# Exploring the Demarco Archive

An interactive tool to explore the connections between the assets of the [Demarco Archive](https://www.demarco-archive.ac.uk/)

------

### Import packages
this is the off-the shelf free software that we will use

In [1]:
import plotly.express as px
import requests
import pandas as pd
from bs4 import BeautifulSoup
import itertools

## Preparing our geographical data
As we want to put on a map our data, we need to do some prep work

In [2]:
europe = px.data.gapminder().query("year==2007 and continent=='Europe'")
europe = europe[["country", "iso_alpha"]]
europe["georef"] = europe["country"]
yug = europe[europe.country.isin(["Bosnia and Herzegovina", "Croatia", "Slovenia", "Serbia", "Montenegro"])].reset_index()
yug["georef"] = "Yugoslavia"
czk = europe[europe.country.isin(["Czech Republic", "Slovak Republic"])].reset_index()
czk["georef"] = "Czechoslovakia"
missing = pd.DataFrame({"country":["Malta", "North Macedonia"], "iso_alpha":["MLT", "MKD"], "georef":["Malta", "Yugoslavia"]})
europe_ext = pd.concat([europe, yug, czk, missing])
europe_ext = europe_ext.drop(columns=["index"])
europe_ext = europe_ext.reset_index(drop=True)

In [37]:
europe_ext.head()

Unnamed: 0,country,iso_alpha,georef
0,Albania,ALB,Albania
1,Austria,AUT,Austria
2,Belgium,BEL,Belgium
3,Bosnia and Herzegovina,BIH,Bosnia and Herzegovina
4,Bulgaria,BGR,Bulgaria


the next step is to map a few locations to their country

In [5]:
locations_dict={
    "Edinburgh": "United Kingdom",
    "Rome": "Italy",
    "Bucharest": "Romania",
    "Warsaw": "Poland",
    "Krakow": "Poland",
    "Cracov": "Poland",
    "Athens": "Greece",
    "Glasgow": "United Kingdom",
    "Romanian": "Romania",
    "Polish": "Poland",
    "Scottish": "United Kingdom",
    "Dresden":"Germany",
    "Berlin": "Germany",
    "Valletta": "Malta",
    "Hagar Qim": "Malta",
    "Motovun": "Croatia",
    "At Gigantija": "Malta",
    "Gozo": "Malta",
    "Florence": "Italy",
    "Follonica": "Italy",
    "Milan": "Italy",
    "Venice":"Italy"
}

## The code
In the next step we define a "function", a block of reusable code that will perform the operations we need:
- query the archive for a specific search string
- plot all the "tokens" associated with out search string
- plot all the occurences of recognised geospatial tokens on a map of Europe

In [38]:
def analyse_token(search_word: str):
    search_string = str(search_word).strip().replace(" ", "+")
    request_string = f"https://www.demarco-archive.ac.uk/search?utf8=%E2%9C%93&search_type=assets&q={search_string}&after=&before="
    response = requests.get(request_string)
    if response.status_code == 200:
        soup2 = BeautifulSoup(response.text, "html.parser")
        content = [t.h1.text.split(".") for t in soup2.find_all("a", {"class": ['gallery-overlay', 'gallery-popup']})]
        tokens = [t for t in list(itertools.chain.from_iterable(content)) if t!=""]
        if len(tokens)==0:
            print("no result found")
            return 1
        token_count = pd.Series(tokens).value_counts().reset_index().rename(columns={"index":"token", 0:"count"})
        fig = px.bar(token_count, x="token", y="count")
        fig.update_xaxes(tickangle = 45)
        fig.show()
        for k in locations_dict.keys():
            token_count.loc[token_count.token.str.replace(",", " ").str.contains(str(k)), "token"] = locations_dict.get(k)
        for country in europe.country:
            token_count.loc[token_count.token.str.contains(str(country)), "token"] = str(country)
        df = pd.merge(europe_ext.drop_duplicates(subset="georef"), token_count.groupby(['token'])['count'].sum().reset_index(),  right_on=["token"], left_on=["georef"], how="left")
        mapfig = px.choropleth(df, locations="iso_alpha",
                            color="count", # lifeExp is a column of gapminder
                            hover_name="country", # column to add to hover information
                            color_continuous_scale=px.colors.sequential.Viridis_r,
                            scope="europe"
                            )
        mapfig.show() 
        return 1
    else:
        print("the request was not succesful")
        return 0

# Let's run our code, at last!

In [39]:
analyse_token("Venice")

1

# And again!

In [40]:
analyse_token("Paolo Cardazzo")

1