# Collections choropleth visualisation

In this notebook I’m going to start building choropleth maps for the British Museum, the V&A, and then the combined dataset. I’ll be using the `ModernCountry` column to plot each object count onto a world map.The dataset should already have consistent country names, which means Plotly should be able to recognise them without much extra work.

The plan is to create:
- a world map for the V&A objects  
- a world map for the British Museum objects  
- a world map for the combined dataset  

I’m following a couple of Plotly resources to guide the structure of the code, but I’ll keep the explanations here simple and focused on what I’m doing at each step. I’m also going to use a custom colour palette instead of the default Plotly colours, so the visuals match the palette I generated earlier.

In [1]:
# This will be included in the requirements.txt
# %pip install plotly

In [2]:
import pandas as pd
import plotly.express as px

combined_path = "../data/combined_collections_dataset.csv"

combined_df = pd.read_csv(combined_path)

# quick check of the first few rows
combined_df.head()


Unnamed: 0,RecordID,Museum,LocalID,AcqDate,ObjectType,ItemDate,StartDate,EndDate,MidpointDate,ItemPlace,ModernCountry,Culture,ItemMaterial,ItemTechnique
0,1,BM,"No: 1886,0401.45",,acroterion,520BC-490BC (circa); 550BC - 500BC,-520.0,-490.0,-505.0,South Ionia,Greece,Archaic Greek,marble,painted
1,2,BM,"No: 1816,0610.321",1816.0,acroterion,420BC-400BC,-420.0,-400.0,-410.0,,Greece,Classical Greek,marble,
2,3,BM,"No: 1893,0315.1",,acroterion,300BC (circa),-300.0,-300.0,-300.0,Taranto,Italy,Apulian (Greek),limestone,
3,4,BM,"No: 1843,0531.26",1843.0,acroterion,447BC-432BC,-447.0,-432.0,-440.0,,Greece,Classical Greek,marble,
4,5,BM,No: EA11468,1848.0,amulet; figure,,-1069.0,-332.0,-700.5,,Egypt,Third Intermediate; Late Period,glazed composition,glazed


## Step 1. Counting objects per country

Now that the combined dataset is loaded, the first thing I need is a simple count of how many objects each modern country has. I want this in three ways:

- counts for the **British Museum** only  
- counts for the **V&A** only  
- counts for **both museums combined**

I’ll do this by grouping the data by:

- `ModernCountry` (the cleaned modern country names), and  
- `Museum` (so I can separate BM and V&A)

Once I’ve got these counts in a tidy table, I can feed them into Plotly to build separate choropleth maps for each museum and then a combined one.


In [3]:
# count objects per modern country and museum
country_counts = (
    combined_df
    .dropna(subset=["ModernCountry"])
    .groupby(["Museum", "ModernCountry"])
    .size()
    .reset_index(name="count")
)

# split into separate tables for BM and V&A
bm_counts = (
    country_counts
    .query("Museum == 'BM'")
    .drop(columns="Museum")
    .sort_values("count", ascending=False)
)

va_counts = (
    country_counts
    .query("Museum == 'VAM'")
    .drop(columns="Museum")
    .sort_values("count", ascending=False)
)

# combined counts across both museums
combined_counts = (
    combined_df
    .dropna(subset=["ModernCountry"])
    .groupby("ModernCountry")
    .size()
    .reset_index(name="count")
    .sort_values("count", ascending=False)
)

# show the three tables for inspection

print("\n=== British Museum: Object Counts by Country ===\n")
print(bm_counts.to_string(index=False))

print("\n=== V&A: Object Counts by Country ===\n")
print(va_counts.to_string(index=False))

print("\n=== Combined: Object Counts by Country ===\n")
print(combined_counts.to_string(index=False))




=== British Museum: Object Counts by Country ===

               ModernCountry  count
                      Greece    844
                       Egypt    665
                       Italy    642
                        Iraq    376
                       China    253
                       India    172
              United Kingdom    164
                      Turkey    119
                      Cyprus    104
                        Iran     83
                      Mexico     62
                    Pakistan     59
                       Japan     41
                       Libya     30
                       Sudan     30
                      France     26
                   Sri Lanka     23
                     Tunisia     23
                     Nigeria     20
                       Nepal     19
                   Indonesia     17
Democratic Republic of Congo     12
                       Syria     11
                      Canada      9
                     Germany      9
             

## Step 2. First choropleth map (combined dataset)

Now that I’ve got counts per country for each museum, I can try the first choropleth. For this one I’m using the combined counts (BM + V&A together) so I only need:
- one row per modern country
- a single value showing how many objects that country has in total

Plotly will match the `ModernCountry` names to country shapes on a world map. I’m also using my custom colour palette so the colours run from deep blue through purples into orange.


In [4]:
import numpy as np
import plotly.express as px

# recompute combined counts to be safe
combined_counts = (
    combined_df
    .dropna(subset=["ModernCountry"])
    .groupby("ModernCountry")
    .size()
    .reset_index(name="count")
    .sort_values("count", ascending=False)
)

# colour palette (dark blue through to yellow)
custom_colors = [
    "#ffa600",
    "#ff7c43",
    "#f95d6a",
    "#d45087",
    "#a05195",
    "#665191",
    "#2f4b7c",
    "#003f5c"
]

# pale yellow for "no data" land
land_no_data_color = "#FFF7E6"

def make_choropleth(df, title):
    # main choropleth
    fig = px.choropleth(
        df,
        locations="ModernCountry",
        locationmode="country names",
        color="count",
        color_continuous_scale=custom_colors,
        labels={"count": "Object count"},
    )

    # tidy borders and hover text
    fig.update_traces(
        marker_line_color="white",
        marker_line_width=0.5,
        hovertemplate="<b>%{location}</b><br>Count: %{z}<extra></extra>"
    )

    # map styling: white country borders, pale yellow land, pale blue sea
    fig.update_geos(
        projection_type="natural earth",
        showcountries=True,
        countrycolor="white",
        landcolor=land_no_data_color,
        showcoastlines=False,
        showframe=False,
    )

    fig.update_layout(
        title=title,
        geo=dict(
            bgcolor="#bce8e6",   # pale blue sea
        ),
        coloraxis_colorbar=dict(
            title="Objects",
            ticks="outside"
        ),
        # reduce overall white space
        width=900,
        height=450,
        margin=dict(l=40, r=80, t=60, b=20),
    )

    return fig


In [5]:
# This will be included in the requirements.txt
# %pip install --upgrade kaleido

In [7]:
# First choropleth using combined counts
fig_combined = make_choropleth(
    combined_counts,
    "Combined collections – objects per country"
)
fig_combined.show()

# Save the choroplath as a PNG and save in the data_visualisations folder
fig_combined.write_image("../visualisations/combined_choropleth.png", scale=2)

# save an interactive HTML version
fig_combined.write_html("../visualisations/combined_choropleth.html")



The library used by the *country names* `locationmode` option is changing in an upcoming version. Country names in existing plots may not work in the new version. To ensure consistent behavior, consider setting `locationmode` to *ISO-3*.



## Step 3. Separate choropleths for each museum

Now that the combined choropleth is working, I want to see how the two museums compare on their own. In this step I'll:

- reuse the same `make_choropleth` function  
- plug in the **British Museum** country counts  
- then plug in the **V&A** country counts  

Each map will still use the `ModernCountry` names and the same colour palette, so the three maps (combined, BM only, V&A only) are easy to compare side by side.


In [8]:
# choropleth for the British Museum
fig_bm = make_choropleth(
    bm_counts,
    "British Museum – objects per country"
)
fig_bm.show()

# Save as PNG
fig_bm.write_image("../visualisations/bm_choropleth.png", scale=2)

# Save as HTML 
fig_bm.write_html("../visualisations/bm_choropleth.html")



The library used by the *country names* `locationmode` option is changing in an upcoming version. Country names in existing plots may not work in the new version. To ensure consistent behavior, consider setting `locationmode` to *ISO-3*.



In [13]:
# choropleth for the V&A
fig_va = make_choropleth(
    va_counts,
    "V&A – objects per country"
)
fig_va.show()

# Save as PNG
fig_va.write_image("../visualisations/va_choropleth.png", scale=2)

# Save as HTML
fig_va.write_html("../visualisations/va_choropleth.html")



The library used by the *country names* `locationmode` option is changing in an upcoming version. Country names in existing plots may not work in the new version. To ensure consistent behavior, consider setting `locationmode` to *ISO-3*.



## Summary of the choropleth visualisation work

- I followed the Plotly guide on making choropleth maps, adapting the examples so they worked with my dataset.

- I started by creating a choropleth for the combined dataset using my custom colour palette.  
  It worked as expected, but the overall effect felt a bit flat on the default grey background.

- To make the map more visually interesting and easier to read, I made three stylistic changes:
  - I changed the sea to a pale blue (`#bce8e6`), which helps the coloured countries stand out.
  - I set zero-value countries to a very pale yellow so they are still visible and outlined, rather than disappearing into the background.
  - I removed the extra white space surrounding the choroplath.

- I then applied the same approach to the British Museum and V&A datasets individually.  
  Each map now uses:
  - the same reversed colour palette  
  - white outlines for all country borders  
  - the pale blue sea  
  - pale yellow for any country that appears in the dataset but has zero objects

- Each version of the choropleth has been saved as a static PNG image and interactive HTML in a newly created folder, `data_visualisations`

These adjustments make all three maps much clearer and more visually balanced. The combined map, the BM map, and the V&A map now read consistently and show the geographical distribution of objects in a way that’s easy to compare.
