# Visualising In the Spotlight Data Over Time

In this notebook we will produce some visualisations of [*In the Spotlight*](https://www.libcrowds.com/collection/playbills) performance data over time to see if we can begin to identify any trends.

As we begin to get into more complicated territory, we won't explain every function used in detail. However, hopefully there will be something here that most can follow.

We will again use pandas and plotly as our core Python libraries, both of which were introduced in previous notebooks.

In [214]:
import pandas
import plotly

## The dataset

Our input will again be the dataframe of performance data introduced in a [previous notebook](intro_to_analysing_its_data_using_python.ipynb). The dataframe is loaded in the code block below.

In [215]:
import os
import sys
module_path = os.path.abspath(os.path.join('..', 'data', 'scripts'))
if module_path not in sys.path:
    sys.path.append(module_path)
from get_its_performances import get_performances_df
df = get_performances_df()

As a reminder of how this dataframe looks we can run the `head()` function.

In [216]:
df.head()

Unnamed: 0,title,date,genre,link,theatre,city,source
0,Black-Eyed Susan,1829-12-03,Nautical Drama,http://access.bl.uk/item/viewer/ark:/81055/vdc...,Miscellaneous Plymouth theatres,Plymouth,https://api.bl.uk/metadata/iiif/ark:/81055/vdc...
1,Black-Eyed Susan,1829-12-03,Nautical Drama,http://access.bl.uk/item/viewer/ark:/81055/vdc...,Miscellaneous Plymouth theatres,Plymouth,https://api.bl.uk/metadata/iiif/ark:/81055/vdc...
2,Black-Eyed Susan,1829-12-03,Nautical Drama,http://access.bl.uk/item/viewer/ark:/81055/vdc...,Miscellaneous Plymouth theatres,Plymouth,https://api.bl.uk/metadata/iiif/ark:/81055/vdc...
3,Black-Eyed Susan,1829-12-03,Nautical Drama,http://access.bl.uk/item/viewer/ark:/81055/vdc...,Miscellaneous Plymouth theatres,Plymouth,https://api.bl.uk/metadata/iiif/ark:/81055/vdc...
4,Black-Eyed Susan,1829-12-03,Nautical Drama,http://access.bl.uk/item/viewer/ark:/81055/vdc...,Miscellaneous Plymouth theatres,Plymouth,https://api.bl.uk/metadata/iiif/ark:/81055/vdc...


## Adding days, months and years to the dataframe

As we begin looking at our date information more closely it might be useful to add separate columns for day, month and year to our dataframe so that we can plot other entities against these values.

We will also want to remove any rows that do not contian a date, or contain an incomplete date, as is the case for many of the playbills. The following line of code checks each value in the date column against a regular expression and removes those rows that do not match the pattern that identifies a complete date.

In [217]:
df = df[df.date.str.contains('\d{4}-\d{2}-\d{2}', na=False)]

The date column is then converted to a date type.

In [218]:
df['date'] = pandas.to_datetime(df['date'])

We are now ready to create our additional columns.

In [219]:
df['day'] = df['date'].dt.strftime('%d').astype('int32')
df['month'] = df['date'].dt.strftime('%m').astype('int32')
df['year'] = df['date'].dt.strftime('%Y').astype('int32')

In [220]:
df.head()

Unnamed: 0,title,date,genre,link,theatre,city,source,day,month,year
0,Black-Eyed Susan,1829-12-03,Nautical Drama,http://access.bl.uk/item/viewer/ark:/81055/vdc...,Miscellaneous Plymouth theatres,Plymouth,https://api.bl.uk/metadata/iiif/ark:/81055/vdc...,3,12,1829
1,Black-Eyed Susan,1829-12-03,Nautical Drama,http://access.bl.uk/item/viewer/ark:/81055/vdc...,Miscellaneous Plymouth theatres,Plymouth,https://api.bl.uk/metadata/iiif/ark:/81055/vdc...,3,12,1829
2,Black-Eyed Susan,1829-12-03,Nautical Drama,http://access.bl.uk/item/viewer/ark:/81055/vdc...,Miscellaneous Plymouth theatres,Plymouth,https://api.bl.uk/metadata/iiif/ark:/81055/vdc...,3,12,1829
3,Black-Eyed Susan,1829-12-03,Nautical Drama,http://access.bl.uk/item/viewer/ark:/81055/vdc...,Miscellaneous Plymouth theatres,Plymouth,https://api.bl.uk/metadata/iiif/ark:/81055/vdc...,3,12,1829
4,Black-Eyed Susan,1829-12-03,Nautical Drama,http://access.bl.uk/item/viewer/ark:/81055/vdc...,Miscellaneous Plymouth theatres,Plymouth,https://api.bl.uk/metadata/iiif/ark:/81055/vdc...,3,12,1829


## Mapping entities against years

Now that we have our date parts in separate columns we can begin to map other entities against them. The code block below defines an entity and a date part to map against each other, then creates a dictionary with counts of that entity for each date part.

In [221]:
entity = 'genre'
date_part = 'month'
limit = 10
min_part = df[date_part].min()
max_part = df[date_part].max()
top_entities = df[entity].value_counts().index.tolist()[:limit]
groups = df.groupby(entity)

In [222]:
data = {}
for key, group_df in groups:
    if key not in top_entities:
        continue
    counts = group_df[date_part].value_counts().to_dict()
    x = list(range(min_part, max_part + 1))
    y = [counts[item] if item in counts else 0 for item in x]
    data[key] = {'x': x, 'y': y}

Plot the chart...

In [223]:
f = plotly.graph_objs.FigureWidget()
for key, value in data.items():
    f.add_scatter(x=value['x'], y=value['y'], name=key)
f

FigureWidget({
    'data': [{'name': 'Comedy',
              'type': 'scatter',
              'uid': '1c4c215c…

## Summary

Work in progress!