# An Introduction to Visualising In the Spotlight Data Using Python

This notebook demontrates a way of producing visualisations of [*In the Spotlight*](https://www.libcrowds.com/collection/playbills) results data, using Python. 

Taking the dataframe produced in the notebook [An Introduction to Analysing In the Spotlight Data Using Python](intro_to_analysing_its_data_using_python.ipynb), we will see how to produce statistical charts using just a few lines of code.

The Python library [plotly.py](https://plot.ly/d3-js-for-python-and-pandas-charts/) is a graphing library that can be used to produce many chart types that can viewed in Jupyter notebooks.

We begin by importing the required libraries.

In [2]:
import pandas
import plotly

## The dataset

In the notebook [An Introduction to Analysing In the Spotlight Data Using Python](intro_to_analysing_its_data_using_python.ipynb) we imported all of the results from [*In the Spotlight*](https://www.libcrowds.com/collection/playbills) into a pandas dataframe. Towards the end of the notebook we stored the dataframe to disk. Here, we will load it back into memory so that we can use its contents for our visualisations.

In [3]:
df = pandas.read_pickle('../data/its_transcriptions.pkl')

## Pie charts

Pie charts are probably one of the most straightforward types of visualisation to get started with, all we need are a list of labels and a list of values. 

As with the other types of chart we will see later, there are additional options available for changing the charts colours, hiding the legend, displaying additional information when hovering over the chart and so on. For now, we will stick to the default options.

For this chart, we will plot the top ten genres collected so far, which we can get by using the `value_counts()` method that was introduced in an [earlier notebook](intro_to_analysing_its_data_using_python.ipynb). The first few rows are also displayed below to give us a quick snapshot of the data.

In [78]:
genre_df = df[df['tag'] == 'genre']
genre_counts = genre_df['transcription'].value_counts()
genre_counts.head()

Comedy     450
Farce      441
Drama      210
Tragedy    161
Play       134
Name: transcription, dtype: int64

As we can see above, we now have an index of genres against a list of values, which can now be used to define the input for our pie chart. Below, the first ten items of the index and values are converted to lists and assigned to the variables `labels` and `values`.

In [84]:
labels = genre_counts[:10].index.tolist()
values = genre_counts[:10].tolist()

Plotly charts are generated from a list of traces. A trace is just the name we give a collection of data and the specifications of which we want that data plotted. Some charts can plot multiple traces; in this case, we only have one.

Below, a pie chart trace is defined, then used as the only item in a list to become our chart data. 

In [85]:
trace = plotly.graph_objs.Pie(labels=labels, values=values)
chart_data = [trace]

Finally, we can plot the chart with the following line of code.

In [86]:
plotly.offline.iplot(chart_data)

## Line charts



In [113]:
date_df = df[df['tag'] == 'date']
date_df = date_df['transcription'].str.split(pat='-', expand=True)
date_df.head()

Unnamed: 0,0,1,2
4096,1828,3,11
4183,1828,4,14
4216,1828,5,29
4458,1829,4,21
6337,1828,6,7


In [119]:
month_counts = date_df[1].value_counts()
x = month_counts.index.tolist()
y = month_counts.tolist()
trace = plotly.graph_objs.Bar(x=x, y=y)

date_data = [trace]
plotly.offline.iplot(date_data)

Similar charts for the year or day could be generated by replacing 1 in the first line of the above code block with 0 or 3.

In [123]:
trace = plotly.graph_objs.Scatter(x=x, y=y)

date_data = [trace]
plotly.offline.iplot(date_data)