# Event-Driven Data Visualization Lab

## Setup

1. We need to install **pandas** and **matplotlib**

In [None]:
!pip install pandas matplotlib

2. We import **pandas** and **matplotlib** to create datasets and charts.   
We also import extra libraries for json manipulation:  
`JSON` and `json.loads` will help us to visualize the generated json.

In [None]:
from IPython.display import JSON
import json
import pandas as pd
import matplotlib.pyplot as plt

3. We set the **inspector** api address.

In [None]:
api = 'http://localhost:8082/v1.0'

## Products

If we use **pandas** to directly read the product API,  
we can see we get only one column with each product in JSON format.

In [None]:
df = pd.read_json(f'{api}/product')
df

We can reshape our JSON with the option `orient` set to `records` .  
We use `JSON` and `json.loads` to visualize the generated JSON.

In [None]:
JSON(json.loads(df.products.to_json(orient='records')))

We can reshape our JSON with the option `orient` set to `records` . 

In [None]:
df = pd.read_json(df.products.to_json(orient='records'))
df

We know that when we have a `NaN`, it means it was never watched or bought.  
We can replace all the `NaN` values with 0.

In [None]:
df = df.fillna(0)
df

We can visualize the conversion rate for each product we have.  
We can observe what product are often watched but rarely bought.  

In [None]:
fig, ax = plt.subplots()
ax.bar([-0.50, 2.50, 5.50], df.watched, 1, label='watched')
ax.bar([0.50, 3.50, 6.50], df.bought, 1, label='bought')
ax.set_ylabel('Count')
ax.set_xticks([0, 3, 6])
ax.set_xticklabels(df.name)
ax.legend()
plt.show()

We can now compare the proportion of products in two groups: `watched` and `bought`.

In [None]:
columns = df.columns.drop('name')
fig, axes = plt.subplots(1, columns.size)

for i, name in enumerate(columns):
    data = df[name]
    axes[i].pie(data, labels=df.name, radius=data.sum()/40, autopct='%1.1f%%', shadow=True)
    axes[i].set_title(name, loc='left', pad=50.0, fontdict={'fontweight': 'bold'})
    
plt.subplots_adjust(right=1.5)
plt.show()

## Customers

If we use **pandas** to read the customer API.

In [None]:
df = pd.read_json(f'{api}/customer')
df = pd.read_json(df.customers.to_json(orient='records'))
df = df.fillna(0)
df

Here we can have a look at the costumers that buy the most.  
We can also see the proportion of products they purchase.

In [None]:
fig, ax = plt.subplots()
data = df.sort_values('products')
ax.barh(data.name, data.Sugar, 0.5, label='Sugar')
ax.barh(data.name, data.Salt, 0.5, label='Salt', left=data.Sugar)
ax.barh(data.name, data.Pepper, 0.5, label='Pepper', left=data.Sugar+data.Salt)
ax.set_xlabel('Count')
ax.legend()
plt.show()