# Interactive Visualization Lab

Complete the following set of exercises to solidify your knowledge of interactive visualization using Plotly, Cufflinks, and IPyWidgets.

In [1]:
import pandas as pd
import chart_studio.plotly as py
import cufflinks as cf
from ipywidgets import interact

cf.go_offline()

In [2]:
data = pd.read_excel('../data/Online Retail.xlsx')
data.head()

Unnamed: 0,InvoiceNo,InvoiceDate,StockCode,Description,Quantity,UnitPrice,Revenue,CustomerID,Country
0,536365,2010-12-01 08:26:00,85123A,CREAM HANGING HEART T-LIGHT HOLDER,6,2.55,15.3,17850,United Kingdom
1,536373,2010-12-01 09:02:00,85123A,CREAM HANGING HEART T-LIGHT HOLDER,6,2.55,15.3,17850,United Kingdom
2,536375,2010-12-01 09:32:00,85123A,CREAM HANGING HEART T-LIGHT HOLDER,6,2.55,15.3,17850,United Kingdom
3,536390,2010-12-01 10:19:00,85123A,CREAM HANGING HEART T-LIGHT HOLDER,64,2.55,163.2,17511,United Kingdom
4,536394,2010-12-01 10:39:00,85123A,CREAM HANGING HEART T-LIGHT HOLDER,32,2.55,81.6,13408,United Kingdom


## 1. Create an interactive bar chart showing total quantity and revenue by country (excluding United Kingdom) for the month of April 2011.

In [3]:
import re 
filtro = data[(data['Country']!='United Kingdom')&(data['InvoiceDate'].astype(str).str.contains(r"2011-04(.+)", regex=True))]
value = filtro.groupby(['Country']).agg({'Quantity':'sum','Revenue':'sum'})
value.iplot(kind='bar', yTitle='Count', title='Total quantity and revenue',filename='cufflinks/categorical-bar-chart')


This pattern has match groups. To actually get the groups, use str.extract.



## 2. Create an interactive line chart showing quantity and revenue sold to France between January 1st and May 31st 2011.

In [4]:
import plotly.express as px
filtro2 = data[(data['Country']=='France')&(data['InvoiceDate']<'2011-05-31')&(data['InvoiceDate']>'2011-01-31')]
filtro2.groupby('InvoiceDate'). agg({'Quantity':'sum','Revenue':'sum'}).iplot(filename='cufflinks/line-example')


## 3. Create an interactive scatter plot showing the relationship between average quantity (x-axis) and average unit price (y-axis) for the product PARTY BUNTING with the plot points color-coded by country (categories).

In [5]:
filtro3 = data[(data['Description']=='PARTY BUNTING')]
filtro3 = filtro3.groupby('Country').agg({'Quantity':'mean','UnitPrice':'mean'}).reset_index()
import plotly.express as px
df = px.data.iris()
fig = px.scatter(filtro3, x="Quantity", y="UnitPrice", color="Country")
fig.show()

## 4. Create a set of interactive histograms showing the distributions of quantity per invoice for the following countries: EIRE, Germany, France, and Netherlands.

In [21]:
data.head()
filtro4 = data[(data['Country'].str.contains(r"EIRE|Germany|France|Netherlands", regex=True))]
filtro4 = filtro4.groupby(['Country','InvoiceNo']).agg({'Quantity':'sum'}).reset_index()
fig = px.histogram(filtro4, x="Quantity", y="InvoiceNo", color="Country", marginal="rug", hover_data=filtro4.columns)
fig.show()

## 5. Create an interactive side-by-side bar chart showing the revenue by country listed below (bars) for each of the products listed below.

In [28]:
product_list = ['JUMBO BAG RED RETROSPOT', 
                'CREAM HANGING HEART T-LIGHT HOLDER',
                'REGENCY CAKESTAND 3 TIER']

country_list = ['EIRE', 'Germany', 'France', 'Netherlands']

In [44]:
filtro5 = data[(data['Country'].isin(['EIRE', 'Germany', 'France', 'Netherlands']))&(data['Description'].isin(product_list))]
filtro5 = filtro5.pivot_table(columns="Country",index="Description",values="Revenue")
filtro5.iplot(kind ='bar')

## 6. Create an interactive line chart showing quantity sold by day for the United Kingdom. Add drop-down boxes for Year and Month that allow you to filter the date range that appears in the chart.

In [67]:
data['Year'] = pd.DatetimeIndex(data['InvoiceDate']).year
data['Month'] = pd.DatetimeIndex(data['InvoiceDate']).month
data['Day'] = pd.DatetimeIndex(data['InvoiceDate']).day
uk = data[data['Country']=='United Kingdom']

In [65]:
uk = uk.pivot_table(values='Quantity', columns=['Year','Month'], index='Day')
uk.iplot(kind='line', xTitle ='Day', yTitle='Quantity')

## 7. Create an interactive scatter plot that plots number of invoices (x-axis) vs. number of customers (y-axis) and the plot points represent individual products. Add two sliders that control the x and y axis ranges.

In [68]:
agg_func = {'InvoiceNo':'nunique',
            'Quantity':'sum',
            'UnitPrice':'mean',
            'Revenue':'sum',
            'CustomerID':'nunique'}

products = uk.groupby('Description').agg(agg_func)

In [69]:
products.iplot(kind='scatter',)
products = products.pivot_table(values='Quantity', columns=['Year','Month'], index='Day')

Unnamed: 0_level_0,InvoiceNo,Quantity,UnitPrice,Revenue,CustomerID
Description,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1
4 PURPLE FLOCK DINNER CANDLES,35,134,2.318421,255.46,30
50'S CHRISTMAS GIFT BAG LARGE,100,1721,1.247900,2067.25,98
DOLLY GIRL BEAKER,100,661,1.250000,826.25,77
I LOVE LONDON MINI BACKPACK,55,181,4.150000,751.15,46
NINE DRAWER OFFICE TIDY,25,44,14.761538,628.40,24
...,...,...,...,...,...
ZINC T-LIGHT HOLDER STARS SMALL,220,4258,0.839005,3399.62,168
ZINC TOP 2 DOOR WOODEN SHELF,9,10,16.950000,169.50,9
ZINC WILLIE WINKIE CANDLE STICK,169,2007,0.877052,1716.87,123
ZINC WIRE KITCHEN ORGANISER,11,24,6.881818,146.40,11


## 8. Creat an interactive bar chart that shows revenue by product description. Add a text field widget that filters the results to show the product that contain the text entered in their description.