# Plotting Multiple Data Series

Complete the following set of exercises to solidify your knowledge of plotting multiple data series with pandas, matplotlib, and seaborn. Part of the challenge that comes with plotting multiple data series is transforming the data into the form needed to visualize it like you want. For some of the exercises in this lab, you will need to transform the data into the form most appropriate for generating the visualization and then create the plot.

In [1]:
import pandas as pd
import matplotlib.pyplot as plt
import seaborn as sns
import warnings
import cufflinks as cf
import plotly.express as px

warnings.filterwarnings('ignore')
%matplotlib inline

In [2]:
cf.go_offline()
?pd.DataFrame.iplot

In [3]:
data = pd.read_csv('../data/liquor_store_sales.csv')
data.head()

Unnamed: 0,Year,Month,Supplier,ItemCode,Description,ItemType,RetailSales,RetailTransfers,WarehouseSales
0,2017,4,ROYAL WINE CORP,100200,GAMLA CAB - 750ML,WINE,0.0,1.0,0.0
1,2017,4,SANTA MARGHERITA USA INC,100749,SANTA MARGHERITA P/GRIG ALTO - 375ML,WINE,0.0,1.0,0.0
2,2017,4,JIM BEAM BRANDS CO,10103,KNOB CREEK BOURBON 9YR - 100P - 375ML,LIQUOR,0.0,8.0,0.0
3,2017,4,HEAVEN HILL DISTILLERIES INC,10120,J W DANT BOURBON 100P - 1.75L,LIQUOR,0.0,2.0,0.0
4,2017,4,ROYAL WINE CORP,101664,RAMON CORDOVA RIOJA - 750ML,WINE,0.0,4.0,0.0


## 1. Create a bar chart with bars for total Retail Sales, Retail Transfers, and Warehouse Sales by Item Type.

In [4]:
sales = data[['RetailSales', 'RetailTransfers', 'WarehouseSales','ItemType']].groupby('ItemType').sum().reset_index()
sales
sales.iplot(kind='bar', x='ItemType',
           xTitle='Item Type',
                  yTitle='Sales per categories', 
                  title='Sales by Item Type')

## 2. Create a horizontal bar chart showing sales mix for the top 10 suppliers with the most total sales. 

In [5]:
data['TotalSales']=data['RetailSales']+data['WarehouseSales']+data['RetailTransfers']

topten=data[['Supplier','TotalSales']].groupby('Supplier').count().reset_index().sort_values(by = 'TotalSales',ascending=False).head(10)
topten

px.bar(data_frame=topten, x='TotalSales', y='Supplier')


## 3. Create a multi-line chart that shows average Retail Sales, Retail Transfers, and Warehouse Sales per month over time.

In [6]:
datan = data.groupby(['Year','Month']).sum().reset_index()

datan['S_Month'] = datan['Month'].astype(str)
datan['S_Year'] = datan['Year'].astype(str)                   
month_short = {"1": "Jan",
                       "2": "Feb",
                       "3": "Mar",
                       "4": "Apr",
                       "5": "May",
                       "6": "Jun",
                       "7": "Jul",
                       "8": "Aug",
                       "9": "Sep",
                       "10": "Oct",
                       "11": "Nov",
                       "12": "Dec"
                      }
datan['SS_Month'] = datan['S_Month'].map(month_short)

datan['Timeline']=datan['SS_Month']+ ' '+datan['S_Year']

datanp= datan.sort_values(['Year','Month'],ascending=[True,True])

salesmonth=datanp[['Timeline','RetailSales', 'RetailTransfers', 'WarehouseSales']]

salesmonth.iplot(kind='line', 
                    x='Timeline', 
                    xTitle='Date',
                    yTitle='SalesType', 
                    title='Sales per Month')



## 4. Plot the same information as above but as a bar chart.

In [7]:
salesmonth.iplot(kind='bar', 
                    x='Timeline', 
                    xTitle='Date',
                    yTitle='SalesType', 
                    title='Sales per Month')

## 5. Create a multi-line chart that shows Retail Sales summed by Item Type over time (Year & Month).

*Hint: There should be a line representing each Item Type.*

In [8]:
datan=data[['RetailSales', 'ItemType','Year','Month']].groupby(['ItemType','Year','Month']).sum().reset_index()

datan['S_Month'] = datan['Month'].astype(str)
datan['S_Year'] = datan['Year'].astype(str)                   
month_short = {"1": "Jan",
                       "2": "Feb",
                       "3": "Mar",
                       "4": "Apr",
                       "5": "May",
                       "6": "Jun",
                       "7": "Jul",
                       "8": "Aug",
                       "9": "Sep",
                       "10": "Oct",
                       "11": "Nov",
                       "12": "Dec"
                      }
datan['SS_Month'] = datan['S_Month'].map(month_short)

datan['Timeline']=datan['SS_Month']+ ' '+datan['S_Year']

datanp= datan.sort_values(['Year','Month'],ascending=[True,True])

datanp

Unnamed: 0,ItemType,Year,Month,RetailSales,S_Month,S_Year,SS_Month,Timeline
0,BEER,2017,4,0.00,4,2017,Apr,Apr 2017
26,LIQUOR,2017,4,0.00,4,2017,Apr,Apr 2017
36,NON-ALCOHOL,2017,4,0.00,4,2017,Apr,Apr 2017
45,REF,2017,4,0.00,4,2017,Apr,Apr 2017
54,STR_SUPPLIES,2017,4,0.00,4,2017,Apr,Apr 2017
...,...,...,...,...,...,...,...,...
35,LIQUOR,2018,2,28852.31,2,2018,Feb,Feb 2018
44,NON-ALCOHOL,2018,2,812.84,2,2018,Feb,Feb 2018
53,REF,2018,2,41.52,2,2018,Feb,Feb 2018
62,STR_SUPPLIES,2018,2,47.76,2,2018,Feb,Feb 2018


In [10]:
pivot_sales_items = datanp.pivot_table(values='RetailSales', 
                            columns='ItemType',
                            index=['Year','Month','Timeline'],
                            aggfunc='sum')

pivot_sales_items.iplot(kind='line')

## 6. Plot the same information as above but as a bar chart.

In [11]:
pivot_sales_items.iplot(kind='bar')

## 7. Create a scatter plot showing the relationship between Retail Sales (x-axis) and Retail Transfers (y-axis) with the plot points color-coded according to their Item Type.

*Hint: Seaborn's lmplot is the easiest way to generate the scatter plot.*

In [12]:
sales.iplot(x='RetailSales', 
                     y='RetailTransfers', 
                     categories='ItemType',
                     xTitle='Retail Sales', 
                    yTitle='Retail Transfers',
                    title='Retail Sales&Transfer per Type')  

## 8. Create a scatter matrix using all the numeric fields in the data set with the plot points color-coded by Item Type.

*Hint: Seaborn's pairplot may be your best option here.*

In [13]:
sales.iplot(kind='bubble', 
                x='RetailSales', 
                y='RetailTransfers', 
                size='RetailSales',
                categories='ItemType', 
                xTitle='Retail Sales', 
                yTitle='Retail Transfers',
                title='Retail Sales&Transfer per Type')  