# Demo 3.2: Grouped and Stacked Bar Charts 


- **Grouped Bar Charts**    
- **Stacked Bar Charts**   

#### Question:  What is the Average City MPG by Vehicle Type *and* Manufacturing_Origin? 


**Key Steps:**   
  1. Read data into a dataframe  
  2. Change Data Types as needed  
  3. Aggregate with *groupby()*: Group on ***Two*** Columns     
  4. Plot!  



Datafile:  **Cars.csv** 

In [1]:
import pandas as pd
import plotly.express as px

# 1. Read the datafile into a dataframe

In [2]:
df = pd.read_csv('Data/Cars.csv')

print(df.shape)
df.head(10)

(428, 13)


Unnamed: 0,Vehicle_Make,Vehicle_Model,Vehicle_Type,Manufacturing_Origin,MPG_City,MPG_Hwy,MSRP,Invoice,Weight,Wheelbase,DriveTrain,EngineSize,Horsepower
0,Acura,MDX,SUV,Asia,17,23,36945,33337,4451,106,All,3.5,265
1,Acura,RSX Type S 2dr,Sedan,Asia,24,31,23820,21761,2778,101,Front,2.0,200
2,Acura,TSX 4dr,Sedan,Asia,22,29,26990,24647,3230,105,Front,2.4,200
3,Acura,TL 4dr,Sedan,Asia,20,28,33195,30299,3575,108,Front,3.2,270
4,Acura,3.5 RL 4dr,Sedan,Asia,18,24,43755,39014,3880,115,Front,3.5,225
5,Acura,3.5 RL w/Navigation 4dr,Sedan,Asia,18,24,46100,41100,3893,115,Front,3.5,225
6,Acura,NSX coupe 2dr manual S,Sports,Asia,17,24,89765,79978,3153,100,Rear,3.2,290
7,Audi,A4 1.8T 4dr,Sedan,Europe,22,31,25940,23508,3252,104,Front,1.8,170
8,Audi,A41.8T convertible 2dr,Sedan,Europe,23,30,35940,32506,3638,105,Front,1.8,170
9,Audi,A4 3.0 4dr,Sedan,Europe,20,28,31840,28846,3462,104,Front,3.0,220


# 2. Change data types as needed  

In [3]:
# data types 'Before' 
df.dtypes

Vehicle_Make             object
Vehicle_Model            object
Vehicle_Type             object
Manufacturing_Origin     object
MPG_City                  int64
MPG_Hwy                   int64
MSRP                      int64
Invoice                   int64
Weight                    int64
Wheelbase                 int64
DriveTrain               object
EngineSize              float64
Horsepower                int64
dtype: object

In [4]:
# Convert MSRP, Invoice, MPG_City, MPG_Hwy to floats
df['MSRP'] = df['MSRP'].astype(float)
df['Invoice'] = df['Invoice'].astype(float)

df['MPG_City'] = df['MPG_City'].astype(float)
df['MPG_Hwy'] = df['MPG_Hwy'].astype(float)

In [5]:
# data types 'After' 
df.dtypes

Vehicle_Make             object
Vehicle_Model            object
Vehicle_Type             object
Manufacturing_Origin     object
MPG_City                float64
MPG_Hwy                 float64
MSRP                    float64
Invoice                 float64
Weight                    int64
Wheelbase                 int64
DriveTrain               object
EngineSize              float64
Horsepower                int64
dtype: object

# Question:  What is the Average City MPG by Vehicle Type *and* Manufacturing_Origin?  
- Categorical Variables to Group On:  **Type** *and* **Manufacturing_Origin**   
- Continuous Variable We're Interested In:  **MPG_City** 
- Aggregation Function:  **mean** 


In [6]:
# Optional:  Display the unique values in the column we want to Group on
df['Vehicle_Type'].unique()

array(['SUV', 'Sedan', 'Sports', 'Wagon', 'Truck', 'Hybrid'], dtype=object)

In [7]:
df['Manufacturing_Origin'].unique()

array(['Asia', 'Europe', 'USA'], dtype=object)

In [8]:
df.head(2)

Unnamed: 0,Vehicle_Make,Vehicle_Model,Vehicle_Type,Manufacturing_Origin,MPG_City,MPG_Hwy,MSRP,Invoice,Weight,Wheelbase,DriveTrain,EngineSize,Horsepower
0,Acura,MDX,SUV,Asia,17.0,23.0,36945.0,33337.0,4451,106,All,3.5,265
1,Acura,RSX Type S 2dr,Sedan,Asia,24.0,31.0,23820.0,21761.0,2778,101,Front,2.0,200


# 3. Aggregate the Data: Grouping On Two Columns 

In [9]:
ser = df.groupby(["Vehicle_Type", 'Manufacturing_Origin'])['MPG_City'].mean()

#print(type(ser))
print(ser.shape)
ser.head()

(15,)


Vehicle_Type  Manufacturing_Origin
Hybrid        Asia                    55.000000
SUV           Asia                    17.320000
              Europe                  14.500000
              USA                     15.520000
Sedan         Asia                    22.840426
Name: MPG_City, dtype: float64

### Convert the Series to a Dataframe and Move Index(es) to Being Columns    

In [10]:
df = ser.to_frame()
df.reset_index(inplace=True)

print(df.shape)
df.head()

(15, 3)


Unnamed: 0,Vehicle_Type,Manufacturing_Origin,MPG_City
0,Hybrid,Asia,55.0
1,SUV,Asia,17.32
2,SUV,Europe,14.5
3,SUV,USA,15.52
4,Sedan,Asia,22.840426


# Plot  

### Grouped Bar Chart   

In [11]:
fig = px.bar(df, 
             x='Vehicle_Type', 
             y='MPG_City',
             color = 'Manufacturing_Origin',
             barmode = 'group', 
             title = 'Demo 3.2: Grouped Bar Chart')
fig.show()

### Stacked Bar Chart   

In [12]:
fig = px.bar(df, 
             x='Vehicle_Type', 
             y='MPG_City',
             color = 'Manufacturing_Origin',
             barmode = 'stack', 
             title = 'Demo 3.2: Stacked Bar Chart')
fig.show()