# Demo 3.1 Vertical and Horizontal Bar Charts 

- **Demonstrates:**   
  - [**Vertical Bar Charts**](#Vertical-Bar-Chart)      
  - [**Horizontal Bar Charts**](#Horizontal-Bar-Chart)    
  - [**Templates**](#Display-Templates)  


- Datafile:  **Cars.csv** 

In [6]:
import pandas as pd
import plotly.express as px

### Read the datafile into a dataframe  

In [7]:
df = pd.read_csv('Data/Cars.csv')

print(df.shape)
df.head(2)

(428, 13)


Unnamed: 0,Vehicle_Make,Vehicle_Model,Vehicle_Type,Manufacturing_Origin,MPG_City,MPG_Hwy,MSRP,Invoice,Weight,Wheelbase,DriveTrain,EngineSize,Horsepower
0,Acura,MDX,SUV,Asia,17,23,36945,33337,4451,106,All,3.5,265
1,Acura,RSX Type S 2dr,Sedan,Asia,24,31,23820,21761,2778,101,Front,2.0,200


### Change data types as needed  
- If we want to do numeric calculations on a column it is important that pandas recognizes it as numeric. 
- We also want to make sure a column is a float (rather than integer) if needed.
- Otherwise either errors or weird results are going to happen!  


In [8]:
# data types 'Before' 
df.dtypes

Vehicle_Make             object
Vehicle_Model            object
Vehicle_Type             object
Manufacturing_Origin     object
MPG_City                  int64
MPG_Hwy                   int64
MSRP                      int64
Invoice                   int64
Weight                    int64
Wheelbase                 int64
DriveTrain               object
EngineSize              float64
Horsepower                int64
dtype: object

In [9]:
df['MSRP'] = df['MSRP'].astype(float)
df['Invoice'] = df['Invoice'].astype(float)

df['MPG_City'] = df['MPG_City'].astype(float)
df['MPG_Hwy'] = df['MPG_Hwy'].astype(float)

In [10]:
# data types 'After' 
df.dtypes

Vehicle_Make             object
Vehicle_Model            object
Vehicle_Type             object
Manufacturing_Origin     object
MPG_City                float64
MPG_Hwy                 float64
MSRP                    float64
Invoice                 float64
Weight                    int64
Wheelbase                 int64
DriveTrain               object
EngineSize              float64
Horsepower                int64
dtype: object

# Question 1:  What is the Average City MPG By Vehicle Type?  
- Categorical Variable to Group On:  **Type**  
- Continuous Variable We're Interested In:  **MPG_City** 
- Aggregation Function:  **mean** 
 
- **Notes:**  
  - If we only select a single continuous variable/column we're interested in, groupby() will creat a pandas Data **Series** rather than a Dataframe  
  - Data Series are similar to Dataframes, but I think Dataframes are easier to work with and more familiar to you, so we're going to convert the Data Series to a Dataframe.


# Aggregate on a *Single* Column:  *Type*   


In [11]:
# Optional:  Display the unique values in the column we want to Group on
df['Vehicle_Type'].unique()

array(['SUV', 'Sedan', 'Sports', 'Wagon', 'Truck', 'Hybrid'], dtype=object)

In [12]:
df.head(2)

Unnamed: 0,Vehicle_Make,Vehicle_Model,Vehicle_Type,Manufacturing_Origin,MPG_City,MPG_Hwy,MSRP,Invoice,Weight,Wheelbase,DriveTrain,EngineSize,Horsepower
0,Acura,MDX,SUV,Asia,17.0,23.0,36945.0,33337.0,4451,106,All,3.5,265
1,Acura,RSX Type S 2dr,Sedan,Asia,24.0,31.0,23820.0,21761.0,2778,101,Front,2.0,200


In [13]:
# This groupby will create a pandas Series rather than a Dataframe
ser = df.groupby("Vehicle_Type")['MPG_City'].mean()

ser

Vehicle_Type
Hybrid    55.000000
SUV       16.100000
Sedan     21.083969
Sports    18.408163
Truck     16.500000
Wagon     21.100000
Name: MPG_City, dtype: float64

# If it's a  Series Convert it to a Dataframe  

In [14]:
type(ser)

pandas.core.series.Series

In [15]:
if type(ser) == pd.core.series.Series :
    print("It's a Series! Change it to a Dataframe!")
    df = ser.to_frame()
else:
    print("It's NOT a Series!")
    df = ser
    
print(df.shape)
df.head()

It's a Series! Change it to a Dataframe!
(6, 1)


Unnamed: 0_level_0,MPG_City
Vehicle_Type,Unnamed: 1_level_1
Hybrid,55.0
SUV,16.1
Sedan,21.083969
Sports,18.408163
Truck,16.5


# Move the Index Column into the Dataframe  
- Since it is no longer the Index, pandas will create a new default index column with values 0, 1, 2, etc...  

In [16]:
df.reset_index(inplace=True)

print(df.shape)
df.head()

(6, 2)


Unnamed: 0,Vehicle_Type,MPG_City
0,Hybrid,55.0
1,SUV,16.1
2,Sedan,21.083969
3,Sports,18.408163
4,Truck,16.5


# 4. Plot  

### Vertical Bar Chart

In [17]:
fig = px.bar(df,              
             x='Vehicle_Type', 
             y='MPG_City',
             orientation='v',   # Default value if omitted  
             title='Demo 3.1: Vertical Bar Chart')
fig.show()

### Horizontal Bar Chart   
- **Note:**  When you sort the Dataframe to change the order of the Bars, the Horizontal Bar chart needs to be sorted the opposite way you probably expect!

In [18]:
fig = px.bar(df,              
             x='MPG_City', 
             y='Vehicle_Type',
             orientation='h',
             title='Demo 3.1: Horizontal Bar Chart')
fig.show()

# Display *Templates*  

- ggplot2  
- seaborn  
- simple_white  
- plotly  
- plotly_white  
- plotly_dark  
- presentation  
- xgridoff  
- ygridoff  
- gridon  
- none

In [19]:
fig = px.bar(df,              
             x='Vehicle_Type', 
             y='MPG_City',
             template='plotly_dark', # This sets the display template to use!
             title='Demo 3.1: <i>plotly_dark</i> Display Template')
fig.show()