<div align='center'><picture><source srcset="https://plotly-marketing-website-2.cdn.prismic.io/plotly-marketing-website-2/Z7eNlZ7c43Q3gCJw_Plotly-Logo-White.svg" type="image/webp"><img src="https://plotly-marketing-website-2.cdn.prismic.io/plotly-marketing-website-2/Z7eNlZ7c43Q3gCJw_Plotly-Logo-White.svg" width="300" height="300"></picture></div>

# **Article 125 : Plotly (Express)** [![Static Badge](https://img.shields.io/badge/Open%20in%20Colab%20-%20orange?style=plastic&logo=googlecolab&labelColor=grey)](https://colab.research.google.com/github/sshrizvi/DataScienceMastery/blob/main/DataVisualization/Notebooks/125_plotly_express.ipynb)

|üî¥ **NOTE** üî¥|
|:-----------:|
| This notebook contains the practical implementations of the concepts discussed in the following article.|
| Here is Article 125 - [Plotly (Express)](../Articles/125_plotly_express.md) |

### üì¶ **Importing Relevant Libraries**

In [2]:
import numpy as np
import pandas as pd
import plotly.graph_objects as go
import plotly.express as px

### ‚ö†Ô∏è **Data Warning**  
For the visualizations ahead, we will be using the following data from the Plotly Express in-built datasets.

In [3]:
tips_df = px.data.tips()

In [4]:
iris_df = px.data.iris()

In [5]:
gap_df = px.data.gapminder()

### üéØ **Scatter Plots**

#### **1. Using Plotly Graph Objects**  
1. **Plotly Graph Objects (*GO*)** is a low-level, fully customizable interface for building figures.  
2. **GO** has four main players that handles plotting visualizations.
   1. *go.Figure*
   2. *go.Trace*
   3. *go.Layout*
   4. *go.FigureWidget*

In [6]:
gap_filtered_df = gap_df[gap_df['year'] == 2007]

Now, to create a Scatter Plot using ***GO***, we would require to work in a hierarchy, which is as follows:  
1. Create a Blank Canvas where your Scatter Plot will display. It is done with *go.Figure*.
2. Create a Trace using *go.Scatter* according to your data.
3. Create a Layout using *go.Layout* to setup title, xlabel and ylabel for your figure.
4. Pass the trace to data, layout to layout parameter of *go.Figure* class.
5. Call the show method using your figure to display `Scatter` Plots.

In [7]:
trace = go.Scatter(x=gap_filtered_df['lifeExp'], y=gap_filtered_df['gdpPercap'], mode='markers')
layout = go.Layout(title='Life Expectancy VS GDP Per Capita',
                   xaxis={'title':'Life Expectancy'},
                   yaxis={'title':'GDP Per Capita'})
fig = go.Figure(data=trace, layout=layout)
fig.show()

Similarly, we can add multiple visualization on the same *Figure*.

In [11]:
trace = go.Scatter(x=gap_filtered_df['lifeExp'], y=gap_filtered_df['gdpPercap'], mode='markers', name='Scatter')
sin_trace = go.Scatter(x=[40, 50, 60, 70, 80], y=[10000, 20000, 40000, 30000, 20000], name='Line')
layout = go.Layout(title='Life Expectancy VS GDP Per Capita',
                   xaxis={'title':'Life Expectancy'},
                   yaxis={'title':'GDP Per Capita'})
fig = go.Figure(data=[trace, sin_trace], layout=layout)
fig.show()

An observation here is that, to plot viisualizations using ***GO***, we will have to do a pretty good hardwork. So, Plotly decided to launch its new product ***Plotly(Express)***, which is easier to use and less work to do.

#### **2. Using Ploty Express (PX)**  
**Plotly Express (*px*)** is the high-level plotting interface of Plotly. Think of it as the shortcut or quick-access version of Plotly.  
1. It is fast.
2. It allows us to plot visualizations in one line of code.
3. It handles the aesthetics of the visualizations automatically.
4. It can understand Pandas DataFrame.
   1. It think of a column as a variable.
   2. It think of a row as an observation.
5. The customization here is less in comparison to ***GO***.

So, lets plot the same Scatter plot using ***PX***.

In [14]:
labels = {
    'lifeExp':'Life Expectancy',
    'gdpPercap':'GPD Per Capita'
}
px.scatter(data_frame=gap_filtered_df, x='lifeExp', y='gdpPercap',
           title='Life Expectancy VS GDP Per Capita', labels=labels)

Yeah, that's the same as ***GO*** one.

##### **The `color` Parameter**
Now, lets colorize our points on the basis of `continent` column.

In [16]:
labels = {
    'lifeExp':'Life Expectancy',
    'gdpPercap':'GPD Per Capita',
    'continent': 'Continent'
}
px.scatter(data_frame=gap_filtered_df, x='lifeExp', y='gdpPercap',
           title='Life Expectancy VS GDP Per Capita', labels=labels,
           color='continent')

##### **The `size` and `size_max` Parameter**
Now, lets increase our point size on the basis of `pop` column, i.e., population.

In [19]:
labels = {
    'lifeExp':'Life Expectancy',
    'gdpPercap':'GPD Per Capita',
    'continent': 'Continent',
    'pop': 'Population'
}
px.scatter(data_frame=gap_filtered_df, x='lifeExp', y='gdpPercap',
           title='Life Expectancy VS GDP Per Capita', labels=labels,
           color='continent', size='pop', size_max=100)

You might have observed that on on hovering the `+` cursor on the points, there are some information about the point which is showing up. We can add a bit more to it with the help of `hover_name` parameter of `px.scatter()` method.

##### **The `hover_name` Parameter**
Now, lets add the Country name to the hover card as title.

In [25]:
labels = {
    'lifeExp':'Life Expectancy',
    'gdpPercap':'GPD Per Capita',
    'continent': 'Continent',
    'pop': 'Population'
}
px.scatter(data_frame=gap_filtered_df, x='lifeExp', y='gdpPercap',
           title='Life Expectancy VS GDP Per Capita', labels=labels,
           color='continent', size='pop', size_max=100, hover_name='country')

We know that our data have a lots of information across several years. This triggers us to think of making some kind of animation where we can see the growth of GDP Per Capita and Life Expectancy.  
We can do it with the help of `animation_frame` and `animation_group` paramters.

##### **The `animation_frame` and `animation_group` Parameter**
Now, lets animate our Scatter Plot.

In [45]:
labels = {
    'lifeExp':'Life Expectancy',
    'gdpPercap':'GPD Per Capita',
    'continent':'Continent',
    'pop':'Population',
    'year':'Year'
}
px.scatter(data_frame=gap_df, x='lifeExp', y='gdpPercap',
           title='Life Expectancy VS GDP Per Capita', labels=labels,
           color='continent', size='pop', size_max=100, hover_name='country',
           animation_frame='year', animation_group='country', range_x=[20, 90],
           width=1080, height=500)

### üéØ **Line Plots**

Now, lets create a visualization to show how India's Population Growth over Time.

In [49]:
india_df = gap_df[gap_df['country'] == 'India']

In [50]:
labels = {
    'pop':'Population',
    'year':'Year'
}
px.line(data_frame=india_df, x='year', y='pop',
        title='India Population Growth', labels=labels)

Now, lets create a line chart to show Population growth of India, China and Pakistan. 

In [56]:
icp_df = gap_df[gap_df['country'].isin(['India', 'China', 'Pakistan'])]
icp_df = icp_df.pivot(index='year', columns='country', values='lifeExp')

In [76]:
labels = {
    'year':'Year',
    'country':'Country',
    'value':'Life Expectancy'
}
px.line(data_frame=icp_df, x=icp_df.index, y=icp_df.columns,
        title='Life Expectancy Growth of Countries', labels=labels)

### üéØ **Bar Plots**

Now, lets create the bar chart version of India's Population Growth Over Time.

In [61]:
india_df = gap_df[gap_df['country'] == 'India']

In [62]:
labels = {
    'pop':'Population',
    'year':'Year'
}
px.bar(data_frame=india_df, x='year', y='pop',
        title='India Population Growth', labels=labels)

#### **1. Grouped Bar Chart**

Now, lets compare the Population of India, China and Pakistan using Grouped Bar Chart.  
**Note :** While calling `px.bar()` method you will have to add a paramter `barmode='group'` to it to create a grouped bar chart.

In [77]:
icp_df = gap_df[gap_df['country'].isin(['India', 'China', 'Pakistan'])]
icp_df = icp_df.pivot(index='year', columns='country', values='pop')

In [78]:
labels = {
    'year':'Year',
    'country':'Country',
    'value':'Population'
}
px.bar(data_frame=icp_df, x=icp_df.index, y=icp_df.columns,
        title='Population Growth of Countries', labels=labels,
        barmode='group')

You will be shocked to know that in ***PX***, we are able to make display the numbers on the bar. üò≤  
YES, we can do that with the help of `text_auto` parameter by setting its value to `True`.

In [79]:
labels = {
    'year':'Year',
    'country':'Country',
    'value':'Population'
}
px.bar(data_frame=icp_df, x=icp_df.index, y=icp_df.columns,
        title='Population Growth of Countries', labels=labels,
        barmode='group', text_auto=True)

It is not so readable as the size of the bars of Pakistan is pretty small. No issue, we can scale it with the help of `log_y` paramter by setting its value to `True`.

In [88]:
labels = {
    'year':'Year',
    'country':'Country',
    'value':'Population'
}
px.bar(data_frame=icp_df, x=icp_df.index, y=icp_df.columns,
        title='Population Growth of Countries', labels=labels,
        barmode='group', text_auto=True, log_y=True)

#### **2. Stacked Bar Chart**

Now, lets create a stacked bar chart to visualize the contribution of population of each country to the population of continents in the year 2007.

In [92]:
year2007_df = gap_df[gap_df['year'] == 2007]

In [146]:
labels = {
    'year':'Year',
    'country':'Country',
    'pop':'Population',
    'continent':'Continent'
}
px.bar(data_frame=year2007_df, x='continent', y='pop',
       title='Contribution of Countries', labels=labels,
       color='country', log_y=True,
       height=600)

Now, lets animate the population growth of continents over several years.

In [137]:
continent_pop_df = gap_df.groupby(by=['continent', 'year'])['pop'].agg('sum').reset_index()

In [142]:
labels = {
    'year':'Year',
    'country':'Country',
    'pop':'Population',
    'continent':'Continent'
}
px.bar(data_frame=continent_pop_df, x='continent', y='pop',
       title='Population Growth of Continents Over Years', labels=labels,
       color='continent', range_y=[0, 4000000000],
       animation_frame='year', animation_group='continent')

### üéØ **Histograms**

Lets plot a histogram of Life Expectancy of all countries in `2007`.

In [147]:
year2007_df = gap_df[gap_df['year'] == 2007]

In [154]:
labels = {
    'lifeExp' : 'Life Expectancy'
}
px.histogram(data_frame=year2007_df, x='lifeExp', text_auto=True,
             title='Histogram of Life Expectancy of All Countries', labels=labels)

We can change the no. of bins too, with the help of `nbins` parameter.

In [159]:
labels = {
    'lifeExp' : 'Life Expectancy'
}

px.histogram(data_frame=year2007_df, x='lifeExp', text_auto=True, nbins=30,
             title='Histogram of Life Expectancy of All Countries', labels=labels)

Now, lets plot histogram for `sepal_length` of all `IRIS` Species.

In [168]:
labels = {
    'species' : 'Species',
    'sepal_length' : 'Sepal Length',
    'count' : 'Count'
}
fig = px.histogram(data_frame=iris_df, x='sepal_length', nbins=30, text_auto=True,
             title='Histogram of Sepal Length of All Iris Species', labels=labels,
             color='species')
fig.update_layout(yaxis_title_text='Count')
fig.show()

### üéØ **Pie Chart**

Lets plot a Pie Chart of Population of Top 5 European Countries in 2007.

In [176]:
eur2007_df = gap_df[(gap_df['continent'] == 'Europe') & (gap_df['year'] == 2007)].head()

In [178]:
labels = {
    'country' : 'Country',
    'pop' : 'Population'
}
px.pie(data_frame=eur2007_df, values='pop', names='country',
       title='Pie Chart of Population of Top 5 European Countries', labels=labels)

### üéØ **Sunburst Plot**

Lets make a Sunburst Plot, which is some kind of a hierarchical pie chart.

In [179]:
year2007_df = gap_df[gap_df['year'] == 2007]

In [209]:
px.sunburst(data_frame=year2007_df, path=['continent', 'country'], values='pop',
            title='Sunburst Plot of Population of Continent & Countries in 2007', color='lifeExp')

Lets make another Sunburst Plot on the `TIPS` Dataset.

In [215]:
labels = {
    'size' : 'Size'
}
px.sunburst(data_frame=tips_df, path=['sex', 'smoker', 'day', 'time'], values='total_bill',
            title='Sunburst Plot on TIPS Dataset', color='size', labels=labels)

### üéØ **Treemap**

In [None]:
year2007_df = gap_df[gap_df['year'] == 2007]

In [207]:
px.treemap(data_frame=year2007_df, path=[px.Constant('World'), 'continent', 'country'], values='pop',
            title='Treemap of Population of Continent & Countries in 2007', color='lifeExp',
            color_continuous_scale='Portland')

In [225]:
labels = {
    'size' : 'Size'
}
px.treemap(data_frame=tips_df, path=[px.Constant('Customer'), 'sex', 'smoker', 'day', 'time'], values='total_bill',
            title='Treemap on TIPS Dataset', color='size', labels=labels, color_continuous_scale='Picnic')

### üéØ **Heatmap**

Lets make a Heatmap of all Average Life Expectancy of All Continents across Years.

In [227]:
avg_lifeExp_pivot = gap_df.pivot_table(index='year', columns='continent', values='lifeExp', aggfunc='mean')

In [232]:
labels = {
    'year' : 'Year',
    'continent' : 'Continent'
}
px.imshow(avg_lifeExp_pivot,
          title='Average Life Expectancy of Continents acorss Years', labels=labels,
          text_auto=True)

### üéØ **3D Scatter Plot**

Lets make an interactive 3D Scatter Plot of all Countries (Life Expectancy, GDP Per Capita, and Population) for the year 2007.

In [245]:
year2007_df = gap_df[gap_df['year'] == 2007]
labels = {
    'lifeExp':'Life Expectancy',
    'gdpPercap':'GPD Per Capita',
    'continent': 'Continent',
    'pop': 'Population'
}
px.scatter_3d(data_frame=year2007_df, x='lifeExp', y='pop', z='gdpPercap',
              log_y=True, color='continent', hover_name='country', labels=labels,
              height=600)

Lets make another 3D Scatter Plot on IRIS Data (Sepal Length, Sepal Width and Petal Length) for all Species.

In [249]:
labels = {
    'sepal_length' : 'Sepal Length',
    'sepal_width' : 'Sepal Width',
    'petal_length' : 'Petal Length',
    'species' : 'Species'
}
px.scatter_3d(data_frame=iris_df, x='sepal_length', y='sepal_width', z='petal_length',
              log_y=True, color='species', labels=labels, height=600)

### üéØ **Scatter Matrix**

Lets make a scatter matrix on IRIS Data (Sepal Length, Sepal Width, Petal Length and Petal Width).

In [254]:
labels = {
    'sepal_length' : 'SL',
    'sepal_width' : 'SW',
    'petal_length' : 'PL',
    'petal_width' : 'PW',
    'species' : 'Species'
}
px.scatter_matrix(data_frame=iris_df, dimensions=['sepal_length', 'sepal_width', 'petal_length', 'petal_width'],
                  labels=labels, color='species')