# Interactive Visualization with Plotly


![](https://i.imgur.com/jPcmxke.jpg)

Plotly is a Python library used for creating interactive visualizations (graphs & charts). Unlike Matplotlib and Seaborn which create static images, Plotly renders an HTML document and uses JavaScript under the hood to enable interactivity. Plotly also offers a large selection of chart types to choose from.


The following topics are covered in this tutorial:

- Creating figures & adding interactive elements
- A quick tour of popular interactive charts
- Using Plotly as a plotting backend for Pandas 
- Creating and exploring 3D graphs
- Adding controls and animating graphs

Let's begin by installing and importing the required libraries.

In [2]:
%pip install pandas-profiling numpy plotly --upgrade --quiet

Note: you may need to restart the kernel to use updated packages.



[notice] A new release of pip available: 22.3.1 -> 23.0.1
[notice] To update, run: python.exe -m pip install --upgrade pip


In [10]:
%pip install nbformat --upgrade --quiet

Note: you may need to restart the kernel to use updated packages.



[notice] A new release of pip available: 22.3.1 -> 23.0.1
[notice] To update, run: python.exe -m pip install --upgrade pip


In [11]:
%pip install plotly --upgrade --quiet

Note: you may need to restart the kernel to use updated packages.



[notice] A new release of pip available: 22.3.1 -> 23.0.1
[notice] To update, run: python.exe -m pip install --upgrade pip


In [3]:
#import jovian
import pandas as pd
import numpy as np

## Creating Figures and Adding Interactive Elements

Like Matplotlib, Plotly provides several low-level functions for creating and customizing figures. While they offer fine grained control over various aspects of a graph, they're quite verbose and can be cumbersome to use. We'll start out by using Plotly express, a high-level API similar to Seaborn, that allows creating and customizing charts with a single line of code.

Plotly express is often imported using the alias `px`.

In [4]:
import plotly.express as px

Let's download a [country-wise population dataset](https://data.worldbank.org/indicator/SP.POP.TOTL) from World Bank Open Data.

In [5]:
population_csv_url = 'https://gist.githubusercontent.com/aakashns/bbd36fbd7c0be266f0c875ad2006a9fd/raw/1763ee47c8919995c4115bb063c99511ced34712/population.csv'
population_df = pd.read_csv(population_csv_url, index_col='Year')
population_df[:5]

Unnamed: 0_level_0,Aruba,Afghanistan,Angola,Albania,Andorra,Arab World,United Arab Emirates,Argentina,Armenia,American Samoa,...,Virgin Islands (U.S.),Vietnam,Vanuatu,World,Samoa,Kosovo,"Yemen, Rep.",South Africa,Zambia,Zimbabwe
Year,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1,Unnamed: 7_level_1,Unnamed: 8_level_1,Unnamed: 9_level_1,Unnamed: 10_level_1,Unnamed: 11_level_1,Unnamed: 12_level_1,Unnamed: 13_level_1,Unnamed: 14_level_1,Unnamed: 15_level_1,Unnamed: 16_level_1,Unnamed: 17_level_1,Unnamed: 18_level_1,Unnamed: 19_level_1,Unnamed: 20_level_1,Unnamed: 21_level_1
1960,54211.0,8996973.0,5454933.0,1608800.0,13411.0,92197753.0,92418.0,20481779.0,1874121.0,20123.0,...,32500.0,32670039.0,63689.0,3031438000.0,108629.0,947000.0,5315355.0,17099840.0,3070776.0,3776681.0
1961,55438.0,9169410.0,5531472.0,1659800.0,14375.0,94724510.0,100796.0,20817266.0,1941492.0,20602.0,...,34300.0,33666110.0,65705.0,3072481000.0,112105.0,966000.0,5393036.0,17524533.0,3164329.0,3905034.0
1962,56225.0,9351441.0,5608539.0,1711319.0,15370.0,97334442.0,112118.0,21153052.0,2009526.0,21253.0,...,35000.0,34683407.0,67794.0,3125457000.0,115776.0,994000.0,5473671.0,17965725.0,3260650.0,4039201.0
1963,56695.0,9543205.0,5679458.0,1762621.0,16412.0,100034179.0,125130.0,21488912.0,2077578.0,22034.0,...,39800.0,35721217.0,69946.0,3190564000.0,119559.0,1022000.0,5556766.0,18423161.0,3360104.0,4178726.0
1964,57032.0,9744781.0,5735044.0,1814135.0,17469.0,102832760.0,138039.0,21824425.0,2145001.0,22854.0,...,40800.0,36779999.0,72115.0,3256065000.0,123342.0,1050000.0,5641597.0,18896307.0,3463213.0,4322861.0


Let's use `px.line` to create a line chart showing the population of Hungary from 1960 to 2019.

In [6]:
?px.line

[1;31mSignature:[0m
[0mpx[0m[1;33m.[0m[0mline[0m[1;33m([0m[1;33m
[0m    [0mdata_frame[0m[1;33m=[0m[1;32mNone[0m[1;33m,[0m[1;33m
[0m    [0mx[0m[1;33m=[0m[1;32mNone[0m[1;33m,[0m[1;33m
[0m    [0my[0m[1;33m=[0m[1;32mNone[0m[1;33m,[0m[1;33m
[0m    [0mline_group[0m[1;33m=[0m[1;32mNone[0m[1;33m,[0m[1;33m
[0m    [0mcolor[0m[1;33m=[0m[1;32mNone[0m[1;33m,[0m[1;33m
[0m    [0mline_dash[0m[1;33m=[0m[1;32mNone[0m[1;33m,[0m[1;33m
[0m    [0msymbol[0m[1;33m=[0m[1;32mNone[0m[1;33m,[0m[1;33m
[0m    [0mhover_name[0m[1;33m=[0m[1;32mNone[0m[1;33m,[0m[1;33m
[0m    [0mhover_data[0m[1;33m=[0m[1;32mNone[0m[1;33m,[0m[1;33m
[0m    [0mcustom_data[0m[1;33m=[0m[1;32mNone[0m[1;33m,[0m[1;33m
[0m    [0mtext[0m[1;33m=[0m[1;32mNone[0m[1;33m,[0m[1;33m
[0m    [0mfacet_row[0m[1;33m=[0m[1;32mNone[0m[1;33m,[0m[1;33m
[0m    [0mfacet_col[0m[1;33m=[0m[1;32mNone[0m[1;33m,[0m[1;33m
[0m    [

In [7]:
px.line(population_df['Hungary'], title='Population')

Note the following:

* `px.line` automatically picks the index of the series as the X-axis. 
* You can hover over any point on the line to view the exact value.
* You can Zoom in and out using the controls to take a closer look at specific areas of the chart. 
* There are several other controls e.g pan, autoscale, download PNG etc.

For fine grained control over various aspects of the chart, we can use the Figure object returned by `px.line`. Let's change the axis labels, chart colors, and ensure that the y axis starts at 0.

In [7]:
fig = px.line(population_df['Hungary'])

In [8]:
# Set axis & legend labels
fig.update_layout(
    title="Year-Wise Population",
    xaxis_title="Year",
    yaxis_title="Population",
    legend_title="Country",
    plot_bgcolor='#ffcc9c',
    font=dict(
        family="Ariel",
        size=14,
        color="#cc3e0e"
    )
)

# Start the Y axis from 0
fig.update_yaxes(rangemode='tozero')

Here's a list of properties you can set using `update_layout`: https://plotly.com/python/reference/layout/

Plotly also has built-in support for Pandas dataframes.


**Note**: Sometimes `fig.show()` does not display the graph on your jupyter notebook then you will have to run the code shared below in a code cell.


```
#Import necessary libraries
import plotly.offline as pyo
import plotly.graph_objs as go
#Set notebook mode to work in offline
pyo.init_notebook_mode()

```

In [9]:
europe_df = population_df[['Hungary', 'Czech Republic', 'Switzerland']]

In [10]:
fig = px.line(europe_df,
              title='Population', 
              color_discrete_sequence=["aquamarine", "cornflowerblue", "goldenrod"])

fig.update_layout(yaxis_title='Population',
                  legend_title='Countries', 
                  font_size=14)
fig.update_yaxes(rangemode='tozero')
fig.show()

Note that apart from providing RGB hexcodes for colors, we can also use named CSS colors: https://www.w3schools.com/cssref/css_colors.asp

Switching from a line chart to a bar chart is simply a matter of replacing `plt.line` with `plt.bar`.

In [11]:
px.bar(population_df[['Bangladesh', 'Pakistan']],
       title="Population",
       barmode="group")

> **EXERCISE**: Compare the annual population increase of India and China using a line chart. Which of the two countries is growing faster? Hint: Use `population_df.diff()`

In [12]:
annual_population_increase_df = population_df[['India', 'China']] .diff(axis=0)

In [13]:
fig = px.line(annual_population_increase_df, title="Annual population increase of India and China")
fig.update_layout(yaxis_title='Population Increase',
                  xaxis_title="Year",
                  legend_title='Countries')
fig.show()

**OBSERVATION**: Toppr - Rapid growth of population is known as *Exponential growth* where the population is increased with the constant increasing birth rate. From the above graph, China has a rapid growth and decrease in population where as India has a consistent growth of population over the years.

> **EXERCISE**: Compare the populations of 10 most populous African countries (as of 2019) over the last 50 years using line charts. 

In [14]:
continent_df = pd.read_csv('https://raw.githubusercontent.com/dbouquin/IS_608/master/NanosatDB_munging/Countries-Continents.csv')
african_countries = list(continent_df[continent_df['Continent']=='Africa']['Country'])

In [15]:
population_of_2019 = pd.DataFrame(population_df.loc[2019]).reset_index()
population_of_2019.rename(columns = {'index' : 'Country', 2019 : 'Population'}, inplace=True)
is_african_country = population_of_2019.Country.isin(african_countries)
top_10_african_countries = population_of_2019[is_african_country].sort_values('Population', ascending=False).head(10)
list_of_top_10_african_countries = list(top_10_african_countries['Country'])
Most_populous_african_countries = population_df[list_of_top_10_african_countries].loc[1970:]

In [16]:
fig = px.line(Most_populous_african_countries)
fig.show()

> **EXERCISE**: Explore the documentation for `px.line`, `fig.update_layout` and `fig.update_yaxes`. Use their arguments to style the charts you've created.

In [17]:
fig = px.line(Most_populous_african_countries, title='Top 10(2019) African countries population')
fig.update_layout(yaxis_title='Population',
                  xaxis_title='Year',
                  legend_title='Countries',
                  )
fig.update_xaxes(color='brown')
fig.update_yaxes(color='brown')

## A quick tour of popular interactive charts

Plotly express provides more than 30 figure for creating different types of figures. Let's explore some popular interactive visualization techniques. We'll use the [built-in datasets](https://plotly.com/python-api-reference/generated/plotly.express.data.html) from `px.data` to demonstrate their usage.

### Scatter Plot



In [18]:
iris_df = px.data.iris()
iris_df

Unnamed: 0,sepal_length,sepal_width,petal_length,petal_width,species,species_id
0,5.1,3.5,1.4,0.2,setosa,1
1,4.9,3.0,1.4,0.2,setosa,1
2,4.7,3.2,1.3,0.2,setosa,1
3,4.6,3.1,1.5,0.2,setosa,1
4,5.0,3.6,1.4,0.2,setosa,1
...,...,...,...,...,...,...
145,6.7,3.0,5.2,2.3,virginica,3
146,6.3,2.5,5.0,1.9,virginica,3
147,6.5,3.0,5.2,2.0,virginica,3
148,6.2,3.4,5.4,2.3,virginica,3


In [19]:
px.scatter(iris_df,
           x='sepal_width',
           y='sepal_length',
           color='species',
           size='petal_length',
           hover_data=['petal_width'])

> **EXERCISE**: Download the dataset `px.data.gapminder()` and visualize the relationship between GDP per capita and life expectancy in the year 2007 using a scatter plot. Set proper titles for the figure and the axes. Color the dots using the values in `continent` column, and show the country name and population on hover. 

In [20]:
px.data.gapminder()

Unnamed: 0,country,continent,year,lifeExp,pop,gdpPercap,iso_alpha,iso_num
0,Afghanistan,Asia,1952,28.801,8425333,779.445314,AFG,4
1,Afghanistan,Asia,1957,30.332,9240934,820.853030,AFG,4
2,Afghanistan,Asia,1962,31.997,10267083,853.100710,AFG,4
3,Afghanistan,Asia,1967,34.020,11537966,836.197138,AFG,4
4,Afghanistan,Asia,1972,36.088,13079460,739.981106,AFG,4
...,...,...,...,...,...,...,...,...
1699,Zimbabwe,Africa,1987,62.351,9216418,706.157306,ZWE,716
1700,Zimbabwe,Africa,1992,60.377,10704340,693.420786,ZWE,716
1701,Zimbabwe,Africa,1997,46.809,11404948,792.449960,ZWE,716
1702,Zimbabwe,Africa,2002,39.989,11926563,672.038623,ZWE,716


In [21]:
gdp_lifex_df = px.data.gapminder()[['country', 'continent', 'lifeExp', 'pop', 'gdpPercap']][px.data.gapminder().year==2007]
px.scatter(gdp_lifex_df, 
           title='Relationship between GDP per capita and life expentancy in 2007',
           x='lifeExp',
           y='gdpPercap',
           color='continent',
           hover_data=['country', 'pop'])

### Bar Chart

We'll use the dataset `medals_long`, which contains represents the medal table for Olympic Short Track Speed Skating for the top three nations as of 2020.

In [22]:
long_df = px.data.medals_long()
long_df

Unnamed: 0,nation,medal,count
0,South Korea,gold,24
1,China,gold,10
2,Canada,gold,9
3,South Korea,silver,13
4,China,silver,15
5,Canada,silver,12
6,South Korea,bronze,11
7,China,bronze,8
8,Canada,bronze,12


In [23]:
fig = px.bar(long_df,
             x='nation',
             y='count',
             color='medal',
             title='Olympic Short Track Speed Skating Medals',
             color_discrete_sequence=["#AF9500", "#B4B4B4", "#6A3805"])
fig.show()

> **EXERCISE**: Look up the documentation of `px.bar` and show the bars in the above chart side by side instead of stacking them on top of each other.

In [24]:
fig = px.bar(long_df,
             x='nation',
             y='count',
             barmode='group',
             color='medal',
             title='Olympic Short Track Speed Skating Medals',
             color_discrete_sequence=["#AF9500", "#B4B4B4", "#6A3805"])
fig.show()

> **EXERCISE**: Replicate the above example with a different dataset. Find a dataset online or pick one from this page: https://plotly.com/python-api-reference/generated/plotly.express.data.html

In [25]:
election = px.data.election()
election.head()

Unnamed: 0,district,Coderre,Bergeron,Joly,total,winner,result,district_id
0,101-Bois-de-Liesse,2481,1829,3024,7334,Joly,plurality,101
1,102-Cap-Saint-Jacques,2525,1163,2675,6363,Joly,plurality,102
2,11-Sault-au-Récollet,3348,2770,2532,8650,Coderre,plurality,11
3,111-Mile-End,1734,4782,2514,9030,Bergeron,majority,111
4,112-DeLorimier,1770,5933,3044,10747,Bergeron,majority,112


In [26]:
fig = px.bar(election,
             x='winner',
             y='total',
             barmode='group',
             color='district')
fig.show()

### Treemap and Sunburst

In [27]:
gapminder_df = px.data.gapminder().query("year == 2007")
gapminder_df['world'] = "world"
gapminder_df

Unnamed: 0,country,continent,year,lifeExp,pop,gdpPercap,iso_alpha,iso_num,world
11,Afghanistan,Asia,2007,43.828,31889923,974.580338,AFG,4,world
23,Albania,Europe,2007,76.423,3600523,5937.029526,ALB,8,world
35,Algeria,Africa,2007,72.301,33333216,6223.367465,DZA,12,world
47,Angola,Africa,2007,42.731,12420476,4797.231267,AGO,24,world
59,Argentina,Americas,2007,75.320,40301927,12779.379640,ARG,32,world
...,...,...,...,...,...,...,...,...,...
1655,Vietnam,Asia,2007,74.249,85262356,2441.576404,VNM,704,world
1667,West Bank and Gaza,Asia,2007,73.422,4018332,3025.349798,PSE,275,world
1679,"Yemen, Rep.",Asia,2007,62.698,22211743,2280.769906,YEM,887,world
1691,Zambia,Africa,2007,42.384,11746035,1271.211593,ZMB,894,world


In [28]:
fig = px.treemap(gapminder_df,
                 path=['world', 'continent', 'country'],
                 values='pop',
                 color='lifeExp',
                 color_continuous_scale='RdBu'
                 )
fig.show()

> **EXERCISE**: Replace `px.treemap` with `px.sunburst` in the above example and study the figure that's created. Can you tell what it represents?

In [29]:
fig = px.sunburst(gapminder_df,
                 path=['world', 'continent', 'country'],
                 values='pop',
                 color='lifeExp',
                 color_continuous_scale='RdBu'
                 )
fig.show()

> **EXERCISE**: Replicate the above example with a different dataset. Find a dataset online or pick one from this page: https://plotly.com/python-api-reference/generated/plotly.express.data.html

Learn more about Treemap and Sunburst charts here: 

* https://plotly.com/python/treemaps/
* https://plotly.com/python/sunburst-charts/


### Histogram and Rug Plots

In [30]:
tips_df = px.data.tips()
tips_df

Unnamed: 0,total_bill,tip,sex,smoker,day,time,size
0,16.99,1.01,Female,No,Sun,Dinner,2
1,10.34,1.66,Male,No,Sun,Dinner,3
2,21.01,3.50,Male,No,Sun,Dinner,3
3,23.68,3.31,Male,No,Sun,Dinner,2
4,24.59,3.61,Female,No,Sun,Dinner,4
...,...,...,...,...,...,...,...
239,29.03,5.92,Male,No,Sat,Dinner,3
240,27.18,2.00,Female,Yes,Sat,Dinner,2
241,22.67,2.00,Male,Yes,Sat,Dinner,2
242,17.82,1.75,Male,No,Sat,Dinner,2


In [31]:
fig = px.histogram(tips_df,
                   x='total_bill',
                   color='sex',
                   marginal='rug',
                   hover_data=tips_df.columns)
fig.show()

> **EXERCISE**: Replace `"rug"` with `"box"` or `"violin"` in the above example and study the charts. Do you understand what the marginal plots represent?

In [32]:
fig = px.histogram(tips_df,
                   x='total_bill',
                   color='sex',
                   marginal='box',
                   hover_data=tips_df.columns)
fig.show()

In [33]:
fig = px.histogram(tips_df,
                   x='total_bill',
                   color='sex',
                   marginal='violin',
                   hover_data=tips_df.columns)
fig.show()

> **EXERCISE**: Replicate the above example with a different dataset. Find a dataset online or pick one from this page: https://plotly.com/python-api-reference/generated/plotly.express.data.html

Learn more about histograms in Plotly here: https://plotly.com/python/histograms/

### Polar Chart

In [34]:
wind_df = px.data.wind()
wind_df

Unnamed: 0,direction,strength,frequency
0,N,0-1,0.5
1,NNE,0-1,0.6
2,NE,0-1,0.5
3,ENE,0-1,0.4
4,E,0-1,0.4
...,...,...,...
123,WSW,6+,0.1
124,W,6+,0.9
125,WNW,6+,2.2
126,NW,6+,1.5


In [35]:
fig = px.line_polar(wind_df, 
                r="frequency", 
                theta="direction",
                color="strength",
                color_discrete_sequence=px.colors.sequential.Plasma_r,
                template="plotly_dark")
fig.show()

> **EXERCISE**: Replace `line_polar` with `scatter_polar` or `bar_polar` in the example above and interpret the chart. When would you use one vs. the other?

In [36]:
fig = px.scatter_polar(wind_df, 
                    r="frequency",
                    theta="direction", 
                    color="strength", 
                    color_discrete_sequence=px.colors.sequential.Plasma_r,
                    template="plotly_dark")
fig.show()

In [37]:
fig = px.bar_polar(wind_df, 
                    r="frequency",
                    theta="direction", 
                    color="strength", 
                    color_discrete_sequence=px.colors.sequential.Plasma_r,
                    template="plotly_dark")
fig.show()

> **EXERCISE**: Replicate the above example with a different dataset. Find a dataset online or pick one from this page: https://plotly.com/python-api-reference/generated/plotly.express.data.html

Learn more about polar charts in plotly here: https://plotly.com/python/polar-chart/

## Using Plotly as a plotting backend for Pandas

We can configure Pandas to use Plotly as the backend for the `plot` methods of Pandas data frames & series. You can learn more about this here: https://plotly.com/python/pandas-backend/. 

The Plotly backend can be enabled as follows:

In [38]:
pd.options.plotting.backend = "plotly"

The Plotly backend supports the following kinds of Pandas plots: `scatter`, `line`, `area`, `bar`, `barh`, `hist` and `box`. Let's look at some examples.

In [39]:
europe_df = population_df[['Germany', 'United Kingdom', 'France', 'Italy']]

europe_df.plot(kind='line', title="Population")

In [40]:
long_df

Unnamed: 0,nation,medal,count
0,South Korea,gold,24
1,China,gold,10
2,Canada,gold,9
3,South Korea,silver,13
4,China,silver,15
5,Canada,silver,12
6,South Korea,bronze,11
7,China,bronze,8
8,Canada,bronze,12


In [41]:
long_df.plot(x='count',
             y='nation',
             kind='barh',
             color='medal',
             barmode='group',
             title='Olympic Short Track Speed Skating Medals',
             color_discrete_sequence=["#AF9500", "#B4B4B4", "#6A3805"])

In [42]:
tips_df

Unnamed: 0,total_bill,tip,sex,smoker,day,time,size
0,16.99,1.01,Female,No,Sun,Dinner,2
1,10.34,1.66,Male,No,Sun,Dinner,3
2,21.01,3.50,Male,No,Sun,Dinner,3
3,23.68,3.31,Male,No,Sun,Dinner,2
4,24.59,3.61,Female,No,Sun,Dinner,4
...,...,...,...,...,...,...,...
239,29.03,5.92,Male,No,Sat,Dinner,3
240,27.18,2.00,Female,Yes,Sat,Dinner,2
241,22.67,2.00,Male,Yes,Sat,Dinner,2
242,17.82,1.75,Male,No,Sat,Dinner,2


In [43]:
fig = tips_df.plot('total_bill', 
                   kind='hist', 
                   title="Distribution of Total Bill")
fig.update_layout(bargap=0.1)
fig.show()

In [44]:
tips_df.plot('tip', kind='box', color='sex')

> **EXERCISE**: Replicate the above examples with different datasets. Find a dataset online or pick from this page: https://plotly.com/python-api-reference/generated/plotly.express.data.html

## Creating and exploring 3D graphs

Plotly can also be used to create 3D graphs. Let's look at an example of 3D surface plots, by plotting the elevation of a mountain.

In [45]:
import pandas as pd

z_data = pd.read_csv('https://raw.githubusercontent.com/plotly/datasets/master/api_docs/mt_bruno_elevation.csv', index_col=0)
z_data

Unnamed: 0,0,1,2,3,4,5,6,7,8,9,...,14,15,16,17,18,19,20,21,22,23
0,27.80985,49.61936,83.08067,116.6632,130.414,150.7206,220.1871,156.1536,148.6416,203.7845,...,49.96142,21.89279,17.02552,11.74317,14.75226,13.6671,5.677561,3.31234,1.156517,-0.147662
1,27.71966,48.55022,65.21374,95.27666,116.9964,133.9056,152.3412,151.934,160.1139,179.5327,...,33.08871,38.40972,44.24843,69.5786,4.019351,3.050024,3.039719,2.996142,2.967954,1.999594
2,30.4267,33.47752,44.80953,62.47495,77.43523,104.2153,102.7393,137.0004,186.0706,219.3173,...,48.47132,74.71461,60.0909,7.073525,6.089851,6.53745,6.666096,7.306965,5.73684,3.625628
3,16.66549,30.1086,39.96952,44.12225,59.57512,77.56929,106.8925,166.5539,175.2381,185.2815,...,60.55916,55.92124,15.17284,8.248324,36.68087,61.93413,20.26867,68.58819,46.49812,0.23601
4,8.815617,18.3516,8.658275,27.5859,48.62691,60.18013,91.3286,145.7109,116.0653,106.2662,...,47.42691,69.20731,44.95468,29.17197,17.91674,16.25515,14.65559,17.26048,31.22245,46.71704
5,6.628881,10.41339,24.81939,26.08952,30.1605,52.30802,64.71007,76.30823,84.63686,99.4324,...,140.2647,81.26501,56.45756,30.42164,17.28782,8.302431,2.981626,2.698536,5.886086,5.268358
6,21.83975,6.63927,18.97085,32.89204,43.15014,62.86014,104.6657,130.2294,114.8494,106.9873,...,122.4221,123.9698,109.0952,98.41956,77.61374,32.49031,14.67344,7.370775,0.03711,0.642339
7,53.34303,26.79797,6.63927,10.88787,17.2044,56.18116,79.70141,90.8453,98.27675,80.87243,...,68.1749,46.24076,39.93857,31.21653,36.88335,40.02525,117.4297,12.70328,1.729771,0.0
8,25.66785,63.05717,22.1414,17.074,41.74483,60.27227,81.42432,114.444,102.3234,101.7878,...,59.19355,42.47175,14.63598,6.944074,6.944075,27.74936,0.0,0.0,0.094494,0.077323
9,12.827,69.20554,46.76293,13.96517,33.88744,61.82613,84.74799,121.122,145.2741,153.1797,...,79.34425,25.93483,6.944074,6.944074,6.944075,7.553681,0.0,0.0,0.0,0.0


We need to use the low level `graph_objects` API to create a 3d surface and attach it to a figure.

In [46]:
import plotly.graph_objects as go

surface=go.Surface(z=z_data.values)
fig = go.Figure(surface)
fig.update_layout(title='Mt. Bruno Elevation')
fig.show()

Plotly express can be used to create 3D linear and scatter plots.

In [47]:
df = px.data.gapminder().query("country=='Brazil'")
fig = px.line_3d(df, x="gdpPercap", y="pop", z="year")
fig.show()

In [48]:
px.scatter_3d(iris_df, 
              x='sepal_length', 
              y='sepal_width', 
              z='petal_width', color='species')

> **EXERCISE**: Create some 3D surface, line and scatter plots using some other datasets.

## Adding controls and animating graphs

Plotly express graphs can be animated by specifying an `animation_frame` column. Additionally, an `animation_group` can be specified to uniquely identify objects across frames. An animated graph also provides slider controls to skip to any frame.

In [49]:
df = px.data.gapminder()
df

Unnamed: 0,country,continent,year,lifeExp,pop,gdpPercap,iso_alpha,iso_num
0,Afghanistan,Asia,1952,28.801,8425333,779.445314,AFG,4
1,Afghanistan,Asia,1957,30.332,9240934,820.853030,AFG,4
2,Afghanistan,Asia,1962,31.997,10267083,853.100710,AFG,4
3,Afghanistan,Asia,1967,34.020,11537966,836.197138,AFG,4
4,Afghanistan,Asia,1972,36.088,13079460,739.981106,AFG,4
...,...,...,...,...,...,...,...,...
1699,Zimbabwe,Africa,1987,62.351,9216418,706.157306,ZWE,716
1700,Zimbabwe,Africa,1992,60.377,10704340,693.420786,ZWE,716
1701,Zimbabwe,Africa,1997,46.809,11404948,792.449960,ZWE,716
1702,Zimbabwe,Africa,2002,39.989,11926563,672.038623,ZWE,716


In [50]:
df = px.data.gapminder()

fig = px.scatter(df, 
                 x="gdpPercap", 
                 y="lifeExp", 
                 animation_frame="year", 
                 animation_group="country",
#                  size="pop",     
                 color="continent", 
                 hover_name="country",
                 log_x=True, 
                 size_max=80, 
                 range_x=[100,100000], 
                 range_y=[25,90])

fig.show()

In [51]:
tips_df

Unnamed: 0,total_bill,tip,sex,smoker,day,time,size
0,16.99,1.01,Female,No,Sun,Dinner,2
1,10.34,1.66,Male,No,Sun,Dinner,3
2,21.01,3.50,Male,No,Sun,Dinner,3
3,23.68,3.31,Male,No,Sun,Dinner,2
4,24.59,3.61,Female,No,Sun,Dinner,4
...,...,...,...,...,...,...,...
239,29.03,5.92,Male,No,Sat,Dinner,3
240,27.18,2.00,Female,Yes,Sat,Dinner,2
241,22.67,2.00,Male,Yes,Sat,Dinner,2
242,17.82,1.75,Male,No,Sat,Dinner,2


In [52]:
px.box(tips_df, 
       x='sex', 
       y='total_bill', 
       color='smoker', 
       animation_frame='day')

> **EXERCISE**: Create some graphs with controls an animations using other datasets.

Learn more about Plotly animations and custom controls here: https://plotly.com/python/#controls

In [53]:
!pip install jovian --upgrade --quiet

In [54]:
import jovian

In [55]:
jovian.commit(filename='interactive-visualization-plotly.ipynb')

<IPython.core.display.Javascript object>

[jovian] Updating notebook "menkachi85/interactive-visualization-plotly-fd9eb" on https://jovian.ai[0m
[jovian] Committed successfully! https://jovian.ai/menkachi85/interactive-visualization-plotly-fd9eb[0m


'https://jovian.ai/menkachi85/interactive-visualization-plotly-fd9eb'

## Summary and Further Reading

We've covered the following topics in this tutorial:

- Creating figures & adding interactive elements
- Replicating common graphs using Plotly
- Using Plotly as a plotting backend for Pandas 
- Creating and exploring 3D graphs
- Adding controls and animating graphs



## Questions for Revision
1.	What is Plotly?
2.	How is Plotly different from Matplotlib and Seaborn?
3.	How do you plot line chart with Plotly? Illustrate with an example.
4.	What is the difference between `plt.bar`, `sns.barplot` and `px.bar`?
5.	What are the popular interactive charts in Plotly?
6.	What is the `color_discrete_sequence parameter` in Plotly?
7.	How do you set titles for axes in Plotly?
8.	What is `hover_data` parameter in Plotly?
9.	How do you plot sunburst chart with Plotly? Illustrate with an example.
10.	Illustrate the usage of a Treemap in Plotly.
11.	What is marginal parameter in Plotly?
12.	What is a polar chart? Illustrate with an example.
13.	How can you use Plotly as backend for plot methods in Pandas dataframes and series? 
14.	What is a boxplot? Plot it using Plotly.
15.	What is `graph_objects` in Plotly?
16.	How do you import `graph_objects`?
17.	What kind of 3D plots can you be plotted using Plotly?
18.	What is an animated graph? Illustrate with an example.
19.	What is an `animation_frame`?
20.	What is an `animation_group`?

## Solutions for Exercises


> **EXERCISE**: Compare the annual population increase of India and China using a line chart. Which of the two countries is growing faster? Hint: Use `population_df.diff()`

**OBSERVATION**: Toppr - Rapid growth of population is known as *Exponential growth* where the population is increased with the constant increasing birth rate. From the above graph, China has a rapid growth and decrease in population where as India has a consistent growth of population over the years.

> **EXERCISE**: Compare the populations of 10 most populous African countries (as of 2019) over the last 50 years using line charts. 

**OBSERVATION**: Nigeria has the highest population growth in the last 50 years.

> **EXERCISE**: Explore the documentation for `px.line`, `fig.update_layout` and `fig.update_yaxes`. Use their arguments to style the charts you've created.

> **EXERCISE**: Download the dataset `px.data.gapminder()` and visualize the relationship between GDP per capita and life expectancy in the year 2007 using a scatter plot. Set proper titles for the figure and the axes. Color the dots using the values in `continent` column, and show the country name and population on hover. 

**OBSERVATION**: GDP per capita and life expectancy in the year 2007 seem to have a 'curvilinear relationship'. 
- Curvilinear relationship - A Curvilinear Relationship is a type of relationship between two variables where as one variable increases, so does the other variable, but only up to a certain point, after which, as one variable continues to increase, the other decreases.

> **EXERCISE**: Look up the documentation of `px.bar` and show the bars in the above chart side by side instead of stacking them on top of each other. 

- [*Reference to Bar Chart `long_df` visualization.*](https://jovian.ai/aakashns/interactive-visualization-plotly/v/18#C42)

> **EXERCISE**: Replicate the above example with a different dataset. Find a dataset online or pick one from this page: https://plotly.com/python-api-reference/generated/plotly.express.data.html

- [*Reference to Bar chart visualization exercise*](https://jovian.ai/aakashns/interactive-visualization-plotly/v/18#C42)

**OBSERVATION**: Tips for male during the dinner time of the day are highest. Insteresting. Evening/Night shifts are taken up by men mostly and that could possibly be the reason for the huge difference. 

> **EXERCISE**: Replicate the above example with a different dataset. Find a dataset online or pick one from this page: https://plotly.com/python-api-reference/generated/plotly.express.data.html

- [*Reference to treemap and sunburst visualization exercise.*](https://jovian.ai/aakashns/interactive-visualization-plotly/v/18#C52)

**OBSERVATION**: Coderre has highest number of votes with maximum plurality wins and Joly has has the least number of votes with only two majority wins.

> **EXERCISE**: Replace `"rug"` with `"box"` or `"violin"` in the above example and study the charts. Do you understand what the marginal plots represent?

- [*Reference to `tips_df` histogram visualization.*](https://jovian.ai/aakashns/interactive-visualization-plotly/v/18#C62)

**OBSERVATION**: Marginal plots represent the small subplots above the main plot, which show the distribution of data along only one dimension. 

> **EXERCISE**: Replicate the above example with a different dataset. Find a dataset online or pick one from this page: https://plotly.com/python-api-reference/generated/plotly.express.data.html

- [*Reference to histogram visualization exercise.*](https://jovian.ai/aakashns/interactive-visualization-plotly/v/18#C62)

**OBSERVATION**: `car_hours` seem to follow a normal distribution with few outliers.

> **EXERCISE**: Replace `line_polar` with `scatter_polar` or `bar_polar` in the example above and interpret the chart. When would you use one vs. the other?

- [*Reference to `wind_df` polar chart visualization.*](https://jovian.ai/aakashns/interactive-visualization-plotly/v/18#C74)

**OBSERVATION**: Bar polar charts are mostly used for categorical data where as scatter polar charts are used for data containing amount and direction values.

> **EXERCISE**: Replicate the above example with a different dataset. Find a dataset online or pick one from this page: https://plotly.com/python-api-reference/generated/plotly.express.data.html

- [*Reference to polar chart visualization exercise.*](https://jovian.ai/aakashns/interactive-visualization-plotly/v/18#C74)

**OBSERVATION**: Dinner time of the day during weekends recorded the highest paid bills at the restaurant.

**OBSERVATION**: Dinner time of the day during weekends received the maximum number of tips.

> **EXERCISE**: Replicate the above examples with different datasets. Find a dataset online or pick from this page: https://plotly.com/python-api-reference/generated/plotly.express.data.html

- [*Reference to Using Plotly as a plotting backend for Pandas exercise.*](https://jovian.ai/aakashns/interactive-visualization-plotly/v/18#C83)

**OBSERVATION**: Coderre has highest number of votes with maximum plurality wins and Joly has has the least number of votes with only two majority wins.

**OBSERVATION**: Coderre has the maximum number of entries in the `elections` dataset indicating that this person must have won elections in many districts. 

**OBSERVATION**: Europe and African continents have more outliers when compared to other continents in Life Expextancy. Asia has the highest Life Expectancy with 82.6 years and Africa the least Life Expectancy with 23.5 years.

> **EXERCISE**: Create some 3D surface, line and scatter plots using some other datasets.

- [*Reference to Creating and exploring 3D graphs exercise.*](https://jovian.ai/aakashns/interactive-visualization-plotly/v/18#C96)

> **EXERCISE**: Create some graphs with controls an animations using other datasets.

- [*Reference to Adding controls and animating graphs exercise.*](https://jovian.ai/aakashns/interactive-visualization-plotly/v/18#C108)