# <center> **Plotly (Express)**

<h3> About Plotly</h3>

Plotly is a Data Viz library by the company Plotly based out of Canada with support in languages such as Python, Js, Julia etc.

<h4> Advantages </h4>

- Multi language support
- Lot's of graphs
- Interactive plots
- Beautiful plots

Does not work with live data streams. Dash can be explored for that.

<h3> The Plotly Roadmap</h3>

- Plotly Go
- Plotly Express
- Dash

>In today's lecture we are gonna use Plotly Graph  Objects (GO) as well as Plotly Express (PX) but we are gonna use the Plotly Express more.

## <b><center> Working with Plotly Go

In [1]:
# import the libraries

import plotly.graph_objects as go
import plotly.express as px
import numpy as np
import pandas as pd

In [2]:
# import datasets

tips = px.data.tips()
iris = px.data.iris()
gap = px.data.gapminder() # hrr country ka 5 year ke gap me population aur different parameter hai 

# we are gonna use these 3 dataset's in today's lecture


In [3]:
tips.head()

Unnamed: 0,total_bill,tip,sex,smoker,day,time,size
0,16.99,1.01,Female,No,Sun,Dinner,2
1,10.34,1.66,Male,No,Sun,Dinner,3
2,21.01,3.5,Male,No,Sun,Dinner,3
3,23.68,3.31,Male,No,Sun,Dinner,2
4,24.59,3.61,Female,No,Sun,Dinner,4


In [4]:
iris.head()

Unnamed: 0,sepal_length,sepal_width,petal_length,petal_width,species,species_id
0,5.1,3.5,1.4,0.2,setosa,1
1,4.9,3.0,1.4,0.2,setosa,1
2,4.7,3.2,1.3,0.2,setosa,1
3,4.6,3.1,1.5,0.2,setosa,1
4,5.0,3.6,1.4,0.2,setosa,1


In [5]:
gap.head()

Unnamed: 0,country,continent,year,lifeExp,pop,gdpPercap,iso_alpha,iso_num
0,Afghanistan,Asia,1952,28.801,8425333,779.445314,AFG,4
1,Afghanistan,Asia,1957,30.332,9240934,820.85303,AFG,4
2,Afghanistan,Asia,1962,31.997,10267083,853.10071,AFG,4
3,Afghanistan,Asia,1967,34.02,11537966,836.197138,AFG,4
4,Afghanistan,Asia,1972,36.088,13079460,739.981106,AFG,4


### <B><Center>Scatter Plot using Plotly Go</Center></b>
We want also see what is the problem with the GO and why we need Plotly Express.

In [6]:
# scatter on gap dataset, columns - lifeExp and pop

gap2007 = gap[gap['year'] == 2007]

gap2007

Unnamed: 0,country,continent,year,lifeExp,pop,gdpPercap,iso_alpha,iso_num
11,Afghanistan,Asia,2007,43.828,31889923,974.580338,AFG,4
23,Albania,Europe,2007,76.423,3600523,5937.029526,ALB,8
35,Algeria,Africa,2007,72.301,33333216,6223.367465,DZA,12
47,Angola,Africa,2007,42.731,12420476,4797.231267,AGO,24
59,Argentina,Americas,2007,75.320,40301927,12779.379640,ARG,32
...,...,...,...,...,...,...,...,...
1655,Vietnam,Asia,2007,74.249,85262356,2441.576404,VNM,704
1667,West Bank and Gaza,Asia,2007,73.422,4018332,3025.349798,PSE,275
1679,"Yemen, Rep.",Asia,2007,62.698,22211743,2280.769906,YEM,887
1691,Zambia,Africa,2007,42.384,11746035,1271.211593,ZMB,894


In [7]:
trace1 = go.Scatter(x=gap2007['lifeExp'], y=gap2007['gdpPercap'], mode='markers')

data = [trace1] # data ek list hota hai, kitne bhi item rakh skte hai aur items ko traces bolte hai

layout = go.Layout(title='Life Exp Vs GDP per Capita for 2007', xaxis={'title':'Life Exp'}, yaxis={'title': 'GDP'})

fig = go.Figure(data, layout) # data me data send krte hai aur layout me font, axes yee sb batate hai

fig.show()

Let's have another trace

In [8]:
trace1 = go.Scatter(x=gap2007['lifeExp'], y=gap2007['gdpPercap'], mode='markers')
trace2 = go.Scatter(x=[0,1,2], y=[0,900,30000], mode='lines')

data = [trace1, trace2]

layout = go.Layout(title='Life Exp Vs GDP per Capita for 2007', xaxis={'title':'Life Exp'}, yaxis={'title': 'GDP'})
fig = go.Figure(data, layout) 

fig.show()

## <b><center> Plotly Express

### <b> 1. Scatter PLot

In [9]:
# plot life exp and gdp scatter plot

px.scatter(gap2007, x = 'lifeExp', y = 'gdpPercap') # less customization than above graph but easy to plot

In [10]:
# continent as color 

px.scatter(gap2007, x = 'lifeExp', y = 'gdpPercap', color = 'continent') # Cool thing: click on the continent name on the legend to turn on or off it.

In [11]:
# population as size and size max

px.scatter(gap2007, x = 'lifeExp', y = 'gdpPercap', color = 'continent', size = 'pop', size_max = 40)

In [12]:
# hover name

px.scatter(gap2007, x = 'lifeExp', y = 'gdpPercap', color = 'continent', size = 'pop', size_max = 40, hover_name='country')

### <b> Animation

><b>Plot animation of the above curve on the basic of year

In [13]:
# Now we want to plot the scatter plot for every 5 year and put all the scatter plot in a animation

px.scatter(gap, x = 'lifeExp', y = 'gdpPercap', 
           color = 'continent', size = 'pop', 
           size_max = 40, 
           animation_frame='year', # because we want animation based on the year
           animation_group='country') # because we want all information in the form of country

In [14]:
# range_x/range_y

fig = px.scatter(gap, x = 'lifeExp', y = 'gdpPercap', 
           color = 'continent', size = 'pop', 
           size_max = 40,
           range_x=[30,95],
           animation_frame='year', 
           animation_group='country'
           )


In [15]:
fig.write_html("animated_chart.html") # saving the animated chart in local machine 

**Question**: Can you slow the animation speed?

### <b>2. Line Plot

**Question**: ***Plot India pop line plot***

In [16]:
india_gap = gap[gap['country'] == 'India']

india_gap

Unnamed: 0,country,continent,year,lifeExp,pop,gdpPercap,iso_alpha,iso_num
696,India,Asia,1952,37.373,372000000,546.565749,IND,356
697,India,Asia,1957,40.249,409000000,590.061996,IND,356
698,India,Asia,1962,43.605,454000000,658.347151,IND,356
699,India,Asia,1967,47.193,506000000,700.770611,IND,356
700,India,Asia,1972,50.651,567000000,724.032527,IND,356
701,India,Asia,1977,54.208,634000000,813.337323,IND,356
702,India,Asia,1982,56.596,708000000,855.723538,IND,356
703,India,Asia,1987,58.553,788000000,976.512676,IND,356
704,India,Asia,1992,60.223,872000000,1164.406809,IND,356
705,India,Asia,1997,61.765,959000000,1458.817442,IND,356


In [17]:
px.line(india_gap, x = 'year', y='pop', title= 'India Population Graph', subtitle='Pop Vs Year')

***Question: India, Pakistan and China lifeExp plot on the same graph***

In [18]:
gap[gap['country'].isin(['India', 'Pakistan', 'China'])]

Unnamed: 0,country,continent,year,lifeExp,pop,gdpPercap,iso_alpha,iso_num
288,China,Asia,1952,44.0,556263527,400.448611,CHN,156
289,China,Asia,1957,50.54896,637408000,575.987001,CHN,156
290,China,Asia,1962,44.50136,665770000,487.674018,CHN,156
291,China,Asia,1967,58.38112,754550000,612.705693,CHN,156
292,China,Asia,1972,63.11888,862030000,676.900092,CHN,156
293,China,Asia,1977,63.96736,943455000,741.23747,CHN,156
294,China,Asia,1982,65.525,1000281000,962.421381,CHN,156
295,China,Asia,1987,67.274,1084035000,1378.904018,CHN,156
296,China,Asia,1992,68.69,1164970000,1655.784158,CHN,156
297,China,Asia,1997,70.426,1230075000,2289.234136,CHN,156


>As plotly expect data each data point in single row but we have 3 rows for a single data point.
>>So we have to reshape our data in wide format using pivot()

In [19]:
temp_df = gap[gap['country'].isin(['India', 'Pakistan', 'China'])].pivot(index='year', columns='country', values= 'lifeExp')

temp_df

country,China,India,Pakistan
year,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1
1952,44.0,37.373,43.436
1957,50.54896,40.249,45.557
1962,44.50136,43.605,47.67
1967,58.38112,47.193,49.8
1972,63.11888,50.651,51.929
1977,63.96736,54.208,54.043
1982,65.525,56.596,56.158
1987,67.274,58.553,58.245
1992,68.69,60.223,60.838
1997,70.426,61.765,61.818


In [20]:
px.line(temp_df, x=temp_df.index, y= temp_df.columns)

In [21]:
px.line(temp_df)

### <b>3. Bar chart

🎯***Question: India population over the year***

In [22]:
px.bar(india_gap, x = 'year', y='pop', title= 'India Population Graph', subtitle='Pop Vs Year')

🎯***Question: India, Pakistan and China GDP per Capita ***Grouped Bar Chart*** on the same graph***

In [23]:
temp_df = gap[gap['country'].isin(['India', 'Pakistan', 'China'])].pivot(index='year', columns='country', values= 'gdpPercap')

temp_df

country,China,India,Pakistan
year,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1
1952,400.448611,546.565749,684.597144
1957,575.987001,590.061996,747.083529
1962,487.674018,658.347151,803.342742
1967,612.705693,700.770611,942.408259
1972,676.900092,724.032527,1049.938981
1977,741.23747,813.337323,1175.921193
1982,962.421381,855.723538,1443.429832
1987,1378.904018,976.512676,1704.686583
1992,1655.784158,1164.406809,1971.829464
1997,2289.234136,1458.817442,2049.350521


In [24]:
px.bar(temp_df, temp_df.index, temp_df.columns) # stacked bar chart

In [25]:
px.bar(temp_df, temp_df.index, temp_df.columns, barmode = 'group')

🎯**Question:** As the population between China, India and Pakistan differ so much, can we use log representation for better graph?   
✨**Answer:** Yes, we can use log axis using ***log_x and log_y = True***

In [26]:
temp_df = gap[gap['country'].isin(['India', 'Pakistan', 'China'])].pivot(index='year', columns='country', values= 'pop')

temp_df # as in this population dataset India, China population is very large but Pakistan population is very small and our graph is not so readable

country,China,India,Pakistan
year,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1
1952,556263527,372000000,41346560
1957,637408000,409000000,46679944
1962,665770000,454000000,53100671
1967,754550000,506000000,60641899
1972,862030000,567000000,69325921
1977,943455000,634000000,78152686
1982,1000281000,708000000,91462088
1987,1084035000,788000000,105186881
1992,1164970000,872000000,120065004
1997,1230075000,959000000,135564834


In [27]:
px.bar(temp_df, temp_df.index, temp_df.columns, barmode = 'group', log_y=True)

🎯**Question:** How to see values above the bar.  
✨**Answer:** By using ***text_auto*** parameter.

In [28]:
px.bar(temp_df, temp_df.index, temp_df.columns, barmode = 'group', log_y=True, text_auto=True)

**Question:**  Plot the population contribution per country of a continents population stacked for a particular year(2007)  

**Answer:** yes, we can 

In [29]:
px.bar(gap2007, x= 'continent', y = 'pop', color='country',title='Continent Population in 2007')

<center><b>Bar Chart Animation </center></b>  

🎯**Question:** Show how the continent population grow year by year.  
✨**Answer:** Yes, we can do this using

In [30]:
px.bar(gap, x= 'continent', y = 'pop', color='country',
       title='Continent Population Changes Year by year',
       labels={'pop': 'Population'},
       animation_frame='year',
       animation_group= 'country',
       )

In [31]:
px.bar(gap, x= 'continent', y = 'pop', color='country',
       title='Continent Population Changes Year by year',
       labels={'pop': 'Population'},
       animation_frame='year',
       animation_group= 'country',
       range_y=[0,4000000000]
       )

### <b>4. Histogram

In [32]:
px.histogram(gap2007, x='lifeExp') # Histogram of life expectancy of all the country in the year 2007

In [33]:
px.histogram(gap2007, x='lifeExp', nbins=30, text_auto = True)

<center><b>Multiple Histogram</center></b>  

🎯**Question:** Plot Histogram of sepal length of all iris species.  
✨**Answer:** Yes, we can below is the code

In [34]:
iris.sample(5)

Unnamed: 0,sepal_length,sepal_width,petal_length,petal_width,species,species_id
99,5.7,2.8,4.1,1.3,versicolor,2
63,6.1,2.9,4.7,1.4,versicolor,2
69,5.6,2.5,3.9,1.1,versicolor,2
81,5.5,2.4,3.7,1.0,versicolor,2
112,6.8,3.0,5.5,2.1,virginica,3


In [35]:
px.histogram(iris, x='sepal_length', color='species', nbins=30, text_auto=True)

### <b>5. Pie Chart

🎯**Question:** Plot the chart of population european countries in 2007.  
✨**Answer:** px.pie()

In [36]:
temp_df = gap2007[gap2007['continent'] == 'Europe']

temp_df

Unnamed: 0,country,continent,year,lifeExp,pop,gdpPercap,iso_alpha,iso_num
23,Albania,Europe,2007,76.423,3600523,5937.029526,ALB,8
83,Austria,Europe,2007,79.829,8199783,36126.4927,AUT,40
119,Belgium,Europe,2007,79.441,10392226,33692.60508,BEL,56
155,Bosnia and Herzegovina,Europe,2007,74.852,4552198,7446.298803,BIH,70
191,Bulgaria,Europe,2007,73.005,7322858,10680.79282,BGR,100
383,Croatia,Europe,2007,75.748,4493312,14619.22272,HRV,191
407,Czech Republic,Europe,2007,76.486,10228744,22833.30851,CZE,203
419,Denmark,Europe,2007,78.332,5468120,35278.41874,DNK,208
527,Finland,Europe,2007,79.313,5238460,33207.0844,FIN,246
539,France,Europe,2007,80.657,61083916,30470.0167,FRA,250


In [37]:
px.pie(temp_df, values='pop', names = 'country')

🎯**Question:** Plot pie chart of world pop in 1952 continent wise -> explode(pull)

In [38]:
temp_df = gap[gap['year'] == 1952].groupby('continent')['pop'].sum().reset_index()

temp_df

Unnamed: 0,continent,pop
0,Africa,237640501
1,Americas,345152446
2,Asia,1395357351
3,Europe,418120846
4,Oceania,10686006


In [39]:
px.pie(temp_df, values='pop', names= 'continent')

In [40]:
# Create a pull list: 0 for all, non-zero (e.g. 0.1) for Europe
pull_values = [0.2 if continent == "Europe" else 0 for continent in temp_df['continent']]


fig = px.pie(temp_df, values='pop', names='continent')

# Update the trace with pull values
fig.update_traces(pull=pull_values)

fig.show()


### <b>6. Sunburst plot</b>  
Sunburst plots visualize hierarchical data spanning outwards radially from root to leaves.

>Hierarchical data are often stored as a rectangular dataframe, with different columns corresponding to different levels of the hierarchy. px.sunburst can take a path parameter corresponding to a list of columns. Note that id and parent should not be provided if path is given.

In [41]:
px.sunburst(gap2007, path=['continent', 'country'], values='pop', color='lifeExp')

>Now, using the Sunburst plot on the tips dataset. 

In [42]:
px.sunburst(tips, path=['day','time','smoker','sex',], values='total_bill', color='size')

### <b>7. Treemap Plot

In [43]:
px.treemap(gap2007, path=[px.Constant('World'),'continent', 'country'], values='pop', color='lifeExp')

<center> <b> Treemap with Rounded Corners

In [44]:
fig = px.treemap(gap2007, path=[px.Constant('World'),'continent', 'country'], values='pop', color='lifeExp')

fig.update_traces(marker=dict(cornerradius=5))
fig.show()

### <b>8. Heatmap plot

In [45]:
temp_df = tips.pivot_table(index='day', columns='sex', values='total_bill', aggfunc = 'sum')

temp_df

sex,Female,Male
day,Unnamed: 1_level_1,Unnamed: 2_level_1
Fri,127.31,198.57
Sat,551.05,1227.35
Sun,357.7,1269.46
Thur,534.89,561.44


In [46]:
px.imshow(temp_df)

🎯**Question:** Plot heatmap of all continents with year on avg life exp.  
✨**Answer:** 👇

In [47]:
temp_df= gap.pivot_table(index='year', columns='continent', values='lifeExp', aggfunc='mean')

temp_df

continent,Africa,Americas,Asia,Europe,Oceania
year,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1
1952,39.1355,53.27984,46.314394,64.4085,69.255
1957,41.266346,55.96028,49.318544,66.703067,70.295
1962,43.319442,58.39876,51.563223,68.539233,71.085
1967,45.334538,60.41092,54.66364,69.7376,71.31
1972,47.450942,62.39492,57.319269,70.775033,71.91
1977,49.580423,64.39156,59.610556,71.937767,72.855
1982,51.592865,66.22884,62.617939,72.8064,74.29
1987,53.344788,68.09072,64.851182,73.642167,75.32
1992,53.629577,69.56836,66.537212,74.4401,76.945
1997,53.598269,71.15048,68.020515,75.505167,78.19


In [48]:
px.imshow(temp_df)

### <b>9. 3D Scatter Plot

🎯**Question:** Plot a 3d scatter plot of all country data - ***[lifeExp, pop, gdpPerCap]*** for 2007.  
✨**Answer:** 👇

In [52]:
px.scatter_3d(gap2007, x='lifeExp', y ='pop', z= 'gdpPercap',
              color='continent', log_y= True, hover_name='country' )

>Above graph does not look good, let's add axes grids

In [50]:
fig = px.scatter_3d(
    gap2007,
    x='lifeExp',
    y='pop',
    z='gdpPercap',
    color='continent',
    log_y=True,
    hover_name='country'
)

# Customize grid and background
fig.update_layout(
    scene=dict(
        xaxis=dict(showgrid=True, gridcolor='lightgray', backgroundcolor="white"),
        yaxis=dict(showgrid=True, gridcolor='lightgray', backgroundcolor="white"),
        zaxis=dict(showgrid=True, gridcolor='lightgray', backgroundcolor="white")
    )
)

fig.show()


>Using 3d scatter plot on the iris dataset.

In [55]:
px.scatter_3d(iris, 'sepal_length', 'sepal_width', 'petal_length', color = 'species')

In [56]:
fig = px.scatter_3d(iris, 'sepal_length', 'sepal_width', 'petal_length', color = 'species')

# Customize grid and background
fig.update_layout(
    scene=dict(
        xaxis=dict(showgrid=True, gridcolor='lightgray', backgroundcolor="white"),
        yaxis=dict(showgrid=True, gridcolor='lightgray', backgroundcolor="white"),
        zaxis=dict(showgrid=True, gridcolor='lightgray', backgroundcolor="white")
    )
)

fig.show()

#### <b> Scatter_matrix -> dimensions

In [60]:
px.scatter_matrix(iris, dimensions = ['sepal_length','sepal_width','petal_length','petal_width'], color = 'species')