### Content of training<br>
1. Plotly <br>
    a. Installation<br>
    b. Supported data types<br>
    c. Multiple plots<br>
        i. Single plot
        ii. Subplots
    d. Plotly Express<br>
2. Geo Plots<br>
    a. Inbuilt plots <br>
    b. GeoJSON plots
        -Covid Tracker

# Plotly<br>
The plotly Python library (plotly.py) is an interactive, open-source plotting library that supports over 40 unique chart types covering a wide range of statistical, financial, geographic, scientific, and 3-dimensional use-cases.

Built on top of the Plotly JavaScript library (plotly.js), plotly.py enables Python users to create beautiful interactive web-based visualizations that can be displayed in Jupyter notebooks, saved to standalone HTML files, or served as part of pure Python-built web applications using Dash.

#### What additional functionality do we have in Plotly over matlpotlib?<br>
The main advantage is that only a few lines of codes are necessary to create aesthetically pleasing, interactive plots. Others to list a few:
1. Makes it easy to modify and export your plot
2. Offers a more ornate visualization, which is well-suited for conveying the important insights hidden within your dataset.<br><br><br>
###### Installation(Not for DSEN) and import

In [None]:
# Installation
!pip install plotly 
#Pre-requisites in notebook(Might already be there)
!pip install notebook
!pip install ipywidgets
#restart kernel after installation completes

In [1]:
# import the libraries
import plotly as py

In [2]:
# Checking the version of your plotly

print(py.__version__)

4.7.0


In [3]:
#To work with plotly offline
from plotly.offline import download_plotlyjs, init_notebook_mode, plot,iplot

In [4]:
init_notebook_mode(connected=True)

In the context of the plotly.js library, a figure is specified by a declarative JSON data structure.

Python dictionaries can be automatically serialized into the JSON data structure that the plotly.js graphing library understands.

##### Components of this dictionary
###### The "data" Key
The "data" key stores the value of list which describes the trace or traces which make up a figure.
Each trace in the list stored by the "data" key is itself defined by a dictionary. The type of the trace ("bar", "scatter", "contour", etc...) is specified with a "type" key, and the rest of the keys in a trace specification dictionary (x, y, etc...) are used to define the properties specific to the trace of that type.

###### The "layout" Key
The"layout" key stores a dictionary that specifies properties related to customizing how the figure looks, such as its title, typography, margins, axes, annotations, shapes, legend and more. In contrast to trace configuration options, which apply only to individual traces, layout configuration options apply to the figure as a whole.

In [10]:
fig = {
    "data": [{"type": "bar",
              "x": [1, 2, 3],
              "y": [1, 3, 2]},
             {"type": "scatter",
              "x": [1, 2, 3],
              "y": [2, 3, 3]}],
    "layout":{"title" : "A Figure Specified By Python Dictionary",
              "height": 500,
              "width" : 800}
}

plot(fig)
#iplot(fig,show_link = False)

'temp-plot.html'

As an <i>alternative to working with Python dictionaries</i>, the plotly.py graphing library provides a hierarchy of classes called <b>"graph objects"</b> that may be used to construct figures. Graph objects have several benefits compared to plain Python dictionaries.

1. Graph objects provide precise data validation. If you provide an invalid property name or an invalid property value as the key to a graph object, an exception will be raised with a helpful error message describing the problem. This is not the case if you use plain Python dictionaries and lists to build your figures.

2. Graph objects contain descriptions of each valid property as Python docstrings. You can use these docstrings in the development environment of your choice to learn about the available properties as an alternative to consulting the online Full Reference.

3. Properties of graph objects can be accessed using both dictionary-style key lookup (e.g. fig["layout"]) or class-style property access (e.g. fig.layout).

4. Graph objects support higher-level convenience functions for making updates to already constructed figures, as described below.



In [7]:
import plotly.graph_objects as go
fig = go.Figure(
    data=[go.Bar(x=[1, 2, 3], y=[1, 3, 2])],
    layout=go.Layout(title="A Figure Specified By A Graph Object",width=800,height=500)
    )
iplot(fig)

The python dictionary(dict) and Graph Objects(GO)are interconvertible. 

In [12]:
#Convert dict to GO
dict_of_fig = {
    "data": [{"type": "bar",
              "x": [1, 2, 3],
              "y": [1, 3, 2]}],
    "layout":{"title" : "A Figure Specified By Python Dictionary",
              "height": 500,
              "width" : 800}
}


fig = go.Figure(dict_of_fig)
iplot(fig)

In [13]:
fig_new=fig.to_dict()
#print("Dictionary Representation of the Graph Object:\n" + str(fig_new))
print("Dictionary Representation of data:\n" + str(fig_new["data"]))

Dictionary Representation of data:
[{'x': [1, 2, 3], 'y': [1, 3, 2], 'type': 'bar'}]


## Types of Curves that can be plotted
 #### A LOT!! Explore: 
 https://plotly.com/python/ <br><br>
To name a few standard plots:
1. Bubble Plot 
2. Line Plot
3. Bar Plot
4. Pie Chart
5. Histogram 
6. Box Plots<br>


### Plotting multiple charts

1. Single plot with legend
2. Subplots
<br><br>
#### Single Plot with legend

In [26]:
x=[1,2,3,4,5,6,7,8,9]
y1=[1,6,2,0,3,-8,-4,8,-1]
y2=[1,7,9,9,12,4,0,8,7]

#Method1: Declaring multiple Traces in data
data = [go.Scatter(x=x,
                   y=y1,
                   name='Growth'),
        go.Bar(x=x,
               y=y2,
               name='N')]
fig = go.Figure(data = data,
                layout = go.Layout(title='Comparison Plot',
                                   width = 500,
                                   height = 500))


iplot(fig)

In [28]:
#Method 2: Adding Traces by command (add_trace or add_{type})
data=[go.Scatter(x=x,
                 y=y1,
                 name='y1')]
fig = go.Figure(data = data,
                layout = go.Layout(title='Comparison Plot',
                                   width = 800,
                                   height = 300))

#iplot(fig)

#USING add_trace 
#fig.add_trace(go.Bar(y=y2,
 #                    name='y2'))

#USING add_{type} 
#fig.add_bar(y=y2,
 #           name='y2')
fig.add_scatter(x=x,
                y=y2,
                name='y2')


iplot(fig)

#### Making Suplot

In [21]:
from plotly.subplots import make_subplots

fig = make_subplots(rows=2, cols=2)

#Scatter Plot
x1=[1,2,3]
y1=[-2,0,2]
fig.add_trace(go.Scatter(x=x1,
                         y=y1,
                         mode='markers'),
              row=1,col=1)

#line plot
x2=[1,2,3]
y2=[4,2,1]
fig.add_trace(go.Scatter(x=x2,
                         y=y2, 
                         mode="lines"), 
              row=1, col=2)

#Bar plot
y3=[2, 1, 3]
fig.add_trace(go.Bar(y=y3), 
              row=2, col=1)

#Bubble Plot
x4=[1,2,3,4]
y4=[10,11,12,13]
s=[40,60,80,100]
fig.add_trace(go.Scatter(x=x4, 
                         y=y4,
                         mode='markers', 
                         marker_size=s),
              row=2,col=2)

plot(fig)

'temp-plot.html'

### Updating a trace or layout

In [22]:
#TRACE
#fig.update_traces(marker_color='MediumPurple')
fig.update_traces(marker_color='MediumPurple',
                  selector={"mode":'markers'})

plot(fig)

'temp-plot.html'

In [23]:
#LAYOUT

fig.update_layout(title='Updated Title')

plot(fig)

'temp-plot.html'

## Integrating with Pandas
<br>

### Plotly Express 
Plotly Express is the easy-to-use, high-level interface to Plotly, which operates on "tidy" data and produces easy-to-style figures.

In [48]:
# importing our "tidy" data
import pandas as pd

#importing Plotly Express
import plotly.express as px

#importing "Tips" from plotly express built-In datasets
df = px.data.tips()

In [30]:
df.head()

Unnamed: 0,total_bill,tip,sex,smoker,day,time,size
0,16.99,1.01,Female,No,Sun,Dinner,2
1,10.34,1.66,Male,No,Sun,Dinner,3
2,21.01,3.5,Male,No,Sun,Dinner,3
3,23.68,3.31,Male,No,Sun,Dinner,2
4,24.59,3.61,Female,No,Sun,Dinner,4


In [31]:
df.info()

<class 'pandas.core.frame.DataFrame'>
RangeIndex: 244 entries, 0 to 243
Data columns (total 7 columns):
total_bill    244 non-null float64
tip           244 non-null float64
sex           244 non-null object
smoker        244 non-null object
day           244 non-null object
time          244 non-null object
size          244 non-null int64
dtypes: float64(2), int64(1), object(4)
memory usage: 13.4+ KB


In [32]:
fig=px.scatter(data_frame=df,
               x='total_bill',
               y='tip',
               height=600,
               width=600)
iplot(fig)

Smoker vs Non Smoker

In [33]:
#trendline and color
fig=px.scatter(df,x='total_bill',
               y='tip',
               color='smoker',
               trendline='ols')
plot(fig)

'temp-plot.html'

#### Distribution
Smoker vs Non Smoker

In [29]:
#marginal_
fig=px.scatter(df,x='total_bill',
               y='tip',
               color='smoker',
               marginal_x='rug',
               marginal_y='rug',
               trendline='ols')
plot(fig)

'temp-plot.html'

Genderwise

In [34]:
fig=px.scatter(df,x='total_bill',
               y='tip',
               color='sex',
               marginal_x='violin',
               marginal_y='histogram')
plot(fig)

'temp-plot.html'

In [49]:
#Facet_col/Facet_row and color_continuous_scale
fig = px.scatter(df, x="total_bill", 
                 y="tip", 
                 color="size", 
                 facet_col="sex",
                 color_continuous_scale=px.colors.sequential.Viridis)
plot(fig)

'temp-plot.html'

Time

In [86]:
fig=px.scatter(df,x='total_bill',y='tip',color='time',marginal_x='violin',marginal_y='violin')
plot(fig)

'temp-plot.html'

Daywise

In [35]:
#histfunc and Barmode
fig=px.histogram(df,
                 x='day',y='tip',
                 color='sex',
                 histfunc='avg',
                 barmode='group')

plot(fig)

'temp-plot.html'

In [36]:
#ordering categorical variable
fig=px.histogram(df,x='day',y='tip',color='sex',
                 histfunc='max',barmode='group',
                 category_orders={'day':["Thur", "Fri", "Sat", "Sun"]})
plot(fig)

'temp-plot.html'

In [37]:
fig=px.scatter(df,x='total_bill',y='tip',animation_frame='day')
plot(fig)

'temp-plot.html'

#### Scatter Matrix

In [38]:
fig=px.scatter_matrix(df,dimensions=['tip','size','total_bill'],
                      color='time',
                      opacity=0.2)
plot(fig)

'temp-plot.html'

#### 3D Plotting

In [39]:
fig=px.scatter_3d(df,x='size',y='tip',z='total_bill',color='time')
plot(fig)

'temp-plot.html'

## Geographical Plotting


## Choropleth Map plot

1. Using Built in Plotly location mode<br>
    a. World Map (ISO codes/alpha)<br>
    b. USA states Map<br>
    <br>
2. Supplying GeoJSON with lat-long coordinates of polygon vertices<br>


### 1. Using Built-in maps

##### a. World Map

Geographical plots in Plotly are mapped against ISO 3166-1 codes for each country<br>
https://en.wikipedia.org/wiki/List_of_ISO_3166_country_codes
### ISO Codes
<b>iso_alpha</b> represents Alpha-3 Codes <br>
<b>iso_num</b> represents Numeric code


In [40]:
#Import "Gapminder" built-in dataset
df=px.data.gapminder()
df.head()

Unnamed: 0,country,continent,year,lifeExp,pop,gdpPercap,iso_alpha,iso_num
0,Afghanistan,Asia,1952,28.801,8425333,779.445314,AFG,4
1,Afghanistan,Asia,1957,30.332,9240934,820.85303,AFG,4
2,Afghanistan,Asia,1962,31.997,10267083,853.10071,AFG,4
3,Afghanistan,Asia,1967,34.02,11537966,836.197138,AFG,4
4,Afghanistan,Asia,1972,36.088,13079460,739.981106,AFG,4


In [41]:
#Taking data for 2007
df_2007=df[df['year']==2007]
df_2007.head()

Unnamed: 0,country,continent,year,lifeExp,pop,gdpPercap,iso_alpha,iso_num
11,Afghanistan,Asia,2007,43.828,31889923,974.580338,AFG,4
23,Albania,Europe,2007,76.423,3600523,5937.029526,ALB,8
35,Algeria,Africa,2007,72.301,33333216,6223.367465,DZA,12
47,Angola,Africa,2007,42.731,12420476,4797.231267,AGO,24
59,Argentina,Americas,2007,75.32,40301927,12779.37964,ARG,32


In [42]:
fig = px.choropleth(df_2007, locations="iso_alpha",
                    color="lifeExp",
                    hover_name="country")
plot(fig)

'temp-plot.html'

#### Time Lapse using animation features

In [43]:
fig = px.choropleth(df, locations='iso_alpha',
                    color='lifeExp',
                    hover_name='country',
                   animation_frame='year',
                   range_color=(20,80))
plot(fig)

'temp-plot.html'

In [44]:
#Bubble plot

fig = px.scatter_geo(df,
                     locations='iso_alpha', 
                     color='continent', 
                     hover_name='country', 
                     size='pop',
                     animation_frame='year', 
                     projection='natural earth')
plot(fig)

'temp-plot.html'

##### NOTE: This feature can also be used in scatterplots,bars, etc.

In [45]:
fig = px.bar(df, x="continent", y="pop", color="continent",
  animation_frame="year", animation_group="country", range_y=[0,4000000000])
plot(fig)

'temp-plot.html'

##### b. USA States Map

In [46]:
#can be called using state code
fig = px.choropleth(locations=["CA", "TX", "NY"], 
                    locationmode="USA-states", 
                    color=[1,2,3], 
                    scope="usa")
plot(fig)

'temp-plot.html'

### 2. Using GeoJSON
GeoJSON is a format for encoding a variety of geographic data structures.

In [22]:
#Loading Indian States file using JSON library
import json
with open('ind_states.geojson') as f:
    states=json.load(f)

In [49]:
states['features'][34]

{'type': 'Feature',
 'properties': {'ID_0': 105,
  'ISO': 'IND',
  'NAME_0': 'India',
  'ID_1': 35,
  'NAME_1': 'West Bengal',
  'NL_NAME_1': None,
  'VARNAME_1': 'Bangla|Bengala Occidentale|Bengala Ocidental|Bengale occidental',
  'TYPE_1': 'State',
  'ENGTYPE_1': 'State'},
 'geometry': {'type': 'MultiPolygon',
  'coordinates': [[[[88.018608093262, 21.57277870178217],
     [88.01889038085955, 21.572500228882006],
     [88.01944732666033, 21.572500228882006],
     [88.01972198486334, 21.57220840454113],
     [88.02055358886713, 21.572221755981616],
     [88.0213851928711, 21.57138824462885],
     [88.02139282226568, 21.57110977172846],
     [88.02111053466825, 21.57083320617687],
     [88.02111053466825, 21.57055473327648],
     [88.0213851928711, 21.570278167724666],
     [88.02139282226568, 21.569440841674748],
     [88.02166748046903, 21.569166183471737],
     [88.02166748046903, 21.56888961792015],
     [88.02139282226568, 21.56861114501976],
     [88.0213851928711, 21.568332672119

In [28]:
import random 
covid=pd.DataFrame()
covid['state_cd']=range(1,36)
covid['active_cases']=[random.randint(1,10000,) for i in range(1,36)]
covid.head()

Unnamed: 0,state_cd,active_cases
0,1,4058
1,2,9952
2,3,5334
3,4,6866
4,5,4458


In [24]:
for i in range(0,35):
    states['features'][i]['id']=i+1

In [29]:
fig = px.choropleth(covid,color='active_cases',
                    geojson=states, locations='state_cd',
                    color_continuous_scale="hot",
                    range_color=(1,10000),
                    scope='asia',
                    title='India Covid-19 Tracker'
                    )
plot(fig)

'temp-plot.html'

In [30]:
fig.write_html("India_covid.html")