<br>
<h1 style = "font-size:40px; font-family:Garamond ; font-weight : normal; background-color: #f6f5f5 ; color : #fe346e; text-align: center; border-radius: 100px 100px;">Understanding Plotly with Folium using World Happiness Dataset</h1>
<br>
    
<center><img src="https://miro.medium.com/max/4800/1*jSzSsGQdJhcG7_PIH8eLag.png"></center>


### <h3 style="color:#fe346e">Abstract</h3>

When data visualization first became a thing, data practitioners were plotting graphs as they were exploring their data with Pandas. The library that made this possible was called Matplotlib, it was a good library because it is :
* easy to use
* fast
* harmonizes well with Pandas

However, people who have used it long enough felt that it was dull.

Even the official functionality of Matplotlib states that :
**Matplotlib is mainly deployed for basic plotting. — Matplotlib**
Hence, data practitioners who wanted a more interesting library that consist of more plotting patterns, options and an easier syntax flocked to the Seaborn Library. At the time, Seaborn was the go to for many people.

### <h3 style="color:#fe346e">Dynamic visuals as a Rescuer </h3>
However, there was still a problem. All of these libraries offer static plots. These plots can only tell you what they show on screen. You aren’t able to dive deeper into the plots, hover over points to find out information or add filters.

Plots that are able to perform functions like these are named Interactive Visualizations.
Interactive Visualizations are popular for adding a ton of information on top of the plots you’re presenting, unlocking possibilities and making you look 10x cooler. It’s hard to put in words, let’s visualize what I’m trying to tell you.

### <h3 style="color:#fe346e">Plotly</h3>
Plotly is the Python Library for interactive data visualizations. Plotly allows you to plot superior interactive graphs than either Matplotlib or Seaborn.

What kind of Graphs does Plotly Plot?
1. All Matplotlib and Seaborn Charts
2. Statistical Charts which includes but not limited to Parallel Categories and Probability Tree Plots
3. Scientific Charts you never thought of, ranging from Network Graphs to Radar Charts
4. Financial Charts which are useful for Time-Series Analysis, examples include Candlesticks, Funnels and Bullet Charts
5. Geological Maps and 3 Dimensional Plots which allows you to interact with them

### <h3 style="color:#fe346e">Why is Plotly so Popular ?</h3>

1. Interactive Plots
2. Prettier than Matplotlib/Seaborn (Up for debate ?)
3. Offers a more detailed visualization which assists in exploring, understanding and communicating your data
4. Provides maximum customization for your plots including adding sliders and filters
5. Much cleaner and understandable code base
6. Backed up by a company named Plotly, which makes interactive web-based visualizations and web-applications

### <h3 style="color:#fe346e">How do I Install Plotly ?</h3>
`pip install plotly`<br>
`pip install cufflinks`

<h1 style="font-family: Verdana; font-size: 24px; font-style: normal; font-weight: bold; text-decoration: none; text-transform: none; letter-spacing: 3px; background-color: #ffffff; color: navy;">Import Libraries&nbsp;&nbsp;&nbsp;&nbsp;</h1> 

In [None]:
import pandas as pd
import plotly 
import plotly.express as ex
import numpy as np
import matplotlib.pyplot as plt
import seaborn as sns
%matplotlib inline
import plotly.graph_objs as go

<h1 style="font-family: Verdana; font-size: 24px; font-style: normal; font-weight: bold; text-decoration: none; text-transform: none; letter-spacing: 3px; background-color: #ffffff; color: navy;">Loading Data 💎&nbsp;&nbsp;&nbsp;&nbsp;</h1> 

In [None]:
data=pd.read_csv('/kaggle/input/world-happiness-report-2021/world-happiness-report-2021.csv')
data21=pd.read_csv('/kaggle/input/world-happiness-report-2021/world-happiness-report.csv')

<h1 style="font-family: Verdana; font-size: 24px; font-style: normal; font-weight: bold; text-decoration: none; text-transform: none; letter-spacing: 3px; background-color: #ffffff; color: navy;">Check the shape&nbsp;&nbsp;&nbsp;&nbsp;</h1> 

In [None]:
display(data.shape)
display(data21.shape)

<h1 style="font-family: Verdana; font-size: 24px; font-style: normal; font-weight: bold; text-decoration: none; text-transform: none; letter-spacing: 3px; background-color: #ffffff; color: navy;">View data&nbsp;&nbsp;&nbsp;&nbsp;</h1> 

In [None]:
data.columns

In [None]:
display(data.head())
display(data21.head())

<h1 style="font-family: Verdana; font-size: 24px; font-style: normal; font-weight: bold; text-decoration: none; text-transform: none; letter-spacing: 3px; background-color: #ffffff; color: navy;">Sort the data&nbsp;&nbsp;&nbsp;&nbsp;</h1> 

In [None]:
data.sort_values(by='Ladder score',ascending=False,inplace=True)

In [None]:
data['Regional indicator'].value_counts()

<h1 style="font-family: Verdana; font-size: 24px; font-style: normal; font-weight: bold; text-decoration: none; text-transform: none; letter-spacing: 3px; background-color: #ffffff; color: navy;">Region wise GroupBy- Leader Score Mean&nbsp;&nbsp;&nbsp;&nbsp;</h1> 

In [None]:
RI_grp=data.groupby(['Regional indicator']).agg({'Country name':'count','Ladder score':'mean'}).reset_index().rename({'Country name':'Country count','Ladder score':'Avg. Ladder score'},axis=1)
RI_grp['Avg. Ladder score']=round(RI_grp['Avg. Ladder score'],3)

<html><h3 style='font-family:Garamond;background:lightgrey;  color:black; font-size:30px; padding:15px; border:3px solid #fe346e;'><center><b>Distribution of Region wise Count and Avg. ladder score</b></center></h3></html>

In [None]:
## Plot Region wise count of countries and average ladder score

from plotly.subplots import make_subplots
fig=go.Figure()
fig.add_trace(go.Bar(
    x=RI_grp['Regional indicator'],
    y=RI_grp['Country count'],
    name='Country Count',
    marker_color='lightblue',
    text=RI_grp['Country count'],
    textposition='inside',
    yaxis='y1'
))
fig.add_trace(go.Scatter(
    x=RI_grp['Regional indicator'],
    y=RI_grp['Avg. Ladder score'],
    name='Average Ladder Score',
    mode='markers+text+lines',
    marker_color='magenta',
    marker_size=10,
    text=RI_grp['Avg. Ladder score'],
    textposition='top center',
    line=dict(color='#5D69B1',dash='dash'),
    yaxis='y2'

))
fig.update_layout(
    title="Region Wise Counts and Avg ladder Score",
    xaxis_title="Region",
    yaxis_title="Count of Country",
    template='ggplot2',
    font=dict(
        size=12,
        color="Black",
        family="Garamond"
        
    ),
    xaxis=dict(showgrid=False),
    yaxis=dict(showgrid=False),
    plot_bgcolor='white',
    yaxis2=dict(showgrid=True,overlaying='y',side='right',title='Avg. Ladder Score'),
    legend=dict(yanchor="top",
    y=1.3,
    xanchor="left",
    x=0.78)
)
fig.show()

<html><h3 style='font-family:Garamond;background:lightgrey;  color:black; font-size:30px; padding:15px; border:3px solid #fe346e;'><center><b>Distribution of Region wise Avg. Social support & Avg. life expectancy</b></center></h3></html>

<h1 style="font-family: Verdana; font-size: 24px; font-style: normal; font-weight: bold; text-decoration: none; text-transform: none; letter-spacing: 3px; background-color: #ffffff; color: navy;">Region wise GroupBy- Social Support Mean & Healthy life expectancy Mean&nbsp;&nbsp;&nbsp;&nbsp;</h1> 

In [None]:
RI_grp1=data.groupby('Regional indicator').agg({'Social support':'mean','Healthy life expectancy':'mean'}).reset_index().rename({'Social support':'Avg. Social support','Healthy life expectancy':'Avg. Healthy life expectancy'},axis=1)

In [None]:
RI_grp1['Avg. Social support']=round(RI_grp1['Avg. Social support'],3)
RI_grp1['Avg. Healthy life expectancy']=round(RI_grp1['Avg. Healthy life expectancy'],3)

In [None]:
from plotly.subplots import make_subplots
fig=go.Figure()
fig.add_trace(go.Bar(
    x=RI_grp1['Regional indicator'],
    y=RI_grp1['Avg. Healthy life expectancy'],
    name='Avg health life exp.',
    marker_color='mediumspringgreen',
    text=RI_grp1['Avg. Healthy life expectancy'],
    yaxis='y1'
))
fig.add_trace(go.Scatter(
    x=RI_grp1['Regional indicator'],
    y=RI_grp1['Avg. Social support'],
    name='Average Ladder Score',
    mode='markers+text',
    marker_color='black',
    marker_size=10,
    text=RI_grp1['Avg. Social support'],
    textposition='bottom center',
    textfont=dict(color='black'),
    yaxis='y2'

))
fig.update_layout(
    title="Region Wise Avg. Social support & Healthy life expectancy",
    xaxis_title="Region",
    yaxis_title="Avg. Healthy life expectancy",
    template='ggplot2',
    font=dict(
        size=10,
        color="black",
        family="Garamond"
        
    ),
    xaxis=dict(showgrid=False),
    yaxis=dict(showgrid=False),
    plot_bgcolor='white',
    yaxis2=dict(showgrid=True,overlaying='y',side='right',title='Avg. Social support'),
    legend=dict(yanchor="top",
    y=1.3,
    xanchor="left",
    x=0.78)
)
fig.show()

<h1 style="font-family: Verdana; font-size: 24px; font-style: normal; font-weight: bold; text-decoration: none; text-transform: none; letter-spacing: 3px; background-color: #ffffff; color: navy;">Histogram-Perceptions of corruption&nbsp;&nbsp;&nbsp;&nbsp;</h1> 

In [None]:
# defining data
trace = go.Histogram(x=data['Perceptions of corruption'],nbinsx=40)
df = [trace]
# defining layout
layout = go.Layout(title="Perception of Corruption Distribution")
# defining figure and plotting
fig = go.Figure(data = df,layout = layout)
fig.update_layout(xaxis_title="Perception of Corruption",
                  template='ggplot2',
                  xaxis=dict(showgrid=False),
                  yaxis=dict(showgrid=False),
                  plot_bgcolor='white',
                  font=dict(family="Garamond"))
fig.show()

<h1 style="font-family: Verdana; font-size: 24px; font-style: normal; font-weight: bold; text-decoration: none; text-transform: none; letter-spacing: 3px; background-color: #ffffff; color: navy;">Scatter plot-Ladder score in Dystopia & Dystopia + residual&nbsp;&nbsp;&nbsp;&nbsp;</h1>

In [None]:
#defining data
trace = go.Scatter(x = data['Perceptions of corruption'],y=data['Dystopia + residual'],text = data['Country name'],mode='markers',marker={'color':'red'})
df=[trace]
#defining layout
layout = go.Layout(title='Perceptions of corruption & Dystopia + residual : Scatter Plot',xaxis=dict(title='Perceptions of corruption'),yaxis=dict(title='Dystopia + residual'),hovermode='closest')
#defining figure and plotting
figure = go.Figure(data=df,layout=layout)
figure.update_layout(template='ggplot2',
                  xaxis=dict(showgrid=True),
                  yaxis=dict(showgrid=True),
                  plot_bgcolor='lightgrey',
                  font=dict(family="Garamond"))
figure.show()

<h1 style="font-family: Verdana; font-size: 24px; font-style: normal; font-weight: bold; text-decoration: none; text-transform: none; letter-spacing: 3px; background-color: #ffffff; color: navy;">Map Latitude and Longitude&nbsp;&nbsp;&nbsp;&nbsp;</h1> 

In [None]:
#function to get longitude and latitude data from country name
from geopy.geocoders import Nominatim
geolocator = Nominatim(user_agent="<masked>")### pass a valid mailid
def geolocate(country):
    try:
        # Geolocate the center of the country
        loc = geolocator.geocode(country)
        # And return latitude and longitude
        return (loc.latitude, loc.longitude)
    except:
        # Return missing value
        return np.nan

In [None]:
data['lat_long']=data['Country name'].apply(geolocate)

In [None]:
data.head()

In [None]:
### Get the country name where lat long is not present 
# data['lat_long'].isna().sum()
# data[data['lat_long'].isna()] ## Hong kong SAR of china

## add lat long 
data['lat_long']=np.where(data['Country name']=='Hong Kong S.A.R. of China','(22.3193,114.1694)',data['lat_long'])

In [None]:
data[data['lat_long'].isna()]### 0 records

<h1 style="font-family: Verdana; font-size: 24px; font-style: normal; font-weight: bold; text-decoration: none; text-transform: none; letter-spacing: 3px; background-color: #ffffff; color: navy;">Split the Latitude and Longitude&nbsp;&nbsp;&nbsp;&nbsp;</h1> 

In [None]:
data['lat_long']=data['lat_long'].astype(str)
data['lat_long']=data['lat_long'].str.replace('(','')
data['lat_long']=data['lat_long'].str.replace(')','')
data=pd.concat([data,data['lat_long'].str.split(',',expand=True).rename({0:'lat',1:'long'},axis=1)],axis=1)

In [None]:
data['lat'].isna().sum()
data['long'].isna().sum()

<html><h3 style='font-family:Garamond;background:lightgrey;  color:black; font-size:30px; padding:15px; border:3px solid #fe346e;'><center><b>Plot on World Map -> Ladder Score and Logged GDP per capita</b></center></h3></html>

In [None]:
# Create a world map to show distributions of users 
import folium
from folium.plugins import MarkerCluster
#empty map
world_map= folium.Map(tiles="cartodbpositron")
marker_cluster = MarkerCluster().add_to(world_map)
#for each coordinate, create circlemarker of user percent
for i in range(len(data)):
        lat = data.iloc[i]['lat']
        long = data.iloc[i]['long']
        radius=5
        popup_text = """Country : {}<br>
                    Logged GDP per capita : {}<br>
                    Ladder score : {}<br>"""
        popup_text = popup_text.format(data.iloc[i]['Country name'],
                                   data.iloc[i]['Logged GDP per capita'],data.iloc[i]['Ladder score']
                                   )
        folium.CircleMarker(location = [lat, long], radius=radius, popup= popup_text, fill =True).add_to(marker_cluster)
#show the map
world_map

<h1 style="font-family: Verdana; font-size: 24px; font-style: normal; font-weight: bold; text-decoration: none; text-transform: none; letter-spacing: 3px; background-color: #ffffff; color: navy;">Trend Lines&nbsp;&nbsp;&nbsp;&nbsp;</h1> 

In [None]:
# data21.head()
data21[data21['Country name']=='Afghanistan']

In [None]:
### Top 10 countries as per Ladder score 
top10=data['Country name'][:10].tolist()
data21_top10=data21[data21['Country name'].isin(top10)]

In [None]:
data21_top10['Country name'].unique()

In [None]:
data21_top10.reset_index(drop=True,inplace=True)

<h1 style="font-family: Verdana; font-size: 24px; font-style: normal; font-weight: bold; text-decoration: none; text-transform: none; letter-spacing: 3px; background-color: #ffffff; color: navy;">Create a color dictionary for country&nbsp;&nbsp;&nbsp;&nbsp;</h1> 

In [None]:
color={}
color_list=['#FF0000','#00FF00','#0000FF','#FFFF00','#00FFFF','#FF00FF','#808080','#800000','#808000','#008000']
i=0
for c in top10:
    color[c]=color_list[i]
    i+=1

In [None]:
for k,v in color.items():
    print(v)

In [None]:
## Plot Region wise count of countries and average ladder score
from plotly.subplots import make_subplots
fig=go.Figure()
for k,v in color.items():
    fig.add_trace(go.Scatter(
    x=data21_top10[data21_top10['Country name']==k]['year'],
    y=data21_top10[data21_top10['Country name']==k]['Life Ladder'],
    name=k,
    mode='markers+text+lines',
    marker_color='black',
    marker_size=3,
    line=dict(color=color[k]),
    yaxis='y1'))
    
fig.update_layout(
    title="Top 10 Country wise Life Ladder trend",
    xaxis_title="Year",
    yaxis_title="Life Ladder",
    template='ggplot2',
    font=dict(
        size=10,
        color="Black",
        family="Garamond"
    ),
    xaxis=dict(showgrid=True),
    yaxis=dict(showgrid=True)
)
fig.show()