# Choropleth Maps

This notebook is used to plot **Chilld Mortality Rate** in India in the year 2011. Dataset is taken from data.gov.in. 
Python Libraries used:
Pandas
Numpy
Plotly

Packages are imported only when they are required (Not at the beginning)


Importing Pandas

In [None]:
import pandas as pd

Loading the dataset as it is. This will give us an idea, what is to be cleaned. 

In [None]:
initdf = pd.read_csv('infant_mortality_rate.csv')

Looking into the top section of the dataset

In [None]:
initdf.head()

The `info()` may give us more information about the dataset.

In [None]:
initdf.info()

Plan is only to plot the data from one year. So, all the information is not useful. The space from the column \`  States/Union Territories\` is not good for me. I would like to rename the headers in a different way in my final dataframe.  \['Sl.No', 'States', '2005', '2006', '2007', '2008', '2009', '2010', '2011'\] will be sufficient. Even though I will use the data for only one year, still I am keeping the data for 2005-2011. 

Creating the target header for dataset.

In [None]:
c_header = ['Sl.No', 'States', '2005', '2006', '2007', '2008', '2009', '2010', '2011']
#usecols =['Sl.No.', ' States/Union Territories','2005', '2006', '2007', '2008', '2009', '2010', '2011']

Lets create the required dataset. 

In [None]:
df = pd.read_csv('infant_mortality_rate.csv' , usecols =['Sl.No.', ' States/Union Territories','2005', '2006', '2007', '2008', '2009', '2010', '2011'])

First look of the imported dataset. 

In [None]:
df.head()

In [None]:
df.info()

We now know the exact header names. Changing the headers to the desired set. 

In [None]:
df.columns=c_header

In [None]:
df.head()

Lets have a look to the States column.

In [None]:
df['States']

The dataset needs some clearning. ALL INDIA is not a state name. Lets delete that from the dataset. 

In [None]:
# index = df[df['States'] == 'ALL INDIA'].index
# df.drop(index, inplace=True)
df=df.drop(35)

Few other state names needs modifications. But before that, it is better to load the geojson file from which we will create the map. We should know how the states names are written in the geojson file. To use geojson file, we should import the json library.

In [None]:
import json

The geojson file can be taken from anywhere. To get the exact geojson file which is used here, [check here](https://un-mapped.carto.com/tables/states_india/public/map)

In [None]:
ind_states = json.load(open('states_india.geojson', 'r'))

Checking the keys. 

In [None]:
ind_states['features'][0].keys()

In [None]:
ind_states['features'][0]['properties']

Declaring a dictionary. This dictionary will be used for mapping the geojson file and our dataset. 

In [None]:
mapping = {}

In [None]:
for feature in ind_states['features']:
    feature['id'] = feature['properties']['state_code']
    mapping[feature['properties']['st_nm']] = feature['id']

Lets check, what is now stored in the mapping. 

In [None]:
mapping

The reason behind loading the json file before renaming the errors in our previous dataset was to check if anything needs to be modified in the geojson file. Here, the name **Arunachal Pradesh** is written as Aruna**n**chal Pradesh and I don't like the name **NCT of Delhi**, instead, I will use **Delhi**. It is just a choice of individuals. If you feel comfortable with the names, you can make the changes in the dataframe.

Normally, when we need to change the keys of a python dictionary, we do it two stages. First, adding the new key and then deleting the old key. But with the use of **pop** we can do it in one step. 

The syntax is: 

`dictionary_name['new key'] = dictionary_name.pop('old key)`

In [None]:
mapping['Arunachal Pradesh'] = mapping.pop('Arunanchal Pradesh')
mapping['Delhi'] = mapping.pop('NCT of Delhi')

Checking, if the changes are OK.

In [None]:
mapping

Now, lets have a look in our dataframe and to identify what changes needs to be done.

In [None]:
df

Unlike the geojson file, the names \['D&N Haveli','A&N Islands','Chhatisgarh','Uttaranchal'\] in our dataframe is not right. Lets correct those values.

In [None]:
df['States'] = df['States'].replace(['D&N Haveli','A&N Islands','Chhatisgarh','Uttaranchal' ],['Dadara & Nagar Havelli','Andaman & Nicobar Island','Chhattisgarh','Uttarakhand'])

Lets look to the corrected values. 

In [None]:
df

Looks good. Now, lets link the files and create an 'id' column in our dataframe. 

In [None]:
df['id'] = df['States'].apply(lambda x: mapping[x])

Lets see, how the dataframe looks now. 

In [None]:
df

The dataframe looks OK. We are ready to create our Chropleth. For that we will need plotly library.

In [None]:
import plotly.express as px

The plotly.io webbrowser and renderers are not required, unless you are using Jupyter Notebook/lab. If you intend to use those. Uncomment those lines. 

In [None]:
import plotly.io as pio

In [None]:
#import webbrowser

In [None]:
#pio.renderers.default = 'browser'

We will create multiple plots, for a comparison. The first plot is simply putting the data and to create an instance in the world map.  

In [None]:
fig = px.choropleth(df, locations='id', geojson=ind_states, color='2011')

In [None]:
fig.show()

This map look too small. The targeted area is out of scope. Lets make the scope right by taking only Asia as a reference.

In [None]:
fig1 = px.choropleth(df, locations='id', geojson=ind_states, color='2011', scope='asia')

In [None]:
fig1.show()

Lets eliminate other areas which are not important in this map and fit the map to visible area. 

In [None]:
fig1.update_geos(fitbounds="locations", visible = False)

In [None]:
#fig1.show()

pyplot is also accompanied by a beautiful library called as **[mapbox](https://plotly.com/python/maps/)**. The mapbox library is far advanced. Lets use the mapbox to make our map more beautiful. Note: Nothing needs to be imported to use mapbox, as it is a part of pyplot library.

In [None]:
fig2 = px.choropleth_mapbox(df,
                            locations='id', 
                            geojson=ind_states, 
                            color='2011', 
                            hover_name = "States",
                            hover_data = ["2011"],
                            mapbox_style="carto-positron",
                            zoom = 4,
                            center={"lat":24, "lon":78},
                            opacity=0.5                            
                    )
fig2.show()

In [None]:
df

In [None]:
fig2 = px.choropleth(df, 
                     locations='id', 
                     geojson=ind_states, 
                     color='2011_log', 
                     scope='asia',
                     hover_name = "States",
                     hover_data = ["2011"]
                    )

In [None]:
fig2.update_geos(fitbounds="locations", visible = False)

In [None]:
# Using Mapbox
fig3 = px.choropleth_mapbox(df,
                            locations='id', 
                            geojson=ind_states, 
                            color='2011_log', 
                            hover_name = "States",
                            hover_data = ["2011"],
                            mapbox_style="carto-positron",
                            zoom = 4,
                            center={"lat":24, "lon":78},
                            opacity=0.5                            
                    )
fig3.show()