# Population density of Bangladesh according to district 
---
1. The current population of Bangladesh is 166,387,872 as of Friday, July 16, 2021, based on Worldometer elaboration of the latest United Nations data.
2. Bangladesh 2020 population is estimated at 164,689,383 people at mid year according to UN data.
3. Bangladesh population is equivalent to 2.11% of the total world population.
4. Bangladesh ranks number 8 in the list of countries (and dependencies) by population.
5. The population density in Bangladesh is 1265 per Km2 (3,277 people per mi2).
6. The total land area is 130,170 Km2 (50,259 sq. miles)

## Plotting Choropleth Bangladesh Map
---
A choropleth map is a type of thematic map in which areas are shaded or patterned in proportion to a statistical variable that represents an aggregate summary of a geographic characteristic within each area, such as population density or per-capita income.

### Process
- At first the district based geojson data is loaded
- Collected population data from Wikipedia and saved as csv format
- Load them to pandas dataframe
- Marge two dataset according to district using an unique id
- Plot choropleth map of Bangladesh


Load all districts from geojson file

In [None]:
from json import load
bd_districts=load(open('../input/bangladesh-geojson-adm2-64-districts-zillas/bangladesh_geojson_adm2_64_districts_zillas.json','r'))

Lets check all keys available in geojson file

In [None]:
bd_districts['features'][61].keys()

In [None]:
bd_districts["features"][61]['properties']

To get population info from wikipedia we can use pandas read_html module. It works fine in Jupiter Notebook. But for Kaggle However, i have tried to read html data. the html data reading support is not available. However, I am adding the process here. Its very simple. And yes, if you know any solution for this, please let me know.

In [None]:
#import pandas as pd
#dfs= pd.read_html('https://en.wikipedia.org/wiki/Districts_of_Bangladesh')

In the website there are many tables available. These are stored in list format. We can check the lenth and by selecting the correct index we can get our required table.

In [None]:
# len(dfs)

Store data as csv format for offline use

In [None]:
# for i in range(len(dfs)):
#   a = "Population (thousands)[28]" in dfs[i]
#   if a == True:
#       df=dfs[i].to_csv("Districts_of_Bangladesh.csv")

Reading csv data into a dataframe

In [None]:
import pandas as pd
df=pd.read_csv("../input/districts-of-bangladesh/Districts_of_Bangladesh.csv")

Checking dataframe head

In [None]:
df.head()

Removing District string from each row as geojson data do not have this district level after each district name.

In [None]:
df.District

In [None]:
df.District = df.District.apply(lambda x: x.replace(" District",""))
    

In [None]:
df.District 

Now it is the time to map this dataframe with geojson file. For this we have to Index the district name for each dataframe. However, we can label a specific id for each district.

In [None]:
district_id_map = {}
for feature in bd_districts["features"]:
    feature["id"] = feature["id"]
    district_id_map[feature["properties"]["ADM2_EN"]] = feature["id"]

In [None]:
district_id_map

Merge both dataframe according to id

In [None]:
df['id'] = df.District.apply(lambda x: district_id_map[x])

Now we can see an id column in the dataframe

In [None]:
df.head()

Renaming columns for looking good

In [None]:
df = df.rename(columns={
    'Population (thousands)[28]' : 'Population (thousands)',
    'Area (km2)[28]' : 'Area (km2)' })

A bar plot can be used to show population level in each district

In [None]:
import numpy as np
from matplotlib import cm
color = cm.inferno_r(np.linspace(.3, .7, 64))

In [None]:
df.set_index('District')["Population (thousands)"].plot.bar(
    xlabel='District',
    rot=90,
    figsize=(20,10),
    fontsize=10,
    color=color
    )

Now lets make choropleth map of Bangladesh with population density

In [None]:
from plotly.offline import plot, iplot, init_notebook_mode
init_notebook_mode(connected=True)
import plotly.express as px
import plotly.io as pio
#pio.renderers.default = 'browser'

In [None]:
fig = px.choropleth(
    df,
    locations='id',
    geojson=bd_districts,
    color='Population (thousands)',
    title='Bangladesh Population',
)
fig.update_geos(fitbounds="locations", visible=False)
fig.show()

As Dhaka has the most population, this part looks yellow. But others are not showing well as these locations have very few population against Dhaka. However, we can make log scale to solve the issue.

In [None]:
df['Population scale'] = np.log10(df['Population (thousands)'])

Now, dataframe has new column named "Population scale"

In [None]:
df.head()

Changing color to 'Population scale' and adding hover_name with hover_data the we can get a more informative graph.

In [None]:
fig = px.choropleth(
    df,
    locations='id',
    geojson=bd_districts,
    color='Population scale',
    hover_name='Bengali',
    hover_data=['Population (thousands)','Area (km2)'],
    title='Bangladesh Population'
)
fig.update_geos(fitbounds="locations", visible=False)
fig.show()

Customizing choropleth graph with mapbox looks more better.

In [None]:
px.choropleth_mapbox(df,
    locations='id',
    geojson=bd_districts,
    color='Population scale',
    hover_name='Bengali',
    hover_data=['Population (thousands)','Area (km2)'],
    title='Bangladesh Population',
    mapbox_style='carto-positron',
    center= { 'lat' : 23.6850, 'lon' : 90.3563},
    zoom=4.8,
    opacity=0.6)

Reference: https://github.com/ahnaf-tahmid-chowdhury/Choropleth-Bangladesh