# Ethiopia Mapping Section

This Python Jupyter file is to calculate and build out the requirements for the Ethiopian design. It could be possible to do this in Excel as well; but this way we have an ability to reference and redesign as per the changing requirements.

First Step is to collect the information; I have a Github account under my username (johnmeye) which i will reference from the file so that anyone who uses Conda/Jupyter will be able to get the files. For any challenges reach out to me on teams or by email (johnmeye@cisco.com)

In [None]:
from urllib.request import urlopen
import json
import pandas as pd
import numpy as np
import seaborn as sns

%matplotlib inline

import ipywidgets as widgets #Importing Widgets to allow for the changing of variables on the fly as questions are asked.

## Geographical Data

This file below is from the Ethiopian files available from the database on the following site:
https://data.humdata.org/dataset/ethiopia-population-data-_-admin-level-0-3

This site has both topography and the Level 0-3 admin data on the population levels per county/province. Vodacom only provided at Admin 1 (Provincial level) for this RFQ; but we are able to go more in depth to try and work out if there is any additional information we can use to strengthen our position.

The following section will pull that information from my Github; so that you don't have to fetch it yourself. Then load it as a JSON file into the DB, which you can see are Polygon type files with GPS coordinates which mark out the different layers/levels in the country. 

In [None]:
#Pull the Data I stored in my Github account for the analysis.

with urlopen('https://github.com/johnmeye/Ethiopia/raw/master/Ethiopia_JSON/eth_admbnda_adm1_csa_bofed_20190827.json') as response:
    counties1 = json.load(response)
    
with urlopen('https://github.com/johnmeye/Ethiopia/raw/master/Ethiopia_JSON/eth_admbnda_adm2_csa_bofed_20190827.json') as response:
    counties2 = json.load(response)
    
with urlopen('https://github.com/johnmeye/Ethiopia/raw/master/Ethiopia_JSON/eth_admbnda_adm3_csa_bofed_20190827.json') as response:
    counties3 = json.load(response)

#Feature Data is available in the JSON files but its easier to manage from a tableset so i have pulled this below as well.
    
Boundaries_Data1 = pd.read_excel("https://github.com/johnmeye/Ethiopia/raw/master/eth_adminboundaries_tabulardata.xlsx",
                  sheet_name='Admin1')

Boundaries_Data3 = pd.read_excel("https://github.com/johnmeye/Ethiopia/raw/master/eth_adminboundaries_tabulardata.xlsx",
                    sheet_name='Admin3')


In [None]:
counties3["features"][0]['properties'] #Just a sample on how to pull out specific information from the Counties json Files.

## Admin Level Data

As mentioned above; there is both Geo and Admin data; this information matches the information against some paramater; since the file is nicely structured according to standards we will stick to the humanitarian markings. 

Below i read the information from different levels into the variables for Admin1-3 so that we are able to use them to draw choropleth maps of the country. 

Once read into memory; it is possible to find matches against the specific parameters in both the GeoJSON and the Admin files. So i run a few sample commands to view what the data looks like. 



In [None]:
Admin1 = pd.read_excel("https://github.com/johnmeye/Ethiopia/raw/master/ethiopia-population-data-_-admin-level-0-3.xlsx",
                   dtype={"admin1Pcode": str},
                   skiprows=[1],
                   sheet_name='Admin1')

Admin2 = pd.read_excel("https://github.com/johnmeye/Ethiopia/raw/master/ethiopia-population-data-_-admin-level-0-3.xlsx",
                   dtype={"admin1Pcode": str},
                   skiprows=[1],
                   sheet_name='Admin2')

Admin3 = pd.read_excel("https://github.com/johnmeye/Ethiopia/raw/master/ethiopia-population-data-_-admin-level-0-3.xlsx",
                   dtype={"admin1Pcode": str},
                   skiprows=[1],
                   sheet_name='Admin3')

# Plotting the information on a Choropleth map

This information that we have gathered above needs to be represented in order to accurately help. 

Vodacom has provided Admin 1 information so we will plot against the Admin1 codes in the information. 

I have made it all at different levels so that some code can run quicker if need be.

In [None]:
import plotly.express as px

In [None]:
fig = px.choropleth_mapbox(Admin1, 
                           geojson=counties1, 
                           locations='admin1Pcode', featureidkey="properties.ADM1_PCODE",
                           color='Total Population',
                           color_continuous_scale="portland",
                           range_color=(500000, 30000000),
                           mapbox_style="carto-positron",
                           zoom=3, center = {"lat": 9, "lon": 39},
                           opacity=0.5,
                           labels={'Total Population':'Total Population'}
                          )
fig.update_layout(margin={"r":0,"t":0,"l":0,"b":0})

#fig2 = px.choropleth_mapbox(Admin3, 
#                           geojson=counties3, 
#                           locations='admin3Pcode', featureidkey="properties.ADM3_PCODE",
#                           color='Total Population',
#                           color_continuous_scale="portland",
#                           range_color=(25000, 300000),
#                           mapbox_style="carto-positron",
#                           zoom=3, center = {"lat": 9, "lon": 39},
#                           opacity=0.5,
#                           labels={'Total Population':'Total Population'}
#                          )
#fig2.update_layout(margin={"r":0,"t":0,"l":0,"b":0})
#
fig.show()
#fig2.show()

# Manipulating the data to find the sites requirements

Vodacom has provided the Ethiopia site numbers, and the expected by year and by type, although the Vodacom breaks it down by height and rooftop; this might not be necessary from our point of view and should not impact the way we calculate this. 

For this we will need to figure out how to define rural/urban and so forth.

In [None]:
Sites = pd.read_excel("https://github.com/johnmeye/Ethiopia/raw/master/TX%20BoQ%20v3-Python.xlsx",
                   sheet_name='Site_Numbers')

Sites.rename(columns = {'Location':'admin1Name_en'}, inplace = True)

Site_Year = Sites.groupby(['admin1Name_en']).sum()

TotalSites = pd.merge(Site_Year, Boundaries_Data1[['admin1Name_en', 'Shape_Area']],how='left', on=['admin1Name_en']) #Site combined with the geodata. Sizing found early.

TotalSites.rename(columns = {'Shape_Area':'Shape_Area_Admin1'}, inplace = True)

print(TotalSites)



In [None]:
TotalSites.sum(axis = 0, skipna = True)

In [None]:
SiteTable = pd.merge(Admin3, TotalSites, how='outer', on=['admin1Name_en']) #Site by year

In [None]:
SiteTable.replace(to_replace=0, value=np.nan, inplace=True)

In [None]:
Fulltable = pd.merge(SiteTable, Boundaries_Data3[['admin3Pcode', 'Shape_Area']],how='left', on=['admin3Pcode']) #Site combined with the geodata. Sizing found early.

print(Fulltable)

In [None]:
Fulltable['Shape_Area'] = Fulltable['Shape_Area'].fillna(0)

In [None]:
is_NaN = pd.isnull(Fulltable['Shape_Area'])

In [None]:
Fulltable[is_NaN]  

### TODO: 
Plan is to create a dynamic image with the section below so that you can switch between the years and view what is happening.

In [None]:
#fig = px.choropleth_mapbox(Fulltable, 
#                           geojson=counties1, 
#                           locations='admin1Name_en', featureidkey="properties.ADM1_EN",
#                           color='Year1',
#                           color_continuous_scale="portland",
#                           range_color=(0, 350),
#                           mapbox_style="carto-positron",
#                           zoom=3, center = {"lat": 9, "lon": 39},
#                           opacity=0.5,
#                           
#                          )
#fig.update_layout(margin={"r":0,"t":0,"l":0,"b":0})
#fig.show()

# Calculating population density

The files provide a value for the size of the land and the overall land coverage. 1,104,300 square Km is the total size of the land in question. The following section shows the calculations used to determine the density for the various provinces. Which we are then able to use for the density and Rural/Urban calculations.

In [None]:
Ethiopia_Area = 1104300
Ethiopia_Shape_Area = Fulltable['Shape_Area'].sum()
Unit_Area = Ethiopia_Area / Ethiopia_Shape_Area
Fulltable['Area_Km'] = Fulltable['Shape_Area'].apply(lambda x: x*Unit_Area)
Fulltable.rename(columns = {'Total Population':'Total_Population'}, inplace = True)
Fulltable.info()

In [None]:
def PopDensity(Population,Area):
    try:
        Density=Population/Area
        return Density
    except ZeroDivisionError:
        return 0

Fulltable['Population_Density'] = Fulltable.apply(lambda x: PopDensity(x.Total_Population, x.Area_Km), axis=1)

In [None]:
Fulltable['Population_Density'].sort_values().tail(15)

In [None]:
sns.boxplot(x="admin1Name_en", y="Total_Population", data=Fulltable,palette='rainbow')

In [None]:
fig1 = px.choropleth_mapbox(Fulltable, 
                           geojson=counties3, 
                           locations='admin3Pcode', featureidkey="properties.ADM3_PCODE",
                           color='Total_Population',
                           color_continuous_scale="portland",
                           range_color=(25000, 300000),
                           mapbox_style="carto-positron",
                           zoom=3, center = {"lat": 9, "lon": 39},
                           opacity=0.5,
                           labels={'Total_Population':'Total_Population'}
                          )
fig1.update_layout(margin={"r":0,"t":0,"l":0,"b":0})
fig1.show()

In [None]:
fig2 = px.choropleth_mapbox(Fulltable, 
                           geojson=counties3, 
                           locations='admin3Pcode', featureidkey="properties.ADM3_PCODE",
                           color='Population_Density',
                           color_continuous_scale="portland",
                           range_color=(50, 450),
                           mapbox_style="carto-positron",
                           zoom=3, center = {"lat": 9, "lon": 39},
                           opacity=0.5,
                           labels={'Population_Density':'Population_Density'}
                          )
fig2.update_layout(margin={"r":0,"t":0,"l":0,"b":0})

fig2.show()

# Todo

Do bar/wisker plot of density by province/area for this 

In [None]:
sns.boxplot(x="admin1Name_en", y="Total_Population", data=Fulltable,palette='rainbow')


In [None]:
Not_Addis_Dire = Fulltable[ (Fulltable['admin1Name_en']!='Dire Dawa') &(Fulltable['admin1Name_en']!='Addis Ababa') ]
Not_Addis_Dire = Fulltable[ (Fulltable['admin1Name_en']!='Dire Dawa') &(Fulltable['admin1Name_en']!='Addis Ababa') ]

In [None]:
Not_Addis_Dire

In [None]:
sns.boxplot(x="admin1Name_en", y="Population_Density", data=Not_Addis_Dire,palette='rainbow')