# Ethiopia Mapping Section

This Python Jupyter file is to calculate and build out the requirements for the Ethiopian design. It could be possible to do this in Excel as well; but this way we have an ability to reference and redesign as per the changing requirements.

First Step is to collect the information; I have a Github account under my username (johnmeye) which i will reference from the file so that anyone who uses Conda/Jupyter will be able to get the files. For any challenges reach out to me on teams or by email (johnmeye@cisco.com)

## Second Revision

This is the second revision of this file as there was a business case change from Vodacom. As a result I'm rewriting this to make it clearer and smoother to calculate the required output. 

## Inputs

The following are the inputs to this file received from Vodacom.
1. Site specifications 
2. Site locations
2. Business case 
3. Consumption assumption of users

## Outputs 

The following are the expected outputs of this file:
1. BoM for sites in catagories of:

    a. POC1
    
    b. POC2
    
    c. POC3
    
    d. Access
    
    e. Peering
    
    
2. BoM's will be in the correct format for CCW upload to allow for quick creation of the total costs.

In [3]:
from urllib.request import urlopen
import json
import pandas as pd
import numpy as np
import seaborn as sns
import matplotlib.pyplot as plt
%matplotlib inline

#For the Map Plotting
import plotly
import plotly.express as px

#For the Inline display of figures.
from IPython.display import HTML
from IPython.display import IFrame

#For the Widgets
#Importing Widgets to allow for the changing of variables on the fly as questions are asked.
import ipywidgets as widgets 


# Input Data 

## Geographical Data

This file below is from the Ethiopian files available from the database on the following site:
https://data.humdata.org/dataset/ethiopia-population-data-_-admin-level-0-3

This site has both topography and the Level 0-3 admin data on the population levels per county/province. Vodacom only provided at Admin 1 (Provincial level) for this RFQ; but we are able to go more in depth to try and work out if there is any additional information we can use to strengthen our position.

The following section will pull that information from my Github; so that you don't have to fetch it yourself. Then load it as a JSON file into the DB, which you can see are Polygon type files with GPS coordinates which mark out the different layers/levels in the country. 

If needed you can pull the information from the GeoJSON files as well, but i did include the boundaries data as a dataframe too. 
Example: counties3["features"][0]['properties'] #Just a sample on how to pull out specific information from the Counties json Files.

## Admin Level Data

As mentioned above; there is both Geo and Admin data; this information matches the information against some paramater; since the file is nicely structured according to standards we will stick to the humanitarian markings. 

Below i read the information from different levels into the variables for Admin1-3 so that we are able to use them to draw choropleth maps of the country. 

Once read into memory; it is possible to find matches against the specific parameters in both the GeoJSON and the Admin files. So i run a few sample commands to view what the data looks like. 

## Vodacom Sites Data
Vodacom has provided the Ethiopia site numbers, and the expected by year and by type, although the Vodacom breaks it down by height and rooftop; this might not be necessary from our point of view and should not impact the way we calculate this. 

For this we will need to figure out how to define rural/urban and so forth.

In [12]:
#*Geographical Data*
#Pull the Data I stored in my Github account for the analysis.

with urlopen('https://github.com/johnmeye/Ethiopia/raw/master/Ethiopia_JSON/eth_admbnda_adm1_csa_bofed_20190827.json') as response:
    counties1 = json.load(response)
    
with urlopen('https://github.com/johnmeye/Ethiopia/raw/master/Ethiopia_JSON/eth_admbnda_adm2_csa_bofed_20190827.json') as response:
    counties2 = json.load(response)
    
with urlopen('https://github.com/johnmeye/Ethiopia/raw/master/Ethiopia_JSON/eth_admbnda_adm3_csa_bofed_20190827.json') as response:
    counties3 = json.load(response)

#*Feature Data* 
#This data is available in the JSON files but its easier to manage from a tableset so i have pulled this below as well.
#Feature data is area, coordinates and naming conventions of each province/suburb/district in ethiopia.
    
Boundaries_Data1 = pd.read_excel("https://github.com/johnmeye/Ethiopia/raw/master/eth_adminboundaries_tabulardata.xlsx",
                  sheet_name='Admin1')

Boundaries_Data2 = pd.read_excel("https://github.com/johnmeye/Ethiopia/raw/master/eth_adminboundaries_tabulardata.xlsx",
                    sheet_name='Admin2')

Boundaries_Data3 = pd.read_excel("https://github.com/johnmeye/Ethiopia/raw/master/eth_adminboundaries_tabulardata.xlsx",
                    sheet_name='Admin3')


#*Admin Level Data*

Admin1 = pd.read_excel("https://github.com/johnmeye/Ethiopia/raw/master/ethiopia-population-data-_-admin-level-0-3.xlsx",
                   dtype={"admin1Pcode": str},
                   skiprows=[1],
                   sheet_name='Admin1')

Admin2 = pd.read_excel("https://github.com/johnmeye/Ethiopia/raw/master/ethiopia-population-data-_-admin-level-0-3.xlsx",
                   dtype={"admin1Pcode": str},
                   skiprows=[1],
                   sheet_name='Admin2')

Admin3 = pd.read_excel("https://github.com/johnmeye/Ethiopia/raw/master/ethiopia-population-data-_-admin-level-0-3.xlsx",
                   dtype={"admin1Pcode": str},
                   skiprows=[1],
                   sheet_name='Admin3')


### Site information input

The input file from Vodacom gives the sites by province and by year for the install. 

This information is stored in the Dataframe **TotalSites** which includes *Total by year* and *Total by Province* which looks as follows:

In [14]:
#*Vodacom Data*
# this information was supplied by Vodacom and contains the expected sites per region.
Sites = pd.read_excel("https://github.com/johnmeye/Ethiopia/raw/master/TX%20BoQ%20v3-Python.xlsx",
                   sheet_name='Site_Numbers')

Sites.rename(columns = {'Location':'admin1Name_en'}, inplace = True)
Site_Year = Sites.groupby(['admin1Name_en']).sum()
TotalSites = pd.merge(Site_Year, Boundaries_Data1[['admin1Name_en', 'Shape_Area']],how='left', on=['admin1Name_en']) #Site combined with the geodata. Sizing found early.
TotalSites.rename(columns = {'Shape_Area':'Shape_Area_Admin1'}, inplace = True)
TotalSites = TotalSites.drop(['Total'], axis=1)
TotalSites.loc['Year_Total']= TotalSites.sum(numeric_only=True, axis=0)
TotalSites.loc[:,'Total']= TotalSites.loc[:,'Year1':'Year11'].sum(numeric_only=True, axis=1)
TotalSites

Unnamed: 0,admin1Name_en,Year1,Year2,Year3,Year4,Year5,Year6,Year7,Year8,Year9,Year10,Year11,Shape_Area_Admin1,Total
0,Addis Ababa,282.0,0.0,0.0,0.0,0.0,94.0,68.0,155.0,253.0,26.0,6.0,0.044369,884.0
1,Afar,61.0,2.0,3.0,37.0,43.0,26.0,68.0,30.0,67.0,22.0,72.0,7.879581,431.0
2,Amhara,364.0,422.0,271.0,133.0,61.0,258.0,226.0,204.0,207.0,354.0,411.0,12.901454,2911.0
3,Benishangul Gumz,28.0,0.0,2.0,21.0,29.0,36.0,38.0,34.0,48.0,3.0,32.0,4.149305,271.0
4,Dire Dawa,44.0,0.0,0.0,0.0,0.0,17.0,20.0,30.0,17.0,4.0,6.0,0.086757,138.0
5,Gambela,15.0,0.0,2.0,8.0,10.0,10.0,13.0,28.0,36.0,1.0,15.0,2.56091,138.0
6,Harari,32.0,0.0,0.0,2.0,0.0,3.0,8.0,23.0,16.0,0.0,13.0,0.030512,97.0
7,Oromia,231.0,412.0,495.0,518.0,461.0,481.0,400.0,415.0,443.0,541.0,766.0,26.47466,5163.0
8,SNNP,68.0,131.0,179.0,257.0,179.0,123.0,115.0,87.0,63.0,386.0,127.0,8.928528,1715.0
9,Somali,163.0,23.0,31.0,61.0,83.0,75.0,140.0,102.0,126.0,52.0,161.0,25.466707,1017.0


### Merging the Provincial Data and the Site Data given by Vodacom

This section briefly merges the information provided with the Site information. This allows a large single table which will then have all kinds of information about population; number of sites; land area and so forth.


In [19]:
#SiteTable is now created to hold the Sites and the Admin3 data together.
SiteTable = pd.merge(Admin3, TotalSites, how='outer', on=['admin1Name_en']) #Site by year
SiteTable.replace(to_replace=0, value=np.nan, inplace=True)

#The Full Table now adds the Shape Area to this Table to allow us to know the total space available for the specific area.
Fulltable = pd.merge(SiteTable, Boundaries_Data3[['admin3Pcode', 'Shape_Area']],how='left', on=['admin3Pcode']) #Site combined with the geodata. Sizing found early.
Fulltable = Fulltable.dropna(subset=['admin3Name_en'])

# Some Shape_Area's are ' NaN'which is a bit of a problem when calculation of density; so i'm filling in these empty cells.
Fulltable['Shape_Area'] = Fulltable['Shape_Area'].fillna(0)

#Based on information from Wikipedia this is the total size of Ethiopia. The Shape Area is a percentage of the total areas.
#I originally used the following, but i think that the total areas on Wiki and This Shape Area don't add up to 100% because 
#The Shape area at Admin3 doesn't include the large bodies of water in the country which are not part of geographic boundaries.
#I'm going to assume that geo boundaries add up to 100%
    #Ethiopia_Shape_Area = Fulltable['Shape_Area'].sum()
    #Unit_Area = Ethiopia_Area / Ethiopia_Shape_Area

Ethiopia_Area = 1104300
Unit_Area = Ethiopia_Area / 100 

Fulltable['Area_Km'] = Fulltable['Shape_Area'].apply(lambda x: x*Unit_Area)
Fulltable.rename(columns = {'Total Population':'Total_Population'}, inplace = True)


def POPDENSITY(Population,Area,AreaName):
    try:
        Density=Population/Area
        return Density
    except ZeroDivisionError:
        print("Population Lost:", AreaName)
        return 0

Fulltable['Population_Density'] = Fulltable.apply(lambda x: POPDENSITY(x.Total_Population, x.Area_Km, x.admin3Name_en), axis=1)


Population Lost: Dawe Serer


In [17]:
Fulltable

Unnamed: 0,admin3Name_en,admin3Pcode,admin2Name_en,admin2Pcode,admin1Name_en,admin1Pcode,admin0Name_en,admin0Pcode,Total_Population,Male,...,Year6,Year7,Year8,Year9,Year10,Year11,Shape_Area_Admin1,Total,Shape_Area,Area_Km
0,Tahtay Adiyabo,ET010101,North Western,ET0101,Tigray,ET01,Ethiopia,ET,104658.344988,52769.849874,...,130.0,107.0,114.0,110.0,67.0,75.0,4.399121,959.0,0.322642,3562.935606
1,Laelay Adiabo,ET010102,North Western,ET0101,Tigray,ET01,Ethiopia,ET,127534.455026,63006.755649,...,130.0,107.0,114.0,110.0,67.0,75.0,4.399121,959.0,0.150153,1658.139579
2,Medebay Zana,ET010103,North Western,ET0101,Tigray,ET01,Ethiopia,ET,146129.151842,71929.954163,...,130.0,107.0,114.0,110.0,67.0,75.0,4.399121,959.0,0.087179,962.717697
3,Tahtay Koraro,ET010104,North Western,ET0101,Tigray,ET01,Ethiopia,ET,77413.547199,38350.178533,...,130.0,107.0,114.0,110.0,67.0,75.0,4.399121,959.0,0.055941,617.756463
4,Asgede Tsimbila,ET010105,North Western,ET0101,Tigray,ET01,Ethiopia,ET,162416.356257,82233.832814,...,130.0,107.0,114.0,110.0,67.0,75.0,4.399121,959.0,0.198092,2187.529956
...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...
978,Police Maret,ET150114,Dire Dawa urban,ET1501,Dire Dawa,ET15,Ethiopia,ET,4846.373489,2431.457007,...,17.0,20.0,30.0,17.0,4.0,6.0,0.086757,138.0,0.000733,8.094519
979,Aseliso,ET150201,Dire Dawa rural,ET1502,Dire Dawa,ET15,Ethiopia,ET,67673.068839,34032.121324,...,17.0,20.0,30.0,17.0,4.0,6.0,0.086757,138.0,0.008446,93.269178
980,Jeldessa,ET150206,Dire Dawa rural,ET1502,Dire Dawa,ET15,Ethiopia,ET,47375.035480,23824.439808,...,17.0,20.0,30.0,17.0,4.0,6.0,0.086757,138.0,0.028006,309.270258
981,Wahil,ET150207,Dire Dawa rural,ET1502,Dire Dawa,ET15,Ethiopia,ET,35409.185411,17806.931392,...,17.0,20.0,30.0,17.0,4.0,6.0,0.086757,138.0,0.015083,166.561569
