## Disreperancy Global IPC aggregation methods
For some dates different results are gained when calculating the numbers on admin0 and admin1 level in different way. Since the numbers are directly reported in the excel sheet on admin0 level we can use those. Simeltaneously we could sum all the numbers reported on admin2 level and this should give the same results. However, they don't match. This notebook does a short exploration of where the differences occur

In [1]:
import pandas as pd

In [2]:
admin_level=0
country="somalia"

In [3]:
df=pd.read_excel("../Data/GlobalIPC/somalia_globalipc_newcolumnnames.xlsx",index_col=0)

In [4]:
df.loc[:,"date"] = pd.to_datetime(df.loc[:,"date"])

In [5]:
df = df[(df["date"].notnull()) & (df[f"ADMIN{admin_level}"].notnull())]

In [6]:
df.head()

Unnamed: 0,ADMIN0,ADMIN1,ADMIN2,ADMIN2_ID,Analysis Name,date,Country Population,pop_CS,% of total county Pop,Area Phase,...,ML2_2,perc_ML2_2,ML2_3,perc_ML2_3,ML2_4,perc_ML2_4,ML2_5,perc_ML2_5,ML2_3p,perc_ML2_3p
0,Somalia: Acute Food Insecurity August 2020,,,,,2020-09-01,12327530.0,12327530.0,1.0,,...,,,,,,,,,,
1,Awdal,,,,,2020-09-01,,724573.0,,,...,,,,,,,,,,
2,Somalia,Awdal,Baki,18976802.0,Acute Food Insecurity August 2020,2020-09-01,,99157.0,,2.0,...,,,,,,,,,,
3,Somalia,Awdal,Borama,18976825.0,Acute Food Insecurity August 2020,2020-09-01,,453434.0,,2.0,...,,,,,,,,,,
4,Somalia,Awdal,Lughaye,18976913.0,Acute Food Insecurity August 2020,2020-09-01,,99157.0,,2.0,...,,,,,,,,,,


In [7]:
df_agg = df[df["ADMIN0"].str.lower().str.fullmatch(country.lower())].groupby(["ADMIN0","date"],as_index=False).sum()
df_precalc = df[df["ADMIN0"].str.lower().str.match(f"{country.lower()}:")]
df_adm1agg = df[~df["ADMIN0"].str.lower().str.contains(f"{country.lower()}")].groupby("date",as_index=False).sum()

In [8]:
df_comb=df_agg.merge(df_precalc,on="date",suffixes=("_agg","_prec")).merge(df_adm1agg.rename(columns={"pop_CS":"pop_CS_adm1"}),on='date')

In [9]:
df_comb["pop_CS_diff_aggprec"]=df_comb["pop_CS_agg"]-df_comb["pop_CS_prec"]
df_comb["pop_CS_diff_aggadm1"]=df_comb["pop_CS_agg"]-df_comb["pop_CS_adm1"]
df_comb["pop_CS_diff_precadm1"]=df_comb["pop_CS_prec"]-df_comb["pop_CS_adm1"]

In [10]:
df_comb[["date","pop_CS_agg","pop_CS_prec","pop_CS_adm1","pop_CS_diff_aggprec","pop_CS_diff_aggadm1","pop_CS_diff_precadm1"]]

Unnamed: 0,date,pop_CS_agg,pop_CS_prec,pop_CS_adm1,pop_CS_diff_aggprec,pop_CS_diff_aggadm1,pop_CS_diff_precadm1
0,2017-01-01,12273659.0,12327530.0,12327529.0,-53871.0,-53870.0,1.0
1,2017-07-01,12109771.0,12327530.0,12327529.0,-217759.0,-217758.0,1.0
2,2018-01-01,12109771.0,12327530.0,12109771.0,-217759.0,0.0,217759.0
3,2018-07-01,12327532.0,12327530.0,12327532.0,2.0,0.0,-2.0
4,2019-01-01,12327532.0,12327530.0,12327533.0,2.0,-1.0,-3.0
5,2019-08-01,12327532.0,12327530.0,12327531.0,2.0,1.0,-1.0
6,2020-01-01,12327530.0,12327530.0,12327530.0,0.0,0.0,0.0
7,2020-09-01,12327530.0,12327530.0,19530536.0,0.0,-7203006.0,-7203006.0


### Conclusions on ADM0 level:
For the three methods there is large disperancy on 2017-01, 2017-07, 2018-01, and 2020-09   
On 2020-09 there is a disreperancy between df_adm1agg and the other two. This is due to wrong summing in the raw data of the Woqooyi Galbeed region   
On 2017-07 the population of Mudug is larger in the admin1 numbers given in the sheet compared to summing the admin2 regions within Mudug   
On 2017-01 and 2018-01 magically extra population is added to the national total that isn't present in the sum of the admin1 and admin2's 

### ADMIN1

In [11]:
df_adm1_precalc = df[~df["ADMIN0"].str.lower().str.contains(f"{country.lower()}")]
df_adm1_precalc=df_adm1_precalc.drop("ADMIN1",axis=1)
df_adm1_precalc=df_adm1_precalc.rename(columns={"ADMIN0":"ADMIN1"})

In [12]:
admin_level=2

In [13]:
df_adm2 = df[(df["date"].notnull()) & (df[f"ADMIN{admin_level}"].notnull())]

In [14]:
df_adm1_agg = df_adm2.groupby(["date", "ADMIN1"], dropna=False, as_index=False).sum()

In [15]:
df_adm1_agg.head()

Unnamed: 0,date,ADMIN1,ADMIN2_ID,Country Population,pop_CS,% of total county Pop,Area Phase,CS_1,perc_CS_1,CS_2,...,ML2_2,perc_ML2_2,ML2_3,perc_ML2_3,ML2_4,perc_ML2_4,ML2_5,perc_ML2_5,ML2_3p,perc_ML2_3p
0,2017-01-01,Awdal,50773531.0,0.0,673264.0,0.0,9.0,483122.0,254.0,117913.0,...,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0
1,2017-01-01,Bakool,50773606.0,0.0,367227.0,0.0,12.0,191157.0,211.0,61817.0,...,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0
2,2017-01-01,Banadir,12693395.0,0.0,1650228.0,0.0,2.0,1138657.0,69.0,511571.0,...,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0
3,2017-01-01,Bari,76160370.0,0.0,730147.0,0.0,14.0,453476.0,351.0,226830.0,...,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0
4,2017-01-01,Bay,50773641.0,0.0,792182.0,0.0,12.0,441051.0,222.0,148907.0,...,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0


In [16]:
df_adm1_precalc.equals(df_adm1_agg)

False

In [17]:
df_adm1_precalc.shape

(144, 54)

In [18]:
df_adm1_agg.shape

(144, 50)

In [19]:
df_adm1_comb=df_adm1_agg.merge(df_adm1_precalc,on=["date","ADMIN1"],suffixes=("_agg","_prec"))

In [20]:
df_adm1_comb["pop_CS_diff"]=df_adm1_comb["pop_CS_agg"]-df_adm1_comb["pop_CS_prec"]

In [21]:
df_adm1_comb[df_adm1_comb["date"]=="2017-07-01"][["ADMIN1","pop_CS_agg","pop_CS_prec","pop_CS_diff"]]

Unnamed: 0,ADMIN1,pop_CS_agg,pop_CS_prec,pop_CS_diff
18,Awdal,673264.0,673264.0,0.0
19,Bakool,367227.0,367227.0,0.0
20,Banadir,1650228.0,1650228.0,0.0
21,Bari,730147.0,730147.0,0.0
22,Bay,792182.0,792182.0,0.0
23,Galgaduud,569434.0,569434.0,0.0
24,Gedo,508403.0,508403.0,0.0
25,Hiraan,520686.0,520686.0,0.0
26,Juba Dhexe,362921.0,362921.0,0.0
27,Juba Hoose,489307.0,489307.0,0.0


In [22]:
df_adm1_comb[df_adm1_comb["date"]=="2020-09-01"][["ADMIN1","pop_CS_agg","pop_CS_prec","pop_CS_diff"]]

Unnamed: 0,ADMIN1,pop_CS_agg,pop_CS_prec,pop_CS_diff
126,Awdal,724573.0,724573.0,0.0
127,Bakool,284353.0,284353.0,0.0
128,Banadir,2228463.0,2228463.0,0.0
129,Bari,712934.0,712934.0,0.0
130,Bay,846600.0,846600.0,0.0
131,Galgaduud,427809.0,427809.0,0.0
132,Gedo,430943.0,430943.0,0.0
133,Hiraan,422993.0,422993.0,0.0
134,Juba Dhexe,286538.0,648936.0,-362398.0
135,Juba Hoose,648936.0,911502.0,-262566.0


In [23]:
df_adm1_comb[df_adm1_comb["ADMIN1"]=="Woqooyi Galbeed"][["ADMIN1","pop_CS_agg","pop_CS_prec","pop_CS_diff"]]

Unnamed: 0,ADMIN1,pop_CS_agg,pop_CS_prec,pop_CS_diff
17,Woqooyi Galbeed,1242003.0,1242003.0,0.0
35,Woqooyi Galbeed,1242003.0,1242003.0,0.0
53,Woqooyi Galbeed,1242003.0,1242003.0,0.0
71,Woqooyi Galbeed,1242003.0,1242003.0,0.0
89,Woqooyi Galbeed,1242003.0,1242003.0,0.0
107,Woqooyi Galbeed,1242003.0,1242003.0,0.0
125,Woqooyi Galbeed,1321524.0,1321524.0,0.0
143,Woqooyi Galbeed,1321524.0,8524530.0,-7203006.0


In [24]:
for i in range(1,6):
    df_adm1_comb[f"CS_{i}_diff"]=df_adm1_comb[f"CS_{i}_agg"]-df_adm1_comb[f"CS_{i}_prec"]

In [25]:
df_adm1_comb[df_adm1_comb["ADMIN1"]=="Woqooyi Galbeed"][["date"]+["pop_CS_agg","pop_CS_prec","pop_CS_diff"]+[f"CS_{i}_agg" for i in range (1,6)]+[f"CS_{i}_prec" for i in range (1,6)]]

Unnamed: 0,date,pop_CS_agg,pop_CS_prec,pop_CS_diff,CS_1_agg,CS_2_agg,CS_3_agg,CS_4_agg,CS_5_agg,CS_1_prec,CS_2_prec,CS_3_prec,CS_4_prec,CS_5_prec
17,2017-01-01,1242003.0,1242003.0,0.0,851011.0,354209.0,36783.0,0.0,0.0,807003.0,367000.0,63000.0,5000.0,
35,2017-07-01,1242003.0,1242003.0,0.0,697314.0,410654.0,98223.0,24462.0,0.0,665003.0,433000.0,111000.0,33000.0,0.0
53,2018-01-01,1242003.0,1242003.0,0.0,726086.0,336066.0,163221.0,14871.0,0.0,686000.0,370000.0,171000.0,15000.0,0.0
71,2018-07-01,1242003.0,1242003.0,0.0,664043.0,298122.0,265963.0,7929.0,5947.0,664003.0,298000.0,266000.0,8000.0,6000.0
89,2019-01-01,1242003.0,1242003.0,0.0,596154.0,483819.0,154877.0,5364.0,0.0,596003.0,486000.0,154000.0,6000.0,0.0
107,2019-08-01,1242003.0,1242003.0,0.0,616766.0,562006.0,45447.0,17030.0,0.0,621003.0,558000.0,50000.0,13000.0,0.0
125,2020-01-01,1321524.0,1321524.0,0.0,855073.0,318140.0,134270.0,3600.0,0.0,859524.0,321000.0,135000.0,6000.0,0.0
143,2020-09-01,1321524.0,8524530.0,-7203006.0,884810.0,298054.0,78578.0,60082.0,0.0,886713.0,297258.0,81917.0,55635.0,0.0


In [26]:
df_adm1_comb[df_adm1_comb["ADMIN1"]=="Woqooyi Galbeed"][["date"]+["pop_CS_agg","pop_CS_prec","pop_CS_diff"]+[f"CS_{i}_diff" for i in range (1,6)]]

Unnamed: 0,date,pop_CS_agg,pop_CS_prec,pop_CS_diff,CS_1_diff,CS_2_diff,CS_3_diff,CS_4_diff,CS_5_diff
17,2017-01-01,1242003.0,1242003.0,0.0,44008.0,-12791.0,-26217.0,-5000.0,
35,2017-07-01,1242003.0,1242003.0,0.0,32311.0,-22346.0,-12777.0,-8538.0,0.0
53,2018-01-01,1242003.0,1242003.0,0.0,40086.0,-33934.0,-7779.0,-129.0,0.0
71,2018-07-01,1242003.0,1242003.0,0.0,40.0,122.0,-37.0,-71.0,-53.0
89,2019-01-01,1242003.0,1242003.0,0.0,151.0,-2181.0,877.0,-636.0,0.0
107,2019-08-01,1242003.0,1242003.0,0.0,-4237.0,4006.0,-4553.0,4030.0,0.0
125,2020-01-01,1321524.0,1321524.0,0.0,-4451.0,-2860.0,-730.0,-2400.0,0.0
143,2020-09-01,1321524.0,8524530.0,-7203006.0,-1903.0,796.0,-3339.0,4447.0,0.0


Reported population vs summed population over 5 phases

In [27]:
df_adm1_comb["pop_CS_sum"]=df_adm1_comb[[f"CS_{i}_agg" for i in range(1,6)]].sum(axis=1)

In [28]:
df_adm1_comb["pop_CS_sum_diff"]=df_adm1_comb["pop_CS_agg"]-df_adm1_comb["pop_CS_sum"]

In [29]:
df_adm1_comb[df_adm1_comb["ADMIN1"]=="Woqooyi Galbeed"][["date","pop_CS_agg","pop_CS_prec","pop_CS_sum","pop_CS_sum_diff"]]

Unnamed: 0,date,pop_CS_agg,pop_CS_prec,pop_CS_sum,pop_CS_sum_diff
17,2017-01-01,1242003.0,1242003.0,1242003.0,0.0
35,2017-07-01,1242003.0,1242003.0,1230653.0,11350.0
53,2018-01-01,1242003.0,1242003.0,1240244.0,1759.0
71,2018-07-01,1242003.0,1242003.0,1242004.0,-1.0
89,2019-01-01,1242003.0,1242003.0,1240214.0,1789.0
107,2019-08-01,1242003.0,1242003.0,1241249.0,754.0
125,2020-01-01,1321524.0,1321524.0,1311083.0,10441.0
143,2020-09-01,1321524.0,8524530.0,1321524.0,0.0


### ADMIN 2

In [30]:
df=pd.read_excel("../Data/GlobalIPC/somalia_globalipc_newcolumnnames.xlsx",index_col=0)

In [31]:
admin_level=2

In [32]:
df_adm2 = df[(df["date"].notnull()) & (df[f"ADMIN{admin_level}"].notnull())]

In [33]:
df_adm2["pop_CS_sum"]=df_adm2[[f"CS_{i}" for i in range(1,6)]].sum(axis=1)
df_adm2["pop_CS_sum_diff"]=df_adm2["pop_CS"]-df_adm2["pop_CS_sum"]

A value is trying to be set on a copy of a slice from a DataFrame.
Try using .loc[row_indexer,col_indexer] = value instead

See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy
  df_adm2["pop_CS_sum"]=df_adm2[[f"CS_{i}" for i in range(1,6)]].sum(axis=1)
A value is trying to be set on a copy of a slice from a DataFrame.
Try using .loc[row_indexer,col_indexer] = value instead

See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy
  df_adm2["pop_CS_sum_diff"]=df_adm2["pop_CS"]-df_adm2["pop_CS_sum"]


In [34]:
df_adm2[["date","ADMIN1","ADMIN2","pop_CS","pop_CS_sum","pop_CS_sum_diff"]].sort_values(by="pop_CS_sum_diff")

Unnamed: 0,date,ADMIN1,ADMIN2,pop_CS,pop_CS_sum,pop_CS_sum_diff
200,2019-08-01,Banadir,Banadir,1650228.0,1666731.0,-16503.0
293,2019-01-01,Banadir,Banadir,1650228.0,1666731.0,-16503.0
737,2017-01-01,Togdheer,Burco,460354.0,464958.0,-4604.0
717,2017-01-01,Sanaag,Laasqoray,402743.0,406770.0,-4027.0
191,2019-08-01,Awdal,Borama,397760.0,401738.0,-3978.0
...,...,...,...,...,...,...
581,2017-07-01,Bay,Baydhaba,315679.0,312522.0,3157.0
240,2019-08-01,Mudug,Gaalkacyo,420293.0,416090.0,4203.0
575,2017-07-01,Bari,Bossaso,469566.0,464870.0,4696.0
651,2017-07-01,Woqooyi Galbeed,Hargeysa,959081.0,949491.0,9590.0


In [35]:
df_adm2[df_adm2["ADMIN1"]=="Woqooyi Galbeed"][["date","ADMIN1","ADMIN2","pop_CS","pop_CS_sum","pop_CS_sum_diff"]]

Unnamed: 0,date,ADMIN1,ADMIN2,pop_CS,pop_CS_sum,pop_CS_sum_diff
90,2020-09-01,Woqooyi Galbeed,Berbera,179997.0,179997.0,0.0
91,2020-09-01,Woqooyi Galbeed,Gebiley,97441.0,97441.0,0.0
92,2020-09-01,Woqooyi Galbeed,Hargeysa,1044086.0,1044086.0,0.0
184,2020-01-01,Woqooyi Galbeed,Berbera,179997.0,179997.0,0.0
185,2020-01-01,Woqooyi Galbeed,Gebiley,97441.0,97441.0,0.0
186,2020-01-01,Woqooyi Galbeed,Hargeysa,1044086.0,1033645.0,10441.0
277,2019-08-01,Woqooyi Galbeed,Berbera,178810.0,177022.0,1788.0
278,2019-08-01,Woqooyi Galbeed,Gebiley,103449.0,104484.0,-1035.0
279,2019-08-01,Woqooyi Galbeed,Hargeysa,959744.0,959743.0,1.0
370,2019-01-01,Woqooyi Galbeed,Berbera,178810.0,177022.0,1788.0


In [36]:
df_adm2[(df_adm2["ADMIN1"]=="Woqooyi Galbeed") & (df_adm2["date"]=="2020-01-01")][["ADMIN1","ADMIN2","pop_CS","pop_CS_sum","pop_CS_sum_diff"]]

Unnamed: 0,ADMIN1,ADMIN2,pop_CS,pop_CS_sum,pop_CS_sum_diff
184,Woqooyi Galbeed,Berbera,179997.0,179997.0,0.0
185,Woqooyi Galbeed,Gebiley,97441.0,97441.0,0.0
186,Woqooyi Galbeed,Hargeysa,1044086.0,1033645.0,10441.0


In [37]:
df_adm2[(df_adm2["ADMIN1"]=="Woqooyi Galbeed") & (df_adm2["date"]=="2020-01-01")].pop_CS.sum()

1321524.0

In [49]:
df_adm2["CS_sum_agg"]=df_adm1_comb[[f"CS_{i}_agg" for i in range(1,6)]].sum(axis=1)
df_adm2["CS_sum_prec"]=df_adm1_comb[[f"CS_{i}_prec" for i in range(1,6)]].sum(axis=1)
for i in range(1,6):
    df_adm1_comb[f"perc_CS_{i}_agg"]=df_adm1_comb[f"CS_{i}_agg"]/df_adm1_comb[f"CS_sum_agg"]*100
    df_adm1_comb[f"perc_CS_{i}_prec"]=df_adm1_comb[f"CS_{i}_prec"]/df_adm1_comb[f"CS_sum_prec"]*100
    df_adm1_comb[f"perc_CS_{i}_diff"]=df_adm1_comb[f"perc_CS_{i}_agg"]-df_adm1_comb[f"perc_CS_{i}_prec"]

### percentual differences?

In [60]:
df_adm1_comb["CS_sum_agg"]=df_adm1_comb[[f"CS_{i}_agg" for i in range(1,6)]].sum(axis=1)
df_adm1_comb["CS_sum_prec"]=df_adm1_comb[[f"CS_{i}_prec" for i in range(1,6)]].sum(axis=1)
for i in range(1,6):
    df_adm1_comb[f"perc_CS_{i}_agg"]=df_adm1_comb[f"CS_{i}_agg"]/df_adm1_comb[f"CS_sum_agg"]*100
    df_adm1_comb[f"perc_CS_{i}_prec"]=df_adm1_comb[f"CS_{i}_prec"]/df_adm1_comb[f"CS_sum_prec"]*100
    df_adm1_comb[f"perc_CS_{i}_diff"]=df_adm1_comb[f"perc_CS_{i}_agg"]-df_adm1_comb[f"perc_CS_{i}_prec"]

df_adm1_comb[f"perc_CS_3p_agg"]=df_adm1_comb[[f"CS_{i}_agg" for i in range(3,6)]].sum(axis=1)/df_adm1_comb[f"CS_sum_agg"]*100
df_adm1_comb[f"perc_CS_3p_prec"]=df_adm1_comb[[f"CS_{i}_prec" for i in range(3,6)]].sum(axis=1)/df_adm1_comb[f"CS_sum_prec"]*100
df_adm1_comb[f"perc_CS_3p_diff"]=df_adm1_comb[f"perc_CS_3p_agg"]-df_adm1_comb[f"perc_CS_3p_prec"]

In [61]:
df_adm1_comb

Unnamed: 0,date,ADMIN1,ADMIN2_ID_agg,Country Population_agg,pop_CS_agg,% of total county Pop_agg,Area Phase_agg,CS_1_agg,perc_CS_1_agg,CS_2_agg,...,pop_CS_sum,pop_CS_sum_diff,CS_sum_agg,CS_sum_prec,perc_CS_1_diff,perc_CS_2_diff,perc_CS_3_diff,perc_CS_4_diff,perc_CS_5_diff,perc_CS_3p_diff
0,2017-01-01,Awdal,50773531.0,0.0,673264.0,0.0,9.0,483122.0,71.758074,117913.0,...,673265.0,-1.0,673265.0,673264.0,0.572923,-0.904129,0.331206,,,0.331206
1,2017-01-01,Bakool,50773606.0,0.0,367227.0,0.0,12.0,191157.0,51.970540,61817.0,...,367818.0,-591.0,367818.0,367227.0,6.705077,-2.799992,-3.905085,,,-3.905085
2,2017-01-01,Banadir,12693395.0,0.0,1650228.0,0.0,2.0,1138657.0,68.999981,511571.0,...,1650228.0,0.0,1650228.0,1650228.0,21.719968,-10.569994,-10.301607,-0.848368,,-11.149974
3,2017-01-01,Bari,76160370.0,0.0,730147.0,0.0,14.0,453476.0,62.091846,226830.0,...,730331.0,-184.0,730331.0,730147.0,7.288224,-2.770284,-3.799975,-0.717965,,-4.517940
4,2017-01-01,Bay,50773641.0,0.0,792182.0,0.0,12.0,441051.0,55.675393,148907.0,...,792183.0,-1.0,792183.0,792182.0,4.906504,-2.536436,-1.612667,-0.757402,,-2.370068
...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...
139,2020-09-01,Shabelle Dhexe,75907344.0,0.0,436759.0,0.0,6.0,334148.0,76.506265,71092.0,...,436759.0,0.0,436759.0,337588.0,18.569372,-9.790108,-6.338750,-2.440513,0.0,-8.779264
140,2020-09-01,Shabelle Hoose,132838230.0,0.0,911502.0,0.0,13.0,701405.0,76.950378,143998.0,...,911503.0,-1.0,911503.0,562067.0,14.668301,-5.907734,-3.898198,-4.862369,0.0,-8.760567
141,2020-09-01,Sool,75907647.0,0.0,360432.0,0.0,8.0,245767.0,68.186787,75120.0,...,360432.0,0.0,360432.0,360432.0,-0.184501,0.033293,0.084343,0.066864,0.0,0.151207
142,2020-09-01,Togdheer,75907529.0,0.0,755793.0,0.0,8.0,576124.0,76.227649,128259.0,...,755794.0,-1.0,755794.0,755793.0,0.043694,0.034246,-0.014826,-0.063114,0.0,-0.077940


In [62]:
df_adm1_comb[["date","ADMIN1","pop_CS_agg","perc_CS_1_agg","perc_CS_1_prec","perc_CS_1_diff"]].sort_values(by="perc_CS_1_diff")

Unnamed: 0,date,ADMIN1,pop_CS_agg,perc_CS_1_agg,perc_CS_1_prec,perc_CS_1_diff
137,2020-09-01,Nugaal,337588.0,57.847856,76.417200,-18.569344
136,2020-09-01,Mudug,627723.0,62.541320,69.637535,-7.096215
28,2017-07-01,Mudug,500104.0,33.118378,39.960605,-6.842228
135,2020-09-01,Juba Hoose,648936.0,71.848996,76.961104,-5.112108
46,2018-01-01,Mudug,500104.0,52.472381,55.710306,-3.237926
...,...,...,...,...,...,...
139,2020-09-01,Shabelle Dhexe,436759.0,76.506265,57.936893,18.569372
20,2017-07-01,Banadir,1650228.0,71.999990,51.400655,20.599335
23,2017-07-01,Galgaduud,569434.0,48.248198,27.296227,20.951971
41,2018-01-01,Galgaduud,569434.0,70.255802,48.857645,21.398157


In [63]:
df_adm1_comb[["date","ADMIN1","pop_CS_agg","perc_CS_3p_agg","perc_CS_3p_prec","perc_CS_3p_diff"]].sort_values(by="perc_CS_3p_diff")

Unnamed: 0,date,ADMIN1,pop_CS_agg,perc_CS_3p_agg,perc_CS_3p_prec,perc_CS_3p_diff
20,2017-07-01,Banadir,1650228.0,10.000012,26.905373,-16.905361
23,2017-07-01,Galgaduud,569434.0,35.526313,51.805828,-16.279515
41,2018-01-01,Galgaduud,569434.0,16.267006,27.768014,-11.501008
2,2017-01-01,Banadir,1650228.0,0.000000,11.149974,-11.149974
25,2017-07-01,Hiraan,520686.0,38.220932,47.629473,-9.408541
...,...,...,...,...,...,...
134,2020-09-01,Juba Dhexe,286538.0,14.142228,11.557380,2.584848
135,2020-09-01,Juba Hoose,648936.0,11.522708,7.240796,4.281912
138,2020-09-01,Sanaag,562067.0,15.913939,10.832804,5.081135
28,2017-07-01,Mudug,500104.0,45.633400,38.447501,7.185898


In [64]:
df_adm1_comb[df_adm1_comb["date"]=="2020-09-01"][["ADMIN1","pop_CS_agg","perc_CS_1_agg","perc_CS_1_prec"]]

Unnamed: 0,ADMIN1,pop_CS_agg,perc_CS_1_agg,perc_CS_1_prec
126,Awdal,724573.0,64.143483,64.254809
127,Bakool,284353.0,77.919698,78.196115
128,Banadir,2228463.0,73.039534,73.030739
129,Bari,712934.0,60.722311,60.725677
130,Bay,846600.0,65.187379,65.154737
131,Galgaduud,427809.0,66.264697,66.10637
132,Gedo,430943.0,64.387252,64.264415
133,Hiraan,422993.0,75.265726,75.413305
134,Juba Dhexe,286538.0,69.33297,71.799993
135,Juba Hoose,648936.0,71.848996,76.961104
