# Water Risk Tables

In [41]:
import pandas as pd
import geopandas as gpd
import numpy as np
import matplotlib.pyplot as plt

##  Framework structure

The structure of the framework is as follows. Below we used the default weighting schemes for the groups (quantity, quality and regulatory reputational) and the overall water risk.

Overall Water Risk (100%)
- Water Quantity Risk (69.4%)
    - Baseline Water Stress (16.3%) 
    - Baseline Water Depletion (16.3%)
    - Groundwater Table Decline (16.3%)
    - Interannual Variability (2.0%)
    - Seasonal Variability (2.0%)
    - Drought Risk (8.2%)
    - Riverine Flood Risk (4.1%)
    - Coastal Flood Risk (4.1%)

- Water Quality Risk (12.2%)
    - Untreated Collected Wastewater (8.2%)
    - Coastal Eutrophication Potential (4.1%)

- Regulatory and Reputational (18.4%)
    - Unimproved/no drinking water (8.2%)
    - Unimproved/no sanitation (8.2%)
    - RepRisk Index (2.0%)
    

**Group names**

|Group full 	               |Group short    |
|------------------------------|---------------|
|Overall Water Risk            |TOT            |
|Water Quantity Risk           |QAN            |
|Water Quality Risk            |QAL            |
|Regulatory and Reputational   |RRR            |

**Indicator names**

|Indicator full 	                 |Indicator short    |
|------------------------------------|-------------------|
|Baseline Water Stress               |bws                |    
|Baseline Water Depletion            |bwd                |
|Groundwater Table Decline           |gtd                |
|Interannual Variability             |iav                |
|Seasonal Variability                |sev                |
|Drought Risk                        |drr                |
|Riverine Flood Risk                 |rfr                |
|Coastal Flood Risk                  |cfr                |
|Untreated Collected Wastewater      |ucw                |
|Coastal Eutrophication Potential    |cep                |
|Unimproved/no drinking water        |udw                |
|Unimproved/no sanitation            |usa                |
|RepRisk Index                       |rri                |


## Weighting Scheme

**Weight names**

|Industry full 	         |Industry short |
|------------------------|---------------|
|Default                 |DEF            |
|Argiculture             |AGR            |
|Electric Power          |ELP            |
|Semiconductor           |SMC            |
|Oil and gas             |ONG            |
|Chemical                |CHE            |
|Mining                  |MIN            |
|Food and beverage       |FNB            |
|Construction materials  |CON            |
|Textile                 |TEX            |


## Annual data

### Pivot

In [59]:
annual_pivot = pd.read_csv('/Volumes/MacBook HD/data/aqueduct/data_source/AQ_2_water_risk_atlas/output_V03/annual/annual_pivot.csv')

  interactivity=interactivity, compiler=compiler, result=result)


**Clean table**

In [44]:
columns_default = ['aq30_id']
cols = filter(lambda x:x.endswith(tuple(["_cat", "_label", "_weight_fraction"])), list(annual_pivot.columns))

for n, col in enumerate(list(cols)):    
    columns_default.append(col)

In [23]:
annual_pivot_default = annual_pivot[columns_default]

**Add `string_id` and `gid_1`**

In [49]:
df_ids = gpd.read_file('/Volumes/MacBook HD/data/aqueduct/data_source/AQ_2_water_risk_atlas/Y2018M12D06_RH_Master_Shape_V01/output_V02/Y2018M12D06_RH_Master_Shape_V01.shp')

In [50]:
df_ids.drop(columns=['aqid', 'type', 'geometry'], inplace=True)
df_ids.head(1)

Unnamed: 0,aq30_id,gid_1,pfaf_id,string_id
0,0,EGY.11_1,111011,111011-EGY.11_1-3365


In [51]:
annual_pivot_default = pd.merge(left=annual_pivot_default,
                right=df_ids,
                on = "aq30_id",
                how = "left")
annual_pivot_default[1000:1005]

Unnamed: 0,aq30_id,aqid,area_km2,bwd_cat,bwd_label,bwd_raw,bwd_score,bws_cat,bws_label,bws_raw,...,w_awr_tex_rrr_score,w_awr_tex_rrr_weight_fraction,w_awr_tex_tot_cat,w_awr_tex_tot_label,w_awr_tex_tot_raw,w_awr_tex_tot_score,w_awr_tex_tot_weight_fraction,gid_1_y,pfaf_id_y,string_id_y
1000,14907,1324,26.73154,2.0,Medium - High (25-50%),0.427414,2.709658,4.0,Extremely High (>80%),1.108402,...,0.0,0.163265,2.0,Medium - High (2-3),1.744505,2.402707,0.591837,-9999,215001,215001-None-1324
1001,14973,-9999,79.894816,2.0,Medium - High (25-50%),0.439716,2.758863,4.0,Extremely High (>80%),1.054327,...,0.0,0.163265,2.0,Medium - High (2-3),1.801017,2.515867,0.591837,-9999,215007,215007-None-None
1002,15254,1431,30.114809,2.0,Medium - High (25-50%),0.402026,2.608105,4.0,Extremely High (>80%),0.947249,...,0.0,0.163265,2.0,Medium - High (2-3),1.714811,2.343247,0.591837,-9999,218001,218001-None-1431
1003,15275,1478,52.83041,2.0,Medium - High (25-50%),0.448701,2.794804,4.0,Extremely High (>80%),1.304529,...,0.0,0.163265,2.0,Medium - High (2-3),1.644917,2.203292,0.591837,-9999,219000,219000-None-1478
1004,15975,-9999,0.102887,2.0,Medium - High (25-50%),0.305912,2.223649,3.0,High (40-80%),0.494191,...,1.11366,0.163265,2.0,Medium - High (2-3),1.838581,2.591084,0.591837,-9999,224008,224008-None-None


**Save tables**

In [28]:
annual_pivot_default.to_csv('/Volumes/MacBook HD/data/aqueduct/data_source/AQ_2_water_risk_atlas/water_risk_indicators_annual_new.csv')

### Normalized

In [22]:
annual_normalized = pd.read_csv('/Users/ikersanchez/Vizzuality/PROIEKTUAK/Aqueduct/work/data/AQ_2_water_risk_atlas/output_V02/annual/annual_normalized.csv')

In [23]:
annual_normalized.head(1)

Unnamed: 0.1,Unnamed: 0,cat,group_short,indicator,industry_short,label,raw,score,string_id,weight_fraction,weighted_score
0,0,4.0,qan,bwd,che,Extremely High,0.987061,4.948243,111011-EGY.11_1-3365,0.07619,0.377009


In [24]:
annual_normalized.drop(columns=['Unnamed: 0','cat','industry_short', 'label', 'raw', 'weight_fraction', 'weighted_score'], inplace=True)
annual_normalized.head(1)

Unnamed: 0,group_short,indicator,score,string_id
0,qan,bwd,4.948243,111011-EGY.11_1-3365


**Add aq30_id**

In [25]:
df_aq30 = gpd.read_file('/Users/ikersanchez/Vizzuality/PROIEKTUAK/Aqueduct/work//data/AQ_2_water_risk_atlas/Y2018M12D06_RH_Master_Shape_V01/output_V02/Y2018M12D06_RH_Master_Shape_V01.shp')

In [26]:
df_aq30.drop(columns=['aqid', 'gid_1', 'pfaf_id', 'type', 'geometry'], inplace=True)
df_aq30.head(1)

Unnamed: 0,aq30_id,string_id
0,0,111011-EGY.11_1-3365


In [27]:
df_aq30.shape

(68511, 2)

In [28]:
df_all = pd.merge(left=annual_normalized,
                right=df_aq30,
                on = "string_id",
                how = "left")
#df_all.drop(columns='string_id', inplace=True)
df_all.sort_values('aq30_id', inplace=True)
df_all.head(1)

Unnamed: 0,group_short,indicator,score,string_id,aq30_id
0,qan,bwd,4.948243,111011-EGY.11_1-3365,0


**Drop rows where score is NaN**

In [29]:
df_all.dropna(subset=['score'], inplace=True)

In [33]:
df_all.shape

(9968327, 5)

In [31]:
df_all.head()

Unnamed: 0,group_short,indicator,score,string_id,aq30_id
0,qan,bwd,4.948243,111011-EGY.11_1-3365,0
109,rrr,usa,1.019067,111011-EGY.11_1-3365,0
120,rrr,rri,2.8,111011-EGY.11_1-3365,0
121,rrr,rri,2.8,111011-EGY.11_1-3365,0
122,rrr,rri,2.8,111011-EGY.11_1-3365,0


In [34]:
df_all.drop_duplicates(subset=['group_short','indicator', 'score', 'string_id', 'aq30_id'], inplace=True)

In [35]:
df_all.shape

(3002974, 5)

**Save table**

In [36]:
df_all.to_csv('/Users/ikersanchez/Vizzuality/PROIEKTUAK/Aqueduct/work/data/AQ_2_water_risk_atlas/water_risk_indicators_normalized.csv')

## Monthly data

In [54]:
monthly_bws = pd.read_csv('/Users/ikersanchez/Vizzuality/PROIEKTUAK/Aqueduct/work/data/AQ_2_water_risk_atlas/output_V01/monthly/monthly_bws.csv')
monthly_bwd = pd.read_csv('/Users/ikersanchez/Vizzuality/PROIEKTUAK/Aqueduct/work/data/AQ_2_water_risk_atlas/output_V01/monthly/monthly_bwd.csv')
monthly_iav = pd.read_csv('/Users/ikersanchez/Vizzuality/PROIEKTUAK/Aqueduct/work/data/AQ_2_water_risk_atlas/output_V01/monthly/monthly_iav.csv')

In [55]:
monthly_bws.head()

Unnamed: 0.1,Unnamed: 0,pfaf_id,temporal_resolution,year,month,delta_id,raw,score,cat,label
0,0,913430,month,2014,1,-1,,-9999.0,-9999.0,NoData
1,1,913430,month,2014,2,-1,,-9999.0,-9999.0,NoData
2,2,913430,month,2014,3,-1,,-9999.0,-9999.0,NoData
3,3,913430,month,2014,4,-1,,-9999.0,-9999.0,NoData
4,4,913430,month,2014,5,-1,,-9999.0,-9999.0,NoData


**Clean table**

In [56]:
monthly_bws = monthly_bws[['pfaf_id', 'year', 'month', 'cat', 'label', 'raw', 'score']]
monthly_bwd = monthly_bwd[['pfaf_id', 'year', 'month', 'cat', 'label', 'raw', 'score']]
monthly_iav = monthly_iav[['pfaf_id', 'year', 'month', 'cat', 'label', 'raw', 'score']]

In [57]:
monthly_bws.rename(columns={'cat': 'bws_cat', 'label': 'bws_label', 'raw': 'bws_raw', 'score': 'bws_score'}, inplace= True)
monthly_bwd.rename(columns={'cat': 'bwd_cat', 'label': 'bwd_label', 'raw': 'bwd_raw', 'score': 'bwd_score'}, inplace= True)
monthly_iav.rename(columns={'cat': 'iav_cat', 'label': 'iav_label', 'raw': 'iav_raw', 'score': 'iav_score'}, inplace= True)

In [58]:
monthly = monthly_bws.merge(monthly_bwd, on=['pfaf_id', 'year', 'month'], how='left')
monthly = monthly.merge(monthly_iav, on=['pfaf_id', 'year', 'month'], how='left')

In [59]:
monthly[100100:100105]

Unnamed: 0,pfaf_id,year,month,bws_cat,bws_label,bws_raw,bws_score,bwd_cat,bwd_label,bwd_raw,bwd_score,iav_cat,iav_label,iav_raw,iav_score
100100,312414,2014,9,0.0,Low,0.004141,0.0,0.0,Low,0.000904,0.018082,0.0,Low,0.238772,0.955087
100101,312414,2014,12,0.0,Low,0.007512,0.0,0.0,Low,0.00167,0.033394,1.0,Low - Medium,0.374989,1.499956
100102,312414,2014,2,0.0,Low,0.018577,0.0,0.0,Low,0.004148,0.082956,1.0,Low - Medium,0.382703,1.530813
100103,312414,2014,5,0.0,Low,0.001603,0.0,0.0,Low,0.000358,0.007162,1.0,Low - Medium,0.39921,1.596841
100104,312414,2014,4,0.0,Low,0.005071,0.0,0.0,Low,0.001138,0.022768,4.0,Extremely High,1.04302,4.172082


**Save tables**

In [60]:
monthly.to_csv('/Users/ikersanchez/Vizzuality/PROIEKTUAK/Aqueduct/work/data/AQ_2_water_risk_atlas/water_risk_indicators_monthly.csv')

In [None]:
industry_weights = pd.read_csv('/Users/ikersanchez/Vizzuality/PROIEKTUAK/Aqueduct/work/data/AQ_2_water_risk_atlas/output_V01/industry_weights/industry_weights.csv')

In [None]:
industry_weights.drop(labels=['Unnamed: 0', 'id'], axis=1, inplace=True)

In [None]:
industry_weights.head()

**Save table**

In [None]:
industry_weights.to_csv('/Users/ikersanchez/Vizzuality/PROIEKTUAK/Aqueduct/work/data/AQ_2_water_risk_atlas/water_risk_industry_weights.csv')

## Future projections

In [2]:
projections = gpd.read_file('/Users/ikersanchez/Vizzuality/PROIEKTUAK/Aqueduct/work/data/aqueduct_projections_20150309_shp/aqueduct_projections_20150309.shp')

**Table reconfiguration**

In [None]:
indicator_codes = {'ws': 'water_stress', 'sv': 'seasonal_variability', 'ut': 'water_demand', 'bt': 'water_supply'}
year_codes = {'20': 2020, '30': 2030, '40': 2040}
scenario_codes = {'24': 'optimistic', '28': 'business_as_usual', '38': 'pessimistic'}
data_types = {'c': 'change_from_baseline', 't': 'future_value'}
suffixes = {'l': 'label', 'r': 'value'}

projections_vertical = pd.DataFrame(columns=['basinid', 'indicator', 'value', 'label', 'year', 'scenario', 'type'])

nRows = projections.shape[0]
basinid_col = list(projections['BasinID'])
for indicator in indicator_codes.keys():
    indicator_col = [indicator_codes[indicator]] * nRows
    for year in year_codes.keys():
        year_col = [year_codes[year]] * nRows
        
        for scenario in scenario_codes.keys():
            scenario_col = [scenario_codes[scenario]] * nRows
            
            for type in data_types.keys():
                type_col = [data_types[type]] * nRows
                
                label_col = list(projections[indicator+year+scenario+type+list(suffixes.keys())[0]])
                value_col = list(projections[indicator+year+scenario+type+list(suffixes.keys())[1]])
                
                df = pd.DataFrame({'basinid': basinid_col, 'indicator': indicator_col, 'value': value_col, 
                                   'label': label_col, 'year': year_col, 'scenario': scenario_col, 'type': type_col})
                
                projections_vertical = pd.concat([projections_vertical, df])

projections_vertical['basinid'] = projections_vertical['basinid'].astype(np.int)
projections_vertical.head()

In [None]:
df[(df['indicator'] == 'water_supply') & (df['type'] == 'future_value')]['label'].unique()

**Save table**

In [None]:
projections_vertical.to_csv('/Users/ikersanchez/Vizzuality/PROIEKTUAK/Aqueduct/work/data/AQ_2_water_risk_atlas/water_risk_indicators_projections.csv')

In [None]:
projections_vertical['label'].unique()

In [None]:
df = projections_vertical[(projections_vertical['indicator'] == 'water_stress') & (projections_vertical['type'] == 'change_from_baseline')]
print(df['label'].unique())
df = projections_vertical[(projections_vertical['indicator'] == 'water_stress') & (projections_vertical['type'] == 'future_value')]
print(df['label'].unique())

In [None]:
df = projections_vertical[(projections_vertical['indicator'] == 'seasonal_variability') & (projections_vertical['type'] == 'change_from_baseline')]
print(df['label'].unique())
df = projections_vertical[(projections_vertical['indicator'] == 'seasonal_variability') & (projections_vertical['type'] == 'future_value')]
print(df['label'].unique())

In [None]:
df = projections_vertical[(projections_vertical['indicator'] == 'water_demand') & (projections_vertical['type'] == 'change_from_baseline')]
print(df['label'].unique())
df = projections_vertical[(projections_vertical['indicator'] == 'water_demand') & (projections_vertical['type'] == 'future_value')]
print(df['label'].unique())

In [None]:
df = projections_vertical[(projections_vertical['indicator'] == 'water_supply') & (projections_vertical['type'] == 'change_from_baseline')]
print(df['label'].unique())
df = projections_vertical[(projections_vertical['indicator'] == 'water_supply') & (projections_vertical['type'] == 'future_value')]
print(df['label'].unique())

## Custom weights table

In [3]:
df_master = pd.read_csv('/Users/ikersanchez/Vizzuality/PROIEKTUAK/Aqueduct/work/data/y2018m12d04_rh_master_merge_rawdata_gpd_v02_v05.csv')

  interactivity=interactivity, compiler=compiler, result=result)


In [4]:
df_master.shape

(890643, 13)

In [5]:
df_master.head(1)

Unnamed: 0,aqid,cat,delta_id,gid_0,gid_1,indicator,label,pfaf_id,raw,score,string_id,temporal_resolution,year
0,3365.0,0.0,-1.0,ERI,ERI.2_1,bwd,Low,111081.0,0.038672,0.773434,111081-ERI.2_1-3365,year,2014.0


**Drop columns**

In [6]:
df_master.drop(columns=['aqid','cat','delta_id','gid_0','gid_1', 'label', 'pfaf_id', 'raw', 'temporal_resolution', 'year'], inplace=True)
df_master.head(1)

Unnamed: 0,indicator,score,string_id
0,bwd,0.773434,111081-ERI.2_1-3365


In [7]:
# certain GUs have invalid 'None' indicators. removing those
# This happens when the id exists in the master shapefile but not in te indicator results.
df_valid = df_master.loc[df_master["indicator"].notnull()]

In [8]:
df_valid.shape

(809557, 3)

**Add group name**

In [9]:
df_weights = pd.read_csv('/Users/ikersanchez/Vizzuality/PROIEKTUAK/Aqueduct/work/data/AQ_2_water_risk_atlas/output_V01/industry_weights/industry_weights.csv')

In [10]:
# Lowercase some columns
df_weights['group_short'] = df_weights['group_short'].astype(str).str.lower()
df_weights['indicator_short'] = df_weights['indicator_short'].astype(str).str.lower()
df_weights['industry_short'] = df_weights['industry_short'].astype(str).str.lower()

# Drop column
df_weights.drop(columns='Unnamed: 0', inplace=True)

In [11]:
df_groups = df_weights.loc[df_weights["industry_short"] =="def"][["indicator_short","group_short"]]

In [12]:
df_groups 

Unnamed: 0,indicator_short,group_short
1,ucw,qal
15,cep,qal
20,drr,qan
42,bws,qan
43,bwd,qan
44,gtd,qan
62,iav,qan
63,sev,qan
75,rfr,qan
76,cfr,qan


In [13]:
# Add group to dataframe
df_group = pd.merge(left=df_valid,
                 right=df_groups,
                 how="left",
                 left_on="indicator",
                 right_on="indicator_short")
# Drop columns
df_group.drop(["indicator_short"], axis=1,inplace=True)

In [14]:
df_group.head(1)

Unnamed: 0,indicator,score,string_id,group_short
0,bwd,0.773434,111081-ERI.2_1-3365,qan


**Add aq30_id**

In [15]:
df_aq30 = gpd.read_file('/Users/ikersanchez/Vizzuality/PROIEKTUAK/Aqueduct/work//data/AQ_2_water_risk_atlas/Y2018M12D06_RH_Master_Shape_V01/output_V02/Y2018M12D06_RH_Master_Shape_V01.shp')

In [16]:
df_aq30.drop(columns=['aqid', 'gid_1', 'pfaf_id', 'type', 'geometry'], inplace=True)
df_aq30.head(1)

Unnamed: 0,aq30_id,string_id
0,0,111011-EGY.11_1-3365


In [17]:
df_aq30.shape

(68511, 2)

In [18]:
df_all = pd.merge(left=df_group,
                right=df_aq30,
                on = "string_id",
                how = "left")
#df_all.drop(columns='string_id', inplace=True)
df_all.sort_values('aq30_id', inplace=True)
df_all.head(1)

Unnamed: 0,indicator,score,string_id,group_short,aq30_id
614218,rfr,4.180674,111011-EGY.11_1-3365,qan,0


**Drop rows where score is NaN**

In [91]:
df_all.dropna(subset=['score'], inplace=True)

In [92]:
df_all.shape

(736918, 5)

In [93]:
df_all.head()

Unnamed: 0,indicator,score,string_id,group_short,aq30_id
614218,rfr,4.180674,111011-EGY.11_1-3365,qan,0
712413,cep,1.25,111011-EGY.11_1-3365,qal,0
723470,cfr,0.0,111011-EGY.11_1-3365,qan,0
325363,sev,2.887187,111011-EGY.11_1-3365,qan,0
552901,usa,1.019067,111011-EGY.11_1-3365,rrr,0


**Save table**

In [95]:
df_all.to_csv('/Users/ikersanchez/Vizzuality/PROIEKTUAK/Aqueduct/work/data/AQ_2_water_risk_atlas/water_risk_indicators_normalized.csv')

## Basin name table

In [21]:
fao_major = pd.read_csv('/Users/ikersanchez/Vizzuality/PROIEKTUAK/Aqueduct/work/data/AQ_2_water_risk_atlas/output_V01/fao/fao_major.csv')
fao_minor = pd.read_csv('/Users/ikersanchez/Vizzuality/PROIEKTUAK/Aqueduct/work/data/AQ_2_water_risk_atlas/output_V01/fao/fao_minor.csv')
fao_link = pd.read_csv('/Users/ikersanchez/Vizzuality/PROIEKTUAK/Aqueduct/work/data/AQ_2_water_risk_atlas/output_V01/fao/fao_link.csv')
fao_major.drop(labels=['Unnamed: 0'], axis=1, inplace=True)
fao_minor.drop(labels=['Unnamed: 0'], axis=1, inplace=True)
fao_link.drop(labels=['Unnamed: 0'], axis=1, inplace=True)

In [22]:
fao_major.head(1)

Unnamed: 0,maj_bas,maj_name,maj_area
0,1001,"Gulf of Mexico, North Atlantic Coast",701385


In [23]:
fao_minor.head(1)

Unnamed: 0,fao_id,sub_bas,to_bas,maj_bas,sub_name,sub_area
0,MAJ_BAS_1001_SUB_BAS_0001001,1001,1005,1001,Upper Roanoke,8689


In [24]:
fao_link.head(1)

Unnamed: 0,pfaf_id,fao_id
0,611001,MAJ_BAS_3001_SUB_BAS_0001002


In [25]:
df = pd.merge(left=fao_minor, right=fao_major, on = "maj_bas", how = "left")
df = pd.merge(left=fao_link, right=df, on = "fao_id", how = "left")
df.drop(labels=['sub_bas', 'to_bas', 'maj_bas', 'sub_area', 'maj_area'], axis=1, inplace=True)

In [26]:
fao_ids = []
sub_names = []
for n in df['pfaf_id'].unique():
    fao_ids.append(list(df[df['pfaf_id'] == n]['fao_id']))
    sub_names.append(list(df[df['pfaf_id'] == n]['sub_name']))

In [27]:
df.drop(labels=['fao_id', 'sub_name'], axis=1, inplace=True)
df.drop_duplicates(subset=['pfaf_id'], keep='first', inplace=True)

In [28]:
df['sub_names'] = sub_names
df['fao_ids'] = fao_ids

In [29]:
df.head()

Unnamed: 0,pfaf_id,maj_name,sub_names,fao_ids
0,611001,Caribbean Coast,"[Archipielago de San Blas Coast, Altrato 1]","[MAJ_BAS_3001_SUB_BAS_0001002, MAJ_BAS_3001_SU..."
2,611002,Caribbean Coast,"[Altrato 2, Sucio, Altrato 1]","[MAJ_BAS_3001_SUB_BAS_0001005, MAJ_BAS_3001_SU..."
5,611003,Caribbean Coast,"[Altrato 1, Golfo del Darien Coast]","[MAJ_BAS_3001_SUB_BAS_0001003, MAJ_BAS_3001_SU..."
7,611004,Caribbean Coast,[Golfo del Darien Coast],[MAJ_BAS_3001_SUB_BAS_0001006]
8,611005,Caribbean Coast,"[Sinu, Golfo del Darien Coast]","[MAJ_BAS_3001_SUB_BAS_0001007, MAJ_BAS_3001_SU..."


**Save table**

In [34]:
df.to_csv('/Users/ikersanchez/Vizzuality/PROIEKTUAK/Aqueduct/work/data/AQ_2_water_risk_atlas/basin_names.csv')

## FAO geometries

In [128]:
fao_major = pd.read_csv('/Users/ikersanchez/Vizzuality/PROIEKTUAK/Aqueduct/work/data/AQ_2_water_risk_atlas/output_V01/fao/fao_major.csv')
fao_minor = pd.read_csv('/Users/ikersanchez/Vizzuality/PROIEKTUAK/Aqueduct/work/data/AQ_2_water_risk_atlas/output_V01/fao/fao_minor.csv')
fao_link = pd.read_csv('/Users/ikersanchez/Vizzuality/PROIEKTUAK/Aqueduct/work/data/AQ_2_water_risk_atlas/output_V01/fao/fao_link.csv')
fao_major.drop(labels=['Unnamed: 0'], axis=1, inplace=True)
fao_minor.drop(labels=['Unnamed: 0'], axis=1, inplace=True)
fao_link.drop(labels=['Unnamed: 0'], axis=1, inplace=True)

df = pd.merge(left=fao_minor, right=fao_major, on = "maj_bas", how = "left")
df = pd.merge(left=fao_link, right=df, on = "fao_id", how = "left")
df.drop(labels=['sub_name', 'maj_name', 'sub_area', 'maj_area'], axis=1, inplace=True)
df = df.dropna(subset=['sub_bas'])
df['sub_bas']=df['sub_bas'].apply(lambda x: int(x))
df['to_bas']=df['to_bas'].apply(lambda x: int(x))
df['maj_bas']=df['maj_bas'].apply(lambda x: int(x))

In [129]:
df.head()

Unnamed: 0,pfaf_id,fao_id,sub_bas,to_bas,maj_bas
0,611001,MAJ_BAS_3001_SUB_BAS_0001002,1002,-999,3001
1,611001,MAJ_BAS_3001_SUB_BAS_0001003,1003,-999,3001
2,611002,MAJ_BAS_3001_SUB_BAS_0001005,1005,1003,3001
3,611002,MAJ_BAS_3001_SUB_BAS_0001004,1004,1003,3001
4,611002,MAJ_BAS_3001_SUB_BAS_0001003,1003,-999,3001


In [103]:
fao_geo = gpd.read_file('/Users/ikersanchez/Vizzuality/PROIEKTUAK/Aqueduct/work/data/AQ_2_water_risk_atlas/output_V01/fao/Y2017M08D23_RH_Merge_FAONames_V01/output_V02/hydrobasins_fao_fiona_merged_v01.shp')
fao_geo.columns = [x.lower() for x in fao_geo.columns]

In [108]:
fao_geo.head()

Unnamed: 0,sub_bas,to_bas,maj_bas,sub_name,maj_name,sub_area,maj_area,legend,geometry
0,1001,1019,5001,Herlen Gol / Hulun Nur,Amur,104012,2086009,1,"(POLYGON ((115.5416666666659 48.0041666666659,..."
1,1002,1006,5001,Onon,Amur,59873,2086009,1,"POLYGON ((113.3541666666661 51.10416666666587,..."
2,1003,-888,5001,Solonchak Zun Torey / Solonchak,Amur,50635,2086009,1,"POLYGON ((115.4333333333327 50.52916666666587,..."
3,1004,1011,5001,Ingoda,Amur,37746,2086009,1,"POLYGON ((114.4583333333327 53.0624999999991, ..."
4,1005,1010,5001,Aga,Amur,8627,2086009,1,"POLYGON ((115.2874999999993 51.69583333333254,..."


In [133]:
df_all = pd.merge(left=fao_geo,
                right=df,
                on = ["sub_bas"],
                how = "left")
df_all.head()

Unnamed: 0,sub_bas,to_bas_x,maj_bas_x,sub_name,maj_name,sub_area,maj_area,legend,geometry,pfaf_id,fao_id,to_bas_y,maj_bas_y
0,1001,1019,5001,Herlen Gol / Hulun Nur,Amur,104012,2086009,1,"(POLYGON ((115.5416666666659 48.0041666666659,...",672099,MAJ_BAS_3001_SUB_BAS_0001001,-999,3001
1,1001,1019,5001,Herlen Gol / Hulun Nur,Amur,104012,2086009,1,"(POLYGON ((115.5416666666659 48.0041666666659,...",422991,MAJ_BAS_5001_SUB_BAS_0001001,1019,5001
2,1001,1019,5001,Herlen Gol / Hulun Nur,Amur,104012,2086009,1,"(POLYGON ((115.5416666666659 48.0041666666659,...",422993,MAJ_BAS_5001_SUB_BAS_0001001,1019,5001
3,1001,1019,5001,Herlen Gol / Hulun Nur,Amur,104012,2086009,1,"(POLYGON ((115.5416666666659 48.0041666666659,...",422992,MAJ_BAS_5001_SUB_BAS_0001001,1019,5001
4,1001,1019,5001,Herlen Gol / Hulun Nur,Amur,104012,2086009,1,"(POLYGON ((115.5416666666659 48.0041666666659,...",422995,MAJ_BAS_5001_SUB_BAS_0001001,1019,5001


## Water Risk Indicators table for Food

We have to modified the new table in such a way that mimics the old one.

### Read tables

**Old table**

In [None]:
wri_old = pd.read_csv('/Users/ikersanchez/Vizzuality/PROIEKTUAK/Aqueduct/work/data/AQ_2_water_risk_atlas/water_risk_indicators_v3.csv')

In [None]:
wri_old.head()

In [None]:
wri_old['scenario'].unique()

In [None]:
wri_old['indicator'].unique()

**New table**

In [None]:
wri_new = pd.read_csv('/Users/ikersanchez/Vizzuality/PROIEKTUAK/Aqueduct/work/data/AQ_2_water_risk_atlas/output_V01/annual/annual_normalized.csv')

In [None]:
wri_new.head()

**Take only default weighting**

In [None]:
wri_new = wri_new[wri_new['industry_short'] == 'def']

### Create `PFAF_ID` and `aqid` columns

The `string_id` is a composite index of `PFAF_ID` + "-" + `GID_1` + "-" + `aqid` 

**Polygon names**

|What 	                                |Name             |unique_identifier_integer |
|---------------------------------------|-----------------|--------------------------|
|hydrological sub-basins                |Hydrobasin6      |PFAF_ID                   |
|sub-national administrative boundaries |GADM_1           |GID_1_ID                  |
|Groundwater Aquifers                   |WHYMAP           |aquid                     |
|Union of the geometries above          |Aqueduct_Union   |aq30_id                   |

In old table (water_risk_indicators_v3) `basinid` = `PFAF_ID`


In [None]:
wri_new['PFAF_ID'] = wri_new.apply(lambda x: x['string_id'].split('-')[0], axis=1)
wri_new['aqid'] = wri_new.apply(lambda x: x['string_id'].split('-')[2], axis=1)

**We remove the sub-national administrative boundaries level** 

In [None]:
wri_new.drop_duplicates(subset=['indicator', 'PFAF_ID', 'aqid'], keep='first', inplace=True)

**Drop some columns**

In [None]:
wri_new.drop(labels=['Unnamed: 0', 'industry_short', 'raw', 
                       'string_id', 'weight_fraction', 'weighted_score'], axis=1, inplace=True)

In [None]:
wri_new.head()

In [None]:
wri_new['aqid'].replace('None', np.nan, inplace=True)
wri_new['PFAF_ID'].replace('None', np.nan, inplace=True)

**Save table**

In [None]:
wri_new.to_csv('/Users/ikersanchez/Vizzuality/PROIEKTUAK/Aqueduct/work/data/AQ_2_water_risk_atlas/water_risk_indicators_food.csv')