# Precalculated data for gadm1 (including SPS) - February 2023

## Context
In this notebook, we use biodiversity data for subnational boundaries stored in AGOL to create tables with precalculated data for gadm1. It includes both global SPS values (from species lookup tables) and AOI-specific SPS values (calculated in this notebook).
 
**amphibians:** https://eowilson.maps.arcgis.com/home/item.html?id=30056f994d5748198ffd8f45619692a2

**birds:** https://eowilson.maps.arcgis.com/home/item.html?id=8663c992ab66475f8b818048725fa98e

**mammals:** https://eowilson.maps.arcgis.com/home/item.html?id=8f2ad6b4ef8547f79e82c9d98e481922

**reptiles:** https://eowilson.maps.arcgis.com/home/item.html?id=e92386ef1f4b423faae3f7afb1330319


## Table of contents
1. [Setup](#setup)
    1. [Import libraries](#libraries)
    2. [Utils](#utils)
    3. [Connect to ESRI](#esri)
    4. [Set paths](#paths)
2. [Prepare data](#data)
...


---
<a id='setup'></a>
## Setup

<a id='libraries'></a>
### Import libraries

In [1]:
import pandas as pd
import numpy as np
import geopandas as gpd
import arcgis
from arcgis.gis import GIS
import json
import pandas as pd
from arcgis.features import FeatureLayerCollection
import requests as re
from copy import deepcopy
from itertools import repeat
import functools


<a id='utils'></a>
### Utils

In [2]:
# Get hosted table from id
def getHTfromId(item_id):
    item = gis.content.get(item_id)
    flayer = item.tables[0]
    sdf = flayer.query().sdf
    return sdf

In [3]:
# Get hosted layer from id
def getHLfromId(item_id):
    item = gis.content.get(item_id)
    flayer = item.layers[0]
    sdf = flayer.query().sdf
    return sdf

In [41]:
def format_df(path, file_name, lookups_id):
    df = pd.read_csv(f'{path}/{file_name}')
    col_name = [col for col in df.columns if col in ['SUM_amphib','SUM_birds','SUM_presence','SUM_reptil']]
    df.rename(columns={'SliceNumbe':'SliceNumber',col_name[0]:'SUM'}, inplace=True)
    
    ### Get information from lookup tables:
    lookup = getHTfromId(lookups_id)
    df = df.merge(lookup[['SliceNumber','range_area_km2', 'SPS', 'conservation_target']], how='left',on = 'SliceNumber')
    
    ### Get species area against global species range:
    df['per_global'] = round(df['SUM']/df['range_area_km2']*100,2)
    df.loc[df['per_global']> 100,'per_global'] = 100 ### make max presence 100%
    
    ### Get species area against aoi area (this is currently not needed on the platform):
    # df = df.merge(gadm0[['MOL_ID','AREA_KM2']])
    # df['per_aoi'] = round(df['SUM']/df['AREA_KM2']*100,2)
    # df.loc[df['per_aoi']> 100,'per_aoi'] = 100 ### make max presence 100%
    
    return df

<a id='esri'></a>
### Connect to ArcGIS API

In [5]:
env_path = ".env"
with open(env_path) as f:
   env = {}
   for line in f:
       env_key, _val = line.split("=")
       env_value = _val.split("\n")[0]
       env[env_key] = env_value

In [6]:
aol_password = env['ARCGIS_SOFIA_PASS']
aol_username = env['ARCGIS_SOFIA_USER']

In [7]:
gis = GIS("https://eowilson.maps.arcgis.com", aol_username, aol_password, profile = "eowilson")

Keyring backend being used (keyring.backends.OS_X.Keyring (priority: 5)) either failed to install or is not recommended by the keyring project (i.e. it is not secure). This means you can not use stored passwords through GIS's persistent profiles. Note that extra system-wide steps must be taken on a Linux machine to use the python keyring module securely. Read more about this at the keyring API doc (http://bit.ly/2EWDP7B) and the ArcGIS API for Python doc (http://bit.ly/2CK2wG8).


<a id='paths'></a>
### Set paths

In [30]:
path_in = '/Users/sofia/Documents/HE_Data/Precalculated/gadm1/Inputs'
path_out = '/Users/sofia/Documents/HE_Data/Precalculated/gadm1/Outputs'

<a id='data'></a>
## Prepare data
### Get subnational boundaries
The dataset used in the layer containing the first iteration of subnational precalculations [gadm1_precalculated](https://eowilson.maps.arcgis.com/home/item.html?id=fe214eeebd21493eb2782a7ce1466606#data) there are many names that have foreign characters that have been replaced by `?`. Moreover, it corresponds with an older gadm version (3.6). Here, we are going to update the subnational names with those in gadm4 to account for changes in region names and avoid the foreign characters issues.

In [31]:
gadm1 = gpd.read_file((f'{path_in}/gadm1_geometries.geojson'))
gadm1

Unnamed: 0,GID_0,NAME_0,GID_1,NAME_1,MOL_ID,AREA_KM2,geometry
0,AFG,Afghanistan,AFG.1_1,Badakhshan,1,43692.210235,"POLYGON ((71.10155 35.95555, 71.08842 35.92924..."
1,AFG,Afghanistan,AFG.2_1,Badghis,2,20589.857163,"POLYGON ((63.09734 34.64551, 63.06237 34.69018..."
2,AFG,Afghanistan,AFG.3_1,Baghlan,3,21120.261382,"POLYGON ((67.35538 34.88549, 67.33750 34.91966..."
3,AFG,Afghanistan,AFG.4_1,Balkh,4,17253.634668,"POLYGON ((66.42347 35.64057, 66.51625 35.67334..."
4,AFG,Afghanistan,AFG.5_1,Bamyan,5,14173.489095,"POLYGON ((66.65279 34.00322, 66.67175 34.03791..."
...,...,...,...,...,...,...,...
3605,ZWE,Zimbabwe,ZWE.6_1,Mashonaland West,3606,57396.734463,"POLYGON ((30.37916 -18.83976, 30.36670 -18.835..."
3606,ZWE,Zimbabwe,ZWE.7_1,Masvingo,3607,56280.104175,"POLYGON ((31.06733 -22.34189, 31.11290 -22.336..."
3607,ZWE,Zimbabwe,ZWE.8_1,Matabeleland North,3608,75500.590205,"POLYGON ((28.66857 -20.30021, 28.63305 -20.260..."
3608,ZWE,Zimbabwe,ZWE.9_1,Matabeleland South,3609,54675.751465,"POLYGON ((30.99968 -22.31642, 30.98855 -22.327..."


In [32]:
gadm1[gadm1.NAME_1=='Iğdır'] # There is no region with that name because it's misspelled. 

Unnamed: 0,GID_0,NAME_0,GID_1,NAME_1,MOL_ID,AREA_KM2,geometry


In [15]:
# Read gadm 4.0
gadm40 = gpd.read_file('/Users/sofia/Documents/HE_Data/gadm/gadm404-shp/gadm404.shp')

In [17]:
gadm40_GID = gadm40[['GID_1', 'NAME_1']]
gadm40_GID = gadm40_GID.groupby('GID_1')
gadm40_GID = gadm40_GID.first()
gadm40_GID = gadm40_GID.reset_index()
gadm40_GID = gadm40_GID.rename(columns={'GID_1':'GID', 'NAME_1':'NAME'})
gadm40_GID

Unnamed: 0,GID,NAME
0,?,?
1,AFG.10_1,Ghor
2,AFG.11_1,Hilmand
3,AFG.12_1,Hirat
4,AFG.13_1,Jawzjan
...,...,...
3651,ZWE.5_1,Mashonaland East
3652,ZWE.6_1,Mashonaland West
3653,ZWE.7_1,Masvingo
3654,ZWE.8_1,Matabeleland North


In [33]:
# Merge both datasets by GID
gadm1 = pd.merge(gadm1, gadm40_GID, how='left', left_on='GID_1', right_on='GID')
gadm1.head()

Unnamed: 0,GID_0,NAME_0,GID_1,NAME_1,MOL_ID,AREA_KM2,geometry,GID,NAME
0,AFG,Afghanistan,AFG.1_1,Badakhshan,1,43692.210235,"POLYGON ((71.10155 35.95555, 71.08842 35.92924...",AFG.1_1,Badakhshan
1,AFG,Afghanistan,AFG.2_1,Badghis,2,20589.857163,"POLYGON ((63.09734 34.64551, 63.06237 34.69018...",AFG.2_1,Badghis
2,AFG,Afghanistan,AFG.3_1,Baghlan,3,21120.261382,"POLYGON ((67.35538 34.88549, 67.33750 34.91966...",AFG.3_1,Baghlan
3,AFG,Afghanistan,AFG.4_1,Balkh,4,17253.634668,"POLYGON ((66.42347 35.64057, 66.51625 35.67334...",AFG.4_1,Balkh
4,AFG,Afghanistan,AFG.5_1,Bamyan,5,14173.489095,"POLYGON ((66.65279 34.00322, 66.67175 34.03791...",AFG.5_1,Bamyan


In [34]:
# Identify regions that have changed the name from gadm version 3.6 to gadm 4.0
gadm2 = gadm1[gadm1.NAME_1!=gadm1.NAME]
gadm2[['NAME_1', 'NAME']]

Unnamed: 0,NAME_1,NAME
52,Brändö,
53,Eckerö,
54,Finström,
55,Föglö,
56,Geta,
...,...,...
3555,Gazima?usa,
3556,Girne,
3557,Güzelyurt,
3558,Iskele,


In [35]:
len(gadm1[gadm1.NAME.isnull()]) # 112 regions have name in gadm 3.6 but not in gadm4. So let's maintain those names

112

In [36]:
# Give to regions with nan name the name they had in gadm36
gadm1.NAME.fillna(gadm1.NAME_1, inplace=True)

In [37]:
# Identify regions that have changed the name from gadm version 3.6 to gadm 4.0
gadm2 = gadm1[gadm1.NAME_1!=gadm1.NAME]
gadm2[['NAME_1', 'NAME']]

Unnamed: 0,NAME_1,NAME
295,Br?ko,Brčko
299,Homyel',Gomel
300,Hrodna,Grodno
301,Mahilyow,Mogilev
303,Vitsyebsk,Vitebsk
...,...,...
3523,V?nh Long,Vĩnh Long
3524,V?nh Phúc,Vĩnh Phúc
3548,?akovica,Đakovica
3551,Pe?ki,Pećki


In [38]:
# Give to NAME_1 the new names
gadm1.NAME_1 = gadm1.NAME
gadm1 = gadm1.drop(columns={'NAME', 'GID'})

In [39]:
gadm1[gadm1.NAME_1=='Iğdır'] # The names have been corrected

Unnamed: 0,GID_0,NAME_0,GID_1,NAME_1,MOL_ID,AREA_KM2,geometry
3156,TUR,Turkey,TUR.38_1,Iğdır,3157,3929.341409,"POLYGON ((44.34463 40.02792, 44.37977 40.00528..."


---
<a id='biodiversity'></a>
## Format Biodiversity data

In [42]:
### Ids of lookup tables for each taxa in ArcGIS online
lookups = {'amphibians':'de2309ec6aa64223a8bea682c0200d34',
         'birds':'b5f5c8d693b74abd9b0d236915d8e739',
         'mammals':'1d3b50e3b8544730ae0e2a80f00b4119',
         'reptiles':'bc6de8b9b8df4fffb6aa4208f4bf1467'}

# Get data for all taxa
amphibians = format_df(path_in, 'amphibians_gadm1_final_20211003_0.csv', lookups['amphibians'])
birds = format_df(path_in, 'birds_gadm1_final_0.csv', lookups['birds'])
mammals = format_df(path_in, 'mammals_gadm1_final_0.csv', lookups['mammals'])
reptiles = format_df(path_in, 'reptiles_gadm1_final_20211003_0.csv', lookups['reptiles'])


In [43]:
amphibians

Unnamed: 0,OID,MOL_ID,SliceNumber,FREQUENCY,SUM,range_area_km2,SPS,conservation_target,per_global
0,1,1,951,3,158,103326,21,29,0.15
1,2,2,1707,3,5508,3706871,14,15,0.15
2,3,6,1707,1,235,3706871,14,15,0.01
3,4,7,950,11,13691,275425,12,15,4.97
4,5,7,1707,12,29675,3706871,14,15,0.80
...,...,...,...,...,...,...,...,...,...
75271,75272,3610,6037,1,1179,1156959,100,15,0.10
75272,75273,3610,6039,9,23410,709196,100,15,3.30
75273,75274,3610,6042,9,532,865504,100,15,0.06
75274,75275,3610,6148,11,46255,4276570,100,15,1.08


In [44]:
amphibians = amphibians.rename(columns = {'SPS': 'SPS_global'})
birds = birds.rename(columns = {'SPS': 'SPS_global'})
mammals = mammals.rename(columns = {'SPS': 'SPS_global'})
reptiles = reptiles.rename(columns = {'SPS': 'SPS_global'})

### Calculate SPS_aoi

In [47]:
# To calculate the SPS_AOI we need to know the species found on the WDPAs (calculations done in Pro: AOI_Summaries_Precalculations.aprx)
wdpa_amph = pd.read_csv(f'{path_in}/WDPA_Regions/Amphibians_wdpa_regions.csv').astype(int).rename(columns={'SUM_amphibians': 'SUM_PA'})
wdpa_bird = pd.read_csv(f'{path_in}/WDPA_Regions/Birds_wdpa_regions.csv').astype(int).rename(columns={'SUM_birds': 'SUM_PA'})
wdpa_mamm = pd.read_csv(f'{path_in}/WDPA_Regions/Mammals_wdpa_regions.csv').astype(int).rename(columns={'SUM_presence': 'SUM_PA'})
wdpa_rept = pd.read_csv(f'{path_in}/WDPA_Regions/Reptiles_wdpa_regions.csv').astype(int).rename(columns={'SUM_reptiles': 'SUM_PA'})

In [48]:
wdpa_amph.head(1)

Unnamed: 0,OID_,MOL_ID,SliceNumber,FREQUENCY,SUM_PA,REGION_ID
0,1,2,3318,1,2,2211


In [49]:
# Aggregate data by region: Aggregate species (SliceNumber) located in different WDPA (MOL_ID) belonging to the same region (REGION_ID)
wdpa_amph2 = wdpa_amph[['REGION_ID', 'SliceNumber', 'SUM_PA']]
wdpa_amph2 = wdpa_amph2.groupby(['REGION_ID', 'SliceNumber']).sum().reset_index()
wdpa_bird2 = wdpa_bird[['REGION_ID', 'SliceNumber', 'SUM_PA']]
wdpa_bird2 = wdpa_bird2.groupby(['REGION_ID', 'SliceNumber']).sum().reset_index()
wdpa_mamm2 = wdpa_mamm[['REGION_ID', 'SliceNumber', 'SUM_PA']]
wdpa_mamm2 = wdpa_mamm2.groupby(['REGION_ID', 'SliceNumber']).sum().reset_index()
wdpa_rept2 = wdpa_rept[['REGION_ID', 'SliceNumber', 'SUM_PA']]
wdpa_rept2 = wdpa_rept2.groupby(['REGION_ID', 'SliceNumber']).sum().reset_index()

In [50]:
wdpa_amph2.head()

Unnamed: 0,REGION_ID,SliceNumber,SUM_PA
0,24,2224,72
1,36,38,77
2,36,212,273
3,36,238,287
4,36,2196,252


In [51]:
# Add this information about the species found in WDPA to master tables with all species per region
amphibians2= pd.merge(amphibians, wdpa_amph2, how='left', left_on= ['MOL_ID', 'SliceNumber'], right_on=['REGION_ID', 'SliceNumber']) 
amphibians2 = amphibians2.fillna(0).drop(columns= 'REGION_ID')
birds2= pd.merge(birds, wdpa_bird2, how='left', left_on= ['MOL_ID', 'SliceNumber'], right_on=['REGION_ID', 'SliceNumber']) 
birds2 = birds2.fillna(0).drop(columns= 'REGION_ID')
mammals2= pd.merge(mammals, wdpa_mamm2, how='left', left_on= ['MOL_ID', 'SliceNumber'], right_on=['REGION_ID', 'SliceNumber']) 
mammals2 = mammals2.fillna(0).drop(columns= 'REGION_ID')
reptiles2= pd.merge(reptiles, wdpa_rept2, how='left', left_on= ['MOL_ID', 'SliceNumber'], right_on=['REGION_ID', 'SliceNumber']) 
reptiles2 = reptiles2.fillna(0).drop(columns= 'REGION_ID')

In [52]:
# Calculate SPS_aoi
amphibians2['SPS_aoi'] = (((amphibians2['SUM_PA']/amphibians2['SUM'])*100/amphibians2['conservation_target'])*100).astype(int)
birds2['SPS_aoi'] = (((birds2['SUM_PA']/birds2['SUM'])*100/birds2['conservation_target'])*100).astype(int)
mammals2['SPS_aoi'] = (((mammals2['SUM_PA']/mammals2['SUM'])*100/mammals2['conservation_target'])*100).astype(int)
reptiles2['SPS_aoi'] = (((reptiles2['SUM_PA']/reptiles2['SUM'])*100/reptiles2['conservation_target'])*100).astype(int)

In [53]:
amphibians2

Unnamed: 0,OID,MOL_ID,SliceNumber,FREQUENCY,SUM,range_area_km2,SPS_global,conservation_target,per_global,SUM_PA,SPS_aoi
0,1,1,951,3,158,103326,21,29,0.15,0.0,0
1,2,2,1707,3,5508,3706871,14,15,0.15,0.0,0
2,3,6,1707,1,235,3706871,14,15,0.01,0.0,0
3,4,7,950,11,13691,275425,12,15,4.97,0.0,0
4,5,7,1707,12,29675,3706871,14,15,0.80,0.0,0
...,...,...,...,...,...,...,...,...,...,...,...
75271,75272,3610,6037,1,1179,1156959,100,15,0.10,1224.0,692
75272,75273,3610,6039,9,23410,709196,100,15,3.30,11331.0,322
75273,75274,3610,6042,9,532,865504,100,15,0.06,31.0,38
75274,75275,3610,6148,11,46255,4276570,100,15,1.08,14245.0,205


In [54]:
# Limit SPS_aoi over 100 to 100
amphibians2['SPS_aoi'].where(amphibians2['SPS_aoi'] < 100, 100, inplace=True)
birds2['SPS_aoi'].where(birds2['SPS_aoi'] < 100, 100, inplace=True)
mammals2['SPS_aoi'].where(mammals2['SPS_aoi'] < 100, 100, inplace=True)
reptiles2['SPS_aoi'].where(reptiles2['SPS_aoi'] < 100, 100, inplace=True)

### Create table with biodiversity data for regions

In [55]:
# Format biodiversity data in a string
amphibians_bio = amphibians2.groupby('MOL_ID')[['SliceNumber', 'per_global', 'SPS_global', 'SPS_aoi']].apply(lambda x: x.to_json(orient='records')).to_frame('amphibians').reset_index()
birds_bio = birds2.groupby('MOL_ID')[['SliceNumber', 'per_global', 'SPS_global', 'SPS_aoi']].apply(lambda x: x.to_json(orient='records')).to_frame('birds').reset_index()
mammals_bio = mammals2.groupby('MOL_ID')[['SliceNumber', 'per_global', 'SPS_global', 'SPS_aoi']].apply(lambda x: x.to_json(orient='records')).to_frame('mammals').reset_index()
reptiles_bio = reptiles2.groupby('MOL_ID')[['SliceNumber', 'per_global', 'SPS_global', 'SPS_aoi']].apply(lambda x: x.to_json(orient='records')).to_frame('reptiles').reset_index()


In [56]:
amphibians_bio

Unnamed: 0,MOL_ID,amphibians
0,1,"[{""SliceNumber"":951,""per_global"":0.15,""SPS_glo..."
1,2,"[{""SliceNumber"":1707,""per_global"":0.15,""SPS_gl..."
2,6,"[{""SliceNumber"":1707,""per_global"":0.01,""SPS_gl..."
3,7,"[{""SliceNumber"":950,""per_global"":4.97,""SPS_glo..."
4,9,"[{""SliceNumber"":1191,""per_global"":14.29,""SPS_g..."
...,...,...
3356,3606,"[{""SliceNumber"":212,""per_global"":1.28,""SPS_glo..."
3357,3607,"[{""SliceNumber"":33,""per_global"":0.0,""SPS_globa..."
3358,3608,"[{""SliceNumber"":212,""per_global"":1.6,""SPS_glob..."
3359,3609,"[{""SliceNumber"":212,""per_global"":1.18,""SPS_glo..."


In [58]:
gadm1 = pd.merge(gadm1, amphibians_bio, how='left', on = 'MOL_ID')
gadm1 = pd.merge(gadm1, birds_bio, how='left', on = 'MOL_ID')
gadm1 = pd.merge(gadm1, mammals_bio, how='left', on = 'MOL_ID')
gadm1 = pd.merge(gadm1, reptiles_bio, how='left', on = 'MOL_ID')
gadm1.head()

Unnamed: 0,GID_0,NAME_0,GID_1,NAME_1,MOL_ID,AREA_KM2,geometry,amphibians,birds,mammals,reptiles
0,AFG,Afghanistan,AFG.1_1,Badakhshan,1,43692.210235,"POLYGON ((71.10155 35.95555, 71.08842 35.92924...","[{""SliceNumber"":951,""per_global"":0.15,""SPS_glo...","[{""SliceNumber"":92,""per_global"":0.1,""SPS_globa...","[{""SliceNumber"":167,""per_global"":4.13,""SPS_glo...","[{""SliceNumber"":4,""per_global"":1.97,""SPS_globa..."
1,AFG,Afghanistan,AFG.2_1,Badghis,2,20589.857163,"POLYGON ((63.09734 34.64551, 63.06237 34.69018...","[{""SliceNumber"":1707,""per_global"":0.15,""SPS_gl...","[{""SliceNumber"":26,""per_global"":0.08,""SPS_glob...","[{""SliceNumber"":575,""per_global"":0.34,""SPS_glo...","[{""SliceNumber"":9,""per_global"":0.87,""SPS_globa..."
2,AFG,Afghanistan,AFG.3_1,Baghlan,3,21120.261382,"POLYGON ((67.35538 34.88549, 67.33750 34.91966...",,"[{""SliceNumber"":26,""per_global"":0.05,""SPS_glob...","[{""SliceNumber"":167,""per_global"":2.03,""SPS_glo...","[{""SliceNumber"":1,""per_global"":0.8,""SPS_global..."
3,AFG,Afghanistan,AFG.4_1,Balkh,4,17253.634668,"POLYGON ((66.42347 35.64057, 66.51625 35.67334...",,"[{""SliceNumber"":26,""per_global"":0.05,""SPS_glob...","[{""SliceNumber"":575,""per_global"":1.61,""SPS_glo...","[{""SliceNumber"":9,""per_global"":0.69,""SPS_globa..."
4,AFG,Afghanistan,AFG.5_1,Bamyan,5,14173.489095,"POLYGON ((66.65279 34.00322, 66.67175 34.03791...",,"[{""SliceNumber"":26,""per_global"":0.03,""SPS_glob...","[{""SliceNumber"":167,""per_global"":0.22,""SPS_glo...","[{""SliceNumber"":1,""per_global"":3.23,""SPS_globa..."


In [62]:
gadm1.loc[gadm1['MOL_ID']==1,'amphibians'].values[0]

'[{"SliceNumber":951,"per_global":0.15,"SPS_global":21,"SPS_aoi":0}]'

### Add nspecies

In [64]:
# Get data for all taxa
a = pd.read_csv(f'{path_in}/amphibians_gadm1_final_20211003_0.csv')
b = pd.read_csv(f'{path_in}/birds_gadm1_final_0.csv')
m = pd.read_csv(f'{path_in}/mammals_gadm1_final_0.csv')
r = pd.read_csv(f'{path_in}/reptiles_gadm1_final_20211003_0.csv')

In [67]:
# Count number of species for group
a_count = a.groupby('MOL_ID')['SliceNumbe'].count().astype(int)
b_count = b.groupby('MOL_ID')['SliceNumber'].count().astype(int)
m_count = m.groupby('MOL_ID')['SliceNumber'].count().astype(int)
r_count = r.groupby('MOL_ID')['SliceNumbe'].count().astype(int)

In [69]:
frame = { 'amph_nspecies': a_count, 'bird_nspecies': b_count, 'mamm_nspecies': m_count, 'rept_nspecies': r_count }
df = pd.DataFrame(frame).reset_index()
cols = ['amph_nspecies', 'bird_nspecies', 'mamm_nspecies', 'rept_nspecies']
df[cols] = df[cols].fillna(0)
df[cols] = df[cols].astype('int')
df['nspecies'] = df['amph_nspecies'] + df['bird_nspecies'] + df['mamm_nspecies'] + df['rept_nspecies']
df

Unnamed: 0,MOL_ID,amph_nspecies,bird_nspecies,mamm_nspecies,rept_nspecies,nspecies
0,1,1,192,77,46,316
1,2,1,134,51,42,228
2,3,0,163,53,37,253
3,4,0,140,50,48,238
4,5,0,125,38,19,182
...,...,...,...,...,...,...
3605,3606,39,488,158,108,793
3606,3607,45,524,171,147,887
3607,3608,44,507,164,123,838
3608,3609,40,488,167,136,831


In [70]:
gadm1_nspecies = gadm1.merge(df, how='left', on = 'MOL_ID')
gadm1_nspecies

Unnamed: 0,GID_0,NAME_0,GID_1,NAME_1,MOL_ID,AREA_KM2,geometry,amphibians,birds,mammals,reptiles,amph_nspecies,bird_nspecies,mamm_nspecies,rept_nspecies,nspecies
0,AFG,Afghanistan,AFG.1_1,Badakhshan,1,43692.210235,"POLYGON ((71.10155 35.95555, 71.08842 35.92924...","[{""SliceNumber"":951,""per_global"":0.15,""SPS_glo...","[{""SliceNumber"":92,""per_global"":0.1,""SPS_globa...","[{""SliceNumber"":167,""per_global"":4.13,""SPS_glo...","[{""SliceNumber"":4,""per_global"":1.97,""SPS_globa...",1,192,77,46,316
1,AFG,Afghanistan,AFG.2_1,Badghis,2,20589.857163,"POLYGON ((63.09734 34.64551, 63.06237 34.69018...","[{""SliceNumber"":1707,""per_global"":0.15,""SPS_gl...","[{""SliceNumber"":26,""per_global"":0.08,""SPS_glob...","[{""SliceNumber"":575,""per_global"":0.34,""SPS_glo...","[{""SliceNumber"":9,""per_global"":0.87,""SPS_globa...",1,134,51,42,228
2,AFG,Afghanistan,AFG.3_1,Baghlan,3,21120.261382,"POLYGON ((67.35538 34.88549, 67.33750 34.91966...",,"[{""SliceNumber"":26,""per_global"":0.05,""SPS_glob...","[{""SliceNumber"":167,""per_global"":2.03,""SPS_glo...","[{""SliceNumber"":1,""per_global"":0.8,""SPS_global...",0,163,53,37,253
3,AFG,Afghanistan,AFG.4_1,Balkh,4,17253.634668,"POLYGON ((66.42347 35.64057, 66.51625 35.67334...",,"[{""SliceNumber"":26,""per_global"":0.05,""SPS_glob...","[{""SliceNumber"":575,""per_global"":1.61,""SPS_glo...","[{""SliceNumber"":9,""per_global"":0.69,""SPS_globa...",0,140,50,48,238
4,AFG,Afghanistan,AFG.5_1,Bamyan,5,14173.489095,"POLYGON ((66.65279 34.00322, 66.67175 34.03791...",,"[{""SliceNumber"":26,""per_global"":0.03,""SPS_glob...","[{""SliceNumber"":167,""per_global"":0.22,""SPS_glo...","[{""SliceNumber"":1,""per_global"":3.23,""SPS_globa...",0,125,38,19,182
...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...
3605,ZWE,Zimbabwe,ZWE.6_1,Mashonaland West,3606,57396.734463,"POLYGON ((30.37916 -18.83976, 30.36670 -18.835...","[{""SliceNumber"":212,""per_global"":1.28,""SPS_glo...","[{""SliceNumber"":26,""per_global"":0.26,""SPS_glob...","[{""SliceNumber"":28,""per_global"":0.71,""SPS_glob...","[{""SliceNumber"":40,""per_global"":0.09,""SPS_glob...",39,488,158,108,793
3606,ZWE,Zimbabwe,ZWE.7_1,Masvingo,3607,56280.104175,"POLYGON ((31.06733 -22.34189, 31.11290 -22.336...","[{""SliceNumber"":33,""per_global"":0.0,""SPS_globa...","[{""SliceNumber"":26,""per_global"":0.27,""SPS_glob...","[{""SliceNumber"":28,""per_global"":0.64,""SPS_glob...","[{""SliceNumber"":40,""per_global"":0.05,""SPS_glob...",45,524,171,147,887
3607,ZWE,Zimbabwe,ZWE.8_1,Matabeleland North,3608,75500.590205,"POLYGON ((28.66857 -20.30021, 28.63305 -20.260...","[{""SliceNumber"":212,""per_global"":1.6,""SPS_glob...","[{""SliceNumber"":26,""per_global"":0.36,""SPS_glob...","[{""SliceNumber"":28,""per_global"":1.13,""SPS_glob...","[{""SliceNumber"":40,""per_global"":0.03,""SPS_glob...",44,507,164,123,838
3608,ZWE,Zimbabwe,ZWE.9_1,Matabeleland South,3609,54675.751465,"POLYGON ((30.99968 -22.31642, 30.98855 -22.327...","[{""SliceNumber"":212,""per_global"":1.18,""SPS_glo...","[{""SliceNumber"":26,""per_global"":0.26,""SPS_glob...","[{""SliceNumber"":28,""per_global"":0.81,""SPS_glob...","[{""SliceNumber"":40,""per_global"":0.0,""SPS_globa...",40,488,167,136,831


### Save table with biodiversity data

In [71]:
gadm1_nspecies.to_csv((f'{path_out}/gadm1_precalculated_SPS_biodiversity_only.csv'))

---
<a id='contextual'></a>
## Add contextual data
Since we don't have the original datasets that were used to produce the first gadm1_precalculated table, we are going to use the information that is already available in the gadm1_precalculated table we used to retrieve the geometries. From that, we are going to extract data on population, land encroachment and climate regime and add it to the dataframe with the new biodiversity data.

In [57]:
gadm.head(1)

Unnamed: 0,GID_0,NAME_0,GID_1,NAME_1,MOL_ID,AREA_KM2,birds,percentage_protected,percent_irrigated,percent_rainfed,percent_rangeland,percent_urban,population_sum,majority_land_cover_climate_reg,land_cover_majority,climate_regime_majority,country_size,ObjectId,geometry
0,ECU,Ecuador,ECU.6_1,Cotopaxi,801,6172.385,"[ { ""SliceNumber"": 27, ""per_global"": 0.02, ""pe...",22.45057,6.61,8.01,62.57,,487626.1,176.0,Forest,Warm Temperate Moist,4,1,"MULTIPOLYGON (((-78.40904 -0.72033, -78.40891 ..."


In [58]:
gadm.columns

Index(['GID_0', 'NAME_0', 'GID_1', 'NAME_1', 'MOL_ID', 'AREA_KM2', 'birds',
       'percentage_protected', 'percent_irrigated', 'percent_rainfed',
       'percent_rangeland', 'percent_urban', 'population_sum',
       'majority_land_cover_climate_reg', 'land_cover_majority',
       'climate_regime_majority', 'country_size', 'ObjectId', 'geometry'],
      dtype='object')

In [59]:
contextual = gadm[['MOL_ID','percentage_protected','percent_irrigated', 'percent_rainfed', 'percent_rangeland',
       'percent_urban', 'population_sum', 'majority_land_cover_climate_reg', 'land_cover_majority', 'climate_regime_majority', 'country_size']]

In [60]:
gadm1_all = pd.merge(gadm1, contextual, how='left', on='MOL_ID')
gadm1_all.head(5)

Unnamed: 0,GID_0,NAME_0,GID_1,NAME_1,MOL_ID,AREA_KM2,geometry,amphibians,birds,mammals,...,percentage_protected,percent_irrigated,percent_rainfed,percent_rangeland,percent_urban,population_sum,majority_land_cover_climate_reg,land_cover_majority,climate_regime_majority,country_size
0,ECU,Ecuador,ECU.6_1,Cotopaxi,801,6172.385,"MULTIPOLYGON (((-78.40904 -0.72033, -78.40891 ...","[{""SliceNumber"":555,""per_global"":24.8,""per_aoi...","[{""SliceNumber"":27,""per_global"":0.02,""per_aoi""...","[{""SliceNumber"":59,""per_global"":0.84,""per_aoi""...",...,22.45057,6.61,8.01,62.57,,487626.1,176.0,Forest,Warm Temperate Moist,4
1,LBN,Lebanon,LBN.5_1,Mount Lebanon,1601,1985.055,"POLYGON ((35.62627 33.49696, 35.62548 33.49446...","[{""SliceNumber"":955,""per_global"":0.06,""per_aoi...","[{""SliceNumber"":121,""per_global"":0.01,""per_aoi...","[{""SliceNumber"":259,""per_global"":0.0,""per_aoi""...",...,3.763879,0.04,57.94,,20.09,4637642.0,175.0,Shrubland,Warm Temperate Moist,5
2,ECU,Ecuador,ECU.7_1,El Oro,802,5868.456,"MULTIPOLYGON (((-80.44117 -3.17687, -80.44184 ...","[{""SliceNumber"":1010,""per_global"":1.45,""per_ao...","[{""SliceNumber"":27,""per_global"":0.05,""per_aoi""...","[{""SliceNumber"":56,""per_global"":0.38,""per_aoi""...",...,2.652022,7.19,4.91,58.84,2.7,698379.8,262.0,Forest,Sub Tropical Moist,4
3,LBN,Lebanon,LBN.6_1,Nabatiyeh,1602,1095.317,"POLYGON ((35.59720 33.27736, 35.59016 33.28218...","[{""SliceNumber"":947,""per_global"":0.01,""per_aoi...","[{""SliceNumber"":97,""per_global"":0.0,""per_aoi"":...","[{""SliceNumber"":33,""per_global"":0.0,""per_aoi"":...",...,2.829077,1.96,92.29,,5.29,762029.5,173.0,Cropland,Warm Temperate Moist,5
4,IDN,Indonesia,IDN.25_1,Sulawesi Barat,1201,16571.38,"MULTIPOLYGON (((119.35876 -3.48674, 119.35515 ...","[{""SliceNumber"":1700,""per_global"":0.01,""per_ao...","[{""SliceNumber"":43,""per_global"":9.85,""per_aoi""...","[{""SliceNumber"":23,""per_global"":8.61,""per_aoi""...",...,11.62976,1.93,30.06,5.05,6.89,1661324.0,262.0,Forest,Sub Tropical Moist,2


In [69]:
gadm1_all.columns

Index(['GID_0', 'NAME_0', 'GID_1', 'NAME_1', 'MOL_ID', 'AREA_KM2', 'geometry',
       'amphibians', 'birds', 'mammals', 'reptiles', 'amph_nspecies',
       'bird_nspecies', 'mamm_nspecies', 'rept_nspecies', 'nspecies',
       'percentage_protected', 'percent_irrigated', 'percent_rainfed',
       'percent_rangeland', 'percent_urban', 'population_sum',
       'majority_land_cover_climate_reg', 'land_cover_majority',
       'climate_regime_majority', 'country_size'],
      dtype='object')

In [64]:
# Save final dataset
gadm1_all.to_file(f"{path_out}/gadm1_precalculated_all.geojson",driver='GeoJSON') 

Import this new dataframe in AGOL manually either as a new feature layer or overwriting an existing service