# Precalculated data for WDPA (AOI Summaries) - March 2023
In this notebook, we generate precalculated data for the layer `WDPA`. 
The biodiversity and contextual data were generated in ArcPro. The current precalculations include:
- Global SPS (from species lookup tables) 
- SPS values specific to the AOI (SPS_aoi): use biodiversity data found within the protected areas of each AOI. 
- New contextual data using human pressures time series

## Table of contents
1. [Setup](#setup)
    1. [Import libraries](#libraries)
    2. [Utils](#utils)
    3. [Connect to ESRI](#esri)
2. [Prepare data](#data)
3. [Calculate biodiversity](#biodiversity)
    1. [Calculate SPS_aoi](#spsaoi)
    2. [Format biodiversity table](#biotable)
    3. [Add nspecies](#nspecies)
4. [Contextual data](#contextual)
    1. [Population and ELU](#othercontextual)
    2. [Human pressures](#pressures)


<a id='setup'></a>
## Setup

<a id='libraries'></a>
### Import libraries

In [1]:
import pandas as pd
import numpy as np
import geopandas as gpd
import arcgis
from arcgis.gis import GIS
import json
import pandas as pd
from arcgis.features import FeatureLayerCollection
import requests as re
from copy import deepcopy
from itertools import repeat
import functools

<a id='utils'></a>
### Utils

**getHTfromId**

In [2]:
def getHTfromId(item_id):
    item = gis.content.get(item_id)
    flayer = item.tables[0]
    sdf = flayer.query().sdf
    return sdf

**format_df**

In [3]:
def format_df(path, file_name, lookups_id):
    df = pd.read_csv(f'{path}/{file_name}')
    col_name = [col for col in df.columns if col in ['amphibians','birds','presence','reptiles']]
    df.rename(columns={'SliceNumbe':'SliceNumber',col_name[0]:'SUM'}, inplace=True)

    ### Get information from lookup tables:
    lookup = getHTfromId(lookups_id)
    df = df.merge(lookup[['SliceNumber','range_area_km2', 'SPS', 'conservation_target']], how='left',on = 'SliceNumber')
    
    ### Get species area against global species range:
    df['per_global'] = round(df['SUM']/df['range_area_km2']*100,2)
    df.loc[df['per_global']> 100,'per_global'] = 100 ### make max presence 100%
    
    
    return df

<a id='esri'></a>
### Connect to ArcGIS API

In [4]:
env_path = ".env"
with open(env_path) as f:
   env = {}
   for line in f:
       env_key, _val = line.split("=")
       env_value = _val.split("\n")[0]
       env[env_key] = env_value

In [5]:
aol_password = env['ARCGIS_SOFIA_PASS']
aol_username = env['ARCGIS_SOFIA_USER']

In [6]:
gis = GIS("https://eowilson.maps.arcgis.com", aol_username, aol_password, profile = "eowilson")

Keyring backend being used (keyring.backends.OS_X.Keyring (priority: 5)) either failed to install or is not recommended by the keyring project (i.e. it is not secure). This means you can not use stored passwords through GIS's persistent profiles. Note that extra system-wide steps must be taken on a Linux machine to use the python keyring module securely. Read more about this at the keyring API doc (http://bit.ly/2EWDP7B) and the ArcGIS API for Python doc (http://bit.ly/2CK2wG8).


<a id='data'></a>
## Prepare data

In [7]:
path_in = '/Users/sofia/Documents/HE_Data/Precalculated/WDPA_Precalculated/Inputs'
path_out = '/Users/sofia/Documents/HE_Data/Precalculated/WDPA_Precalculated/Outputs'

In [12]:
# Import wdpa table
wdpa= gpd.read_file(f'{path_in}/WDPA_FILTERED_20210615_nomarine_wdpa_corrected_geometries/WDPA_FILTERED_20210615_nomarine_corrected_geometries.shp')
len(wdpa)

217429

In [13]:
wdpa.columns

Index(['WDPAID', 'WDPA_PID', 'PA_DEF', 'NAME', 'ORIG_NA', 'DESIG', 'DESIG_E',
       'DESIG_T', 'IUCN_CA', 'INT_CRI', 'MARINE', 'REP_M_A', 'GIS_M_A',
       'REP_ARE', 'GIS_ARE', 'NO_TAKE', 'NO_TK_A', 'STATUS', 'STATUS_',
       'GOV_TYP', 'OWN_TYP', 'MANG_AU', 'MANG_PL', 'VERIF', 'METADAT',
       'SUB_LOC', 'PARENT_', 'ISO3', 'SUPP_IN', 'CONS_OB', 'SORTER',
       'WDPA_PID_h', 'WDPA_PID__', 'hash_vl', 'AREA_KM', 'MOL_ID', 'geometry'],
      dtype='object')

In [14]:
# Create a dataframe without geometries (they will be published independently) and with only relevant columns
wdpa.rename(columns={'AREA_KM':'AREA_KM2'},inplace=True)
dff = wdpa[['MOL_ID', 'WDPAID', 'WDPA_PID', 'NAME', 'DESIG', 'DESIG_T', 'IUCN_CA',
       'STATUS', 'GOV_TYP', 'MANG_AU', 'ISO3', 'AREA_KM2', 'DESIG_E',
       'ORIG_NA', 'STATUS_']]
dff.head()

Unnamed: 0,MOL_ID,WDPAID,WDPA_PID,NAME,DESIG,DESIG_T,IUCN_CA,STATUS,GOV_TYP,MANG_AU,ISO3,AREA_KM2,DESIG_E,ORIG_NA,STATUS_
0,1,310492.0,310492,Boulder Beach,Stewardship Area,National,III,Designated,Federal or national ministry or agency,Department of Conservation,NZL,1.136031,Stewardship Area,Boulder Beach / WWF Block,
1,2,307797.0,307797,Ferndale,Scenic Reserve,National,III,Designated,Federal or national ministry or agency,Department of Conservation,NZL,0.748492,Scenic Reserve,Ferndale,
2,3,307745.0,307745,Broughton Bay,Scenic Reserve,National,III,Designated,Federal or national ministry or agency,Department of Conservation,NZL,0.031907,Scenic Reserve,Broughton Bay,
3,4,307867.0,307867,Kaipupu Point,Scenic Reserve,National,III,Designated,Federal or national ministry or agency,Department of Conservation,NZL,0.270855,Scenic Reserve,Kaipupu Point,
4,5,303963.0,303963,Catlins Conservation Park,Stewardship Area,National,III,Designated,Federal or national ministry or agency,Department of Conservation,NZL,8.412168,Stewardship Area,Catlins Conservation Park,


<a id='biodiversity'></a>
## Calculate biodiversity data for WDPA

In [16]:
### Ids of lookup tables for each taxa in ArcGIS online (run by Tamara in 2021)

lookups = {'amphibians':'de2309ec6aa64223a8bea682c0200d34',
         'birds':'b5f5c8d693b74abd9b0d236915d8e739',
         'mammals':'1d3b50e3b8544730ae0e2a80f00b4119',
         'reptiles':'bc6de8b9b8df4fffb6aa4208f4bf1467'}

# Get data for all taxa (calculated in arcgis pro with sample tool and saved in local)
amphibians = format_df(path_in, 'wdpa_amphibians_final_20211003.csv', lookups['amphibians'])
birds = format_df(path_in, 'wdpa_birds_final_20211003.csv', lookups['birds'])
mammals = format_df(path_in, 'wdpa_mammals_final_20211003.csv', lookups['mammals'])
reptiles = format_df(path_in, 'wdpa_reptiles_final_20211003.csv', lookups['reptiles'])

In [17]:
# Change the column name SPS to SPS_global to differenciate it from the SPS_aoi we'll calculate later
amphibians = amphibians.rename(columns = {'SPS': 'SPS_global'})
birds = birds.rename(columns = {'SPS': 'SPS_global'})
mammals = mammals.rename(columns = {'SPS': 'SPS_global'})
reptiles = reptiles.rename(columns = {'SPS': 'SPS_global'})

In [18]:
amphibians.head()

Unnamed: 0,OBJECTID,MOL_ID,X,Y,SUM,SliceNumber,Dimensions,range_area_km2,SPS_global,conservation_target,per_global
0,1,102,173.412225,-41.161038,1.0,3318,SliceNumber,296432,95,15,0.0
1,2,102,173.412225,-41.161038,1.0,3393,SliceNumber,471493,68,15,0.0
2,3,103,173.046567,-40.963196,1.0,3318,SliceNumber,296432,95,15,0.0
3,4,104,172.595189,-41.794691,1.0,3318,SliceNumber,296432,95,15,0.0
4,5,104,172.595189,-41.794691,1.0,3393,SliceNumber,471493,68,15,0.0


<a id='spsaoi'></a>
### Calculate SPS_aoi
The SPS_aoi in protected areas is, by definition, always 100% because all species found within the aoi are protected in the aoi

In [19]:
amphibians['SPS_aoi'] = 100
birds['SPS_aoi'] = 100
mammals['SPS_aoi'] = 100
reptiles['SPS_aoi'] = 100
amphibians.head()

Unnamed: 0,OBJECTID,MOL_ID,X,Y,SUM,SliceNumber,Dimensions,range_area_km2,SPS_global,conservation_target,per_global,SPS_aoi
0,1,102,173.412225,-41.161038,1.0,3318,SliceNumber,296432,95,15,0.0,100
1,2,102,173.412225,-41.161038,1.0,3393,SliceNumber,471493,68,15,0.0,100
2,3,103,173.046567,-40.963196,1.0,3318,SliceNumber,296432,95,15,0.0,100
3,4,104,172.595189,-41.794691,1.0,3318,SliceNumber,296432,95,15,0.0,100
4,5,104,172.595189,-41.794691,1.0,3393,SliceNumber,471493,68,15,0.0,100


<a id='biotable'></a>
### Format table with biodiversity data for WDPA

In [20]:
# Format biodiversity data in a string
amphibians_bio = amphibians.groupby('MOL_ID')[['SliceNumber', 'per_global', 'SPS_global', 'SPS_aoi']].apply(lambda x: x.to_json(orient='records')).to_frame('amphibians').reset_index()
birds_bio = birds.groupby('MOL_ID')[['SliceNumber', 'per_global', 'SPS_global', 'SPS_aoi']].apply(lambda x: x.to_json(orient='records')).to_frame('birds').reset_index()
mammals_bio = mammals.groupby('MOL_ID')[['SliceNumber', 'per_global', 'SPS_global', 'SPS_aoi']].apply(lambda x: x.to_json(orient='records')).to_frame('mammals').reset_index()
reptiles_bio = reptiles.groupby('MOL_ID')[['SliceNumber', 'per_global', 'SPS_global', 'SPS_aoi']].apply(lambda x: x.to_json(orient='records')).to_frame('reptiles').reset_index()


In [21]:
amphibians_bio

Unnamed: 0,MOL_ID,amphibians
0,2,"[{""SliceNumber"":3318,""per_global"":0.0,""SPS_glo..."
1,3,"[{""SliceNumber"":3318,""per_global"":0.0,""SPS_glo..."
2,4,"[{""SliceNumber"":3318,""per_global"":0.0,""SPS_glo..."
3,5,"[{""SliceNumber"":3318,""per_global"":0.0,""SPS_glo..."
4,6,"[{""SliceNumber"":3318,""per_global"":0.0,""SPS_glo..."
...,...,...
193873,217452,"[{""SliceNumber"":3222,""per_global"":0.0,""SPS_glo..."
193874,217453,"[{""SliceNumber"":3222,""per_global"":0.0,""SPS_glo..."
193875,217455,"[{""SliceNumber"":3222,""per_global"":0.0,""SPS_glo..."
193876,217457,"[{""SliceNumber"":168,""per_global"":0.0,""SPS_glob..."


In [22]:
dff = pd.merge(dff, amphibians_bio, how='left', on = 'MOL_ID')
dff = pd.merge(dff, birds_bio, how='left', on = 'MOL_ID')
dff = pd.merge(dff, mammals_bio, how='left', on = 'MOL_ID')
dff = pd.merge(dff, reptiles_bio, how='left', on = 'MOL_ID')
dff.head()

Unnamed: 0,MOL_ID,WDPAID,WDPA_PID,NAME,DESIG,DESIG_T,IUCN_CA,STATUS,GOV_TYP,MANG_AU,ISO3,AREA_KM2,DESIG_E,ORIG_NA,STATUS_,amphibians,birds,mammals,reptiles
0,1,310492.0,310492,Boulder Beach,Stewardship Area,National,III,Designated,Federal or national ministry or agency,Department of Conservation,NZL,1.136031,Stewardship Area,Boulder Beach / WWF Block,,,"[{""SliceNumber"":482.0,""per_global"":0.0,""SPS_gl...","[{""SliceNumber"":303.0,""per_global"":0.0,""SPS_gl...",
1,2,307797.0,307797,Ferndale,Scenic Reserve,National,III,Designated,Federal or national ministry or agency,Department of Conservation,NZL,0.748492,Scenic Reserve,Ferndale,,"[{""SliceNumber"":3318,""per_global"":0.0,""SPS_glo...","[{""SliceNumber"":8.0,""per_global"":0.0,""SPS_glob...","[{""SliceNumber"":303.0,""per_global"":0.0,""SPS_gl...","[{""SliceNumber"":6163,""per_global"":0.0,""SPS_glo..."
2,3,307745.0,307745,Broughton Bay,Scenic Reserve,National,III,Designated,Federal or national ministry or agency,Department of Conservation,NZL,0.031907,Scenic Reserve,Broughton Bay,,"[{""SliceNumber"":3318,""per_global"":0.0,""SPS_glo...","[{""SliceNumber"":1847.0,""per_global"":0.0,""SPS_g...","[{""SliceNumber"":303.0,""per_global"":0.0,""SPS_gl...","[{""SliceNumber"":6499,""per_global"":0.0,""SPS_glo..."
3,4,307867.0,307867,Kaipupu Point,Scenic Reserve,National,III,Designated,Federal or national ministry or agency,Department of Conservation,NZL,0.270855,Scenic Reserve,Kaipupu Point,,"[{""SliceNumber"":3318,""per_global"":0.0,""SPS_glo...","[{""SliceNumber"":8.0,""per_global"":0.0,""SPS_glob...","[{""SliceNumber"":303.0,""per_global"":0.0,""SPS_gl...","[{""SliceNumber"":6163,""per_global"":0.0,""SPS_glo..."
4,5,303963.0,303963,Catlins Conservation Park,Stewardship Area,National,III,Designated,Federal or national ministry or agency,Department of Conservation,NZL,8.412168,Stewardship Area,Catlins Conservation Park,,"[{""SliceNumber"":3318,""per_global"":0.0,""SPS_glo...","[{""SliceNumber"":8.0,""per_global"":0.01,""SPS_glo...",,"[{""SliceNumber"":6163,""per_global"":0.01,""SPS_gl..."


In [23]:
dff = dff.fillna('[]')
dff.head()

Unnamed: 0,MOL_ID,WDPAID,WDPA_PID,NAME,DESIG,DESIG_T,IUCN_CA,STATUS,GOV_TYP,MANG_AU,ISO3,AREA_KM2,DESIG_E,ORIG_NA,STATUS_,amphibians,birds,mammals,reptiles
0,1,310492.0,310492,Boulder Beach,Stewardship Area,National,III,Designated,Federal or national ministry or agency,Department of Conservation,NZL,1.136031,Stewardship Area,Boulder Beach / WWF Block,[],[],"[{""SliceNumber"":482.0,""per_global"":0.0,""SPS_gl...","[{""SliceNumber"":303.0,""per_global"":0.0,""SPS_gl...",[]
1,2,307797.0,307797,Ferndale,Scenic Reserve,National,III,Designated,Federal or national ministry or agency,Department of Conservation,NZL,0.748492,Scenic Reserve,Ferndale,[],"[{""SliceNumber"":3318,""per_global"":0.0,""SPS_glo...","[{""SliceNumber"":8.0,""per_global"":0.0,""SPS_glob...","[{""SliceNumber"":303.0,""per_global"":0.0,""SPS_gl...","[{""SliceNumber"":6163,""per_global"":0.0,""SPS_glo..."
2,3,307745.0,307745,Broughton Bay,Scenic Reserve,National,III,Designated,Federal or national ministry or agency,Department of Conservation,NZL,0.031907,Scenic Reserve,Broughton Bay,[],"[{""SliceNumber"":3318,""per_global"":0.0,""SPS_glo...","[{""SliceNumber"":1847.0,""per_global"":0.0,""SPS_g...","[{""SliceNumber"":303.0,""per_global"":0.0,""SPS_gl...","[{""SliceNumber"":6499,""per_global"":0.0,""SPS_glo..."
3,4,307867.0,307867,Kaipupu Point,Scenic Reserve,National,III,Designated,Federal or national ministry or agency,Department of Conservation,NZL,0.270855,Scenic Reserve,Kaipupu Point,[],"[{""SliceNumber"":3318,""per_global"":0.0,""SPS_glo...","[{""SliceNumber"":8.0,""per_global"":0.0,""SPS_glob...","[{""SliceNumber"":303.0,""per_global"":0.0,""SPS_gl...","[{""SliceNumber"":6163,""per_global"":0.0,""SPS_glo..."
4,5,303963.0,303963,Catlins Conservation Park,Stewardship Area,National,III,Designated,Federal or national ministry or agency,Department of Conservation,NZL,8.412168,Stewardship Area,Catlins Conservation Park,[],"[{""SliceNumber"":3318,""per_global"":0.0,""SPS_glo...","[{""SliceNumber"":8.0,""per_global"":0.01,""SPS_glo...",[],"[{""SliceNumber"":6163,""per_global"":0.01,""SPS_gl..."


In [24]:
dff.loc[dff['MOL_ID']==121,'birds'].values[0]

'[{"SliceNumber":251.0,"per_global":0.0,"SPS_global":57,"SPS_aoi":100},{"SliceNumber":552.0,"per_global":0.0,"SPS_global":100,"SPS_aoi":100},{"SliceNumber":613.0,"per_global":0.0,"SPS_global":100,"SPS_aoi":100},{"SliceNumber":1301.0,"per_global":0.0,"SPS_global":40,"SPS_aoi":100},{"SliceNumber":1310.0,"per_global":0.0,"SPS_global":67,"SPS_aoi":100},{"SliceNumber":1321.0,"per_global":0.0,"SPS_global":100,"SPS_aoi":100},{"SliceNumber":1510.0,"per_global":0.0,"SPS_global":82,"SPS_aoi":100},{"SliceNumber":1511.0,"per_global":0.0,"SPS_global":95,"SPS_aoi":100},{"SliceNumber":1517.0,"per_global":0.0,"SPS_global":74,"SPS_aoi":100},{"SliceNumber":1572.0,"per_global":0.0,"SPS_global":58,"SPS_aoi":100},{"SliceNumber":1933.0,"per_global":0.0,"SPS_global":75,"SPS_aoi":100},{"SliceNumber":2017.0,"per_global":0.0,"SPS_global":85,"SPS_aoi":100},{"SliceNumber":3025.0,"per_global":0.0,"SPS_global":53,"SPS_aoi":100},{"SliceNumber":3027.0,"per_global":0.0,"SPS_global":88,"SPS_aoi":100},{"SliceNumber":308

In [25]:
len(dff)

217429

<a id='nspecies'></a>
### Add nspecies

In [26]:
# Get data for all taxa
a = pd.read_csv(f'{path_in}/wdpa_amphibians_final_20211003.csv')
b = pd.read_csv(f'{path_in}/wdpa_birds_final_20211003.csv')
m = pd.read_csv(f'{path_in}/wdpa_mammals_final_20211003.csv')
r = pd.read_csv(f'{path_in}/wdpa_reptiles_final_20211003.csv')

In [27]:
# Count number of species for group
a_count = a.groupby('MOL_ID')['SliceNumber'].count().astype(int)
b_count = b.groupby('MOL_ID')['SliceNumber'].count().astype(int)
m_count = m.groupby('MOL_ID')['SliceNumber'].count().astype(int)
r_count = r.groupby('MOL_ID')['SliceNumber'].count().astype(int)

In [28]:
frame = { 'amph_nspecies': a_count, 'bird_nspecies': b_count, 'mamm_nspecies': m_count, 'rept_nspecies': r_count }
df = pd.DataFrame(frame).reset_index()
cols = ['amph_nspecies', 'bird_nspecies', 'mamm_nspecies', 'rept_nspecies']
df[cols] = df[cols].fillna(0)
df[cols] = df[cols].astype('int')
df['nspecies'] = df['amph_nspecies'] + df['bird_nspecies'] + df['mamm_nspecies'] + df['rept_nspecies']
df

Unnamed: 0,MOL_ID,amph_nspecies,bird_nspecies,mamm_nspecies,rept_nspecies,nspecies
0,1,0,49,3,0,52
1,2,2,68,2,6,78
2,3,2,24,2,2,30
3,4,2,62,2,4,70
4,5,2,48,0,3,53
...,...,...,...,...,...,...
205893,217453,2,95,25,2,124
205894,217455,2,76,25,2,105
205895,217456,0,40,2,0,42
205896,217457,6,59,40,3,108


In [29]:
# Merge nspecies in dataframe
wdpa_nspecies = dff.merge(df, how='left', on = 'MOL_ID')
wdpa_nspecies

Unnamed: 0,MOL_ID,WDPAID,WDPA_PID,NAME,DESIG,DESIG_T,IUCN_CA,STATUS,GOV_TYP,MANG_AU,...,STATUS_,amphibians,birds,mammals,reptiles,amph_nspecies,bird_nspecies,mamm_nspecies,rept_nspecies,nspecies
0,1,310492.0,310492,Boulder Beach,Stewardship Area,National,III,Designated,Federal or national ministry or agency,Department of Conservation,...,[],[],"[{""SliceNumber"":482.0,""per_global"":0.0,""SPS_gl...","[{""SliceNumber"":303.0,""per_global"":0.0,""SPS_gl...",[],0.0,49.0,3.0,0.0,52.0
1,2,307797.0,307797,Ferndale,Scenic Reserve,National,III,Designated,Federal or national ministry or agency,Department of Conservation,...,[],"[{""SliceNumber"":3318,""per_global"":0.0,""SPS_glo...","[{""SliceNumber"":8.0,""per_global"":0.0,""SPS_glob...","[{""SliceNumber"":303.0,""per_global"":0.0,""SPS_gl...","[{""SliceNumber"":6163,""per_global"":0.0,""SPS_glo...",2.0,68.0,2.0,6.0,78.0
2,3,307745.0,307745,Broughton Bay,Scenic Reserve,National,III,Designated,Federal or national ministry or agency,Department of Conservation,...,[],"[{""SliceNumber"":3318,""per_global"":0.0,""SPS_glo...","[{""SliceNumber"":1847.0,""per_global"":0.0,""SPS_g...","[{""SliceNumber"":303.0,""per_global"":0.0,""SPS_gl...","[{""SliceNumber"":6499,""per_global"":0.0,""SPS_glo...",2.0,24.0,2.0,2.0,30.0
3,4,307867.0,307867,Kaipupu Point,Scenic Reserve,National,III,Designated,Federal or national ministry or agency,Department of Conservation,...,[],"[{""SliceNumber"":3318,""per_global"":0.0,""SPS_glo...","[{""SliceNumber"":8.0,""per_global"":0.0,""SPS_glob...","[{""SliceNumber"":303.0,""per_global"":0.0,""SPS_gl...","[{""SliceNumber"":6163,""per_global"":0.0,""SPS_glo...",2.0,62.0,2.0,4.0,70.0
4,5,303963.0,303963,Catlins Conservation Park,Stewardship Area,National,III,Designated,Federal or national ministry or agency,Department of Conservation,...,[],"[{""SliceNumber"":3318,""per_global"":0.0,""SPS_glo...","[{""SliceNumber"":8.0,""per_global"":0.01,""SPS_glo...",[],"[{""SliceNumber"":6163,""per_global"":0.01,""SPS_gl...",2.0,48.0,0.0,3.0,53.0
...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...
217424,217454,555682973.0,555682973,4.11 St Sampson's Marais / Ivy Castle,Sites of Special Significance,National,OECM,Designated,Federal or national ministry or agency,States of Guernsey and private land owners,...,2016.0,[],[],[],[],,,,,
217425,217455,555682979.0,555682979,4.7 Les Vicheries and Rue Rocheuse,Sites of Special Significance,National,OECM,Designated,Federal or national ministry or agency,Private land owners,...,2016.0,"[{""SliceNumber"":3222,""per_global"":0.0,""SPS_glo...","[{""SliceNumber"":120.0,""per_global"":0.0,""SPS_gl...","[{""SliceNumber"":624.0,""per_global"":0.0,""SPS_gl...","[{""SliceNumber"":4843,""per_global"":0.0,""SPS_glo...",2.0,76.0,25.0,2.0,105.0
217426,217456,555651689.0,555651689,"Les Demoiselles nursery (Plaisance Bay), Magda...",Other Effective Area-Based Conservation Measure,National,OECM,Designated,Federal or national ministry or agency,Fisheries And Oceans Canada,...,2016.0,[],"[{""SliceNumber"":142.0,""per_global"":0.0,""SPS_gl...","[{""SliceNumber"":2954.0,""per_global"":0.0,""SPS_g...",[],0.0,40.0,2.0,0.0,42.0
217427,217457,555651698.0,555651698,Strait Of Georgia And Howe Sound Glass Sponge ...,Other Effective Area-Based Conservation Measure,National,OECM,Designated,Federal or national ministry or agency,Fisheries And Oceans Canada,...,2019.0,"[{""SliceNumber"":168,""per_global"":0.0,""SPS_glob...","[{""SliceNumber"":142.0,""per_global"":0.0,""SPS_gl...","[{""SliceNumber"":556.0,""per_global"":0.0,""SPS_gl...","[{""SliceNumber"":2215,""per_global"":0.0,""SPS_glo...",6.0,59.0,40.0,3.0,108.0


In [30]:
len(wdpa_nspecies[wdpa_nspecies.nspecies.isna()])

11531

**Save table with biodiversity data**

In [31]:
wdpa_nspecies.to_csv((f'{path_out}/wdpa_precalculated_SPS_biodiversity_only.csv'))

<a id='contextual'></a>
## Add contextual data
Population and ELU come from previous calculations. In this iteration we include the new human modification time series.

<a id='othercontextual'></a>
### Population and ELU

In [33]:
pop = pd.read_csv(f'{path_in}/Pop.csv')
elu = pd.read_csv(f'{path_in}/ELU.csv')

In [34]:
## Add contextual data: ELU
ctx = wdpa_nspecies.merge(elu[['MOL_ID','MAJORITY']], how='left', on = 'MOL_ID').rename(columns={'MAJORITY':'majority_land_cover_climate_regime'})
ctx.head()

Unnamed: 0,MOL_ID,WDPAID,WDPA_PID,NAME,DESIG,DESIG_T,IUCN_CA,STATUS,GOV_TYP,MANG_AU,...,amphibians,birds,mammals,reptiles,amph_nspecies,bird_nspecies,mamm_nspecies,rept_nspecies,nspecies,majority_land_cover_climate_regime
0,1,310492.0,310492,Boulder Beach,Stewardship Area,National,III,Designated,Federal or national ministry or agency,Department of Conservation,...,[],"[{""SliceNumber"":482.0,""per_global"":0.0,""SPS_gl...","[{""SliceNumber"":303.0,""per_global"":0.0,""SPS_gl...",[],0.0,49.0,3.0,0.0,52.0,107.0
1,2,307797.0,307797,Ferndale,Scenic Reserve,National,III,Designated,Federal or national ministry or agency,Department of Conservation,...,"[{""SliceNumber"":3318,""per_global"":0.0,""SPS_glo...","[{""SliceNumber"":8.0,""per_global"":0.0,""SPS_glob...","[{""SliceNumber"":303.0,""per_global"":0.0,""SPS_gl...","[{""SliceNumber"":6163,""per_global"":0.0,""SPS_glo...",2.0,68.0,2.0,6.0,78.0,176.0
2,3,307745.0,307745,Broughton Bay,Scenic Reserve,National,III,Designated,Federal or national ministry or agency,Department of Conservation,...,"[{""SliceNumber"":3318,""per_global"":0.0,""SPS_glo...","[{""SliceNumber"":1847.0,""per_global"":0.0,""SPS_g...","[{""SliceNumber"":303.0,""per_global"":0.0,""SPS_gl...","[{""SliceNumber"":6499,""per_global"":0.0,""SPS_glo...",2.0,24.0,2.0,2.0,30.0,
3,4,307867.0,307867,Kaipupu Point,Scenic Reserve,National,III,Designated,Federal or national ministry or agency,Department of Conservation,...,"[{""SliceNumber"":3318,""per_global"":0.0,""SPS_glo...","[{""SliceNumber"":8.0,""per_global"":0.0,""SPS_glob...","[{""SliceNumber"":303.0,""per_global"":0.0,""SPS_gl...","[{""SliceNumber"":6163,""per_global"":0.0,""SPS_glo...",2.0,62.0,2.0,4.0,70.0,
4,5,303963.0,303963,Catlins Conservation Park,Stewardship Area,National,III,Designated,Federal or national ministry or agency,Department of Conservation,...,"[{""SliceNumber"":3318,""per_global"":0.0,""SPS_glo...","[{""SliceNumber"":8.0,""per_global"":0.01,""SPS_glo...",[],"[{""SliceNumber"":6163,""per_global"":0.01,""SPS_gl...",2.0,48.0,0.0,3.0,53.0,97.0


In [35]:
# Retrieve elu lookup table to see to see the correspondences for that elu code
elu_lookup = getHTfromId('83802a7fa3d34c1fa40844fc14683966')
elu_lookup.head()

Unnamed: 0,elu_code,elu,lc_type,lf_type,cr_type,ObjectId
0,301,Sub Tropical Moist Forest on Plains,Forest,Plains,Sub Tropical Moist,1
1,201,Warm Temperate Dry Sparsley or Non vegetated o...,Sparsley or Non vegetated,Plains,Warm Temperate Dry,2
2,151,Cool Temperate Dry Sparsley or Non vegetated o...,Sparsley or Non vegetated,Plains,Cool Temperate Dry,3
3,302,Sub Tropical Moist Cropland on Tablelands,Cropland,Tablelands,Sub Tropical Moist,4
4,152,Cool Temperate Dry Sparsley or Non vegetated o...,Sparsley or Non vegetated,Tablelands,Cool Temperate Dry,5


In [36]:
# Merge in dataset the required info from lookup table
ctx = ctx.merge(elu_lookup[['elu_code','lc_type','cr_type']], how='left', left_on = 'majority_land_cover_climate_regime', right_on = 'elu_code')\
    .drop(columns=['elu_code'])\
    .rename(columns={'lc_type':'land_cover_majority','cr_type':'climate_regime_majority'})
ctx.head()

Unnamed: 0,MOL_ID,WDPAID,WDPA_PID,NAME,DESIG,DESIG_T,IUCN_CA,STATUS,GOV_TYP,MANG_AU,...,mammals,reptiles,amph_nspecies,bird_nspecies,mamm_nspecies,rept_nspecies,nspecies,majority_land_cover_climate_regime,land_cover_majority,climate_regime_majority
0,1,310492.0,310492,Boulder Beach,Stewardship Area,National,III,Designated,Federal or national ministry or agency,Department of Conservation,...,"[{""SliceNumber"":303.0,""per_global"":0.0,""SPS_gl...",[],0.0,49.0,3.0,0.0,52.0,107.0,Grassland,Cool Temperate Moist
1,2,307797.0,307797,Ferndale,Scenic Reserve,National,III,Designated,Federal or national ministry or agency,Department of Conservation,...,"[{""SliceNumber"":303.0,""per_global"":0.0,""SPS_gl...","[{""SliceNumber"":6163,""per_global"":0.0,""SPS_glo...",2.0,68.0,2.0,6.0,78.0,176.0,Forest,Warm Temperate Moist
2,3,307745.0,307745,Broughton Bay,Scenic Reserve,National,III,Designated,Federal or national ministry or agency,Department of Conservation,...,"[{""SliceNumber"":303.0,""per_global"":0.0,""SPS_gl...","[{""SliceNumber"":6499,""per_global"":0.0,""SPS_glo...",2.0,24.0,2.0,2.0,30.0,,,
3,4,307867.0,307867,Kaipupu Point,Scenic Reserve,National,III,Designated,Federal or national ministry or agency,Department of Conservation,...,"[{""SliceNumber"":303.0,""per_global"":0.0,""SPS_gl...","[{""SliceNumber"":6163,""per_global"":0.0,""SPS_glo...",2.0,62.0,2.0,4.0,70.0,,,
4,5,303963.0,303963,Catlins Conservation Park,Stewardship Area,National,III,Designated,Federal or national ministry or agency,Department of Conservation,...,[],"[{""SliceNumber"":6163,""per_global"":0.01,""SPS_gl...",2.0,48.0,0.0,3.0,53.0,97.0,Forest,Cool Temperate Moist


In [37]:
ctx['land_cover_majority'] = ctx['land_cover_majority'].fillna('')
ctx['climate_regime_majority'] = ctx['climate_regime_majority'].fillna('')
ctx = ctx.fillna(0)
ctx.head()

Unnamed: 0,MOL_ID,WDPAID,WDPA_PID,NAME,DESIG,DESIG_T,IUCN_CA,STATUS,GOV_TYP,MANG_AU,...,mammals,reptiles,amph_nspecies,bird_nspecies,mamm_nspecies,rept_nspecies,nspecies,majority_land_cover_climate_regime,land_cover_majority,climate_regime_majority
0,1,310492.0,310492,Boulder Beach,Stewardship Area,National,III,Designated,Federal or national ministry or agency,Department of Conservation,...,"[{""SliceNumber"":303.0,""per_global"":0.0,""SPS_gl...",[],0.0,49.0,3.0,0.0,52.0,107.0,Grassland,Cool Temperate Moist
1,2,307797.0,307797,Ferndale,Scenic Reserve,National,III,Designated,Federal or national ministry or agency,Department of Conservation,...,"[{""SliceNumber"":303.0,""per_global"":0.0,""SPS_gl...","[{""SliceNumber"":6163,""per_global"":0.0,""SPS_glo...",2.0,68.0,2.0,6.0,78.0,176.0,Forest,Warm Temperate Moist
2,3,307745.0,307745,Broughton Bay,Scenic Reserve,National,III,Designated,Federal or national ministry or agency,Department of Conservation,...,"[{""SliceNumber"":303.0,""per_global"":0.0,""SPS_gl...","[{""SliceNumber"":6499,""per_global"":0.0,""SPS_glo...",2.0,24.0,2.0,2.0,30.0,0.0,,
3,4,307867.0,307867,Kaipupu Point,Scenic Reserve,National,III,Designated,Federal or national ministry or agency,Department of Conservation,...,"[{""SliceNumber"":303.0,""per_global"":0.0,""SPS_gl...","[{""SliceNumber"":6163,""per_global"":0.0,""SPS_glo...",2.0,62.0,2.0,4.0,70.0,0.0,,
4,5,303963.0,303963,Catlins Conservation Park,Stewardship Area,National,III,Designated,Federal or national ministry or agency,Department of Conservation,...,[],"[{""SliceNumber"":6163,""per_global"":0.01,""SPS_gl...",2.0,48.0,0.0,3.0,53.0,97.0,Forest,Cool Temperate Moist


In [38]:
## Add contextual data: POP
ctx = ctx.merge(pop[['MOL_ID','SUM']],on ='MOL_ID',how='left')
ctx.head(1)

Unnamed: 0,MOL_ID,WDPAID,WDPA_PID,NAME,DESIG,DESIG_T,IUCN_CA,STATUS,GOV_TYP,MANG_AU,...,reptiles,amph_nspecies,bird_nspecies,mamm_nspecies,rept_nspecies,nspecies,majority_land_cover_climate_regime,land_cover_majority,climate_regime_majority,SUM
0,1,310492.0,310492,Boulder Beach,Stewardship Area,National,III,Designated,Federal or national ministry or agency,Department of Conservation,...,[],0.0,49.0,3.0,0.0,52.0,107.0,Grassland,Cool Temperate Moist,2.110001


In [39]:
ctx = ctx.rename(columns ={'SUM':'population_sum'})
ctx.head(1)

Unnamed: 0,MOL_ID,WDPAID,WDPA_PID,NAME,DESIG,DESIG_T,IUCN_CA,STATUS,GOV_TYP,MANG_AU,...,reptiles,amph_nspecies,bird_nspecies,mamm_nspecies,rept_nspecies,nspecies,majority_land_cover_climate_regime,land_cover_majority,climate_regime_majority,population_sum
0,1,310492.0,310492,Boulder Beach,Stewardship Area,National,III,Designated,Federal or national ministry or agency,Department of Conservation,...,[],0.0,49.0,3.0,0.0,52.0,107.0,Grassland,Cool Temperate Moist,2.110001


<a id='pressures'></a>
### Human pressures 

In [40]:
# Bring new human pressure tables
agriculture = pd.read_csv(f'{path_in}/HP_wdpa_updated/HP_wdpa_agriculture_table_updated.csv')
builtup = pd.read_csv(f'{path_in}/HP_wdpa_updated/HP_wdpa_builtup_table_updated.csv')
extraction = pd.read_csv(f'{path_in}/HP_wdpa_updated/HP_wdpa_extraction_table_updated.csv')
intrusion = pd.read_csv(f'{path_in}/HP_wdpa_updated/HP_wdpa_intrusion_table_updated.csv')
transportation = pd.read_csv(f'{path_in}/HP_wdpa_updated/HP_wdpa_transportation_table_updated.csv')


#### Format human pressures

In [41]:
agriculture = agriculture[['MOL_ID', 'Year', 'percentage_land_encroachment']].astype({'Year':'int'})
builtup = builtup[['MOL_ID', 'Year', 'percentage_land_encroachment']].astype({'Year':'int'})
extraction = extraction[['MOL_ID', 'Year', 'percentage_land_encroachment']].astype({'Year':'int'})
intrusion = intrusion[['MOL_ID', 'Year', 'percentage_land_encroachment']].astype({'Year':'int'})
transportation = transportation[['MOL_ID', 'Year', 'percentage_land_encroachment']].astype({'Year':'int'})

In [42]:
# Format them to have required fields in a string
agr = agriculture.groupby('MOL_ID')[['Year', 'percentage_land_encroachment']].apply(lambda x: x.to_json(orient='records')).to_frame('agriculture').reset_index()
bui = builtup.groupby('MOL_ID')[['Year', 'percentage_land_encroachment']].apply(lambda x: x.to_json(orient='records')).to_frame('builtup').reset_index()
ext = extraction.groupby('MOL_ID')[['Year', 'percentage_land_encroachment']].apply(lambda x: x.to_json(orient='records')).to_frame('extraction').reset_index()
int = intrusion.groupby('MOL_ID')[['Year', 'percentage_land_encroachment']].apply(lambda x: x.to_json(orient='records')).to_frame('intrusion').reset_index()
tra = transportation.groupby('MOL_ID')[['Year', 'percentage_land_encroachment']].apply(lambda x: x.to_json(orient='records')).to_frame('transportation').reset_index()

In [43]:
agr.agriculture[0]

'[{"Year":1995,"percentage_land_encroachment":12.5},{"Year":2000,"percentage_land_encroachment":12.5},{"Year":2005,"percentage_land_encroachment":12.5},{"Year":2010,"percentage_land_encroachment":37.5},{"Year":2015,"percentage_land_encroachment":87.5},{"Year":2017,"percentage_land_encroachment":100.0}]'

In [44]:
ctx = pd.merge(ctx, agr, how='left', on = 'MOL_ID')
ctx = pd.merge(ctx, bui, how='left', on = 'MOL_ID')
ctx = pd.merge(ctx, ext, how='left', on = 'MOL_ID')
ctx = pd.merge(ctx, int, how='left', on = 'MOL_ID')
ctx = pd.merge(ctx, tra, how='left', on = 'MOL_ID')
ctx.head(10)

Unnamed: 0,MOL_ID,WDPAID,WDPA_PID,NAME,DESIG,DESIG_T,IUCN_CA,STATUS,GOV_TYP,MANG_AU,...,nspecies,majority_land_cover_climate_regime,land_cover_majority,climate_regime_majority,population_sum,agriculture,builtup,extraction,intrusion,transportation
0,1,310492.0,310492,Boulder Beach,Stewardship Area,National,III,Designated,Federal or national ministry or agency,Department of Conservation,...,52.0,107.0,Grassland,Cool Temperate Moist,2.110001,,,,"[{""Year"":1990,""percentage_land_encroachment"":1...",
1,2,307797.0,307797,Ferndale,Scenic Reserve,National,III,Designated,Federal or national ministry or agency,Department of Conservation,...,78.0,176.0,Forest,Warm Temperate Moist,1.315837,,,,,
2,3,307745.0,307745,Broughton Bay,Scenic Reserve,National,III,Designated,Federal or national ministry or agency,Department of Conservation,...,30.0,0.0,,,,,,,,
3,4,307867.0,307867,Kaipupu Point,Scenic Reserve,National,III,Designated,Federal or national ministry or agency,Department of Conservation,...,70.0,0.0,,,,,,,,"[{""Year"":1990,""percentage_land_encroachment"":1..."
4,5,303963.0,303963,Catlins Conservation Park,Stewardship Area,National,III,Designated,Federal or national ministry or agency,Department of Conservation,...,53.0,97.0,Forest,Cool Temperate Moist,3.103363,,,,,
5,6,555565660.0,555565660,Mt Aspiring/Tititea,Stewardship Area,National,III,Designated,Federal or national ministry or agency,Department of Conservation,...,53.0,97.0,Forest,Cool Temperate Moist,0.224189,,,,,
6,7,555564797.0,555564797,Kenepuru Sound,Scenic Reserve,National,III,Designated,Federal or national ministry or agency,Department of Conservation,...,52.0,176.0,Forest,Warm Temperate Moist,1.076915,,,,,
7,8,310790.0,310790,Four Rivers Plain,Stewardship Area,National,III,Designated,Federal or national ministry or agency,Department of Conservation,...,42.0,171.0,Grassland,Warm Temperate Moist,0.472719,,,,,
8,9,555566679.0,555566679,Earnscleugh - Rough Creek,Conservation Covenant,National,IV,Designated,Collaborative governance,Collaborative,...,23.0,107.0,Grassland,Cool Temperate Moist,0.222667,,,,,
9,10,555564992.0,555564992,Fyfe Cottage Exchange,Stewardship Area,National,III,Designated,Federal or national ministry or agency,Department of Conservation,...,49.0,0.0,,,,,,,,


In [45]:
nulls = ctx[ctx[['agriculture', 'builtup','extraction', 'intrusion', 'transportation']].isna().all(axis=1)]
len(nulls)

44593

There are 44,593 out of 217,429 that don't have human pressures data.

In [46]:
ctx.columns

Index(['MOL_ID', 'WDPAID', 'WDPA_PID', 'NAME', 'DESIG', 'DESIG_T', 'IUCN_CA',
       'STATUS', 'GOV_TYP', 'MANG_AU', 'ISO3', 'AREA_KM2', 'DESIG_E',
       'ORIG_NA', 'STATUS_', 'amphibians', 'birds', 'mammals', 'reptiles',
       'amph_nspecies', 'bird_nspecies', 'mamm_nspecies', 'rept_nspecies',
       'nspecies', 'majority_land_cover_climate_regime', 'land_cover_majority',
       'climate_regime_majority', 'population_sum', 'agriculture', 'builtup',
       'extraction', 'intrusion', 'transportation'],
      dtype='object')

In [47]:
# Save dataframe
ctx.to_csv(f'{path_out}/wdpa_precalculated_aoi_summaries_updated.csv')

### Correct WDPA names
The dataset used to generate the precalculated data for the WDPA has a problem with the protected areas' names. Some of the special characters have been converted to "?". Here, we are going to solve this issue using the names collected in this [WDPA dataset](https://eowilson.maps.arcgis.com/home/item.html?id=ef9262a20fbb41bc8dc5eefdc9b93691)

In [48]:
wdpa_names = pd.read_csv('/Users/sofia/Documents/HE_Data/WDPA/WDPA_FILTERED_20210615_FILTERED_TERR01_missing1980_no_oecm_wdpa_pid_hash_20230322.csv')
wdpa = pd.read_csv('/Users/sofia/Documents/HE_Data/Precalculated/WDPA_Precalculated/Outputs/wdpa_precalculated_aoi_summaries_updated.csv')

  exec(code_obj, self.user_global_ns, self.user_ns)


In [49]:
wdpa = wdpa.drop(columns={'Unnamed: 0'})
wdpa.head(1)

Unnamed: 0,MOL_ID,WDPAID,WDPA_PID,NAME,DESIG,DESIG_T,IUCN_CA,STATUS,GOV_TYP,MANG_AU,...,nspecies,majority_land_cover_climate_regime,land_cover_majority,climate_regime_majority,population_sum,agriculture,builtup,extraction,intrusion,transportation
0,1,310492.0,310492,Boulder Beach,Stewardship Area,National,III,Designated,Federal or national ministry or agency,Department of Conservation,...,52.0,107.0,Grassland,Cool Temperate Moist,2.110001,,,,"[{""Year"":1990,""percentage_land_encroachment"":1...",


In [50]:
wdpa_names.head(1)

Unnamed: 0,WDPAID,WDPA_PID,PA_DEF,NAME,ORIG_NAME,DESIG,DESIG_ENG,DESIG_TYPE,IUCN_CAT,INT_CRIT,...,METADATAID,SUB_LOC,PARENT_ISO3,ISO3,SUPP_INFO,CONS_OBJ,SORTER,WDPA_PID_hash,WDPA_PID_hash_int,hash_value
0,555561621,555561621,1,"Lsg-Morsbachtal, Eschbachtal, Seitentaeler Und...","Lsg-Morsbachtal, Eschbachtal, Seitentaeler Und...",Landschaftschutzgebiet,Landscape Protection Area,National,V,Not Applicable,...,1839,Not Reported,DEU,DEU,Not Applicable,Not Applicable,1800.15,fe23b936,2016138816,2016138816


In [51]:
wdpa[wdpa.WDPA_PID=='555517476']

Unnamed: 0,MOL_ID,WDPAID,WDPA_PID,NAME,DESIG,DESIG_T,IUCN_CA,STATUS,GOV_TYP,MANG_AU,...,nspecies,majority_land_cover_climate_regime,land_cover_majority,climate_regime_majority,population_sum,agriculture,builtup,extraction,intrusion,transportation
165231,165232,555517476.0,555517476,U kapli?ky,Site of Community Importance (Habitats Directive),Regional,Not Reported,Designated,Federal or national ministry or agency,Krajský ú?ad Jihomoravského kraje,...,171.0,110.0,Cropland,Cool Temperate Moist,,"[{""Year"":1990,""percentage_land_encroachment"":1...","[{""Year"":1995,""percentage_land_encroachment"":5...",,"[{""Year"":1990,""percentage_land_encroachment"":1...","[{""Year"":1990,""percentage_land_encroachment"":1..."


In [52]:
wdpa_names[wdpa_names.WDPA_PID=='555517476']

Unnamed: 0,WDPAID,WDPA_PID,PA_DEF,NAME,ORIG_NAME,DESIG,DESIG_ENG,DESIG_TYPE,IUCN_CAT,INT_CRIT,...,METADATAID,SUB_LOC,PARENT_ISO3,ISO3,SUPP_INFO,CONS_OBJ,SORTER,WDPA_PID_hash,WDPA_PID_hash_int,hash_value
164715,555517476,555517476,1,U kapličky,U kapličky,Site of Community Importance (Habitats Directive),Site of Community Importance (Habitats Directive),Regional,Not Reported,Not Applicable,...,1832,Not Reported,CZE,CZE,Not Applicable,Not Applicable,2008.12,1119c377,284617844,284617844


In [53]:
names = wdpa_names[['WDPA_PID', 'NAME']].rename(columns={'NAME':'NAME_correct'})
names.head(1)

Unnamed: 0,WDPA_PID,NAME_correct
0,555561621,"Lsg-Morsbachtal, Eschbachtal, Seitentaeler Und..."


In [54]:
# Merge NAME columns from new dataset into the one with the precalculations
dff = pd.merge(wdpa, names, how='left', on='WDPA_PID')
dff.head(1)

Unnamed: 0,MOL_ID,WDPAID,WDPA_PID,NAME,DESIG,DESIG_T,IUCN_CA,STATUS,GOV_TYP,MANG_AU,...,majority_land_cover_climate_regime,land_cover_majority,climate_regime_majority,population_sum,agriculture,builtup,extraction,intrusion,transportation,NAME_correct
0,1,310492.0,310492,Boulder Beach,Stewardship Area,National,III,Designated,Federal or national ministry or agency,Department of Conservation,...,107.0,Grassland,Cool Temperate Moist,2.110001,,,,"[{""Year"":1990,""percentage_land_encroachment"":1...",,Boulder Beach


In [55]:
# Check which wdpa have different names in the first dataset compared to the corrected dataset
dff2 = dff[dff.NAME!=dff.NAME_correct]
dff2[['NAME', 'NAME_correct']]

Unnamed: 0,NAME,NAME_correct
20,"Mt Cargill ""Scenic Reserve""","Mt Cargill """"Scenic Reserve"""""
1270,Hima Huraymila? National Park,Hima Huraymila’ National Park
1271,Al-Ha?ir Wetland,Al-Ha’ir Wetland
1272,Yanbu? Coastal Conservation Area,Yanbu‘ Coastal Conservation Area
2692,ORI VARNOUNTA ? EVRYTERI PERIOCHI,ORI VARNOUNTA – EVRYTERI PERIOCHI
...,...,...
217426,4.11 St Sampson's Marais / Ivy Castle,
217427,4.7 Les Vicheries and Rue Rocheuse,
217428,"Les Demoiselles nursery (Plaisance Bay), Magda...",
217429,Strait Of Georgia And Howe Sound Glass Sponge ...,


In [56]:
# Give to each wdpa with NaN values in the corrected dataset the name they have in the original one
dff.NAME_correct.fillna(dff.NAME, inplace=True)

In [57]:
# Check which wdpa have different names in the first dataset compared to the corrected dataset
dff2 = dff[dff.NAME!=dff.NAME_correct]
dff2[['NAME', 'NAME_correct']]

Unnamed: 0,NAME,NAME_correct
20,"Mt Cargill ""Scenic Reserve""","Mt Cargill """"Scenic Reserve"""""
1270,Hima Huraymila? National Park,Hima Huraymila’ National Park
1271,Al-Ha?ir Wetland,Al-Ha’ir Wetland
1272,Yanbu? Coastal Conservation Area,Yanbu‘ Coastal Conservation Area
2692,ORI VARNOUNTA ? EVRYTERI PERIOCHI,ORI VARNOUNTA – EVRYTERI PERIOCHI
...,...,...
216259,Complexe du Parc Urbain Bãngr ? Weoogo et du l...,Complexe du Parc Urbain Bãngr – Weoogo et du l...
216637,Ch?ihilii Chìk,Ch’ihilii Chìk
216689,5-10-77,5/10/1977
216784,"Centr ohrany prirody ""Zejskij""","Centr ohrany prirody """"Zejskij"""""


In [58]:
# Give to NAME in original dataset the new names
dff.NAME = dff.NAME_correct
dff = dff.drop(columns={'NAME_correct'})
dff.head()

Unnamed: 0,MOL_ID,WDPAID,WDPA_PID,NAME,DESIG,DESIG_T,IUCN_CA,STATUS,GOV_TYP,MANG_AU,...,nspecies,majority_land_cover_climate_regime,land_cover_majority,climate_regime_majority,population_sum,agriculture,builtup,extraction,intrusion,transportation
0,1,310492.0,310492,Boulder Beach,Stewardship Area,National,III,Designated,Federal or national ministry or agency,Department of Conservation,...,52.0,107.0,Grassland,Cool Temperate Moist,2.110001,,,,"[{""Year"":1990,""percentage_land_encroachment"":1...",
1,2,307797.0,307797,Ferndale,Scenic Reserve,National,III,Designated,Federal or national ministry or agency,Department of Conservation,...,78.0,176.0,Forest,Warm Temperate Moist,1.315837,,,,,
2,3,307745.0,307745,Broughton Bay,Scenic Reserve,National,III,Designated,Federal or national ministry or agency,Department of Conservation,...,30.0,0.0,,,,,,,,
3,4,307867.0,307867,Kaipupu Point,Scenic Reserve,National,III,Designated,Federal or national ministry or agency,Department of Conservation,...,70.0,0.0,,,,,,,,"[{""Year"":1990,""percentage_land_encroachment"":1..."
4,5,303963.0,303963,Catlins Conservation Park,Stewardship Area,National,III,Designated,Federal or national ministry or agency,Department of Conservation,...,53.0,97.0,Forest,Cool Temperate Moist,3.103363,,,,,


In [59]:
dff[dff.WDPA_PID=='555517476']

Unnamed: 0,MOL_ID,WDPAID,WDPA_PID,NAME,DESIG,DESIG_T,IUCN_CA,STATUS,GOV_TYP,MANG_AU,...,nspecies,majority_land_cover_climate_regime,land_cover_majority,climate_regime_majority,population_sum,agriculture,builtup,extraction,intrusion,transportation
165231,165232,555517476.0,555517476,U kapličky,Site of Community Importance (Habitats Directive),Regional,Not Reported,Designated,Federal or national ministry or agency,Krajský ú?ad Jihomoravského kraje,...,171.0,110.0,Cropland,Cool Temperate Moist,,"[{""Year"":1990,""percentage_land_encroachment"":1...","[{""Year"":1995,""percentage_land_encroachment"":5...",,"[{""Year"":1990,""percentage_land_encroachment"":1...","[{""Year"":1990,""percentage_land_encroachment"":1..."


In [60]:
dff.columns

Index(['MOL_ID', 'WDPAID', 'WDPA_PID', 'NAME', 'DESIG', 'DESIG_T', 'IUCN_CA',
       'STATUS', 'GOV_TYP', 'MANG_AU', 'ISO3', 'AREA_KM2', 'DESIG_E',
       'ORIG_NA', 'STATUS_', 'amphibians', 'birds', 'mammals', 'reptiles',
       'amph_nspecies', 'bird_nspecies', 'mamm_nspecies', 'rept_nspecies',
       'nspecies', 'majority_land_cover_climate_regime', 'land_cover_majority',
       'climate_regime_majority', 'population_sum', 'agriculture', 'builtup',
       'extraction', 'intrusion', 'transportation'],
      dtype='object')

In [61]:
# Save dataframe
dff.to_csv(f'{path_out}/wdpa_precalculated_aoi_summaries_updated.csv')

Import this new dataframe in AGOL manually either as a new feature layer and create another service from URL to make it whitelisted