# Update gadm names for national and subnational boundaries from gadm 3.6 to gadm 4.0

With this notebook, we will update the NAME_0 and NAME_1 fields of the gadm0_precalculated and gadm1_precalculated layers. Note that only the names are changed, the geometries are still the same (gadm36). The reason to change the names is to incorporate the most recent updates in terms of country or region names included in gadm. 

July 2022

## Table of contents
1. [Update gadm0_precalculated](#gadm0)
2. [Update gadm1_precalculated](#gadm1)

### Import packages

In [1]:
import pandas as pd
import numpy as np
import geopandas as gpd
import arcgis
from arcgis.gis import GIS
import json
import pandas as pd
from arcgis.features import FeatureLayerCollection
import requests as re
from copy import deepcopy
from itertools import repeat

<a id='gadm0'></a>
## Update gadm0_precalculated
* Geojson downloaded from AGOL (download layer from the hosted feature layer to make the changes and also to have a backup in case something fails)

* This notebook was also used to update the country names in the NRC dataset

**Open layer with gadm3.6 names**

In [5]:
# gadm_precalc = gpd.read_file('/Users/sofia/Documents/HE_Data/Precalculated/gadm0/gadm0_precalculated_20220224_backup2.geojson')

gadm_nrc = gpd.read_file('/Users/sofia/Documents/HE_Data/NRC/NRC_2022/NRC_final_20220623.geojson')

In [6]:
gadm_nrc.head()

Unnamed: 0,GID_0,NAME_0,Area_Country,x,y,jpg_url,has_priority,has_raisg,GlobalID,max_highlited_sp,...,mammals_mar,fishes_mar,endemic_mammals_mar,endemic_fishes_mar,nspecies_mar,total_endemic_mar,Area_EEZ,Global_SPI_mar,filter_similar_mar,geometry
0,ABW,Aruba,181.94,-69.97,12.51,https://live.staticflickr.com/1952/31416683438...,1.0,0.0,fe9f6eb0-f4f8-4f29-875a-5cbb3219e4e5,4.0,...,30,1648,0,0,1678,0,29970.3,23.4,"{'filter_Area_Country': ['CUW', 'GRD', 'LVA', ...",POINT (-69.97000 12.51000)
1,AFG,Afghanistan,643857.48,66.03,33.83,https://p1.pxfuel.com/preview/967/12/53/afghan...,1.0,0.0,193ba976-0e5a-4cf6-9b09-d00bf83f4557,5.0,...,0,0,0,0,0,0,,,,POINT (66.03000 33.83000)
2,AGO,Angola,1247421.58,17.58,-12.34,https://live.staticflickr.com/3787/13698381215...,1.0,0.0,174ce788-4f67-4ae0-922f-d2ddac87f8c3,24.0,...,46,1372,0,2,1418,2,495859.76,23.4,"{'filter_Area_Country': ['NFK', 'BMU', 'CCK', ...",POINT (17.58000 -12.34000)
3,AIA,Anguilla,83.3,-63.05,18.21,https://live.staticflickr.com/8063/8194570372_...,1.0,0.0,9f5f24d8-8b21-49a8-8f55-90b47cf63e7b,2.0,...,25,1508,0,0,1533,0,90157.96,23.4,"{'filter_Area_Country': ['ERI', 'AZE', 'FIN', ...",POINT (-63.05000 18.21000)
4,ALA,Åland,1506.26,19.97,60.24,https://p1.pxfuel.com/preview/294/670/561/alan...,1.0,0.0,2b45351b-a335-490e-914e-7748d4f41f66,1.0,...,0,0,0,0,0,0,,,,POINT (19.97000 60.24000)


In [7]:
len(gadm_nrc)

254

In [8]:
type(gadm_nrc)

geopandas.geodataframe.GeoDataFrame

**Open gadm version 4.0**

In [9]:
gadm40 = gpd.read_file('/Users/sofia/Documents/HE_Data/gadm/gadm404-shp/gadm404.shp')


In [10]:
len(gadm40)

348904

**Create new table with gadm40 in which we only have GID_0 and NAME_0**

In [11]:
gadm40_GID = gadm40[['GID_0', 'NAME_0']]
gadm40_GID = gadm40_GID.groupby('GID_0')
gadm40_GID = gadm40_GID.first()
gadm40_GID = gadm40_GID.reset_index()
gadm40_GID

Unnamed: 0,GID_0,NAME_0
0,ABW,Aruba
1,AFG,Afghanistan
2,AGO,Angola
3,AIA,Anguilla
4,ALA,Åland
...,...,...
258,Z09,Sang
259,ZAF,South Africa
260,ZMB,Zambia
261,ZNC,Northern Cyprus


In [12]:
gadm40_GID = gadm40_GID.rename(columns={'GID_0':'GID', 'NAME_0':'NAME'})
gadm40_GID

Unnamed: 0,GID,NAME
0,ABW,Aruba
1,AFG,Afghanistan
2,AGO,Angola
3,AIA,Anguilla
4,ALA,Åland
...,...,...
258,Z09,Sang
259,ZAF,South Africa
260,ZMB,Zambia
261,ZNC,Northern Cyprus


**Merge new columns in gadm3.6 dataset**

In [13]:
gadm = pd.merge(gadm_nrc, gadm40_GID, how='left', left_on='GID_0', right_on='GID')
gadm.head()

Unnamed: 0,GID_0,NAME_0,Area_Country,x,y,jpg_url,has_priority,has_raisg,GlobalID,max_highlited_sp,...,endemic_mammals_mar,endemic_fishes_mar,nspecies_mar,total_endemic_mar,Area_EEZ,Global_SPI_mar,filter_similar_mar,geometry,GID,NAME
0,ABW,Aruba,181.94,-69.97,12.51,https://live.staticflickr.com/1952/31416683438...,1.0,0.0,fe9f6eb0-f4f8-4f29-875a-5cbb3219e4e5,4.0,...,0,0,1678,0,29970.3,23.4,"{'filter_Area_Country': ['CUW', 'GRD', 'LVA', ...",POINT (-69.97000 12.51000),ABW,Aruba
1,AFG,Afghanistan,643857.48,66.03,33.83,https://p1.pxfuel.com/preview/967/12/53/afghan...,1.0,0.0,193ba976-0e5a-4cf6-9b09-d00bf83f4557,5.0,...,0,0,0,0,,,,POINT (66.03000 33.83000),AFG,Afghanistan
2,AGO,Angola,1247421.58,17.58,-12.34,https://live.staticflickr.com/3787/13698381215...,1.0,0.0,174ce788-4f67-4ae0-922f-d2ddac87f8c3,24.0,...,0,2,1418,2,495859.76,23.4,"{'filter_Area_Country': ['NFK', 'BMU', 'CCK', ...",POINT (17.58000 -12.34000),AGO,Angola
3,AIA,Anguilla,83.3,-63.05,18.21,https://live.staticflickr.com/8063/8194570372_...,1.0,0.0,9f5f24d8-8b21-49a8-8f55-90b47cf63e7b,2.0,...,0,0,1533,0,90157.96,23.4,"{'filter_Area_Country': ['ERI', 'AZE', 'FIN', ...",POINT (-63.05000 18.21000),AIA,Anguilla
4,ALA,Åland,1506.26,19.97,60.24,https://p1.pxfuel.com/preview/294/670/561/alan...,1.0,0.0,2b45351b-a335-490e-914e-7748d4f41f66,1.0,...,0,0,0,0,,,,POINT (19.97000 60.24000),ALA,Åland


In [16]:
gadm[['GID_0', 'NAME_0', 'GID', 'NAME']]

Unnamed: 0,GID_0,NAME_0,GID,NAME
0,ABW,Aruba,ABW,Aruba
1,AFG,Afghanistan,AFG,Afghanistan
2,AGO,Angola,AGO,Angola
3,AIA,Anguilla,AIA,Anguilla
4,ALA,Åland,ALA,Åland
...,...,...,...,...
249,TZA,Tanzania,TZA,Tanzania
250,UGA,Uganda,UGA,Uganda
251,UKR,Ukraine,UKR,Ukraine
252,UMI,United States Minor Outlying Islands,UMI,United States Minor Outlying Islands


**Check countries that have different names in gadm36 and gadm40**

In [17]:
gadm2 = gadm[gadm.NAME_0!=gadm.NAME]
gadm2[['NAME_0', 'NAME']]

Unnamed: 0,NAME_0,NAME
39,Northern Cyprus,
67,Republic of Congo,Republic of the Congo
71,Cape Verde,Cabo Verde
78,Czech Republic,Czechia
116,Hong Kong,
155,Macao,
164,Macedonia,North Macedonia
204,Palestina,Palestine
207,Reunion,Réunion
216,Saint Helena,"Saint Helena, Ascension and Tristan da Cunha"


In [18]:
gadm.columns

Index(['GID_0', 'NAME_0', 'Area_Country', 'x', 'y', 'jpg_url', 'has_priority',
       'has_raisg', 'GlobalID', 'max_highlited_sp', 'continent', 'GNI_PPP',
       'sentence', 'hm_ter', 'hm_no_ter', 'hm_vh_ter', 'Global_SPI_ter',
       'Pop2020', 'SPI_ter', 'prop_protected_ter', 'protection_needed_ter',
       'amphibians', 'birds', 'mammals', 'reptiles', 'endemic_amphibians',
       'endemic_birds', 'endemic_mammals', 'endemic_reptiles', 'nspecies_ter',
       'total_endemic_ter', 'filter_similar_ter', 'Marine', 'Pop2020_EEZ',
       'hm_no_mar', 'hm_mar', 'hm_vh_mar', 'SPI_mar', 'protection_needed_mar',
       'prop_protected_mar', 'mammals_mar', 'fishes_mar',
       'endemic_mammals_mar', 'endemic_fishes_mar', 'nspecies_mar',
       'total_endemic_mar', 'Area_EEZ', 'Global_SPI_mar', 'filter_similar_mar',
       'geometry', 'GID', 'NAME'],
      dtype='object')

In [None]:
# Give to each country with NaN values in gadm40 the name they had in gadm36
gadm.NAME.fillna(gadm.NAME_0, inplace=True)

In [21]:
# Check the names that are different now and that will change
gadm2 = gadm[gadm.NAME_0!=gadm.NAME]
gadm2[['NAME_0', 'NAME']]

Unnamed: 0,NAME_0,NAME
67,Republic of Congo,Republic of the Congo
71,Cape Verde,Cabo Verde
78,Czech Republic,Czechia
164,Macedonia,North Macedonia
204,Palestina,Palestine
207,Reunion,Réunion
216,Saint Helena,"Saint Helena, Ascension and Tristan da Cunha"


In [22]:
# Give to NAME_0 the new names
gadm.NAME_0 = gadm.NAME
gadm = gadm.drop(columns={'NAME', 'GID'})

In [23]:
# Now Czech Republic is Czechia
gadm.NAME_0[gadm.GID_0=='CZE']

78    Czechia
Name: NAME_0, dtype: object

In [25]:
gadm_nrc.NAME_0[gadm_nrc.GID_0=='CZE']

78    Czech Republic
Name: NAME_0, dtype: object

In [29]:
gadm.columns

Index(['GID_0', 'NAME_0', 'Area_Country', 'x', 'y', 'jpg_url', 'has_priority',
       'has_raisg', 'GlobalID', 'max_highlited_sp', 'continent', 'GNI_PPP',
       'sentence', 'hm_ter', 'hm_no_ter', 'hm_vh_ter', 'Global_SPI_ter',
       'Pop2020', 'SPI_ter', 'prop_protected_ter', 'protection_needed_ter',
       'amphibians', 'birds', 'mammals', 'reptiles', 'endemic_amphibians',
       'endemic_birds', 'endemic_mammals', 'endemic_reptiles', 'nspecies_ter',
       'total_endemic_ter', 'filter_similar_ter', 'Marine', 'Pop2020_EEZ',
       'hm_no_mar', 'hm_mar', 'hm_vh_mar', 'SPI_mar', 'protection_needed_mar',
       'prop_protected_mar', 'mammals_mar', 'fishes_mar',
       'endemic_mammals_mar', 'endemic_fishes_mar', 'nspecies_mar',
       'total_endemic_mar', 'Area_EEZ', 'Global_SPI_mar', 'filter_similar_mar',
       'geometry'],
      dtype='object')

**Save file with the name of the original layer and use it to overwrite the current hosted table in ArcGIS Online**

In [30]:
gadm.to_file('/Users/sofia/Documents/HE_Data/NRC/NRC_2022/NRC_final_20220623_gadm4.geojson',driver='GeoJSON')

---
<a id='gadm1'></a>
## Update gadm1_precalculated

In the layer [gadm1_precalculated](https://eowilson.maps.arcgis.com/home/item.html?id=fe214eeebd21493eb2782a7ce1466606#data) there are many names that have foreign characters that have been replaced by `?`. To solve this, let's update the subnational names with those in gadm4

**Import packages**

**Bring dataset hosted in AGOL**

For this, download it from AGOL and you'll also have a backup copy of the current layer in case something goes wrong

In [6]:
gadm1 = gpd.read_file('/Users/sofia/Documents/HE_Data/precalculated/gadm1/gadm1_precalculated_range_area_0_backup.geojson')
gadm1

Unnamed: 0,GID_0,NAME_0,GID_1,NAME_1,MOL_ID,AREA_KM2,reptiles,amphibians,mammals,birds,...,percent_rainfed,percent_rangeland,percent_urban,population_sum,majority_land_cover_climate_reg,land_cover_majority,climate_regime_majority,country_size,ObjectId,geometry
0,ECU,Ecuador,ECU.6_1,Cotopaxi,801,6172.385,"[{""SliceNumber"":310,""per_global"":2.01,""per_aoi...","[{""SliceNumber"":555,""per_global"":24.8,""per_aoi...","[{""SliceNumber"":59,""per_global"":0.84,""per_aoi""...","[{""SliceNumber"":27,""per_global"":0.02,""per_aoi""...",...,8.01,62.57,,487626.1,176.0,Forest,Warm Temperate Moist,4,1,"MULTIPOLYGON (((-78.40904 -0.72033, -78.40891 ..."
1,LBN,Lebanon,LBN.5_1,Mount Lebanon,1601,1985.055,"[{""SliceNumber"":2,""per_global"":1.61,""per_aoi"":...","[{""SliceNumber"":955,""per_global"":0.06,""per_aoi...","[{""SliceNumber"":259,""per_global"":0.0,""per_aoi""...","[{""SliceNumber"":121,""per_global"":0.01,""per_aoi...",...,57.94,,20.09,4637642.0,175.0,Shrubland,Warm Temperate Moist,5,2,"POLYGON ((35.62627 33.49696, 35.62548 33.49446..."
2,ECU,Ecuador,ECU.7_1,El Oro,802,5868.456,"[{""SliceNumber"":310,""per_global"":3.75,""per_aoi...","[{""SliceNumber"":1010,""per_global"":1.45,""per_ao...","[{""SliceNumber"":56,""per_global"":0.38,""per_aoi""...","[{""SliceNumber"":27,""per_global"":0.05,""per_aoi""...",...,4.91,58.84,2.7,698379.8,262.0,Forest,Sub Tropical Moist,4,3,"MULTIPOLYGON (((-80.44117 -3.17687, -80.44184 ..."
3,LBN,Lebanon,LBN.6_1,Nabatiyeh,1602,1095.317,"[{""SliceNumber"":2,""per_global"":0.75,""per_aoi"":...","[{""SliceNumber"":947,""per_global"":0.01,""per_aoi...","[{""SliceNumber"":33,""per_global"":0.0,""per_aoi"":...","[{""SliceNumber"":97,""per_global"":0.0,""per_aoi"":...",...,92.29,,5.29,762029.5,173.0,Cropland,Warm Temperate Moist,5,4,"POLYGON ((35.59720 33.27736, 35.59016 33.28218..."
4,IDN,Indonesia,IDN.25_1,Sulawesi Barat,1201,16571.380,"[{""SliceNumber"":143,""per_global"":0.01,""per_aoi...","[{""SliceNumber"":1700,""per_global"":0.01,""per_ao...","[{""SliceNumber"":23,""per_global"":8.61,""per_aoi""...","[{""SliceNumber"":43,""per_global"":9.85,""per_aoi""...",...,30.06,5.05,6.89,1661324.0,262.0,Forest,Sub Tropical Moist,2,5,"MULTIPOLYGON (((119.35876 -3.48674, 119.35515 ..."
...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...
3605,ZWE,Zimbabwe,ZWE.6_1,Mashonaland West,3606,57396.730,"[{""SliceNumber"":40,""per_global"":0.09,""per_aoi""...","[{""SliceNumber"":212,""per_global"":1.28,""per_aoi...","[{""SliceNumber"":28,""per_global"":0.71,""per_aoi""...","[{""SliceNumber"":26,""per_global"":0.26,""per_aoi""...",...,3.24,77.84,0.32,1774267.0,290.0,Forest,Sub Tropical Dry,3,3606,"POLYGON ((30.37916 -18.83976, 30.36670 -18.835..."
3606,ZWE,Zimbabwe,ZWE.7_1,Masvingo,3607,56280.110,"[{""SliceNumber"":40,""per_global"":0.05,""per_aoi""...","[{""SliceNumber"":33,""per_global"":0.0,""per_aoi"":...","[{""SliceNumber"":28,""per_global"":0.64,""per_aoi""...","[{""SliceNumber"":26,""per_global"":0.27,""per_aoi""...",...,0.43,95.45,0.19,1631915.0,290.0,Forest,Sub Tropical Dry,3,3607,"POLYGON ((31.06733 -22.34189, 31.11290 -22.336..."
3607,ZWE,Zimbabwe,ZWE.8_1,Matabeleland North,3608,75500.590,"[{""SliceNumber"":40,""per_global"":0.03,""per_aoi""...","[{""SliceNumber"":212,""per_global"":1.6,""per_aoi""...","[{""SliceNumber"":28,""per_global"":1.13,""per_aoi""...","[{""SliceNumber"":26,""per_global"":0.36,""per_aoi""...",...,1.04,50.68,0.15,791251.1,288.0,Forest,Sub Tropical Dry,3,3608,"POLYGON ((28.66857 -20.30021, 28.63305 -20.260..."
3608,ZWE,Zimbabwe,ZWE.9_1,Matabeleland South,3609,54675.750,"[{""SliceNumber"":40,""per_global"":0.0,""per_aoi"":...","[{""SliceNumber"":212,""per_global"":1.18,""per_aoi...","[{""SliceNumber"":28,""per_global"":0.81,""per_aoi""...","[{""SliceNumber"":26,""per_global"":0.26,""per_aoi""...",...,0.37,92.22,,709477.6,285.0,Shrubland,Sub Tropical Dry,3,3609,"POLYGON ((30.99968 -22.31642, 30.98855 -22.327..."


In [7]:
gadm1.columns

Index(['GID_0', 'NAME_0', 'GID_1', 'NAME_1', 'MOL_ID', 'AREA_KM2', 'reptiles',
       'amphibians', 'mammals', 'birds', 'percentage_protected',
       'percent_irrigated', 'percent_rainfed', 'percent_rangeland',
       'percent_urban', 'population_sum', 'majority_land_cover_climate_reg',
       'land_cover_majority', 'climate_regime_majority', 'country_size',
       'ObjectId', 'geometry'],
      dtype='object')

**Read gadm4.0 and prepare it to update the field NAME_1**

In [8]:
# Read gadm 4.0
gadm40 = gpd.read_file('/Users/sofia/Documents/HE_Data/gadm/gadm404-shp/gadm404.shp')


In [9]:
gadm40_GID = gadm40[['GID_1', 'NAME_1']]
gadm40_GID = gadm40_GID.groupby('GID_1')
gadm40_GID = gadm40_GID.first()
gadm40_GID = gadm40_GID.reset_index()
gadm40_GID


Unnamed: 0,GID_1,NAME_1
0,?,?
1,AFG.10_1,Ghor
2,AFG.11_1,Hilmand
3,AFG.12_1,Hirat
4,AFG.13_1,Jawzjan
...,...,...
3651,ZWE.5_1,Mashonaland East
3652,ZWE.6_1,Mashonaland West
3653,ZWE.7_1,Masvingo
3654,ZWE.8_1,Matabeleland North


In [10]:
gadm40_GID = gadm40_GID.rename(columns={'GID_1':'GID', 'NAME_1':'NAME'})
gadm40_GID


Unnamed: 0,GID,NAME
0,?,?
1,AFG.10_1,Ghor
2,AFG.11_1,Hilmand
3,AFG.12_1,Hirat
4,AFG.13_1,Jawzjan
...,...,...
3651,ZWE.5_1,Mashonaland East
3652,ZWE.6_1,Mashonaland West
3653,ZWE.7_1,Masvingo
3654,ZWE.8_1,Matabeleland North


**Merge datasets**

In [11]:
# Merge both datasets by GID
gadm = pd.merge(gadm1, gadm40_GID, how='left', left_on='GID_1', right_on='GID')
gadm.head()

Unnamed: 0,GID_0,NAME_0,GID_1,NAME_1,MOL_ID,AREA_KM2,reptiles,amphibians,mammals,birds,...,percent_urban,population_sum,majority_land_cover_climate_reg,land_cover_majority,climate_regime_majority,country_size,ObjectId,geometry,GID,NAME
0,ECU,Ecuador,ECU.6_1,Cotopaxi,801,6172.385,"[{""SliceNumber"":310,""per_global"":2.01,""per_aoi...","[{""SliceNumber"":555,""per_global"":24.8,""per_aoi...","[{""SliceNumber"":59,""per_global"":0.84,""per_aoi""...","[{""SliceNumber"":27,""per_global"":0.02,""per_aoi""...",...,,487626.1,176.0,Forest,Warm Temperate Moist,4,1,"MULTIPOLYGON (((-78.40904 -0.72033, -78.40891 ...",ECU.6_1,Cotopaxi
1,LBN,Lebanon,LBN.5_1,Mount Lebanon,1601,1985.055,"[{""SliceNumber"":2,""per_global"":1.61,""per_aoi"":...","[{""SliceNumber"":955,""per_global"":0.06,""per_aoi...","[{""SliceNumber"":259,""per_global"":0.0,""per_aoi""...","[{""SliceNumber"":121,""per_global"":0.01,""per_aoi...",...,20.09,4637642.0,175.0,Shrubland,Warm Temperate Moist,5,2,"POLYGON ((35.62627 33.49696, 35.62548 33.49446...",LBN.5_1,Mount Lebanon
2,ECU,Ecuador,ECU.7_1,El Oro,802,5868.456,"[{""SliceNumber"":310,""per_global"":3.75,""per_aoi...","[{""SliceNumber"":1010,""per_global"":1.45,""per_ao...","[{""SliceNumber"":56,""per_global"":0.38,""per_aoi""...","[{""SliceNumber"":27,""per_global"":0.05,""per_aoi""...",...,2.7,698379.8,262.0,Forest,Sub Tropical Moist,4,3,"MULTIPOLYGON (((-80.44117 -3.17687, -80.44184 ...",ECU.7_1,El Oro
3,LBN,Lebanon,LBN.6_1,Nabatiyeh,1602,1095.317,"[{""SliceNumber"":2,""per_global"":0.75,""per_aoi"":...","[{""SliceNumber"":947,""per_global"":0.01,""per_aoi...","[{""SliceNumber"":33,""per_global"":0.0,""per_aoi"":...","[{""SliceNumber"":97,""per_global"":0.0,""per_aoi"":...",...,5.29,762029.5,173.0,Cropland,Warm Temperate Moist,5,4,"POLYGON ((35.59720 33.27736, 35.59016 33.28218...",LBN.6_1,Nabatiyeh
4,IDN,Indonesia,IDN.25_1,Sulawesi Barat,1201,16571.38,"[{""SliceNumber"":143,""per_global"":0.01,""per_aoi...","[{""SliceNumber"":1700,""per_global"":0.01,""per_ao...","[{""SliceNumber"":23,""per_global"":8.61,""per_aoi""...","[{""SliceNumber"":43,""per_global"":9.85,""per_aoi""...",...,6.89,1661324.0,262.0,Forest,Sub Tropical Moist,2,5,"MULTIPOLYGON (((119.35876 -3.48674, 119.35515 ...",IDN.25_1,Sulawesi Barat


In [12]:
gadm[['GID_1', 'NAME_1', 'GID', 'NAME']]

Unnamed: 0,GID_1,NAME_1,GID,NAME
0,ECU.6_1,Cotopaxi,ECU.6_1,Cotopaxi
1,LBN.5_1,Mount Lebanon,LBN.5_1,Mount Lebanon
2,ECU.7_1,El Oro,ECU.7_1,El Oro
3,LBN.6_1,Nabatiyeh,LBN.6_1,Nabatiyeh
4,IDN.25_1,Sulawesi Barat,IDN.25_1,Sulawesi Barat
...,...,...,...,...
3605,ZWE.6_1,Mashonaland West,ZWE.6_1,Mashonaland West
3606,ZWE.7_1,Masvingo,ZWE.7_1,Masvingo
3607,ZWE.8_1,Matabeleland North,ZWE.8_1,Matabeleland North
3608,ZWE.9_1,Matabeleland South,ZWE.9_1,Matabeleland South


In [13]:
# Identify regions that have changed the name from gadm version 3.6 to gadm 4.0
gadm2 = gadm[gadm.NAME_1!=gadm.NAME]
gadm2[['NAME_1', 'NAME']]

Unnamed: 0,NAME_1,NAME
13,GrandBassa,Grand Bassa
14,GrandGedeh,Grand Gedeh
15,GrandKru,Grand Kru
21,River Cess,Rivercess
107,?iauliai,Šiauliai
...,...,...
3463,Chardzhou,Lebap
3465,Tashauz,Daşoguz
3535,Bart?n,Bartın
3551,Elaz??,Elazığ


In [14]:
len(gadm[gadm.NAME.isnull()]) # 112 regions have name in gadm 3.6 but not in gadm4. So let's maintain those names

112

In [15]:
# Give to regions with nan name the name they had in gadm36
gadm.NAME.fillna(gadm.NAME_1, inplace=True)

In [16]:
# Identify regions that have changed the name from gadm version 3.6 to gadm 4.0
gadm2 = gadm[gadm.NAME_1!=gadm.NAME]
gadm2[['NAME_1', 'NAME']]

Unnamed: 0,NAME_1,NAME
13,GrandBassa,Grand Bassa
14,GrandGedeh,Grand Gedeh
15,GrandKru,Grand Kru
21,River Cess,Rivercess
107,?iauliai,Šiauliai
...,...,...
3463,Chardzhou,Lebap
3465,Tashauz,Daşoguz
3535,Bart?n,Bartın
3551,Elaz??,Elazığ


**Change the field NAME_1 to include the new names**

In [17]:
gadm.NAME_1 = gadm.NAME
gadm = gadm.drop(columns={'NAME', 'GID'})

In [18]:
gadm[gadm.NAME_1=='Iğdır']

Unnamed: 0,GID_0,NAME_0,GID_1,NAME_1,MOL_ID,AREA_KM2,reptiles,amphibians,mammals,birds,...,percent_rainfed,percent_rangeland,percent_urban,population_sum,majority_land_cover_climate_reg,land_cover_majority,climate_regime_majority,country_size,ObjectId,geometry
3560,TUR,Turkey,TUR.38_1,Iğdır,3157,3929.341,"[{""SliceNumber"":1,""per_global"":0.19,""per_aoi"":...","[{""SliceNumber"":955,""per_global"":0.13,""per_aoi...","[{""SliceNumber"":259,""per_global"":0.05,""per_aoi...","[{""SliceNumber"":42,""per_global"":0.01,""per_aoi""...",...,31.59,34.73,,205832.0,148.0,Cropland,Cool Temperate Dry,3,3561,"POLYGON ((44.34463 40.02792, 44.37977 40.00528..."


**Change the field NAME_0**

In [19]:
gadm40_GID0 = gadm40[['GID_0', 'NAME_0']]
gadm40_GID0 = gadm40_GID0.groupby('GID_0')
gadm40_GID0 = gadm40_GID0.first()
gadm40_GID0 = gadm40_GID0.reset_index()
gadm40_GID0

Unnamed: 0,GID_0,NAME_0
0,ABW,Aruba
1,AFG,Afghanistan
2,AGO,Angola
3,AIA,Anguilla
4,ALA,Åland
...,...,...
258,Z09,Sang
259,ZAF,South Africa
260,ZMB,Zambia
261,ZNC,Northern Cyprus


In [20]:
gadm40_GID0 = gadm40_GID0.rename(columns={'GID_0':'GID', 'NAME_0':'NAME'})
gadm40_GID0

Unnamed: 0,GID,NAME
0,ABW,Aruba
1,AFG,Afghanistan
2,AGO,Angola
3,AIA,Anguilla
4,ALA,Åland
...,...,...
258,Z09,Sang
259,ZAF,South Africa
260,ZMB,Zambia
261,ZNC,Northern Cyprus


In [21]:
# Merge both datasets by GID
gadm = pd.merge(gadm, gadm40_GID0, how='left', left_on='GID_0', right_on='GID')
gadm.head()

Unnamed: 0,GID_0,NAME_0,GID_1,NAME_1,MOL_ID,AREA_KM2,reptiles,amphibians,mammals,birds,...,percent_urban,population_sum,majority_land_cover_climate_reg,land_cover_majority,climate_regime_majority,country_size,ObjectId,geometry,GID,NAME
0,ECU,Ecuador,ECU.6_1,Cotopaxi,801,6172.385,"[{""SliceNumber"":310,""per_global"":2.01,""per_aoi...","[{""SliceNumber"":555,""per_global"":24.8,""per_aoi...","[{""SliceNumber"":59,""per_global"":0.84,""per_aoi""...","[{""SliceNumber"":27,""per_global"":0.02,""per_aoi""...",...,,487626.1,176.0,Forest,Warm Temperate Moist,4,1,"MULTIPOLYGON (((-78.40904 -0.72033, -78.40891 ...",ECU,Ecuador
1,LBN,Lebanon,LBN.5_1,Mount Lebanon,1601,1985.055,"[{""SliceNumber"":2,""per_global"":1.61,""per_aoi"":...","[{""SliceNumber"":955,""per_global"":0.06,""per_aoi...","[{""SliceNumber"":259,""per_global"":0.0,""per_aoi""...","[{""SliceNumber"":121,""per_global"":0.01,""per_aoi...",...,20.09,4637642.0,175.0,Shrubland,Warm Temperate Moist,5,2,"POLYGON ((35.62627 33.49696, 35.62548 33.49446...",LBN,Lebanon
2,ECU,Ecuador,ECU.7_1,El Oro,802,5868.456,"[{""SliceNumber"":310,""per_global"":3.75,""per_aoi...","[{""SliceNumber"":1010,""per_global"":1.45,""per_ao...","[{""SliceNumber"":56,""per_global"":0.38,""per_aoi""...","[{""SliceNumber"":27,""per_global"":0.05,""per_aoi""...",...,2.7,698379.8,262.0,Forest,Sub Tropical Moist,4,3,"MULTIPOLYGON (((-80.44117 -3.17687, -80.44184 ...",ECU,Ecuador
3,LBN,Lebanon,LBN.6_1,Nabatiyeh,1602,1095.317,"[{""SliceNumber"":2,""per_global"":0.75,""per_aoi"":...","[{""SliceNumber"":947,""per_global"":0.01,""per_aoi...","[{""SliceNumber"":33,""per_global"":0.0,""per_aoi"":...","[{""SliceNumber"":97,""per_global"":0.0,""per_aoi"":...",...,5.29,762029.5,173.0,Cropland,Warm Temperate Moist,5,4,"POLYGON ((35.59720 33.27736, 35.59016 33.28218...",LBN,Lebanon
4,IDN,Indonesia,IDN.25_1,Sulawesi Barat,1201,16571.38,"[{""SliceNumber"":143,""per_global"":0.01,""per_aoi...","[{""SliceNumber"":1700,""per_global"":0.01,""per_ao...","[{""SliceNumber"":23,""per_global"":8.61,""per_aoi""...","[{""SliceNumber"":43,""per_global"":9.85,""per_aoi""...",...,6.89,1661324.0,262.0,Forest,Sub Tropical Moist,2,5,"MULTIPOLYGON (((119.35876 -3.48674, 119.35515 ...",IDN,Indonesia


In [22]:
gadm[['GID_0', 'NAME_0', 'GID', 'NAME']]

Unnamed: 0,GID_0,NAME_0,GID,NAME
0,ECU,Ecuador,ECU,Ecuador
1,LBN,Lebanon,LBN,Lebanon
2,ECU,Ecuador,ECU,Ecuador
3,LBN,Lebanon,LBN,Lebanon
4,IDN,Indonesia,IDN,Indonesia
...,...,...,...,...
3605,ZWE,Zimbabwe,ZWE,Zimbabwe
3606,ZWE,Zimbabwe,ZWE,Zimbabwe
3607,ZWE,Zimbabwe,ZWE,Zimbabwe
3608,ZWE,Zimbabwe,ZWE,Zimbabwe


In [23]:
# Identify regions that have changed the name from gadm version 3.6 to gadm 4.0
gadm2 = gadm[gadm.NAME_0!=gadm.NAME]
gadm2[['NAME_0', 'NAME']]

Unnamed: 0,NAME_0,NAME
120,Macao,
121,Macao,
212,Macedonia,North Macedonia
213,Macedonia,North Macedonia
214,Macedonia,North Macedonia
...,...,...
2761,Northern Cyprus,
2762,Northern Cyprus,
2763,Northern Cyprus,
2764,Northern Cyprus,


In [24]:
len(gadm[gadm.NAME.isnull()]) # 25 countries have name in gadm 3.6 but not in gadm4. So let's maintain those names

25

In [25]:
# Give to regions with nan name the name they had in gadm36
gadm.NAME.fillna(gadm.NAME_0, inplace=True)

In [26]:
# Identify regions that have changed the name from gadm version 3.6 to gadm 4.0
gadm2 = gadm[gadm.NAME_0!=gadm.NAME]
gadm2[['NAME_0', 'NAME']]

Unnamed: 0,NAME_0,NAME
212,Macedonia,North Macedonia
213,Macedonia,North Macedonia
214,Macedonia,North Macedonia
215,Macedonia,North Macedonia
216,Macedonia,North Macedonia
...,...,...
2165,Reunion,Réunion
2166,Reunion,Réunion
2347,Saint Helena,"Saint Helena, Ascension and Tristan da Cunha"
2348,Saint Helena,"Saint Helena, Ascension and Tristan da Cunha"


In [27]:
gadm.NAME_0 = gadm.NAME
gadm = gadm.drop(columns={'NAME', 'GID'})

In [28]:
gadm[gadm.NAME_0=='Cabo Verde'].head(2)

Unnamed: 0,GID_0,NAME_0,GID_1,NAME_1,MOL_ID,AREA_KM2,reptiles,amphibians,mammals,birds,...,percent_rainfed,percent_rangeland,percent_urban,population_sum,majority_land_cover_climate_reg,land_cover_majority,climate_regime_majority,country_size,ObjectId,geometry
1808,CPV,Cabo Verde,CPV.1_1,Boa Vista,609,630.0752,"[{""SliceNumber"":2175,""per_global"":31.64,""per_a...","[{""SliceNumber"":252,""per_global"":0.01,""per_aoi...","[{""SliceNumber"":2335,""per_global"":0.01,""per_ao...","[{""SliceNumber"":102,""per_global"":8.39,""per_aoi...",...,0.91,,,19314.74,312.0,Sparsley or Non vegetated,Sub Tropical Desert,5,1809,"MULTIPOLYGON (((-22.96042 16.05542, -22.95903 ..."
1809,CPV,Cabo Verde,CPV.2_1,Brava,610,69.45991,"[{""SliceNumber"":2171,""per_global"":5.1,""per_aoi...","[{""SliceNumber"":252,""per_global"":0.0,""per_aoi""...",[],"[{""SliceNumber"":102,""per_global"":2.03,""per_aoi...",...,,,,4717.327,286.0,Shrubland,Sub Tropical Dry,5,1810,"MULTIPOLYGON (((-24.63486 14.98486, -24.62958 ..."


In [29]:
len(gadm)

3610

In [50]:
# See differences between original dataset and new dataset
gadm1.NAME_0[gadm1.NAME_0!=gadm.NAME_0].unique()

array(['Macedonia', 'Republic of Congo', 'Cape Verde', 'Czech Republic',
       'Palestina', 'Reunion', 'Saint Helena'], dtype=object)

In [53]:
gadm[gadm.NAME_0=='Czechia'].head(2)

Unnamed: 0,GID_0,NAME_0,GID_1,NAME_1,MOL_ID,AREA_KM2,reptiles,amphibians,mammals,birds,...,percent_rainfed,percent_rangeland,percent_urban,population_sum,majority_land_cover_climate_reg,land_cover_majority,climate_regime_majority,country_size,ObjectId,geometry
1865,CZE,Czechia,CZE.1_1,Jihočeský,666,9999.675,"[{""SliceNumber"":574,""per_global"":0.45,""per_aoi...","[{""SliceNumber"":815,""per_global"":0.32,""per_aoi...","[{""SliceNumber"":129,""per_global"":0.04,""per_aoi...","[{""SliceNumber"":42,""per_global"":0.03,""per_aoi""...",...,74.34,0.03,1.77,639129.8,110.0,Cropland,Cool Temperate Moist,5,1866,"POLYGON ((14.63085 48.60770, 14.62451 48.60737..."
1866,CZE,Czechia,CZE.2_1,Jihomoravský,667,7092.675,"[{""SliceNumber"":573,""per_global"":0.01,""per_aoi...","[{""SliceNumber"":815,""per_global"":0.25,""per_aoi...","[{""SliceNumber"":129,""per_global"":0.01,""per_aoi...","[{""SliceNumber"":42,""per_global"":0.01,""per_aoi""...",...,89.48,,2.01,1199884.0,110.0,Cropland,Cool Temperate Moist,5,1867,"POLYGON ((16.76474 48.71827, 16.76360 48.71934..."


In [55]:
gadm.columns

Index(['GID_0', 'NAME_0', 'GID_1', 'NAME_1', 'MOL_ID', 'AREA_KM2', 'reptiles',
       'amphibians', 'mammals', 'birds', 'percentage_protected',
       'percent_irrigated', 'percent_rainfed', 'percent_rangeland',
       'percent_urban', 'population_sum', 'majority_land_cover_climate_reg',
       'land_cover_majority', 'climate_regime_majority', 'country_size',
       'ObjectId', 'geometry'],
      dtype='object')

**Save file**

Use the same name of the original one and use it to manually overwrite the hosted feature service with the new dataset. We've tried to do this using the API but it gives an error, probably due to the size of the layer. 

In [68]:
gadm.to_file('/Users/sofia/Documents/HE_Data/precalculated/gadm1/gadm1_precalculated_updated/gadm1_precalculated_range.geojson', driver='GeoJSON')