Notebook to check the geographical granularity of the different datasets and harmonize it over the set level of the municipalities.

**Requirements**: numpy pandas xlrd geopandas openpyxl

## Content
* Data ingestion
* Comparison
* Corrections:
    * Correction of PCF counties to correspond to the ones in the shapefile
    * Disaggregation of pre-PCF adoption:
        * Match of regions with municipalities
        * Disaggregation of adopted area based on pasture area in each municipality
      
**NOTE:** in the last section on disaggregartion, possible to choose between the two different possibilities based on the disaggregation of pre-PCF adoption.

In [847]:
import numpy as np
import pandas as pd

# Data ingestion

## SBP adoption previous to the PCF project

In [848]:
path_to_adoption_pre_PCF = "./adoption/Pastures before 2009.xlsx"

In [849]:
adoption_pre_PCF = pd.read_excel(path_to_adoption_pre_PCF, header=2, index_col=1)
adoption_pre_PCF = adoption_pre_PCF.drop('Unnamed: 0', axis=1)
adoption_pre_PCF.head()

Unnamed: 0_level_0,1996,1997,1998,1999,2000,2001,2002,2003,2004,2005,2006,2007,2008
Region,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1,Unnamed: 7_level_1,Unnamed: 8_level_1,Unnamed: 9_level_1,Unnamed: 10_level_1,Unnamed: 11_level_1,Unnamed: 12_level_1,Unnamed: 13_level_1
Minho,0,1.0,2,0.0,30.0,29,12,27.5,46,46,79,90,455
Aveiro,0,6.0,5,0.0,3.0,0,10,30.0,37,35,36,138,50
Coimbra,5,2.5,0,1.5,4.5,9,2,26.5,12,7,7,18,15
Trás-os-Montes,2,5.0,32,35.0,30.0,41,76,89.0,110,69,84,120,97
Guarda,18,73.0,135,128.0,35.0,16,92,100.0,142,108,119,134,223


In [850]:
regions_prePCF = adoption_pre_PCF.index.tolist()
len(regions_prePCF)

24

## PCF project

In [851]:
path_to_PCF_data = "./adoption/20160729_RelCampo_Pastagens_Chave_RT.xlsx"

In [852]:
cols_to_fetch = ['Parcel_ID', 'Farmer_ID', 'Year that the pasture was installed', 'County', 'Area_Total_SIG_PPI_2009_ha',
                 'Area_Total_SIG_PPI_2010_ha', 'Area_Total_SIG_PPII_2011_ha', 'Area_Total_SIG_PPII_2012_ha']
PCF_data = pd.read_excel(path_to_PCF_data, sheet_name='Data_table', usecols=cols_to_fetch, header=1, index_col=0)
PCF_data = PCF_data.fillna(0)
PCF_data.head()

Unnamed: 0_level_0,Farmer_ID,Year that the pasture was installed,County,Area_Total_SIG_PPI_2009_ha,Area_Total_SIG_PPI_2010_ha,Area_Total_SIG_PPII_2011_ha,Area_Total_SIG_PPII_2012_ha
Parcel_ID,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1,Unnamed: 7_level_1
55,1,2009,Idanha-a-Nova,9.28,0.0,0.0,0.0
410,2,2010,Évora,0.0,3.21,0.0,0.0
681,2,2011,Évora,0.0,0.0,6.83,0.0
1068,3,2012,Reguengos de Monsaraz,0.0,0.0,0.0,38.17
584,4,2010,Avis,0.0,9.57,0.0,0.0


In [853]:
counties_PCF = PCF_data['County'].unique().tolist()
len(counties_PCF)

86

## Census data 2009

In [854]:
import os

In [855]:
path_to_pastures_data = "./census/extracted/municipalities_permanent_pastures_area.csv"
municipalities_pastures_area = pd.read_csv(path_to_pastures_data, index_col = "Municipality")

In [856]:
municipalities_census = municipalities_pastures_area.index.values
municipalities_census[:5]

array(['Arouca', 'Castelo de Paiva', 'Espinho', 'Santa Maria da Feira',
       'Oliveira de Azeméis'], dtype=object)

In [857]:
len(municipalities_census)

278

## Shapefile

In [858]:
import geopandas as gpd

In [859]:
path_to_shapefile = "./counties_shp/mod_concelhos.shp"

In [860]:
shapefile_data = gpd.read_file(path_to_shapefile)

In [861]:
shapefile_data.head()

Unnamed: 0,CCA_2,District,Municipali,geometry
0,705,Évora,Évora,"POLYGON ((-7.79291 38.76507, -7.79287 38.76506..."
1,701,Évora,Alandroal,"POLYGON ((-7.25937 38.77351, -7.25921 38.77343..."
2,702,Évora,Arraiolos,"POLYGON ((-7.88611 38.92495, -7.88580 38.92472..."
3,703,Évora,Borba,"POLYGON ((-7.46362 38.92344, -7.46344 38.92329..."
4,704,Évora,Estremoz,"POLYGON ((-7.52770 39.00080, -7.52765 39.00066..."


In [862]:
shapefile_data.columns=['CCA_2', 'District', 'Municipality', 'geometry']

In [863]:
municipalities_shapefile = shapefile_data['Municipality'].tolist()
len(municipalities_shapefile)

308

In [864]:
districts_shapefile = shapefile_data['District'].value_counts().keys().tolist()
len(districts_shapefile)

20

In [865]:
shapefile_data.set_index('Municipality', inplace=True)

In [866]:
shapefile_data.head()

Unnamed: 0_level_0,CCA_2,District,geometry
Municipality,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1
Évora,705,Évora,"POLYGON ((-7.79291 38.76507, -7.79287 38.76506..."
Alandroal,701,Évora,"POLYGON ((-7.25937 38.77351, -7.25921 38.77343..."
Arraiolos,702,Évora,"POLYGON ((-7.88611 38.92495, -7.88580 38.92472..."
Borba,703,Évora,"POLYGON ((-7.46362 38.92344, -7.46344 38.92329..."
Estremoz,704,Évora,"POLYGON ((-7.52770 39.00080, -7.52765 39.00066..."


# Comparison

## Aggregated adoption pre-PCF and during PCF

In [867]:
common_regions = [reg for reg in regions_prePCF if reg in counties_PCF]
common_regions

['Guarda',
 'Castelo Branco',
 'Abrantes',
 'Santarém',
 'Coruche',
 'Évora',
 'Elvas',
 'Portalegre',
 'Estremoz',
 'Ponte de Sor',
 'Odemira',
 'Beja']

This does not ensure that these regions coincide (they could be bigger in the pre-PCF dataset including neighbouring municipalities)

## Census - PCF adoption

In [868]:
munic_in_PCF_not_in_census = [munic for munic in counties_PCF if munic not in municipalities_census]
munic_in_PCF_not_in_census

['Vila Velha de Rodão',
 'Alcácer do Sal - Torrão/Alvito-V.N. Baronia',
 'Ourique ',
 'Santiago do cacém',
 'Alcácer do Sal - Santa Susana',
 'Évora / Montemor-o-Novo',
 'Moura e Serpa',
 'Albernoa',
 'Benavente/Porto Alto',
 'Elvas e Campo Maior',
 'Alcácer do Sal - Torrão',
 'Lisboa - Serpa',
 'Ferreira do Alentejo /Figueira dos Cavaleiros',
 'Ponte de Sor / Montargil']

In [869]:
len(munic_in_PCF_not_in_census)

14

There are some conflicts of names to be solved, but out of 86 municipalities in the PCF data 72 are also in the census data, so the geographical division coincides in general terms

## Shapefile - Census

In [870]:
munic_in_census_not_in_shapefile = [munic for munic in municipalities_census if munic not in municipalities_shapefile]
munic_in_census_not_in_shapefile

[]

In [871]:
munic_in_shapefile_not_in_census = [munic for munic in municipalities_shapefile if munic not in municipalities_census]
len(munic_in_shapefile_not_in_census)

29

No municipality in the census is missing in the shapefile (apart from Meda but changed the shapefile)

A lot from the shapefile are missing from the census, with the majority most likely belonging to Azores and Madeira for which we have no data. Let's check if more are missing.

### Check excluding Madeira and Azores districts from shapefile


In [872]:
len(shapefile_data)

308

In [873]:
shapefile_data_excl = shapefile_data.loc[(shapefile_data['District'] != 'Azores') 
                                         & (shapefile_data['District'] != 'Madeira')]

In [874]:
len(shapefile_data_excl)

278

In [875]:
municipalities_shapefile_excl = shapefile_data_excl.index.to_list()

In [876]:
munic_in_shapefile_excl_not_in_census = [munic for munic in municipalities_shapefile_excl if munic not in municipalities_census]
len(munic_in_shapefile_excl_not_in_census)

0

All the remaining municipalities are available in the census

## Shapefile - pre-PCF adoption

In [877]:
regions_as_districts = [reg for reg in regions_prePCF if reg in districts_shapefile]
len(regions_as_districts)

10

In [878]:
regions_as_municipalities = [reg for reg in regions_prePCF if reg in municipalities_shapefile]
len(regions_as_municipalities)

17

In [879]:
regions_in_shapefile = regions_as_districts + regions_as_municipalities
regions_as_nothing = [region for region in regions_prePCF if region not in regions_in_shapefile]
len(regions_as_nothing)

7

This means that in the prePCF dataset we have:
* 7 regions that have the same name of 6 municipalities
* 6 regions that could be both districts or municipalities
* 11 that do not appear in the shapefile at all

### Shapefile - PCF data

In [880]:
munic_in_PCF_not_in_shapefile = [munic for munic in counties_PCF if munic not in municipalities_shapefile]
munic_in_PCF_not_in_shapefile

['Vila Velha de Rodão',
 'Alcácer do Sal - Torrão/Alvito-V.N. Baronia',
 'Ourique ',
 'Santiago do cacém',
 'Alcácer do Sal - Santa Susana',
 'Évora / Montemor-o-Novo',
 'Moura e Serpa',
 'Albernoa',
 'Benavente/Porto Alto',
 'Elvas e Campo Maior',
 'Alcácer do Sal - Torrão',
 'Lisboa - Serpa',
 'Ferreira do Alentejo /Figueira dos Cavaleiros',
 'Ponte de Sor / Montargil']

In [881]:
munic_in_PCF_not_in_shapefile == munic_in_PCF_not_in_census

True

The only difference is that 'Ponte de Sor' is in shapefile but not in census

## Conclusions

Make sense to use the municipality level agregation. The only dataset that has a lower granularity is the adoption previous to the PCF, that will have to be disaggregated, probably based on the pasture area in each municipality (available from the census). For the rest, some adjustements are required.

# Correction of PCF counties to correspond to the ones in the shapefile

In [882]:
munic_in_PCF_not_in_shapefile = [munic for munic in counties_PCF if munic not in municipalities_shapefile]
munic_in_PCF_not_in_shapefile

['Vila Velha de Rodão',
 'Alcácer do Sal - Torrão/Alvito-V.N. Baronia',
 'Ourique ',
 'Santiago do cacém',
 'Alcácer do Sal - Santa Susana',
 'Évora / Montemor-o-Novo',
 'Moura e Serpa',
 'Albernoa',
 'Benavente/Porto Alto',
 'Elvas e Campo Maior',
 'Alcácer do Sal - Torrão',
 'Lisboa - Serpa',
 'Ferreira do Alentejo /Figueira dos Cavaleiros',
 'Ponte de Sor / Montargil']

In [883]:
replacements = {}

#### Albernoa

It's a civil parish in the municipality of Beja

In [884]:
[munic for munic in counties_PCF if 'Beja' in munic]

['Beja']

In [885]:
[munic for munic in municipalities_shapefile if 'Beja' in munic]

['Beja']

Since both databases present Beja, the values reported in Albernoa were considered in Beja

In [886]:
replacements['Albernoa'] = 'Beja'

####  Alcácer do Sal - Santa Susana, Alcácer do Sal - Torrão, Alcácer do Sal - Torrão/Alvito-V.N. Baronia

In [887]:
'Alcácer do Sal' in municipalities_shapefile

True

In [888]:
[munic in municipalities_shapefile for munic in ['Santa Susana', 'Torrão', 'Alvito']]

[False, False, True]

In [889]:
'Alvito' in counties_PCF

True

Since the shapefile contains the municipality of Alcácer do Sal and not the other more specific areas, we can group the three entries in the PCF data to Alcácer do Sal. (Alvito is in the municipalities but there is another single entry for it in the PCF).


In [890]:
replacements['Alcácer do Sal - Santa Susana'] = 'Alcácer do Sal'
replacements['Alcácer do Sal - Torrão'] = 'Alcácer do Sal'
replacements['Alcácer do Sal - Torrão/Alvito-V.N. Baronia'] = 'Alcácer do Sal'

#### Benavente/Porto Alto

In [891]:
[mun in counties_PCF for mun in ['Benavente', 'Porto Alto']]

[True, False]

In [892]:
[mun in municipalities_shapefile for mun in ['Benavente', 'Porto Alto']]

[True, False]

While the shapefile has only an entry for Benavente, the PCF project dataset presents two, Benavente and Benavente/Porto Alto, both referring to Bonavente municipality

In [893]:
replacements['Benavente/Porto Alto'] = 'Benavente'

#### Elvas e Campo Maior

In [894]:
[mun in counties_PCF for mun in ['Elvas', 'Campo Maior']]

[True, True]

In [895]:
[mun in municipalities_shapefile for mun in ['Elvas', 'Campo Maior']]

[True, True]

Elvas and Campo Maior are two distinct municipalities with two separate entries in the PCF project database. Checking the spreadsheet, it is possible to see that the parcels with County value Elvas e Campo Maior are four, all corresponding to the same farmer. Since no more information is available to decide to which municipality assign this area, it is arbitrarily assigned to the municipality with the biggest surface, that is Elvas.

In [896]:
replacements['Elvas e Campo Maior'] = 'Elvas'

#### Ferreira do Alentejo /Figueira dos Cavaleiros

In [897]:
[mun in counties_PCF for mun in ['Ferreira do Alentejo', 'Figueira dos Cavaleiros']]

[True, False]

In [898]:
[mun in municipalities_shapefile for mun in ['Ferreira do Alentejo', 'Figueira dos Cavaleiros']]

[True, False]

As for Benavente/Porto Alto

In [899]:
replacements['Ferreira do Alentejo /Figueira dos Cavaleiros'] = 'Ferreira do Alentejo'

#### Lisboa - Serpa

In [900]:
[mun in counties_PCF for mun in ['Lisboa', 'Serpa']]

[False, True]

Serpa is a different municipality not close to Lisbon and it has its own entry in the PCF project database. However, the municipality of Lisboa includes mainly urban area and therefore an adoption in this municipality is unlikely.

In [901]:
replacements['Lisboa - Serpa'] = 'Serpa'

#### Moura e Serpa

In [902]:
[mun in counties_PCF for mun in ['Moura', 'Serpa']]

[True, True]

In [903]:
[mun in municipalities_shapefile for mun in ['Moura', 'Serpa']]

[True, True]

As for Elvas and Campo Maior, both exists

In [904]:
replacements['Moura e Serpa'] = 'Serpa'

#### Ourique

In [905]:
[mun in counties_PCF for mun in ['Ourique ', 'Ourique']]

[True, True]

This is a typing error, there is an entry for Ourique that has a blank space after the name

In [906]:
replacements['Ourique '] = 'Ourique'

#### Ponte de Sor, Ponte de Sor / Montargil

In [907]:
[mun in counties_PCF for mun in ['Ponte de Sor', 'Montargil']]

[True, False]

In [908]:
[mun in municipalities_shapefile for mun in ['Ponte de Sor', 'Montargil']]

[True, False]

In [909]:
[mun in municipalities_census for mun in ['Ponte de Sor', 'Montargil']]

[True, False]

The shapefile reported the Ponte de Sor municipality as Ponte de Sôr, but the shapefile was changed. Since Montargil is not a municipality, Ponte de Sor / Montargil is considered belonging to Ponte de Sor.

In [910]:
replacements['Ponte de Sor / Montargil'] = 'Ponte de Sor'

#### Santiago do cacém

In [911]:
'Santiago do Cacém' in municipalities_shapefile

True

In [912]:
'Santiago do Cacém' in counties_PCF

True

It appears not capitalized sometimes in the database

In [913]:
replacements['Santiago do cacém'] = 'Santiago do Cacém'

#### Vila Velha de Rodão

In [914]:
'Vila Velha de Ródão' in municipalities_shapefile

True

In [915]:
'Vila Velha de Ródão' in counties_PCF

False

In the PCF project, it misses the accent on the o of Ródão

In [916]:
replacements['Vila Velha de Rodão'] = 'Vila Velha de Ródão'

#### Évora / Montemor-o-Novo

In [917]:
[mun in counties_PCF for mun in ['Évora', 'Montemor-o-Novo']]

[True, True]

In [918]:
[mun in municipalities_shapefile for mun in ['Évora', 'Montemor-o-Novo']]

[True, True]

As for Elvas e Campo Maior, both exists

In [919]:
replacements['Évora / Montemor-o-Novo'] = 'Évora'

### Create the PCF project database with the corrected counties and save it

In [920]:
new_PCF_data = PCF_data.replace(replacements)

In [921]:
new_PCF_data.to_excel("./adoption/PCF project data_Corrected counties.xlsx")

### Check everything replaced properly

In [922]:
corrected_counties_PCF = new_PCF_data['County'].unique().tolist()
len(corrected_counties_PCF)

73

In [923]:
to_still_fix_PCF = [munic for munic in corrected_counties_PCF if munic not in municipalities_shapefile]
to_still_fix_PCF

[]

## Are there municipalities that adopted during the PCF project in Azores and Madeira?

In [924]:
munic_in_azores = shapefile_data.loc[(shapefile_data['District'] == 'Azores')]
munic_in_madeira = shapefile_data.loc[(shapefile_data['District'] == 'Madeira')]

for munic in corrected_counties_PCF:
    if munic in munic_in_azores:
        print(munic, 'in Azores')
    if munic in munic_in_madeira:
        print(munic, 'in Madeira')

There was no adoption during the PCF programme in Azores and Madeira

# Disaggregation of pre-PCF adoption

## Matching of pre-PCF adoption regions with municipalities

Result of this part is the **mapping** dictionary, which maps each regions in the pre-PCF dataset with the corresponding municipalities

In [925]:
regions_as_districts

['Aveiro',
 'Coimbra',
 'Guarda',
 'Viseu',
 'Castelo Branco',
 'Leiria',
 'Santarém',
 'Évora',
 'Portalegre',
 'Beja']

In [926]:
regions_as_municipalities

['Aveiro',
 'Coimbra',
 'Guarda',
 'Viseu',
 'Castelo Branco',
 'Leiria',
 'Abrantes',
 'Santarém',
 'Tomar',
 'Coruche',
 'Évora',
 'Elvas',
 'Portalegre',
 'Estremoz',
 'Ponte de Sor',
 'Odemira',
 'Beja']

In [927]:
regions_as_nothing

['Minho',
 'Trás-os-Montes',
 'Oeste',
 'Montemor',
 'Ferreira',
 'Algarve',
 'Madeira + Azores']

Regions: (https://en.wikipedia.org/wiki/Districts_of_Portugal) 
* Algarve --> Faro (16 munic)
* Trás-os-Montes --> Bragança + Vila Real + partly Viseu and Guarda, 31 munic. (https://en.wikipedia.org/wiki/Tr%C3%A1s-os-Montes_e_Alto_Douro_Province)
* Oeste --> 12 municip. (https://en.wikipedia.org/wiki/Oeste_(intermunicipal_community)), in the districts of Lisboa and Leiria
* Madeira + Azores --> Madeira + Azores 

Agric. regions:
* Minho --> Braga + Viana do Castelo (14 + 10 munic)

Could be districts or municipalities:
* Aveiro
* Coimbra
* Guarda
* Viseu
* Leiria
* Portalegre
* Beja
* Castelo Branco
* Évora
* Santarém

Corresponding only to municipalities:
* Odemira (in Beja)
* Estremoz (in Évora)
* Elvas (in Portalegre)
* Ponte de Sor (in Portalegre)
* Coruche (in Santarém)
* Tomar (in Santarém)
* Abrantes (in Santarém)

Unclear ones:
* Montemor: can be assumed to be Montemor-o-Novo, in the Évora district and with a much bigger area than Montemor-o-Velho
* Ferreira: can be assumed to be Ferreira do Alentejo, also present in the PCF project and in Beja district, bigger than Ferreira do Zêzere that is in Santarém district

In [928]:
mapping = {}

Disaggregation rationale: assign adoption only to municipalities that adopted during PCF, proportionally to their pastures land. If none, assign to all the rest proportionally to pasture land

### Unclear ones

In [929]:
mapping['Montemor'] = ['Montemor-o-Novo', 'Alcácer do Sal', 'Palmela']

In [930]:
mapping['Ferreira'] = ['Ferreira do Alentejo', 'Grândola', 'Santiago do Cacém']

### Single municipalities

In [931]:
single_municipalities = ['Odemira', 'Estremoz', 'Elvas', 'Ponte de Sor', 'Coruche', 'Tomar', 'Abrantes']

In [932]:
for munic in single_municipalities:
    mapping[munic] = [munic]

### Districts

In [933]:
districts = ['Aveiro', 'Coimbra', 'Guarda', 'Viseu', 'Portalegre', 'Beja', 'Castelo Branco', 'Évora', 'Santarém', 'Leiria']

In [934]:
munic_already_included = single_municipalities + ['Montemor-o-Novo', 'Ferreira do Alentejo']
for dist in districts:
    munic_in_distr = shapefile_data.loc[shapefile_data['District'] == dist].index.to_list()
    munic_to_include = [munic for munic in munic_in_distr if (munic not in munic_already_included) and (munic in corrected_counties_PCF)]
    mapping[dist] = munic_to_include
    if len(mapping[dist]) == 0:
        munic_to_include = [munic for munic in munic_in_distr if (munic not in munic_already_included)]
        mapping[dist] = munic_to_include

### Regions

In [935]:
# To avoid to insert duplicates with the regions
municipalities_already_in_mapping = [munic for mapped in mapping.values() for munic in mapped]

In [936]:
def include_in_mapping(region):
    """
    Function to assign municipalities corresponding to the region: if there are municipalities that adopted during the PCF 
    project assign them, otherwise all the ones corresponding to the region.
    """
    munic_to_include_2 = [munic for munic in munic_in_region if 
                          (munic not in municipalities_already_in_mapping) and (munic in corrected_counties_PCF)]
    mapping[region] = munic_to_include_2
    print(len(mapping[region]))
    if len(mapping[region]) == 0:
        mapping[region] = munic_to_include
        print(len(mapping[region]))

In [937]:
shapefile_data.loc[shapefile_data['District'] == 'Leiria'].index.unique()

Index(['Óbidos', 'Alcobaça', 'Alvaiázere', 'Ansião', 'Batalha', 'Bombarral',
       'Caldas da Rainha', 'Castanheira de Pêra', 'Figueiró dos Vinhos',
       'Leiria', 'Marinha Grande', 'Nazaré', 'Pedrógão Grande', 'Peniche',
       'Pombal', 'Porto de Mós'],
      dtype='object', name='Municipality')

In [938]:
munic_in_oeste = ['Alcobaça', 'Alenquer', 'Arruda dos Vinhos', 'Bombarral', 'Cadaval', 'Caldas da Rainha', 'Lourinhã', 
                    'Nazaré', 'Óbidos', 'Peniche', 'Sobral de Monte Agraço', 'Torres Vedras']
munic_in_ribatejo = ['Azambuja', 'Vila Franca de Xira', 'Alcochete', 'Montijo', 'Moita']
munic_in_region = munic_in_oeste + munic_in_ribatejo

munic_to_include = [munic for munic in munic_in_region if munic not in municipalities_already_in_mapping]
mapping['Oeste'] = munic_to_include

In [939]:
include_in_mapping('Oeste')

6


In [940]:
munic_leir = shapefile_data.loc[shapefile_data['District'] == 'Leiria'].index.to_list()
munic_oeste_in_leir=[munic for munic in munic_in_region if munic in munic_leir]
munic_oeste_in_leir

['Alcobaça', 'Bombarral', 'Caldas da Rainha', 'Nazaré', 'Óbidos', 'Peniche']

In [941]:
faro_municipalities = shapefile_data.loc[shapefile_data['District'] == 'Faro'].index.to_list()

munic_in_region = faro_municipalities

munic_to_include = [munic for munic in munic_in_region if munic not in municipalities_already_in_mapping]
mapping['Algarve'] = munic_to_include

In [942]:
include_in_mapping('Algarve')

0
16


In [943]:
braganca_municipalities = shapefile_data.loc[shapefile_data['District'] == 'Bragança'].index.to_list()
vilareal_municipalities = shapefile_data.loc[shapefile_data['District'] == 'Vila Real'].index.to_list()
viseu_municipalities_in_tom = ['Armamar', 'Lamego', 'Tabuaço', 'São João da Pesqueira']
guarda_municipalities_in_tom = ['Vila Nova de Foz Côa']

munic_in_region = (braganca_municipalities + vilareal_municipalities 
                    + viseu_municipalities_in_tom + guarda_municipalities_in_tom)

munic_to_include = [munic for munic in munic_in_region if munic not in municipalities_already_in_mapping]
mapping['Trás-os-Montes'] = munic_to_include

In [944]:
include_in_mapping('Trás-os-Montes')

2


In [945]:
madeira_municipalities = shapefile_data.loc[shapefile_data['District'] == 'Madeira'].index.to_list()
azores_municipalities = shapefile_data.loc[shapefile_data['District'] == 'Azores'].index.to_list()

munic_in_region = madeira_municipalities + azores_municipalities
munic_to_include = [munic for munic in munic_in_region if munic not in municipalities_already_in_mapping]
mapping['Madeira + Azores'] = munic_in_region

In [946]:
include_in_mapping('Madeira + Azores')

0
30


In [947]:
braga_municipalities = shapefile_data.loc[shapefile_data['District'] == 'Braga'].index.to_list()
vianadocastelo_municipalities = shapefile_data.loc[shapefile_data['District'] == 'Viana do Castelo'].index.to_list()

munic_in_region = braga_municipalities + vianadocastelo_municipalities
munic_to_include = [munic for munic in munic_in_region if munic not in municipalities_already_in_mapping]
mapping['Minho'] = munic_in_region

In [948]:
include_in_mapping('Minho')

0
24


In [949]:
mapping

{'Montemor': ['Montemor-o-Novo', 'Alcácer do Sal', 'Palmela'],
 'Ferreira': ['Ferreira do Alentejo', 'Grândola', 'Santiago do Cacém'],
 'Odemira': ['Odemira'],
 'Estremoz': ['Estremoz'],
 'Elvas': ['Elvas'],
 'Ponte de Sor': ['Ponte de Sor'],
 'Coruche': ['Coruche'],
 'Tomar': ['Tomar'],
 'Abrantes': ['Abrantes'],
 'Aveiro': ['Águeda',
  'Ílhavo',
  'Albergaria-a-Velha',
  'Anadia',
  'Arouca',
  'Aveiro',
  'Castelo de Paiva',
  'Espinho',
  'Estarreja',
  'Mealhada',
  'Murtosa',
  'Oliveira de Azeméis',
  'Oliveira do Bairro',
  'Ovar',
  'São João da Madeira',
  'Santa Maria da Feira',
  'Sever do Vouga',
  'Vagos',
  'Vale de Cambra'],
 'Coimbra': ['Arganil',
  'Cantanhede',
  'Coimbra',
  'Condeixa-a-Nova',
  'Figueira da Foz',
  'Góis',
  'Lousã',
  'Mira',
  'Miranda do Corvo',
  'Montemor-o-Velho',
  'Oliveira do Hospital',
  'Pampilhosa da Serra',
  'Penacova',
  'Penela',
  'Soure',
  'Tábua',
  'Vila Nova de Poiares'],
 'Guarda': ['Almeida', 'Guarda', 'Pinhel', 'Sabugal'],


### Checks

If the keys of the mapping coincide with the original regions

In [950]:
mapping_keys = list(mapping.keys())
set(mapping_keys) == set(regions_prePCF)

True

Check on municipalities: duplicates inserted with the regions

In [951]:
municipalities_in_mapping = [munic for mapped in mapping.values() for munic in mapped]

In [952]:
len(municipalities_in_mapping)

220

In [953]:
len(set(municipalities_in_mapping))

218

In [954]:
import collections

In [955]:
counter = collections.Counter(municipalities_in_mapping)
duplicates = {k: v for k, v in counter.items() if v > 1}
duplicates

{'Lagoa': 2, 'Calheta': 2}

Lagoa and Calheta are fine, since there are actually two for each. Therefore no duplicates are present

## Disaggregation of adopted area based on pasture area in each municipality

Since we are not considering Azores and Madeira in the following analysis, need to remove the corresponding entries

In [956]:
mapping_restr = mapping.copy()

mapping_restr.pop('Madeira + Azores')

['Câmara de Lobos',
 'Calheta',
 'Funchal',
 'Machico',
 'Ponta do Sol',
 'Porto Moniz',
 'Porto Santo',
 'Ribeira Brava',
 'São Vicente',
 'Santa Cruz',
 'Santana',
 'Angra do Heroísmo',
 'Calheta',
 'Corvo',
 'Horta',
 'Lagoa',
 'Lajes das Flores',
 'Lajes do Pico',
 'Madalena',
 'Nordeste',
 'Ponta Delgada',
 'Povoação',
 'Praia da Vitória',
 'Ribeira Grande',
 'São Roque do Pico',
 'Santa Cruz da Graciosa',
 'Santa Cruz das Flores',
 'Velas',
 'Vila do Porto',
 'Vila Franca do Campo']

In [957]:
municipalities_remained_in_mapping = [munic for mapped in mapping_restr.values() for munic in mapped]
len(municipalities_remained_in_mapping)

190

### Disaggregation

In [958]:
mapping = mapping_restr
out_path_adoption_pre_PCF = (
    "./adoption/SBP adoption previous to 2009 per municipality_PCF mapped.xlsx"
    )

Get total pastures area for the municipalities in each region

In [959]:
calc_area = pd.DataFrame(index=mapping.keys())
calc_area['Municipalities'] = mapping.values()
calc_area.index.name = 'Region'

  return array(a, dtype, copy=False, order=order)


In [960]:
for region in calc_area.index.tolist():
    calc_area.loc[region, 'Total pasture area'] = sum([municipalities_pastures_area.loc[munic, 'pastures_area_munic_ha'] 
                                                 for munic in calc_area.loc[region, 'Municipalities']])

Append a column with the region corresponding to the municipality to "municipalities_pastures_area"

In [961]:
municipalities_pastures_area['Region'] = ''
for munic in municipalities_pastures_area.index.tolist():
    mask = calc_area['Municipalities'].apply(lambda x: munic in x)
    try:
        region = calc_area[mask].index.values[0]
    except IndexError:
        region = float('NaN')
    municipalities_pastures_area['Region'].loc[munic] = region

A value is trying to be set on a copy of a slice from a DataFrame

See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy
  iloc._setitem_with_indexer(indexer, value)


In [962]:
len(municipalities_pastures_area[municipalities_pastures_area['Region'].notnull()])

190

Makes sense, of the 278 municipalities considered (exluding Madeira and Azores), 186 are mapped to a region in the PCF and will have a non null adoption (in fact the municipalities mapped to a region were 186). The remaining will not have any adoption.

Calculate coefficients of area percentage for each municipality

In [963]:
municipalities_pastures_area['Area coefficient'] = ''
for munic in municipalities_pastures_area.index.tolist():
    if pd.isnull(municipalities_pastures_area.loc[munic, 'Region']):
        coef = 0
    else:
        coef = (municipalities_pastures_area.loc[munic, 'pastures_area_munic_ha'] 
                / calc_area.loc[municipalities_pastures_area.loc[munic, 'Region'], 'Total pasture area'])
    municipalities_pastures_area['Area coefficient'].loc[munic] = coef

In [964]:
# Check that sum of indices = 1 for all regions
(abs(municipalities_pastures_area[['Region', 'Area coefficient']].groupby('Region').sum() - 1) < 0.0001).all()

Area coefficient    True
dtype: bool

Disaggregate the adoption per year with the coefficients

In [965]:
indexes = municipalities_pastures_area.index

In [966]:
adoption_pre_PCF_munic = pd.DataFrame(index=indexes, columns=adoption_pre_PCF.columns)

In [967]:
for col in adoption_pre_PCF.columns.tolist():
    for munic in adoption_pre_PCF_munic.index.tolist():
        if municipalities_pastures_area.loc[munic, 'Area coefficient'] == 0:
            adoption_pre_PCF_munic.loc[munic, col] = 0
        else:       
            adoption_pre_PCF_munic.loc[munic, col] = (adoption_pre_PCF.loc[municipalities_pastures_area.loc[munic, 'Region'], col] 
                                                      * municipalities_pastures_area.loc[munic, 'Area coefficient'])

Check if total area remained the same

In [968]:
(adoption_pre_PCF.sum()-adoption_pre_PCF.loc[['Madeira + Azores']].sum()).sum()

83302.5

In [969]:
adoption_pre_PCF_munic.sum().sum()

83302.5

## Save disaggregated dataset

In [970]:
adoption_pre_PCF_munic.index.name = 'Municipality'

In [971]:
adoption_pre_PCF_munic.to_excel(out_path_adoption_pre_PCF)

# Export of shapefile for ABM

In [972]:
munic_to_excl_more = ['São João da Madeira']

In [973]:
shapefile_data_excl.drop(munic_to_excl_more, inplace=True)

A value is trying to be set on a copy of a slice from a DataFrame

See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy
  errors=errors,


In [974]:
out_path = "./counties_shp/Shapefile for ABM/shapefile_for_munic_abm.shp"

In [975]:
shapefile_data_excl.to_file(out_path)