# GMW: International status data processing.  
This notebooks documents the data processing pipeline for the upcoming international status widget.  
This widget consists on a series of sentences detailing a number of agreements and targets that involved countries could have taken up.  

  
For this, the application will require the data in the following format:  

**data**  
```pledge_type``` *string*  
```base_years``` *string*  
```target_years``` *string*  
```ndc_target``` *string*  
```ndc_target_url``` *string*  
```ndc_reduction_target``` *string*  
```ndc_blurb``` *string*  
```ndc_updated``` *boolean*  
```ndc``` *boolean*  
```ndc_mitigation``` *boolean*  
```ndc_adaptation``` *boolean*  
```ipcc_wetlands_suplement``` *string*  
   
**metadata**  
   ```location_id``` *number* 

 

In [2]:
import pandas as pd
import numpy as np
import geopandas as gpd
import os


## 1) Data loading  

Data can be retrieved directly from the google cloud bucket of the project.

In [29]:
data_url = 'https://storage.googleapis.com/mangrove_atlas/widget_data/Climate_Policy_Tool_Data_prep.csv'
original_data = pd.read_csv(data_url)
#original_data = pd.read_csv('../../../../data/Climate_Policy_Tool_Data_prep.csv')

original_data.columns = original_data.columns.str.replace(' ', '_').str.replace('%', 'pct').str.lower()
original_data.columns

Index(['id', 'type', 'iso', 'name', 'total_organic_carbon',
       '2016_mangrove_extent_(ha)', 'ndc_pct_reduction_target',
       'ndc_target_(mtco2e/yr)', 'ncs_submission_language', 'pledge_type',
       'base_year', 'target_year', 'ndc_target_url', 'pledge_summary',
       'ndc_blurb', 'climate_vulnerability_rank_(index)', 'ndc:_first/updated',
       'ndc:includes/doesn't_include', 'ndc_mitigation_(y/n)',
       'ndc_adaptation_(y/n)', 'frel', 'forest_or_wetland',
       'investible_blue_carbon_extent_@_$5/ton',
       'investible_blue_carbon_extent_@_$5/ton.1',
       'investible_blue_carbon_extent_@_$10/ton',
       'investible_blue_carbon_extent_@_$10/ton.1',
       'investible_blue_carbon_extent_(diff_btw._$10/ton_and_$5/ton)',
       'extent_within_pa', 'remaining_extent', 'wetland_supplement',
       'ghg_inventories', 'mangrove_considerations_in_national_policies',
       'emissions_by_land_sector_by_type',
       'avoided_loss_(emissions_from_mangrove_loss)_(mtco2e_yr-1)',


In [31]:
selected_data = original_data[['iso', 'name',
                             '2016_mangrove_extent_(ha)', 'ndc_pct_reduction_target',
                             'ndc_target_(mtco2e/yr)', 'ncs_submission_language', 'pledge_type',
                             'base_year', 'target_year', 'ndc_target_url', 'pledge_summary',
                             'ndc_blurb', 'climate_vulnerability_rank_(index)', 'ndc:_first/updated',
                             'ndc:includes/doesn\'t_include', 'ndc_mitigation_(y/n)',
                             'ndc_adaptation_(y/n)', 'frel', 'forest_or_wetland', 'wetland_supplement',
                             'ghg_inventories']].copy()



selected_data.columns = ['iso', 'country',
                      '2016_mangrove_extent', 'ndc_pct_reduction_target',
                      'ndc_target', 'ncs_submission_language', 'pledge_type',
                      'base_year', 'target_year', 'ndc_target_url', 'pledge_summary',
                      'ndc_blurb', 'climate_vulnerability_rank', 'ndc_first_updated',
                      'ndc_includes', 'ndc_mitigation_y/n',
                      'ndc_adaptation_y/n', 'frel', 'forest_or_wetland', 'wetland_supplement',
                      'ghg_inventories']
selected_data.head()

Unnamed: 0,iso,country,2016_mangrove_extent,ndc_pct_reduction_target,ndc_target,ncs_submission_language,pledge_type,base_year,target_year,ndc_target_url,...,ndc_blurb,climate_vulnerability_rank,ndc_first_updated,ndc_includes,ndc_mitigation_y/n,ndc_adaptation_y/n,frel,forest_or_wetland,wetland_supplement,ghg_inventories
0,AGO,Angola,36527.73,14.0,96.65,,a GHG target,2005.0,2025.0,https://www4.unfccc.int/sites/NDCStaging/pages...,...,"""Angola plans to reduce GHG emissions up to 35...",106.67,Updated,Includes,No,Yes,,,Has not,
1,ATG,Antigua & Barbuda,889.68,25.0,,,Non-GHG target and actions,1990.0,2030.0,https://www4.unfccc.int/sites/NDCStaging/pages...,...,"Conditional Adaptation Targets "" (1) By 2025, ...",74.5,,Doesn't include,No,No,,,Has not,
2,AUS,Australia,967262.58,43.0,133.74,,a GHG Target,2005.0,2030.0,https://www4.unfccc.int/sites/NDCStaging/pages...,...,"""Under a Paris Agreement applicable to all, Au...",52.0,Updated,Includes,Yes,No,,not specified,Has,
3,BHR,Bahrain,82.03,,,,Actions only,,,https://www4.unfccc.int/sites/NDCStaging/pages...,...,The Kingdom of Bahrain communicated in its NDC...,125.0,,Doesn't include,No,No,,,Has not,
4,BGD,Bangladesh,404613.41,6.73,27.56,,a GHG target,2030.0,2030.0,https://www4.unfccc.int/sites/NDCStaging/pages...,...,"""The NDC of Bangladesh consists of the followi...",25.0,,Doesn't include,No,No,374253.0,,Has not,


In [32]:
selected_data.dtypes

iso                            object
country                        object
2016_mangrove_extent          float64
ndc_pct_reduction_target       object
ndc_target                     object
ncs_submission_language        object
pledge_type                    object
base_year                      object
target_year                    object
ndc_target_url                 object
pledge_summary                 object
ndc_blurb                      object
climate_vulnerability_rank    float64
ndc_first_updated              object
ndc_includes                   object
ndc_mitigation_y/n             object
ndc_adaptation_y/n             object
frel                          float64
forest_or_wetland              object
wetland_supplement             object
ghg_inventories               float64
dtype: object

## Populate data model

In [52]:
data_model = selected_data[['iso', 'country']].copy()
data_model.head(3)

Unnamed: 0,iso,country
0,AGO,Angola
1,ATG,Antigua & Barbuda
2,AUS,Australia


### Pledges and NDC info  
This data is taken directly from the table  

```pledge_type``` *string*  
```base_years``` *string*  
```target_years``` *string*  
```ndc_target``` *string*  
```ndc_target_url``` *string*  
```ndc_reduction_target``` *string*  
```ndc_pct_reduction_target``` *string*   
```ndc_blurb``` *string*  

In [54]:
data_model['pledge_type'] = selected_data['pledge_type'].str.lower().str.replace('ghg', 'GHG')
data_model['base_years'] = selected_data['base_year'].str.lower().str.replace('&', 'and')
data_model['target_years'] = selected_data['target_year'].str.lower().str.replace('&', 'and')
data_model['ndc_target'] = selected_data['ndc_target'].str.lower()
data_model['ndc_reduction_target'] = selected_data['ndc_pct_reduction_target'].str.lower().str.replace('&', 'and').str.replace(' and', '% and')
data_model['ndc_target_url'] = selected_data['ndc_target_url'].str.lower()
data_model['pledge_summary'] = selected_data['pledge_summary']
data_model['ndc_blurb'] = selected_data['ndc_blurb']

data_model.head(3)

Unnamed: 0,iso,country,pledge_type,base_years,target_years,ndc_target,ndc_reduction_target,ndc_target_url,pledge_summary,ndc_blurb
0,AGO,Angola,a GHG target,2005,2025,96.65,14,https://www4.unfccc.int/sites/ndcstaging/pages...,"""Angola plans to reduce GHG emissions up to 35...","""Angola plans to reduce GHG emissions up to 35..."
1,ATG,Antigua & Barbuda,non-GHG target and actions,1990,2030,,25,https://www4.unfccc.int/sites/ndcstaging/pages...,Antigua and Barbuda communicated that it would...,"Conditional Adaptation Targets "" (1) By 2025, ..."
2,AUS,Australia,a GHG target,2005,2030,133.74,43,https://www4.unfccc.int/sites/ndcstaging/pages...,Australia communicated a target of 5 per cent ...,"""Under a Paris Agreement applicable to all, Au..."


#### First sentence
"`Country` NDC pledge contains `pledge_type`"  
If `pledge_type` is empty, ignore this sencente

In [44]:
def sentence_1(df, country):
    df = df[df['country'] == country]
    if (df['pledge_type'].notnull()).any():
        return f'{country} NDC pledge contains {df[df["country"] == country]["pledge_type"].values[0]}'
    else:
        return f'No NDC pledge'

In [125]:
for country in data_model['country'].values[0:12]:
    print(sentence_1(data_model, country))

Angola NDC pledge contains a GHG target
Antigua & Barbuda NDC pledge contains non-GHG target and actions
Australia NDC pledge contains a GHG target
Bahrain NDC pledge contains actions only
Bangladesh NDC pledge contains a GHG target
Belize NDC pledge contains a non-GHG target and actions
Benin NDC pledge contains actions
No NDC pledge
Brazil NDC pledge contains a GHG target
No NDC pledge
Cambodia NDC pledge contains a GHG target and actions
No NDC pledge


### Second sentence  
Multiple posibilities based on the info from:  
```ndc_target``` *string*  
```ndc_reduction_target``` *string*  
```base_years``` *string*  
```target_years``` *string* 
    
      
- "The GHG target is a `ndc_reduction_target`% reduction from a baseline in `base_years` by target year `target_years`. This represents a reduction of `ndc_target` mtCO2e/yr."  
  
- "The GHG target is a `ndc_reduction_target`% reduction from a baseline in `base_years` by target year `target_years`."
  
- "The GHG target represents a reduction of `ndc_target` mtCO2e/yr from a baseline in `base_years` by target year `target_years`."  
(If either `base_years` or `target_years` (or both) missing, just delete the associated clause(s))

In [83]:
def sentence_2(df, country):
    df = df[df['country'] == country]
    ndc_target = df['ndc_target'].values[0]
    ndc_reduction_target = df['ndc_reduction_target'].values[0]
    base_years = df['base_years'].values[0]
    target_years = df['target_years'].values[0]


    if (all(i == i for i in [ndc_target, ndc_reduction_target, base_years, target_years])):
        return f'The GHG target is a {ndc_reduction_target}% reduction from a baseline in {base_years} by target year/s {target_years}. This represents a reduction of {ndc_target} mtCO2e/yr.'
    elif (all(i == i for i in [ndc_reduction_target, base_years, target_years])):
        return f'The GHG target is a {ndc_reduction_target}% reduction from a baseline in {base_years} by target year/s {target_years}'
    elif (all(i == i for i in [ndc_target, base_years, target_years])):
        return f'The GHG target represents a reduction of {ndc_target} mtCO2e/yr from a baseline in {base_years} by target year/s {target_years}.'
    elif (all(i == i for i in [ndc_target, base_years])):
        return f'The GHG target represents a reduction of {ndc_target} mtCO2e/yr from a baseline in {base_years}.'
    elif (all(i == i for i in [ndc_target, target_years])):
        return f'The GHG target represents a reduction of {ndc_target} mtCO2e/yr by target year/s {target_years}.'
    elif (np.isnan(ndc_target)):
        return 'No data'
    else:
        return f'The GHG target represents a reduction of {ndc_target} mtCO2e/yr'


In [88]:
for country in data_model.country.values[0:10]:
   
    print(country + ": " + sentence_2(data_model, country))
    print(' ')

Angola: The GHG target is a 14% reduction from a baseline in 2005 by target year/s 2025. This represents a reduction of 96.65 mtCO2e/yr.
 
Antigua & Barbuda: The GHG target is a 25% reduction from a baseline in 1990 by target year/s 2030
 
Australia: The GHG target is a 43% reduction from a baseline in 2005 by target year/s 2030. This represents a reduction of 133.74 mtCO2e/yr.
 
Bahrain: No data
 
Bangladesh: The GHG target is a 6.73% reduction from a baseline in 2030 by target year/s 2030. This represents a reduction of 27.56 mtCO2e/yr.
 
Belize: No data
 
Benin: The GHG target is a 21.4% reduction from a baseline in 2030 by target year/s 2030. This represents a reduction of 4.82 mtCO2e/yr.
 
Bonaire, Sint-Eustasius, Saba: No data
 
Brazil: The GHG target is a 37% and 50% reduction from a baseline in 2005 by target year/s 2025 and 2030. This represents a reduction of 220 and 480 mtCO2e/yr.
 
Brunei: No data
 


In [86]:
sentence_2(data_model, 'Costa Rica')

'The GHG target represents a reduction of 9.11 mtCO2e/yr by target year/s 2030.'

### Add NDC first/updated data and other related info  
```ndc_updated``` *boolean*  
```ndc``` *boolean*  
```ndc_mitigation``` *boolean*  
```ndc_adaptation``` *boolean*  
```ipcc_wetlands_suplement``` *string* 

ndc_first_updated	ndc_includes	ndc_mitigation_y/n	ndc_adaptation_y/n	frel	forest_or_wetland	wetland_supplement

In [107]:
data_model['ndc'] = np.where(data_model['pledge_type'].notnull(), True, False)
data_model['ndc_updated'] = np.where(data_model['pledge_type'].notnull(), np.where(selected_data['ndc_first_updated'] == 'Updated', True, False), False).astype(bool)
data_model['ndc_mitigation'] = np.where(data_model['pledge_type'].notnull(), np.where(selected_data['ndc_mitigation_y/n'] == 'Yes', True, False), False).astype(bool)
data_model['ndc_adaptation'] = np.where(data_model['pledge_type'].notnull(), np.where(selected_data['ndc_adaptation_y/n'] == 'Yes', True, False), False).astype(bool)
data_model['ipcc_wetlands_suplement'] = selected_data['wetland_supplement'].str.lower()


data_model[['country','ndc_blurb', 'ndc', 'ndc_updated', 'ndc_mitigation',
       'ndc_adaptation', 'ipcc_wetlands_suplement']].head(15)

Unnamed: 0,country,ndc_blurb,ndc,ndc_updated,ndc_mitigation,ndc_adaptation,ipcc_wetlands_suplement
0,Angola,"""Angola plans to reduce GHG emissions up to 35...",True,True,False,True,has not
1,Antigua & Barbuda,"Conditional Adaptation Targets "" (1) By 2025, ...",True,False,False,False,has not
2,Australia,"""Under a Paris Agreement applicable to all, Au...",True,True,True,False,has
3,Bahrain,The Kingdom of Bahrain communicated in its NDC...,True,False,False,False,has not
4,Bangladesh,"""The NDC of Bangladesh consists of the followi...",True,False,False,False,has not
5,Belize,Belize mitigation potential is framed on an ac...,True,False,False,False,has not
6,Benin,The Benin plans to reduce overall cumulative g...,True,False,False,False,has not
7,"Bonaire, Sint-Eustasius, Saba",,False,False,False,False,has not
8,Brazil,"""Brazil intends to commit to reduce greenhouse...",True,False,False,False,has not
9,Brunei,"""Brunei Darussalams Intended Nationally Determ...",False,False,False,False,has not


### Third sentence
Various combinations depending on TRUE / FALSE values on `ndc`, `ndc_updated`, `ndc_adaptation` and `ndc_mitigation`:

`country` `first`/`updated` NDC pledge...  
- includes coastal and marine NBS for `both mitigation and adaptation`
- includes coastal and marine NBS for `mitigation`/`adaptation`
- `doesn't include` coastal and marine NBS

In [117]:
def sentence_3(df, country):
    df = df[df['country'] == country]

    ndc = df['ndc'].values[0]
    ndc_updated = df['ndc_updated'].values[0]
    ndc_mitigation = df['ndc_mitigation'].values[0]
    ndc_adaptation = df['ndc_adaptation'].values[0]

    if (ndc):
        if(ndc_updated):
            fst_upd = 'updated'
        else:
            fst_upd = 'first'
        if(ndc_mitigation & ndc_adaptation): 
            inc = 'includes coastal and marine NBS for both mitigation and adaptation'
        elif(ndc_mitigation):
            inc = 'includes coastal and marine NBS for mitigation'
        elif(ndc_adaptation):
            inc = 'includes coastal and marine NBS for adaptation'
        else:
            inc = 'doesn\'t include coastal and marine NBS'

        return f'{country} {fst_upd} NDC pledge {inc}.'
    else:
        return f'No NDC pledge'

    

In [111]:
for country in data_model.country.values[0:11]:
    print(sentence_3(data_model, country))
    print(' ')

Angola updated NDC pledge includes coastal and marine NBS for adaptation.
 
Antigua & Barbuda first NDC pledge doesn't include coastal and marine NBS.
 
Australia updated NDC pledge includes coastal and marine NBS for mitigation.
 
Bahrain first NDC pledge doesn't include coastal and marine NBS.
 
Bangladesh first NDC pledge doesn't include coastal and marine NBS.
 
Belize first NDC pledge doesn't include coastal and marine NBS.
 
Benin first NDC pledge doesn't include coastal and marine NBS.
 
No NDC pledge
 
Brazil first NDC pledge doesn't include coastal and marine NBS.
 
No NDC pledge
 
Cambodia updated NDC pledge includes coastal and marine NBS for both mitigation and adaptation.
 


### Fourth sentence
Based on content from `ipcc_wetlands_suplement`  
`country` `has` / `has not` implemented the IPCC wetlands supplement.

In [114]:
def sentence_4(data_model, country):
    df = data_model[data_model['country'] == country]
    return f'{country} {df["ipcc_wetlands_suplement"].values[0]} implemented the IPCC Wetlands Supplement.'

In [116]:
for country in data_model.country.values[0:5]:
    print(sentence_4(data_model, country))
    print(' ')

Angola has not implemented the IPCC Wetlands Supplement.
 
Antigua & Barbuda has not implemented the IPCC Wetlands Supplement.
 
Australia has implemented the IPCC Wetlands Supplement.
 
Bahrain has not implemented the IPCC Wetlands Supplement.
 
Bangladesh has not implemented the IPCC Wetlands Supplement.
 


## Final format  
Review of the final data (except for country / location info preparation for API ingestion), and the generated text for example countries

### Data Model

In [118]:
data_model.head(6)

Unnamed: 0,iso,country,pledge_type,base_years,target_years,ndc_target,ndc_reduction_target,ndc_target_url,pledge_summary,ndc_blurb,ndc,ndc_updated,ndc_mitigation,ndc_adaptation,ipcc_wetlands_suplement
0,AGO,Angola,a GHG target,2005.0,2025.0,96.65,14.0,https://www4.unfccc.int/sites/ndcstaging/pages...,"""Angola plans to reduce GHG emissions up to 35...","""Angola plans to reduce GHG emissions up to 35...",True,True,False,True,has not
1,ATG,Antigua & Barbuda,non-GHG target and actions,1990.0,2030.0,,25.0,https://www4.unfccc.int/sites/ndcstaging/pages...,Antigua and Barbuda communicated that it would...,"Conditional Adaptation Targets "" (1) By 2025, ...",True,False,False,False,has not
2,AUS,Australia,a GHG target,2005.0,2030.0,133.74,43.0,https://www4.unfccc.int/sites/ndcstaging/pages...,Australia communicated a target of 5 per cent ...,"""Under a Paris Agreement applicable to all, Au...",True,True,True,False,has
3,BHR,Bahrain,actions only,,,,,https://www4.unfccc.int/sites/ndcstaging/pages...,,The Kingdom of Bahrain communicated in its NDC...,True,False,False,False,has not
4,BGD,Bangladesh,a GHG target,2030.0,2030.0,27.56,6.73,https://www4.unfccc.int/sites/ndcstaging/pages...,"""The NDC of Bangladesh consists of the followi...","""The NDC of Bangladesh consists of the followi...",True,False,False,False,has not
5,BLZ,Belize,a non-GHG target and actions,,2030.0,,,https://www4.unfccc.int/sites/ndcstaging/pages...,,Belize mitigation potential is framed on an ac...,True,False,False,False,has not


### Full text 

In [119]:
def full_text(df, country):
    df = df[df['country'] == country]
    s1 = sentence_1(df, country)
    s2 = sentence_2(df, country)
    s3 = sentence_3(df, country)
    s4 = sentence_4(df, country)

    print(s1)
    print(s2)
    print(' ')
    print(s3)
    print(' ')
    print(s4)

In [120]:
full_text(data_model, 'Brazil')

Brazil NDC pledge contains a GHG target
The GHG target is a 37% and 50% reduction from a baseline in 2005 by target year/s 2025 and 2030. This represents a reduction of 220 and 480 mtCO2e/yr.
 
Brazil first NDC pledge doesn't include coastal and marine NBS.
 
Brazil has not implemented the IPCC Wetlands Supplement.


In [121]:
full_text(data_model, 'Indonesia')

Indonesia NDC pledge contains a GHG target and actions
The GHG target represents a reduction of 1181.21 mtCO2e/yr by target year/s 2020.
 
Indonesia first NDC pledge doesn't include coastal and marine NBS.
 
Indonesia has not implemented the IPCC Wetlands Supplement.


In [123]:
full_text(data_model, 'Australia')

Australia NDC pledge contains a GHG target
The GHG target is a 43% reduction from a baseline in 2005 by target year/s 2030. This represents a reduction of 133.74 mtCO2e/yr.
 
Australia updated NDC pledge includes coastal and marine NBS for mitigation.
 
Australia has implemented the IPCC Wetlands Supplement.
