# Introduction 

1. **Notebook Description** In this notebook, I conduct exploratory data analysis (EDA) On Encore
2. **Author** Filippo Radice Fossati
3. **Date** 09/22/2023

# Notebook Structure 

This notebook is structured as follows:

1. **Section 1**: Importing necessary libraries and datasets.
2. **Section 2**: EDA on Encore

# Section 1 - Import Libraries & Files

In [1]:
import pandas as pd
import numpy as np
import matplotlib.pyplot as plt
import seaborn as sns

import warnings
import seaborn as sns
import matplotlib.pyplot as plt


import rasterio
import osmnx as ox
import geopandas as gpd

import support_function as sf

%load_ext autoreload
%autoreload 2
#to suppress scientifc notation

plt.style.use('ggplot')
plt.rcParams.update({'font.size': 16, 'axes.labelweight': 'bold', 'figure.figsize': (6, 6), 'axes.edgecolor': '0.2'})

In [2]:
#loading in dependency and impact dataset
materiality = pd.read_csv("./_data/ENCORE_data/ENCORE dependency materialities.csv")


# I ended up not using this file so I commented them out and deleted from _data/ENCORE folder
# materiality_db = pd.read_excel("./_data/ENCORE_data/ENCORE dependencies database.xlsx")
# asset_driver = pd.read_csv("./_data/ENCORE_data/assets_driver_of_environmental_change_join.csv",encoding='ISO-8859-1')
# asset = pd.read_csv("./_data/ENCORE_data/assets.csv")
# benefit = pd.read_csv("./_data/ENCORE_data/benefits.csv")
# driver_of_change = pd.read_csv("./_data/ENCORE_data/drivers of environmental change.csv",encoding='ISO-8859-1')
# asset_ecosystem_services = pd.read_csv("./_data/ENCORE_data/ecosystem_services_assets_join.csv",encoding='ISO-8859-1')
# ecosystem_services = pd.read_csv("./_data/ENCORE_data/ecosystem_services.csv",encoding='ISO-8859-1')

# sector_subindustries_and_processes = pd.read_csv("./_data/ENCORE_data/sectors_subindustries_and_processes.csv")


# Section 2 - EDA

## 2.1 - Encore Dependency Materialities

**Encore Dependency Materialities** The materiality rating component of the dependency scores are taken from the ENCORE knowledge base (Natural Capital Finance Alliance 2022). The ENCORE knowledge base assesses the links between each sector of the global economy, the ecosystem services that support their production processes and the natural capital assets that support those services.The reliance of production processes on ecosystem services is scored through qualitative materiality ratings (Very Low to Very High). And These qualitative metrics are then converted to numeric



In [3]:
#renaming columns
materiality.columns = materiality.columns.map(lambda x: '_'.join(x.split(' ')).lower())
materiality.columns

Index(['process', 'ecosystem_service', 'rating', 'justification'], dtype='object')

In [4]:
#Checking General Info Of teh Dataset
materiality.info()

<class 'pandas.core.frame.DataFrame'>
RangeIndex: 652 entries, 0 to 651
Data columns (total 4 columns):
 #   Column             Non-Null Count  Dtype 
---  ------             --------------  ----- 
 0   process            652 non-null    object
 1   ecosystem_service  652 non-null    object
 2   rating             652 non-null    object
 3   justification      652 non-null    object
dtypes: object(4)
memory usage: 20.5+ KB


In [5]:
#Checking Null Values
materiality.isnull().mean()

process              0.0
ecosystem_service    0.0
rating               0.0
justification        0.0
dtype: float64

In [6]:
"""
Mapping List of Activities of Different Asset with Processes"

'Biofuel Plant': 'Biomass energy production'
, 'Coal Mine': 'Mining'
, 'Company Headquarters':
,'Industrial Plant'
, 'Instn Address'
, 'Mining Property':'Mining'
,'Oil and Gas Exploration and Production'
, 'Port'
,'Power Generation'
, 'Power Plant'
, 'Power Plants'
,'Refinery and Chemicals Plant'
, 'Regulated Industrial Site': ''
,'Steel Plant':'Steel production'
, 'Sugar Refinery':'Biomass energy production'
""";

In [7]:
#checking unique values of all columns
materiality['process'].sort_values().unique()

array(['Airport services', 'Alcoholic fermentation and distilling',
       'Alumina refining', 'Aquaculture', 'Biomass energy production',
       'Cable and satellite installations on land',
       'Catalytic cracking, fractional distillation and crystallization',
       'Construction', 'Construction materials production',
       'Cruise line provision', 'Cryogenic air separation',
       'Distribution',
       'Electric/nuclear power transmission and distribution',
       'Electronics and hardware production',
       'Environmental and facilities services',
       'Fibre-optic cable installation (marine)', 'Financial services',
       'Footwear production', 'Freshwater wild-caught fish',
       'Gas adsorption', 'Gas distribution', 'Gas retail',
       'Geothermal energy production', 'Glass making',
       'Hotels and resorts provision',
       'Houseware and specialities production', 'Hydropower production',
       'Incomplete combustion', 'Infrastructure builds',
       'Infrastruct

In [8]:
materiality['ecosystem_service'].unique()

array(['Mass stabilisation and erosion control', 'Ground water',
       'Surface water', 'Climate regulation',
       'Flood and storm protection', 'Filtration',
       'Dilution by atmosphere and ecosystems', 'Genetic materials',
       'Water flow maintenance', 'Water quality', 'Soil quality',
       'Pest control', 'Disease control', 'Ventilation',
       'Fibres and other materials',
       'Buffering and attenuation of mass flows', 'Bio-remediation',
       'Maintain nursery habitats', 'Mediation of sensory impacts',
       'Animal-based energy', 'Pollination'], dtype=object)

In [9]:
materiality['rating'].unique()

array(['M', 'H', 'VH', 'L', 'VL'], dtype=object)

In [10]:
materiality['justification'].sort_values().unique()

array(['Although less practical, production process can take place without the ecosystem service due to availability of impact mitigation solutions',
       'Although less practical, production process can take place without the ecosystem service due to availability of protection substitutes',
       'Although less practical, production process can take place without the ecosystem service due to availability of substitutes',
       'Ecosystem service is critical and irreplaceable in production process',
       'Ecosystem service is critical and irreplaceable in production process ',
       'Ecosystem service is critical and irreplaceable in production process. Production process can take place with some disruption of the ecosystem service, but the high quantity of the ecosystem service required for the production process makes service this a high risk\n',
       'Ecosystem service is critical and irreplaceable in production process. Production process can take place with some disruptio

In [11]:
materiality

Unnamed: 0,process,ecosystem_service,rating,justification
0,Airport services,Mass stabilisation and erosion control,M,"Although less practical, production process ca..."
1,Airport services,Ground water,M,"Although less practical, production process ca..."
2,Airport services,Surface water,M,"Although less practical, production process ca..."
3,Airport services,Climate regulation,M,Most of the time the production process can ta...
4,Airport services,Flood and storm protection,H,The production process is extremely vulnerable...
...,...,...,...,...
647,"Water services (e.g. waste water, treatment an...",Surface water,VH,The production process is extremely vulnerable...
648,"Water services (e.g. waste water, treatment an...",Water flow maintenance,VH,The production process is extremely vulnerable...
649,Wind energy provision,Mass stabilisation and erosion control,M,"Although less practical, production process ca..."
650,Wind energy provision,Flood and storm protection,M,Most of the time the production process can ta...


In [12]:
# I need now to subset the df by selecting mining as process
mining_materiality = materiality[(materiality['process']=='Mining')].copy()
mining_materiality

Unnamed: 0,process,ecosystem_service,rating,justification
364,Mining,Mass stabilisation and erosion control,M,"Although less practical, production process ca..."
365,Mining,Surface water,H,Ecosystem service is critical and irreplaceabl...
366,Mining,Water flow maintenance,H,Ecosystem service is critical and irreplaceabl...
367,Mining,Ground water,H,Ecosystem service is critical and irreplaceabl...
368,Mining,Climate regulation,H,The production process is extremely vulnerable...


In [13]:
rating_to_numeric = {
    'No dependency':0
    , 'VL':.2
    , 'L':.4
    , 'M':.6
    , 'H':.8
    , 'VH':1
    
}
mining_materiality['rating_numeric'] = mining_materiality['rating'].map(rating_to_numeric)
mining_materiality.reset_index(drop=True, inplace=True)
mining_materiality

Unnamed: 0,process,ecosystem_service,rating,justification,rating_numeric
0,Mining,Mass stabilisation and erosion control,M,"Although less practical, production process ca...",0.6
1,Mining,Surface water,H,Ecosystem service is critical and irreplaceabl...,0.8
2,Mining,Water flow maintenance,H,Ecosystem service is critical and irreplaceabl...,0.8
3,Mining,Ground water,H,Ecosystem service is critical and irreplaceabl...,0.8
4,Mining,Climate regulation,H,The production process is extremely vulnerable...,0.8


In [14]:
mining_materiality['justification'].loc[1]

'Ecosystem service is critical and irreplaceable in production process'

In [15]:
#checking subindustries sector and processes
#within the asset df the gics sectors are diversified metals & mining and Copper Which both relates to Mining Process
#but they do have different category; need a more granular map
# sector_subindustries_and_processes[sector_subindustries_and_processes['Process']=='Mining']

In [16]:
# sector_subindustries_and_processes

## 2.2 - Other Dataset

In [17]:
# driver_of_change.loc[25][0], driver_of_change.loc[25][1],  driver_of_change.loc[25][2]

# asset.loc[7][0], asset.loc[7][1], asset.loc[7][2]

# water = asset_ecosystem_services[asset_ecosystem_services['Ecosystem Service'].str.contains('water')]
# water.loc[22][0], water.loc[22][1], water.loc[22][2]

# water.loc[45][0], water.loc[45][1], water.loc[45][2]

# ecosystem_services.loc[10][0],ecosystem_services.loc[10][1],ecosystem_services.loc[10][2]

# ecosystem_services.loc[17][0],ecosystem_services.loc[17][1],ecosystem_services.loc[17][2]