# 1. Calculation of indicator record value: 

This notebook contains a QA of the queries implemented for the impact calculation in the main LG application. The main purpose is to update these calculations according to the improvements on the methodology: https://docs.google.com/document/d/1IDuYWOllQ2fTf2ZeBUmOtEZpqht2rqMot3G9W3CjEjE/edit#

As part of new strategy the indicator record entity will include:

    - Indicator record value: value of impact in my geometry
    - Indicator recod scaler: equivalent to total commodity production in my location
    - Pointer: h3 table and column name to distribute the impact

In this notebook we will also be covering two approaches to compute the indicator record value:

    a. Get the total indicator record value in my geometry by summing the impact in all the hexagons within my geometry.
    b. Get the indicator record value in my geometry by computing the average risk in all the hexagons within my geometry and multiply that value by the total volume.

In [1]:
# import libraries
from psycopg2.pool import ThreadedConnectionPool

import pandas as pd
from tqdm import tqdm
import json

In [2]:
#set env
## env file for gcs upload
env_path = ".env"
with open(env_path) as f:
    env = {}
    for line in f:
        env_key, _val = line.split("=", 1)
        env_value = _val.split("\n")[0]
        env[env_key] = env_value
        
#list(env.keys())

# set conexion to local ddbb
postgres_thread_pool = ThreadedConnectionPool(1, 50,
                                              host=env['API_POSTGRES_HOST'],
                                              port=env['API_POSTGRES_PORT'],
                                              user=env['API_POSTGRES_USERNAME'],
                                              password=env['API_POSTGRES_PASSWORD']
                                              )

#get list of sourcing records to iterate:
conn = postgres_thread_pool.getconn()
cursor = conn.cursor()

##  Compute indicator record value as sum of impacts in area:

As summary the formulas to compute each of the landgriffon impact indicators are shown below:

    Probability purchase area (ppa) = (ha / total ha) * Purchase Volume (tonnes)
    
### Water impacts 

        water imapct (m3/yr) = (BWF * 0.001 / Prod all crops) * ppa
    
    equal to:
    
        water imapct (m3/yr) = (BWF * 0.001 / Prod all crops) * (ha / total ha) * Volume
    

### Land impact:

        Land impact (ha/yr)  = (Harvested area (ha) / Production) *  ppa
        
     equal to:
   
        Land impact (ha/yr)  = (Harvested area (ha) / Production) *  (Ha / total ha) * Volume
    
### Deforestation:

        Deforestation impact (ha/yr) = land impact (ha) * deforestation mask (unitless)
    
    equal to:
    
        Deforestation impact (ha/yr) = (Harvested area (ha) / Production) *  ppa * deforestation mask (unitless)
        Deforestation impact (ha/yr) = (Harvested area (ha) / Production) *  (Ha / total ha) * Volume * deforestation mask (unitless)
        
### Carbon:

        Carbon impact (tCO2e/yr) = net forest carbo emisions (t CO2e) * Deforestation impact (ha)
    
    equal to: 
    
        Carbon impact (tCO2e/yr) = net forest carbo emisions (t CO2e) * (Harvested area (ha) / Production) *  ppa * deforestation mask (unitless)
        Carbon impact (tCO2e/yr) = net forest carbo emisions (t CO2e) * (Harvested area (ha) / Production) *  (Ha / total ha) * Volume * deforestation mask (unitless)
     
    
### Biodiversity:

        Biodiversity  impact  (PDF/yr)  = PSL(PDF m⁻²) * 10⁴(m² ha⁻¹) * Deforestation impact (ha)
    
    equal to: 
    
        Biodiversity  impact  (PDF/yr)  = PSL(PDF m⁻²) * 10⁴(m² ha⁻¹) * (Harvested area (ha) / Production) *  ppa * deforestation mask (unitless)
        Biodiversity  impact  (PDF/yr)  = PSL(PDF m⁻²) * 10⁴(m² ha⁻¹) * (Harvested area (ha) / Production) *   (Ha / total ha) * Volume * deforestation mask (unitless)
         

In [39]:
def getIndicatorRecordValues(sr_id, material_id):
    """Function for calculating the indicator record values for each indicator record:
    Parameters
    -------------
        - sr_id: Indicator record id
        - material_id: Material id
    
    Returns
    ------------
    
        - indicator record values: total impact for a sourcing record in relation to a indicator
        - scaler: value for distributin the total impact in a geometry in a map view
        - pointer_table: h3 table needed for distributing the impact in a gemotry
        - pointer_column: h3 column needed for distributing the impact in a geometry"""
    
    #get production tables for materials
    sql_prod_tables = f"""select hd."h3tableName", hd."h3columnName" from h3_data hd where hd.id in (
    select mth."h3DataId" from material_to_h3 mth where mth."materialId" = '{material_id}' and mth."type" ='producer')"""
    cursor.execute(sql_prod_tables)
    response_prodtables = cursor.fetchall()
    prod_table= response_prodtables[0][0]
    prod_column =  response_prodtables[0][1]

    ## get harvest tables 
    sql_ha_tables = f"""select hd."h3tableName", hd."h3columnName" from h3_data hd where hd.id in (
    select mth."h3DataId" from material_to_h3 mth where mth."materialId" = '{material_id}' and mth."type" ='producer')"""
    cursor.execute(sql_ha_tables)
    response_hatables = cursor.fetchall()
    ha_table= response_hatables[0][0]
    ha_column =  response_hatables[0][1]
    
    #GET SCALER:
    
    sql_scaler = f"""select sum(totalProd.prod) from
    (select production.prod from
        (select h3_uncompact(gr."h3Compact"::h3index[],6) h3index from sourcing_records sr 
        left join sourcing_location sl on sl.id = sr."sourcingLocationId" 
        left join geo_region gr on sl."geoRegionId" =gr.id
        where sr.id='{sr_id}') locations
    left join
        (select prodTable.h3index, prodTable."{prod_column}" prod from {prod_table} prodTable
        where prodTable."{prod_column}"> 0) production
    on production.h3index=locations.h3index
    where production.prod > 0) totalProd"""
    cursor.execute(sql_scaler)
    response_scaler = cursor.fetchall()
    
    ### WATER IMPACT
    
    sql_waterImpact = f"""select sum(totalImpact.impact)  from
        (select (locations.tonnage * (materials.ha / sum(materials.ha) over())) * ((materials.bwf * 0.001)/ materials.totalProd) as impact from 	
            (select h3_uncompact(gr."h3Compact"::h3index[],6) h3index, sr.tonnage from sourcing_records sr 
            left join sourcing_location sl on sl.id = sr."sourcingLocationId" 
            left join geo_region gr on sl."geoRegionId" =gr.id
            where sr.id='{sr_id}') locations
        left join 
            (select hatable.h3index, hatable."{ha_column}" as ha, watertable."wfBltotMmyr" as bwf, totalprod.totalProd
            from {ha_table} hatable
            left join h3_grid_wf_global watertable on watertable.h3index =hatable.h3index 
            left join 
                (select spam_prod.h3index, sum(spam_prod."spam2010V2R0GlobalPPotaA" + spam_prod."spam2010V2R0GlobalPRapeA" + spam_prod."spam2010V2R0GlobalPCocoA" + spam_prod."spam2010V2R0GlobalPBarlA" + spam_prod."spam2010V2R0GlobalPOilpA" + spam_prod."spam2010V2R0GlobalPSunfA" + spam_prod."spam2010V2R0GlobalPSorgA" + spam_prod."spam2010V2R0GlobalPMaizA" + spam_prod."spam2010V2R0GlobalPRiceA" + spam_prod."spam2010V2R0GlobalPSugbA" + spam_prod."spam2010V2R0GlobalPBanaA" + spam_prod."spam2010V2R0GlobalPWheaA" + spam_prod."spam2010V2R0GlobalPTobaA" + spam_prod."spam2010V2R0GlobalPSwpoA" + spam_prod."spam2010V2R0GlobalPSoybA" + spam_prod."spam2010V2R0GlobalPOcerA" + spam_prod."spam2010V2R0GlobalPSugcA" + spam_prod."spam2010V2R0GlobalPCottA" + spam_prod."spam2010V2R0GlobalPTeasA" + spam_prod."spam2010V2R0GlobalPOoilA") as totalProd
                    from h3_grid_spam2010v2r0_global_prod spam_prod
                    group by spam_prod.h3index) totalprod
            on totalprod.h3index = hatable.h3index
            where hatable."{ha_column}"> 0
            and watertable."wfBltotMmyr" > 0) materials
        on locations.h3index= materials.h3index
        where materials.totalProd > 0) totalImpact"""
    cursor.execute(sql_waterImpact)
    response_wi_ir = cursor.fetchall()

    ### LAND IMPACT SUM:

    sql_landImpact = f"""select sum(totalImpact.impact)  from
        (select ((locations.tonnage * (materials.ha / sum(materials.ha) over()))* (materials.ha/materials.prod)) as impact from 	
            (select h3_uncompact(gr."h3Compact"::h3index[],6) h3index, sr.tonnage from sourcing_records sr 
            left join sourcing_location sl on sl.id = sr."sourcingLocationId" 
            left join geo_region gr on sl."geoRegionId" =gr.id
            where sr.id='{sr_id}') locations
        left join 
            (select hatable.h3index, hatable."{ha_column}" as ha, prodtable."{prod_column}" as prod
            from {ha_table} hatable
            left join {prod_table} prodtable on prodtable.h3index= hatable.h3index
            where hatable."{ha_column}"> 0
            and prodtable."{prod_column}">0) materials
        on locations.h3index= materials.h3index
        where materials.prod > 0) totalImpact"""
    cursor.execute(sql_landImpact)
    response_li_ir = cursor.fetchall()


    #### DEFORESTATION IMPACT 

    sql_defImpact = f"""select sum(totalImpact.impact)  from
        (select ((locations.tonnage * (materials.ha / sum(materials.ha) over()))* (materials.ha/materials.prod))* materials.def as impact from 	
            (select h3_uncompact(gr."h3Compact"::h3index[],6) h3index, sr.tonnage from sourcing_records sr 
            left join sourcing_location sl on sl.id = sr."sourcingLocationId" 
            left join geo_region gr on sl."geoRegionId" =gr.id
            where sr.id='{sr_id}') locations
        left join 
            (select hatable.h3index, hatable."{ha_column}" as ha, prodtable."{prod_column}" as prod, deftable."hansenLoss2019" as def 
            from {ha_table} hatable
            left join {prod_table} prodtable on prodtable.h3index= hatable.h3index
            left join h3_grid_deforestation_global deftable on deftable.h3index=hatable.h3index
            where hatable."{ha_column}"> 0
            and prodtable."{prod_column}">0
            and deftable."hansenLoss2019">0) materials
        on locations.h3index= materials.h3index
        where materials.prod > 0) totalImpact"""
    cursor.execute(sql_defImpact)
    response_di_ir = cursor.fetchall()

    ### CARBON IMPACT

    sql_carbImpact = f"""select sum(totalImpact.impact)  from
        (select ((locations.tonnage * (materials.ha / sum(materials.ha) over()))* (materials.ha/materials.prod))* materials.def * materials.carb as impact from 	
            (select h3_uncompact(gr."h3Compact"::h3index[],6) h3index, sr.tonnage from sourcing_records sr 
            left join sourcing_location sl on sl.id = sr."sourcingLocationId" 
            left join geo_region gr on sl."geoRegionId" =gr.id
            where sr.id='{sr_id}') locations
        left join 
            (select hatable.h3index, hatable."{ha_column}" as ha, prodtable."{prod_column}" as prod, deftable."hansenLoss2019" as def,
            carbtable."earthstat2000GlobalHectareEmissions" as carb
            from {ha_table} hatable
            left join {prod_table} prodtable on prodtable.h3index= hatable.h3index
            left join h3_grid_deforestation_global deftable on deftable.h3index=hatable.h3index
            left join h3_grid_carbon_global carbtable on carbtable.h3index =hatable.h3index 
            where hatable."{ha_column}"> 0
            and prodtable."{prod_column}">0
            and deftable."hansenLoss2019">0
            and carbtable."earthstat2000GlobalHectareEmissions" > 0) materials
        on locations.h3index= materials.h3index
        where materials.prod > 0) totalImpact"""
    cursor.execute(sql_carbImpact)
    response_ci_ir = cursor.fetchall()

    ### BIODIVERSITY IMPACT

    sql_bioImpact = f"""select sum(totalImpact.impact)  from
        (select ((locations.tonnage * (materials.ha / sum(materials.ha) over()))* (materials.ha/materials.prod))* materials.def * materials.bio * (1/0.0001) as impact from 	
            (select h3_uncompact(gr."h3Compact"::h3index[],6) h3index, sr.tonnage from sourcing_records sr 
            left join sourcing_location sl on sl.id = sr."sourcingLocationId" 
            left join geo_region gr on sl."geoRegionId" =gr.id
            where sr.id='{sr_id}') locations
        left join 
            (select hatable.h3index, hatable."{ha_column}" as ha, prodtable."{prod_column}" as prod, deftable."hansenLoss2019" as def,
            biotable."lciaPslRPermanentCrops" as bio
            from {ha_table} hatable
            left join {prod_table} prodtable on prodtable.h3index= hatable.h3index
            left join h3_grid_deforestation_global deftable on deftable.h3index=hatable.h3index
            left join h3_grid_bio_global biotable on biotable.h3index =hatable.h3index 
            where hatable."{ha_column}"> 0
            and prodtable."{prod_column}">0
            and deftable."hansenLoss2019">0
            and biotable."lciaPslRPermanentCrops" > 0) materials
        on locations.h3index= materials.h3index
        where materials.prod > 0) totalImpact"""
    cursor.execute(sql_bioImpact)
    response_bi_ir = cursor.fetchall()
    
    
    return { 
        'value': {
            'carbon_emissions_tCO2e': response_ci_ir[0][0],#carbon_emissions_tCO2e_t #c71eb531-2c8e-40d2-ae49-1049543be4d1
            'land_ha': response_li_ir[0][0],
            'deforestation_ha':response_di_ir[0][0], #deforestation_ha #633cf928-7c4f-41a3-99c5-e8c1bda0b323
            'biodiversity_impact_PDF': response_bi_ir[0][0], #biodiversity_impact_PDF #0594aba7-70a5-460c-9b58-fc1802d264ea
            'water_use_m3':response_wi_ir[0][0] #water_use_m3 #e2c00251-fe31-4330-8c38-604535d795dc
        },
        'scaler':response_scaler[0][0] ,
        'pointer_table':prod_table,
        'pointer_column': prod_column
    }
    
    

In [40]:


conn = postgres_thread_pool.getconn()
cursor = conn.cursor()

sql_sr = """select sr.id, sl."materialId" from sourcing_records sr 
    left join sourcing_location sl on sl.id=sr."sourcingLocationId" """
cursor.execute(sql_sr)
response_srIds = cursor.fetchall()

indicator_record = []
for sr in tqdm(response_srIds):
    
    sr_id = sr[0]
    material_id = sr[1]
    res = getIndicatorRecordValues(sr_id, material_id) 
    indicator_record.append({sr_id:res})
        

100%|██████████| 495/495 [10:25<00:00,  1.26s/it]


In [41]:
indicator_record

[{'4057312f-131b-44a3-86cb-4ff6253026ce': {'value': {'carbon_emissions_tCO2e': 1155.6997766810225,
    'land_ha': 649.9999029403361,
    'deforestation_ha': 649.9999466340014,
    'biodiversity_impact_PDF': 90439209.93097399,
    'water_use_m3': 0.0017795614703469925},
   'scaler': 31973.941,
   'pointer_table': 'h3_grid_earthstat2000_global_prod',
   'pointer_column': 'earthstat2000GlobalRubberProduction'}},
 {'2280336e-f744-4ab0-a1e6-9c3a9e83c62b': {'value': {'carbon_emissions_tCO2e': 1168.1457742760495,
    'land_ha': 656.9999018950782,
    'deforestation_ha': 656.9999460592906,
    'biodiversity_impact_PDF': 91413170.65330754,
    'water_use_m3': 0.0017987259784891909},
   'scaler': 31973.941,
   'pointer_table': 'h3_grid_earthstat2000_global_prod',
   'pointer_column': 'earthstat2000GlobalRubberProduction'}},
 {'1926c1dc-caf5-467b-93da-b958ca512865': {'value': {'carbon_emissions_tCO2e': 1180.591771871076,
    'land_ha': 663.9999008498204,
    'deforestation_ha': 663.9999454845799,

In [42]:
## compare results with values in ddbb

ir_id_list = [el for record in indicator_record for el in record]

water_val_es = [record[el]['value']['water_use_m3'] for ir in ir_id_list for record in indicator_record for el in record if el == ir]
land_val_es = [record[el]['value']['land_ha'] for ir in ir_id_list for record in indicator_record for el in record if el == ir]
def_val_es = [record[el]['value']['deforestation_ha'] for ir in ir_id_list for record in indicator_record for el in record if el == ir]
carb_val_es = [record[el]['value']['carbon_emissions_tCO2e'] for ir in ir_id_list for record in indicator_record for el in record if el == ir]
bio_val_es = [record[el]['value']['biodiversity_impact_PDF'] for ir in ir_id_list for record in indicator_record for el in record if el == ir]
scaler = [record[el]['scaler'] for ir in ir_id_list for record in indicator_record for el in record if el == ir]
pointer_table = [record[el]['pointer_table'] for ir in ir_id_list for record in indicator_record for el in record if el == ir]
pointer_column = [record[el]['pointer_column'] for ir in ir_id_list for record in indicator_record for el in record if el == ir]



df_ = pd.DataFrame(ir_id_list, columns = {'IndicatorRecord_id'})
df_['es_water_use_m3']=water_val_es
df_['es_land_ha']=land_val_es
df_['es_deforestation_ha']=def_val_es
df_['es_carbon_emissions_tCO2e']=carb_val_es
df_['es-biodiversity_impact_PDF']=bio_val_es
df_['escaler']=scaler
df_['pointer_table']=pointer_table
df_['pointer_column']=pointer_column
df_.head()



Unnamed: 0,IndicatorRecord_id,es_water_use_m3,es_land_ha,es_deforestation_ha,es_carbon_emissions_tCO2e,es-biodiversity_impact_PDF,escaler,pointer_table,pointer_column
0,4057312f-131b-44a3-86cb-4ff6253026ce,0.00178,649.999903,649.999947,1155.699777,90439210.0,31973.941,h3_grid_earthstat2000_global_prod,earthstat2000GlobalRubberProduction
1,2280336e-f744-4ab0-a1e6-9c3a9e83c62b,0.001799,656.999902,656.999946,1168.145774,91413170.0,31973.941,h3_grid_earthstat2000_global_prod,earthstat2000GlobalRubberProduction
2,1926c1dc-caf5-467b-93da-b958ca512865,0.001818,663.999901,663.999945,1180.591772,92387130.0,31973.941,h3_grid_earthstat2000_global_prod,earthstat2000GlobalRubberProduction
3,c8266f82-5b40-4ffb-a6c8-42cf445064f4,0.001837,670.9999,670.999945,1193.037769,93361090.0,31973.941,h3_grid_earthstat2000_global_prod,earthstat2000GlobalRubberProduction
4,b6488db3-165a-4123-9313-abc91da76381,0.001856,677.999899,677.999944,1205.483767,94335050.0,31973.941,h3_grid_earthstat2000_global_prod,earthstat2000GlobalRubberProduction
