# 7_Tables_&_Charts_prototype_v1

This notebook contains a prototype for the different request that we will need to implement for getting the data needed to populate the tables and charts.

The filters that the client would be able to apply affect directly to the sourcing record entity. This sourcing record entity is directly linked with the indicator record entity and this indicator record entity contains the final impact value related with a sourcing record and to each indicator.

The ```client payload``` would be something like:

```
{
  "indicators": [1, 2, 3], // ids of indicators, required
  "groupBy": "material", // material, business-unit, region, supplier, required
  "start_year": 2020, // required
  "end_year": 2030, // required
  "materials": [1, 2, 3], // optional
  "origins": [1, 2, 3], // optional
  "supp
liers": [1, 2, 3], // optional
}

```

The ```response example``` would be something like:

```
/api?indicators[]=water&groupBy=material&start_year=2020&end_year=2021

{
 data: [
  {
      indicatorShortName: "water", // name of the indicator, indicator.shortname
      indicatorId: 2, // id of the indicator, indicator.id
      groupBy: "material" //value of the groupBy query param
      rows: [ //one array element per material (group by entity)
        {
          name: Wood // name of the "group by " entity,
          values: [
            {
              year: 2020
              value: X //calculated by Elena's magic formula
              isProjected: bool //if this is based on actual data, or on estimation
            },
            {
              year: 2021
              value: Y, //calculated by Elena's magic formula
              isProjected: bool //if this is based on actual data, or on estimation
            }
          ]
        }
        
      ]
      yearSums:[
       {
         year: 2020
         value: //sum of all values for this year for this material (group by entity)
       }
      ]
    }
  ],
  metadata: {
    unit: //indicator.unit.symbol
  }
}

```


## 1. Import libraries

In [1]:
from psycopg2.pool import ThreadedConnectionPool

## 2. Access api in local for ptototype

In [2]:
## env file for gcs upload
env_path = ".env"
with open(env_path) as f:
    env = {}
    for line in f:
        env_key, _val = line.split("=", 1)
        env_value = _val.split("\n")[0]
        env[env_key] = env_value
        
list(env.keys())

['API_SERVICE_PORT',
 'API_POSTGRES_HOST',
 'API_POSTGRES_PORT',
 'API_POSTGRES_USERNAME',
 'API_POSTGRES_PASSWORD',
 'API_POSTGRES_DATABASE',
 'CLIENT_SERVICE_PORT']

In [3]:
postgres_thread_pool = ThreadedConnectionPool(1, 50,
                                              host=env['API_POSTGRES_HOST'],
                                              port=env['API_POSTGRES_PORT'],
                                              user=env['API_POSTGRES_USERNAME'],
                                              password=env['API_POSTGRES_PASSWORD']
                                              )

## 3. Request data to the ddbb

In the endpoint the client would be able to send:

- Array of indicators (required)
- group by element (required) - This group by filter can be either of the following values: material, business-unit, region, supplier
- start_year (required)
- end_year (required)
- array of materials (optional)- if no material is provided we will retrieve all sourcing record withouth filtering by materilas
- array of origins (optional) - if no origin provided we will retrieve sourcing records withputh filtering by origin
- array of suppliers (optional) - if no suppliers provided, we will retrieve all the sourcing recods withouth filtering by supplier


You can see below and example of a client request:

In [123]:
# EXAMPLE OF FILTERS THAT THE CLIENT CAN SEND:

# array of indicators - required
indicators =  ('0594aba7-70a5-460c-9b58-fc1802d264ea', '633cf928-7c4f-41a3-99c5-e8c1bda0b323', 'c71eb531-2c8e-40d2-ae49-1049543be4d1', 'e2c00251-fe31-4330-8c38-604535d795dc') # ids of indicators, required
#group by key - Required - material, business-unit, region, supplier
groupBy='material' 
start_year= 2019#required
end_year= 2022 #required
# OPTIONAL FIELDS
materials= ('41822942-3957-4526-9dc5-a80d94419a1e', '80d52237-bb4a-4f25-9133-cbbebaa68734') #optional - if no provided we don't filter by material
origins=('05bd7ca9-6687-4df2-a46d-31e12f0f01bf', 'fd4b4fc0-6640-47e6-ba45-f65dd34072c5') #optioal - if not provided we don't filter by origin
#suppliers=[1, 2, 3] # optional - if not provided, we don't filter by supplier


# connect to the ddbb
conn = postgres_thread_pool.getconn()
cursor = conn.cursor()


## NOTE: The same logic for the and indicators, materials and admin regions would be applied to the supplier. 
# As all the data is null, I'm not filtering by anything in this case
cursor.execute(f"""
    select sr."year", sum(sr.tonnage) tonnes, sum(ir.value) impact, i.id, i."shortName", m."name" 
    from sourcing_records sr --select sourcing recods
    left join sourcing_location sl on sl.id =sr."sourcingLocationId" --join sourcing locations for filtering data
    left join indicator_record ir on ir."sourcingRecordId"=sr.id --  join with indicatir record
    left join "indicator" i on i.id=ir."indicatorId"  -- join with indicatirs
    left join material m on m.id =sl."materialId" -- join with the joinby key (materials, origin suppliers entity)
    where sr."year" between {start_year} and {end_year} -- filter by range to get all no projected values for each year
    and i.id in {indicators}
    and sl."materialId" in {materials} -- filter by materials - optional
    and sl."adminRegionId" in {origins} -- filter by admin regions - optional
    -- and sl."t1SupplierId" in (list) if filter selected
    -- and sl."producerId" in (list) if filter selected
    group by sl."materialId", sr."year", i.id, i."shortName", m."name" -- group by value to get the sum of impacts
""")

response = cursor.fetchall()

In [124]:
## example of the response - response that you will get with the query provided above
response

[(2019,
  Decimal('1150'),
  126460.35616648903,
  '0594aba7-70a5-460c-9b58-fc1802d264ea',
  'biodiversity loss',
  'Raw hides, skins and leather'),
 (2019,
  Decimal('1150'),
  12292.629895980881,
  '633cf928-7c4f-41a3-99c5-e8c1bda0b323',
  'Deforestation loss',
  'Raw hides, skins and leather'),
 (2019,
  Decimal('1150'),
  101462.66568489585,
  'c71eb531-2c8e-40d2-ae49-1049543be4d1',
  'GHG emissions',
  'Raw hides, skins and leather'),
 (2019,
  Decimal('1150'),
  313727.30810887297,
  'e2c00251-fe31-4330-8c38-604535d795dc',
  'Unsustainable water use',
  'Raw hides, skins and leather'),
 (2020,
  Decimal('1162'),
  127782.23627783079,
  '0594aba7-70a5-460c-9b58-fc1802d264ea',
  'biodiversity loss',
  'Raw hides, skins and leather'),
 (2020,
  Decimal('1162'),
  12421.123890302677,
  '633cf928-7c4f-41a3-99c5-e8c1bda0b323',
  'Deforestation loss',
  'Raw hides, skins and leather'),
 (2020,
  Decimal('1162'),
  102523.25077822336,
  'c71eb531-2c8e-40d2-ae49-1049543be4d1',
  'GHG emis

## 4. Parse data :

Parse the response to theformat expected by the client:


In [202]:
data = []
default_agr = 1.5 #default annual growth rate - used for projecting the data. 
#iterate the response over the different indicators that the client has provided
for idx, indicatorId in enumerate(indicators):
    ##append data by indicator
    data.append({
        'indicatorShortName':[el[4] for el in response if el[3]==indicatorId][0], # set the indicator shortname that we get from the query above
        'indicatorId':indicatorId,# set the indicator id
        'groupBy':groupBy, #set the group by key
        'rows':[], # we will append later the data by year and by group by value
        'yearSum':[] # we will append later the sum of total impact by yera and by indicator
    })
    #populate rows
    respose_byIndicator = [el for el in response if el[3]==indicatorId] # filter the response by the indicatorId
    unique_names = set([el[5] for el in response if el[3]==indicatorId]) #get unique names for idicator id
    for name in unique_names:
        data[idx]['rows'].append({
            'name':name, #set name of the individual names of the groupby key
            'values':[] #append values by year
        })
        
        i = len(data[idx]['rows'])-1 #index for appending later the data
        for year in range(start_year, end_year+1): # iterate over each year from start and end year
            value = [el[2] for el in respose_byIndicator if el[0]==year and el[5]==name] # get the value of impact for those years we have record on the ddbb
            if len(value): # if we have data, we append the value
                data[idx]['rows'][i]['values'].append({
                    'year':year,
                    'value':value[0],
                    'isProjected':False
                })
                value = value[0]
            else: # if we don't have data, we project
                # we get the latest value to project with the default annual growth rate
                value_to_project = data[idx]['rows'][i]['values'][-1]['value']
                value = value_to_project + value_to_project*default_agr/100         
                data[idx]['rows'][i]['values'].append({
                    'year':year,
                    'value':value,
                    'isProjected':True
                })
     
    # append the total sum of impact by year by indicator
    for i,year in enumerate(range(start_year, end_year+1)):
        ## add sum of impact by indicator
        if len(data[idx]['rows']):
            data[idx]['yearSum'].append({
                'year':year,
                'value':sum([el['values'][i]['value'] for el in data[idx]['rows'] if el['values'][i]['year']==year])
            })
# ONCE WE HAVE ALL THE DATA BY INDICATOR, THE CLIENT WILL ALSO NEED THE TOTAL PURCHASED VOLUME BY YEAR       
# add total sum of purchase tonnes
data.append({
    'name':'purchaseTonnes',
    'values':[]
})

for year in range(start_year, end_year+1):
    purchase_tonnes = sum([el[2] for el in response if el[0]==year])
    if purchase_tonnes!=0:
        data[-1]['values'].append({
            'year':year,
            'value':purchase_tonnes,
            'isProjected': False
        })
    else:
        tonnes_to_project=data[-1]['values'][-1]['value']
        purchase_tonnes = tonnes_to_project + tonnes_to_project*default_agr/100  
        data[-1]['values'].append({
            'year':year,
            'value':purchase_tonnes,
            'isProjected': True
        })
        
                                                     
data       

[{'indicatorShortName': 'biodiversity loss',
  'indicatorId': '0594aba7-70a5-460c-9b58-fc1802d264ea',
  'groupBy': 'material',
  'rows': [{'name': 'Raw hides, skins and leather',
    'values': [{'year': 2019,
      'value': 126460.35616648903,
      'isProjected': False},
     {'year': 2020, 'value': 127782.23627783079, 'isProjected': False},
     {'year': 2021, 'value': 129698.96982199825, 'isProjected': True},
     {'year': 2022, 'value': 131644.45436932822, 'isProjected': True}]},
   {'name': 'Cotton',
    'values': [{'year': 2019,
      'value': 4901.394723953627,
      'isProjected': False},
     {'year': 2020, 'value': 4949.21320906537, 'isProjected': False},
     {'year': 2021, 'value': 5023.451407201351, 'isProjected': True},
     {'year': 2022, 'value': 5098.803178309371, 'isProjected': True}]}],
  'yearSum': [{'year': 2019, 'value': 131361.75089044266},
   {'year': 2020, 'value': 132731.44948689616},
   {'year': 2021, 'value': 134722.4212291996},
   {'year': 2022, 'value': 13