# Data Preprocess 2

This journal uses the PV ICE data that we previously generated in [Data Preprocess 1 Section 1](./data_preprocess_1.ipynb) to obtain the tonnes of material processed per tonne of PV Module.

Ideally, we should be able to do this with the time series waste data that PV ICE generates; unfortunately, RELOG does not have this capability.

To accommodate, I will do the following:
1. Load the waste material data.
2. Get the total PV module waste generated from 2023 to 2050. *Note: The 2023 files include waste accumulated from 2010 to 2023*.
3. Calculate % of specific material per tonne of total PV waste.

## 0. Load libraries

In [1]:
import numpy as np
import pandas as pd
import os,sys
import matplotlib.pyplot as plt
from pathlib import Path

## 1. Load waste material data

There are a lot of files here, so let's generate a code to load all the files with their variable name.

In [2]:
mats = ['csi', 'cdte']
material_list_csi = ['glass', 'silicon', 'silver', 'copper', 'aluminium_frames', 'encapsulant', 'backsheet', 'Module']
material_list_cdte = ['cadmium', 'tellurium', 'glass_cdte', 'aluminium_frames_cdte', 'Module', 'copper_cdte', 'encapsulant_cdte']

There are a bunch of columns that we do not want, so let's ignore them before loading all the files. To do this, I load one of the files (it does not matter which one), and allocate the column names to a variable. Here I want to ignore `['Unnamed: 0', 0, 2010, 'longitude', 'latitude', 'FIPS', 45]`.

**TO DO:** Get rid of `'Unnamed: 0'` and `0` columns in data_preprocess_1 when I save the files in the first place, it is useless. Once I fix this and re-generate the files, I have to re-write part of this code.

In [76]:
cols = list(pd.read_csv("csi_wasteEOL_Module.csv", nrows =1))

In [77]:
rem_cols = ['Unnamed: 0', '0','2010','longitude', 'latitude', 'FIPS', '45', 'total waste']

In [79]:
[cols.remove(item) for item in rem_cols] # This one only works once! It will throw an error if you run it again.

[None, None, None, None, None, None, None, None]

Load the files, set the column names as int for easy access, and make a list of the variables we are creating.

In [156]:
materials = []
for y in mats:
    if y == 'csi':
        for x in material_list_csi:
            globals()['%s_%s' % (y, x)] = pd.read_csv('{}_wasteEOL_{}.csv'.format(y, x), usecols =cols) # Load files
            globals()['%s_%s' % (y, x)].columns = globals()['%s_%s' % (y, x)].columns.astype('int')
            materials.append(globals()['%s_%s' % (y, x)])
    elif y == 'cdte':
        for x in material_list_cdte:
            globals()['%s_%s' % (y, x)] = pd.read_csv('{}_wasteEOL_{}.csv'.format(y, x), usecols =cols)
            globals()['%s_%s' % (y, x)].columns = globals()['%s_%s' % (y, x)].columns.astype('int')
            materials.append(globals()['%s_%s' % (y, x)])

Sum all years to generate a `total waste` column.

In [173]:
for material in materials:
    material['total waste'] = material.loc[:, :].sum(axis=1)
    print(material)

             2011          2012      2013          2014          2015  \
0    3.399019e-10  1.421339e-07  0.000011  1.544716e-04  1.008534e-03   
1    1.068942e-05  1.022619e-03  0.014485  9.371007e-02  3.957827e-01   
2    0.000000e+00  0.000000e+00  0.000000  1.950208e-10  3.082218e-08   
3    5.235890e-07  5.024856e-05  0.000725  4.809267e-03  2.081444e-02   
4    3.616357e-07  3.444818e-05  0.000476  2.965408e-03  1.203511e-02   
..            ...           ...       ...           ...           ...   
129  0.000000e+00  0.000000e+00  0.000000  2.126408e-10  3.360693e-08   
130  0.000000e+00  0.000000e+00  0.000000  1.679899e-09  2.655006e-07   
131  1.931643e-07  1.901291e-05  0.000313  2.430768e-03  1.199809e-02   
132  5.228492e-08  6.369230e-06  0.000202  2.348187e-03  1.438681e-02   
133  0.000000e+00  0.000000e+00  0.000000  2.867643e-09  4.532181e-07   

             2016      2017      2018       2019       2020  ...  \
0    4.268900e-03  0.013715  0.036505   0.084899   0.17

In [171]:
csi_aluminium_frames_total

NameError: name 'csi_aluminium_frames_total' is not defined

In [172]:
csi_aluminium_frames

Unnamed: 0,2011,2012,2013,2014,2015,2016,2017,2018,2019,2020,...,2042,2043,2044,2045,2046,2047,2048,2049,2050,total waste
0,8.442955e-11,3.502291e-08,0.000003,3.795028e-05,2.475105e-04,1.046257e-03,3.355504e-03,0.008910,0.020656,0.043265,...,51.815539,136.648797,125.897009,66.205465,137.907532,90.188257,114.476473,173.411644,171.588988,5832.986207
1,2.655185e-06,2.539936e-04,0.003596,2.324920e-02,9.810008e-02,3.163885e-01,8.488368e-01,1.993765,4.238123,8.338673,...,4863.147777,17177.569201,14026.349831,2742.199221,7242.539444,2481.002258,2765.769905,5205.985916,2541.851469,495343.574801
2,0.000000e+00,0.000000e+00,0.000000,3.870442e-11,6.116183e-09,1.166693e-07,9.381437e-07,0.000005,0.000018,0.000060,...,7.756283,10.971644,11.974833,11.696267,38.251406,7.935125,8.637789,22.709807,5.705275,616.058670
3,1.300563e-07,1.248011e-05,0.000180,1.192629e-03,5.156218e-03,1.694719e-02,4.599901e-02,0.108522,0.230308,0.450433,...,937.325208,1883.130752,1844.113686,1285.197609,2814.305605,1336.054606,1484.999319,2318.414123,1478.616908,77877.878254
4,8.982807e-08,8.556466e-06,0.000118,7.362171e-04,2.986789e-03,9.272806e-03,2.397790e-02,0.054296,0.111193,0.210538,...,336.530694,476.311811,576.766414,666.640675,1466.873291,855.187400,1068.514115,1662.657647,1511.163906,39732.033818
...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...
129,0.000000e+00,0.000000e+00,0.000000,4.220133e-11,6.668774e-09,1.284657e-07,1.221506e-06,0.000009,0.000051,0.000226,...,211.155312,341.528576,460.191200,600.854918,925.501957,1131.719948,1540.147990,2109.085347,2703.840622,41765.858736
130,0.000000e+00,0.000000e+00,0.000000,3.333978e-10,5.268448e-08,1.008339e-06,8.611994e-06,0.000051,0.000243,0.000997,...,561.687620,807.181753,928.413172,987.131386,1917.855446,1150.849665,1315.064778,1875.500361,1494.986025,50777.875238
131,4.798081e-08,4.720980e-06,0.000078,6.012483e-04,2.961958e-03,1.075461e-02,3.149118e-02,0.078871,0.175823,0.358623,...,447.166517,1362.438237,1150.760478,344.193745,983.352608,271.497143,290.670507,619.985770,202.850634,36031.411521
132,1.298725e-08,1.578473e-06,0.000050,5.779088e-04,3.533801e-03,1.457813e-02,4.652512e-02,0.124529,0.294090,0.633331,...,1780.175629,5696.776863,4739.268874,1208.256078,3691.762637,853.748408,890.216229,2143.711818,452.928563,136329.419792
