# Data Preprocess 2

This journal uses the PV ICE data that we previously generated in [Data Preprocess 1 Section 1](./data_preprocess_1.ipynb) to obtain the tonnes of material processed per tonne of PV Module.

Ideally, we should be able to do this with the time series waste data that PV ICE generates; unfortunately, RELOG does not have this capability.

To accommodate, I will do the following:
1. Load the waste material data.
2. Get the total PV module waste generated from 2025 to 2050. *Note: The 2025 files include waste accumulated from 2010 to 2025*.
3. Calculate % of specific material per tonne of total PV waste.

***NOTE:** All quantities are given in **metric tonnes**.*

## 0. Load necessary libraries

In [96]:
import numpy as np
import pandas as pd
import os,sys
import matplotlib.pyplot as plt
from pathlib import Path

## 1. Load waste material data

There are a lot of files here, so let's generate a code to load all the files with their variable name.

In [97]:
mats = ['csi', 'cdte']
material_list_csi = ['glass', 'silicon', 'silver', 'copper', 'aluminium_frames', 'encapsulant', 'backsheet', 'Module']
material_list_cdte = ['cadmium', 'tellurium', 'glass_cdte', 'aluminium_frames_cdte', 'Module', 'copper_cdte', 'encapsulant_cdte']

There are a bunch of columns that we do not want, so let's ignore them before loading all the files. To do this, I load one of the files (it does not matter which one), and allocate the column names to a variable. Here I want to ignore `['Unnamed: 0', 0, 2010, 'longitude', 'latitude', 'FIPS', 45]`.

**TO DO:** Get rid of `'Unnamed: 0'` and `0` columns in data_preprocess_1 when I save the files in the first place, it is useless. Once I fix this and re-generate the files, I have to re-write part of this code.

In [98]:
simulation = 'Method3' # Change this one for the files you wish to load

In [99]:
cwd = os.getcwd() #current folder
pv_ice_output = os.path.join(cwd, 'PV_ICE_clean_outputs', simulation)

In [100]:
cols = list(pd.read_csv(os.path.join(pv_ice_output, "csi_wasteEOL_Module.csv"), nrows =1))

In [101]:
cols

['0',
 '2010',
 '2011',
 '2012',
 '2013',
 '2014',
 '2015',
 '2016',
 '2017',
 '2018',
 '2019',
 '2020',
 '2021',
 '2022',
 '2023',
 '2024',
 '2025',
 '2026',
 '2027',
 '2028',
 '2029',
 '2030',
 '2031',
 '2032',
 '2033',
 '2034',
 '2035',
 '2036',
 '2037',
 '2038',
 '2039',
 '2040',
 '2041',
 '2042',
 '2043',
 '2044',
 '2045',
 '2046',
 '2047',
 '2048',
 '2049',
 '2050',
 'longitude',
 'latitude',
 'FIPS',
 '45',
 '46',
 'total waste']

In [102]:
rem_cols = ['0','2010','longitude', 'latitude', 'FIPS', '45', '46', 'total waste']

In [103]:
[cols.remove(item) for item in rem_cols] # This one only works once! It will throw an error if you run it again.

[None, None, None, None, None, None, None, None]

Load the files, set the column names as int for easy access, and make a list of the variables we are creating.

In [104]:
materials = []
for y in mats:
    if y == 'csi':
        for x in material_list_csi:
            globals()['%s_%s' % (y, x)] = pd.read_csv(os.path.join(pv_ice_output,'{}_wasteEOL_{}.csv'.format(y, x)), usecols =cols) # Load files
            globals()['%s_%s' % (y, x)].columns = globals()['%s_%s' % (y, x)].columns.astype('int')
            materials.append(globals()['%s_%s' % (y, x)])
    elif y == 'cdte':
        for x in material_list_cdte:
            globals()['%s_%s' % (y, x)] = pd.read_csv(os.path.join(pv_ice_output,'{}_wasteEOL_{}.csv'.format(y, x)), usecols =cols)
            globals()['%s_%s' % (y, x)].columns = globals()['%s_%s' % (y, x)].columns.astype('int')
            materials.append(globals()['%s_%s' % (y, x)])

## 2. Calculate total waste per material

Sum all years to generate a `total waste` column.

In [105]:
cdte_cadmium

Unnamed: 0,2011,2012,2013,2014,2015,2016,2017,2018,2019,2020,...,2041,2042,2043,2044,2045,2046,2047,2048,2049,2050
0,5.752115e-14,1.080357e-11,5.875183e-10,7.918871e-09,5.153066e-08,2.195134e-07,7.143339e-07,1.938872e-06,4.626485e-06,1.003050e-05,...,0.023216,0.030126,0.061145,0.078896,0.049779,0.085559,0.076219,0.097107,0.135560,0.147769
1,1.808956e-09,1.721805e-07,2.367338e-06,1.465366e-05,5.908301e-05,1.827862e-04,4.737170e-04,1.084313e-03,2.266552e-03,4.421215e-03,...,1.858654,2.245478,6.232513,7.842521,1.690841,3.629486,1.963367,2.312896,3.687166,2.761815
2,0.000000e+00,0.000000e+00,0.000000e+00,1.694838e-14,2.724909e-12,5.842517e-11,5.452053e-10,3.086637e-09,1.262148e-08,4.191722e-08,...,0.006371,0.008912,0.012808,0.017421,0.022241,0.039590,0.036733,0.048599,0.069535,0.079487
3,8.860626e-11,8.441458e-09,1.166983e-07,7.285446e-07,2.964698e-06,9.237581e-06,2.398433e-05,5.457889e-05,1.124952e-04,2.148577e-04,...,0.521375,0.690327,1.131361,1.465759,1.353174,2.245920,2.035623,2.555140,3.492940,3.779274
4,6.119912e-11,5.817866e-09,7.939714e-08,4.856679e-07,1.931009e-06,5.879997e-06,1.494215e-05,3.332803e-05,6.737701e-05,1.261748e-04,...,0.164780,0.224648,0.315238,0.417723,0.515903,0.921054,0.797188,1.027222,1.463323,1.592708
...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...
129,0.000000e+00,0.000000e+00,0.000000e+00,1.847965e-14,2.971102e-12,6.409312e-11,6.563110e-10,4.591207e-09,2.422456e-08,1.021354e-07,...,0.108610,0.159872,0.242277,0.343570,0.454603,0.662445,0.846283,1.135543,1.519654,1.936906
130,0.000000e+00,0.000000e+00,0.000000e+00,1.459924e-13,2.347222e-11,5.043119e-10,4.861690e-09,2.987128e-08,1.377668e-07,5.329233e-07,...,0.285892,0.377021,0.520767,0.664370,0.759606,1.234537,1.065555,1.287140,1.717347,1.745162
131,3.268893e-11,3.137344e-09,4.527256e-08,3.010765e-07,1.309196e-06,4.340044e-06,1.193483e-05,2.870804e-05,6.255202e-05,1.264075e-04,...,0.238674,0.310977,0.661133,0.858144,0.514980,0.910334,0.818297,1.064822,1.513529,1.687112
132,8.848106e-12,9.086496e-10,1.796881e-08,1.646552e-07,9.107926e-07,3.612491e-06,1.153433e-05,3.188977e-05,7.959991e-05,1.835526e-04,...,0.663763,0.800102,2.067755,2.583117,0.648358,1.668397,0.606052,0.680005,1.321706,0.626148


In [106]:
for material in materials:
    material['total waste'] = material.loc[:, :].sum(axis=1)

In [107]:
cdte_tellurium

Unnamed: 0,2011,2012,2013,2014,2015,2016,2017,2018,2019,2020,...,2042,2043,2044,2045,2046,2047,2048,2049,2050,total waste
0,6.529342e-14,1.226335e-11,6.669039e-10,8.988870e-09,5.849349e-08,2.491741e-07,8.108549e-07,2.200853e-06,5.251617e-06,1.138583e-05,...,0.035342,0.071160,0.092177,0.060340,0.102618,0.094250,0.120906,0.168358,0.187022,1.196831
1,2.053382e-09,1.954456e-07,2.687213e-06,1.663367e-05,6.706632e-05,2.074843e-04,5.377257e-04,1.230825e-03,2.572809e-03,5.018610e-03,...,2.563033,7.096308,8.934583,1.966675,4.187813,2.324175,2.757320,4.364268,3.373239,96.237594
2,0.000000e+00,0.000000e+00,0.000000e+00,1.923845e-14,3.093099e-12,6.631959e-11,6.188736e-10,3.503704e-09,1.432689e-08,4.758109e-08,...,0.010822,0.015619,0.021391,0.027610,0.048328,0.046462,0.061748,0.087857,0.102116,0.450552
3,1.005788e-10,9.582070e-09,1.324666e-07,8.269857e-07,3.365289e-06,1.048577e-05,2.722510e-05,6.195361e-05,1.276955e-04,2.438893e-04,...,0.810297,1.325095,1.724908,1.625393,2.677527,2.490910,3.149283,4.302459,4.739481,25.670384
4,6.946836e-11,6.603978e-09,9.012531e-08,5.512914e-07,2.191927e-06,6.674503e-06,1.696114e-05,3.783132e-05,7.648101e-05,1.432236e-04,...,0.267725,0.377310,0.503285,0.628212,1.106581,0.990807,1.284648,1.821932,2.022181,9.928777
...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...
129,0.000000e+00,0.000000e+00,0.000000e+00,2.097662e-14,3.372557e-12,7.275340e-11,7.449919e-10,5.211572e-09,2.749779e-08,1.159359e-07,...,0.198685,0.301361,0.429386,0.573658,0.834575,1.076843,1.449458,1.942635,2.488480,9.674381
130,0.000000e+00,0.000000e+00,0.000000e+00,1.657190e-13,2.664379e-11,5.724547e-10,5.518603e-09,3.390750e-08,1.563819e-07,6.049321e-07,...,0.437687,0.606019,0.776394,0.894801,1.448023,1.275184,1.551719,2.072351,2.144719,12.394876
131,3.710587e-11,3.561263e-09,5.138980e-08,3.417580e-07,1.486094e-06,4.926472e-06,1.354747e-05,3.258708e-05,7.100407e-05,1.434877e-04,...,0.367218,0.772238,1.006649,0.632186,1.101612,1.024897,1.341314,1.897889,2.154601,12.764027
132,1.004366e-11,1.031427e-09,2.039676e-08,1.869034e-07,1.033859e-06,4.100612e-06,1.309285e-05,3.619872e-05,9.035547e-05,2.083543e-04,...,0.911121,2.351605,2.938809,0.745707,1.907800,0.707590,0.799020,1.537093,0.759760,22.278561


Get the names of the created variables, create a list and then generate a dataframe with the total generated waste by material.

The following cell prints all the variables generated in this notebook, I copy and pasted the ones I am interested. This would be helpful to automate the total sum of wastes.


In [108]:
vnames = [name for name in globals()] 
print(vnames)

['__name__', '__doc__', '__package__', '__loader__', '__spec__', '__builtin__', '__builtins__', '_ih', '_oh', '_dh', 'In', 'Out', 'get_ipython', 'exit', 'quit', '_', '__', '___', '_i', '_ii', '_iii', '_i1', 'np', 'pd', 'os', 'sys', 'plt', 'Path', '_i2', 'mats', 'material_list_csi', 'material_list_cdte', '_i3', 'simulation', '_i4', 'cwd', 'pv_ice_output', '_i5', 'cols', '_i6', '_6', '_i7', 'rem_cols', '_i8', '_8', '_i9', 'materials', 'y', 'x', 'csi_glass', 'csi_silicon', 'csi_silver', 'csi_copper', 'csi_aluminium_frames', 'csi_encapsulant', 'csi_backsheet', 'csi_Module', 'cdte_cadmium', 'cdte_tellurium', 'cdte_glass_cdte', 'cdte_aluminium_frames_cdte', 'cdte_Module', 'cdte_copper_cdte', 'cdte_encapsulant_cdte', '_i10', '_10', '_i11', 'material', '_i12', '_12', '_i13', '_13', '_i14', 'vnames', '_i15', 'material_vars', '_i16', 'total_wastes', '_i17', '_i18', '_18', '_i19', 'total_waste', '_i20', '_i21', '_21', '_i22', '_22', '_i23', 'total_pv_waste', '_i24', '_24', '_i25', '_i26', '_i27',

In [109]:
material_vars = ['csi_glass', 'csi_silicon', 'csi_silver', 'csi_copper', 'csi_aluminium_frames', 'csi_encapsulant', 'csi_backsheet', 'csi_Module', 'cdte_cadmium', 'cdte_tellurium', 'cdte_glass_cdte', 'cdte_aluminium_frames_cdte', 'cdte_Module', 'cdte_copper_cdte', 'cdte_encapsulant_cdte']

In [110]:
total_wastes = pd.DataFrame()

In [111]:
total_wastes['Material'] = material_vars

In [112]:
material_vars[1]

'csi_silicon'

In [113]:
total_waste = []
for mats in range(len(material_vars)):
    total_waste.append(materials[mats]['total waste'].sum(axis=0))

In [114]:
total_wastes['Total waste'] = total_waste

## 3. Calculate tonnes of recycled material per tonne of PV processed.

This section shows how the Inputs & Outputs section of the PV Recycling plant was calculated.

Calculate total PV input (cSi + CdTe).

In [115]:
total_wastes[total_wastes['Material'] == 'csi_Module']['Total waste']

7    6.430749e+06
Name: Total waste, dtype: float64

In [116]:
total_wastes[total_wastes['Material'] == 'cdte_Module']['Total waste']

12    1.345260e+06
Name: Total waste, dtype: float64

In [117]:
total_pv_waste = np.array(sum(total_wastes[total_wastes['Material'] == 'csi_Module']['Total waste'],\
                              total_wastes[total_wastes['Material'] == 'cdte_Module']['Total waste']))

In [118]:
total_pv_waste

array([7776008.40186959])

In [119]:
total_wastes['Tonnes of waste per tonne of PV'] = total_wastes['Total waste'].divide(total_pv_waste[0])

Now, we add a new column with the material-specific recycling efficiencies. Luckily we made a dictionary of these values in [Data Preprocess 1 Section 4.3.](./data_preprocess_1.ipynb)

In [120]:
total_wastes['Recycling Efficiencies'] = [0.98, 0.95, 0.95, 0.95, 1, 1, 1, 0, 0.95, 0.95, 0.9, 1, 0, 0.95, 0.9]

In [121]:
total_wastes['Tonnes of recycled material per tonne of PV'] =\
                            total_wastes['Tonnes of waste per tonne of PV'] *\
                            total_wastes['Recycling Efficiencies']

In [122]:
total_wastes

Unnamed: 0,Material,Total waste,Tonnes of waste per tonne of PV,Recycling Efficiencies,Tonnes of recycled material per tonne of PV
0,csi_glass,4739718.0,0.609531,0.98,0.59734
1,csi_silicon,219877.1,0.028276,0.95,0.026863
2,csi_silver,2418.875,0.000311,0.95,0.000296
3,csi_copper,3865.457,0.000497,0.95,0.000472
4,csi_aluminium_frames,773519.8,0.099475,1.0,0.099475
5,csi_encapsulant,450461.5,0.05793,1.0,0.05793
6,csi_backsheet,240887.6,0.030978,1.0,0.030978
7,csi_Module,6430749.0,0.826999,0.0,0.0
8,cdte_cadmium,498.1058,6.4e-05,0.95,6.1e-05
9,cdte_tellurium,587.702,7.6e-05,0.95,7.2e-05


We assume that the material that is not recycled is landfilled. So let's calculate that!

In [123]:
total_wastes['Tonnes of landfilled material per PV processed'] =\
                        total_wastes['Tonnes of waste per tonne of PV'] -\
                        total_wastes['Tonnes of recycled material per tonne of PV']

Ignore the csi_Module and cdte_Modules. I should probably pop them out of the dataframe but for now I am keeping them for sanity check.

In [124]:
total_wastes 

Unnamed: 0,Material,Total waste,Tonnes of waste per tonne of PV,Recycling Efficiencies,Tonnes of recycled material per tonne of PV,Tonnes of landfilled material per PV processed
0,csi_glass,4739718.0,0.609531,0.98,0.59734,0.012191
1,csi_silicon,219877.1,0.028276,0.95,0.026863,0.001414
2,csi_silver,2418.875,0.000311,0.95,0.000296,1.6e-05
3,csi_copper,3865.457,0.000497,0.95,0.000472,2.5e-05
4,csi_aluminium_frames,773519.8,0.099475,1.0,0.099475,0.0
5,csi_encapsulant,450461.5,0.05793,1.0,0.05793,0.0
6,csi_backsheet,240887.6,0.030978,1.0,0.030978,0.0
7,csi_Module,6430749.0,0.826999,0.0,0.0,0.826999
8,cdte_cadmium,498.1058,6.4e-05,0.95,6.1e-05,3e-06
9,cdte_tellurium,587.702,7.6e-05,0.95,7.2e-05,4e-06


Contaminated glass has to go to a special waste management, so let's put it in its own bin.

In [125]:
tt_contaminated_glass = np.array(sum(total_wastes[total_wastes['Material'] == 'csi_glass']['Tonnes of landfilled material per PV processed'],\
                              total_wastes[total_wastes['Material'] == 'cdte_glass_cdte']['Tonnes of landfilled material per PV processed']))
tt_contaminated_glass

array([0.02858578])

Let's add this to the dataframe.

In [126]:
total_wastes.loc[len(total_wastes.index)] = ['tt_contaminated_glass', 0, 0, 0, 0, tt_contaminated_glass[0]]

Same goes with cadmium waste.

In [127]:
tt_cadmium_waste = np.array(total_wastes[total_wastes['Material'] == 'cdte_cadmium']['Tonnes of landfilled material per PV processed'])

In [128]:
tt_cadmium_waste

array([3.20283742e-06])

In [129]:
total_wastes.loc[len(total_wastes.index)] = ['tt_cadmium_waste', 0, 0, 0, 0, tt_cadmium_waste[0]]

Now let's calculate the rest of the waste:

In [140]:
csi_module_fakewaste = total_wastes.loc[total_wastes['Material'] == 'csi_Module']['Tonnes of landfilled material per PV processed'].values[0]
cdte_module_fakewaste = total_wastes.loc[total_wastes['Material'] == 'cdte_Module']['Tonnes of landfilled material per PV processed'].values[0]

In [141]:
tt_all_waste = np.array(total_wastes['Tonnes of landfilled material per PV processed'].sum(axis=0)) \
                - tt_cadmium_waste - tt_contaminated_glass - csi_module_fakewaste - cdte_module_fakewaste

In [142]:
tt_all_waste

array([0.0304897])

In [86]:
total_wastes.loc[len(total_wastes.index)] = ['tt_all_waste', 0, 0, 0, 0, tt_all_waste]

In [89]:
total_wastes.loc[len(total_wastes.index)] = ['tt_other_waste', 0, 0, 0, 0, tt_other_waste[0]]

In [90]:
total_wastes

Unnamed: 0,Material,Total waste,Tonnes of waste per tonne of PV,Recycling Efficiencies,Tonnes of recycled material per tonne of PV,Tonnes of landfilled material per PV processed
0,csi_glass,4739718.0,0.609531,0.98,0.59734,0.012191
1,csi_silicon,219877.1,0.028276,0.95,0.026863,0.001414
2,csi_silver,2418.875,0.000311,0.95,0.000296,1.6e-05
3,csi_copper,3865.457,0.000497,0.95,0.000472,2.5e-05
4,csi_aluminium_frames,773519.8,0.099475,1.0,0.099475,0.0
5,csi_encapsulant,450461.5,0.05793,1.0,0.05793,0.0
6,csi_backsheet,240887.6,0.030978,1.0,0.030978,0.0
7,csi_Module,6430749.0,0.826999,0.0,0.0,0.826999
8,cdte_cadmium,498.1058,6.4e-05,0.95,6.1e-05,3e-06
9,cdte_tellurium,587.702,7.6e-05,0.95,7.2e-05,4e-06


In [92]:
cwd = os.getcwd()


In [93]:
total_wastes.to_csv(os.path.join(cwd, 'RELOG_import_data', 'RELOG_case_builder_io.csv'), index = False)