## Energy saved from recycling 

Did you know that recycling saves energy by reducing or eliminating the need to make materials from scratch? For example, aluminum can manufacturers can skip the energy-costly process of producing aluminum from ore by cleaning and melting recycled cans. Aluminum is classified as a non-ferrous metal. Singapore has an ambitious goal of becoming a zero-waste nation. The amount of waste disposed of in Singapore has increased seven-fold over the last 40 years. At this rate, Semakau Landfill, Singapore’s only landfill, will run out of space by 2035. Making matters worse, Singapore has limited land for building new incineration plants or landfills. The government would like to motivate citizens by sharing the total energy that the combined recycling efforts have saved every year. They have asked you to help them. You have been provided with three datasets. The data come from different teams, so the names of waste types may differ.

**datasets/wastestats.csv - Recycling statistics per waste type for the period 2003 to 2017**

Source: [Singapore National Environment Agency](https://www.nea.gov.sg/our-services/waste-management/waste-statistics-and-overall-recycling)

*   **waste\_type:** The type of waste recycled.
*   **waste\_disposed\_of\_tonne:** The amount of waste that could not be recycled (in metric tonnes).
*   **total\_waste\_recycle\_tonne:** The amount of waste that could be recycled (in metric tonnes).
*   **total\_waste\_generated:** The total amount of waste collected before recycling (in metric tonnes).
*   **recycling\_rate:** The amount of waste recycled per tonne of waste generated.
*   **year:** The recycling year.

**datasets/2018\_2019\_waste.csv - Recycling statistics per waste type for the period 2018 to 2019**

Source: [Singapore National Environment Agency](https://www.nea.gov.sg/our-services/waste-management/waste-statistics-and-overall-recycling)

*   **Waste Type:** The type of waste recycled.
*   **Total Generated:** The total amount of waste collected before recycling (in thousands of metric tonnes).
*   **Total Recycled:** The amount of waste that could be recycled. (in thousands of metric tonnes).
*   **Year:** The recycling year.

**datasets/energy\_saved.csv - Estimations of the amount of energy saved per waste type in kWh**

*   **material:** The type of waste recycled.
*   **energy\_saved:** An estimate of the energy saved (in kiloWatt hour) by recycling a metric tonne of waste.
*   **crude\_oil\_saved:** An estimate of the number of barrels of oil saved by recycling a metric tonne of waste.

In [23]:
#Load everything to the working space

import pandas as pd

#Load csv

stats_13_17 = pd.read_csv('datasets/wastestats.csv')
stats_18_19 = pd.read_csv('datasets/2018_2019_waste.csv')
energy_saved = pd.read_csv('datasets/energy_saved.csv')

#Load as df

stats_1 = pd.DataFrame(stats_13_17)
stats_2 = pd.DataFrame(stats_18_19)
stats_3 = pd.DataFrame(energy_saved)

#Set the workspace
pd.set_option('display.max_columns',85)
pd.set_option('display.max_row',85)

#Prepare the merge
#Uniform columns name
stats_1.columns = ['waste_type', 'waste_cant_recycle', 'waste_can_recycle', 'total_waste_generated', 'recycling_rate', 'year']
stats_2.columns = ['waste_type', 'total_waste_generated_thousands', 'waste_can_recycle_thousands', 'year']

#Clean stats_1
stats_1['waste_type'] = stats_1['waste_type'].str.lower()
stats_1['waste_type'] = stats_1['waste_type'].replace(['plastics', 'ferrous metals','ferrous metal','non-ferrous metals', 'non-ferrous metal'],['plastic', 'ferrous','ferrous', 'non-ferrous', 'non-ferrous'])

#Create a filter of the words I need: cat
cat = ['plastic', 'ferrous', 'non-ferrous', 'glass']
filt = stats_1['waste_type'].isin(cat)

#Create stats_1_clean
stats_1_clean = stats_1.loc[filt]
stats_1_clean.head()

#Clean stats_2: stats_1_clean
stats_2['waste_type'] = stats_2['waste_type'].str.lower()
stats_2['waste_type'] = stats_2['waste_type'].replace(['ferrous metal', 'plastics', 'non-ferrous metal'],['ferrous', 'plastic', 'non-ferrous'])
stats_2['total_waste_generated'] = stats_2['total_waste_generated_thousands'] * 1000
stats_2['waste_can_recycle'] = stats_2['waste_can_recycle_thousands'] * 1000
filt2 = stats_2['waste_type'].isin(cat)
stats_2_clean = stats_2.loc[filt2]

#Drop the not necessary columns
stats_2_clean.drop(columns=['total_waste_generated_thousands', 'waste_can_recycle_thousands'],inplace=True)

#Uniform stats_2_clean to stats_1 data
stats_2_clean['waste_cant_recycle'] = stats_2_clean['total_waste_generated'] - stats_2_clean['waste_can_recycle']
stats_2_clean ['recycling_rate'] = stats_2_clean['total_waste_generated'] / stats_2_clean['waste_can_recycle']
stats_2_clean = stats_2_clean[['waste_type', 'waste_cant_recycle','waste_can_recycle','total_waste_generated','recycling_rate', 'year']]

#Clean stats_3
stats_3.columns = ['nat', 'Plastic', 'Glass', 'Ferrous', 'Non-ferrous', 'Paper']
stats_3.drop(columns=['nat'],index=[0,1,2,4], inplace= True)

A value is trying to be set on a copy of a slice from a DataFrame.
Try using .loc[row_indexer,col_indexer] = value instead

See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy
  stats_2_clean['waste_cant_recycle'] = stats_2_clean['total_waste_generated'] - stats_2_clean['waste_can_recycle']
A value is trying to be set on a copy of a slice from a DataFrame.
Try using .loc[row_indexer,col_indexer] = value instead

See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy
  stats_2_clean ['recycling_rate'] = stats_2_clean['total_waste_generated'] / stats_2_clean['waste_can_recycle']


In [24]:
# Clean and complete df: all_stats

df_list_I_need = [stats_1_clean, stats_2_clean]
all_stats = pd.concat(df_list_I_need)
all_stats.sort_values(by=['waste_type', 'year'], ascending=[True, True], inplace=True)
all_stats['kwh'] = 0

In [25]:
all_stats.set_index('waste_type', inplace=True)

In [26]:
all_stats.loc['ferrous','kwh'] = 642
all_stats.loc['plastic','kwh'] = 5774
all_stats.loc['glass','kwh'] = 42
all_stats.loc['non-ferrous','kwh'] = 14000
all_stats['prod'] = all_stats['waste_can_recycle'] * all_stats['kwh']

In [27]:
year_grp = all_stats.groupby('year')
y15 = year_grp.get_group(2015)
y16 = year_grp.get_group(2016)
y17 = year_grp.get_group(2017)
y18 = year_grp.get_group(2018)
y19 = year_grp.get_group(2019)


In [28]:
sum15 = y15['prod'].sum()
sum16 = y16['prod'].sum()
sum17 = y17['prod'].sum()
sum18 = y18['prod'].sum()
sum19 = y19['prod'].sum()

In [29]:
df = pd.DataFrame(columns=['year', 'total_energy_saved'])
df['year'] = [2015, 2016, 2017, 2018, 2019]
df['total_energy_saved'] = [sum15,sum16,sum17,sum18,sum19]
df.set_index('year',inplace=True)

In [30]:
df.to_csv('datasets/results.csv')