## Energy saved from recycling
<p>Did you know that recycling saves energy by reducing or eliminating the need to make materials from scratch? For example, aluminum can manufacturers can skip the energy-costly process of producing aluminum from ore by cleaning and melting recycled cans. Aluminum is classified as a non-ferrous metal.</p>
<p>Singapore has an ambitious goal of becoming a zero-waste nation. The amount of waste disposed of in Singapore has increased seven-fold over the last 40 years. At this rate, Semakau Landfill, Singapore’s only landfill, will run out of space by 2035. Making matters worse, Singapore has limited land for building new incineration plants or landfills.</p>
<p>The government would like to motivate citizens by sharing the total energy that the combined recycling efforts have saved every year. They have asked you to help them.</p>
<p>You have been provided with three datasets. The data come from different teams, so the names of waste types may differ.</p>
<div style="background-color: #efebe4; color: #05192d; text-align:left; vertical-align: middle; padding: 15px 25px 15px 25px; line-height: 1.6;">
    <div style="font-size:16px"><b>datasets/wastestats.csv - Recycling statistics per waste type for the period 2003 to 2017</b>
    </div>
    <div>Source: <a href="https://www.nea.gov.sg/our-services/waste-management/waste-statistics-and-overall-recycling">Singapore National Environment Agency</a></div>
<ul>
    <li><b>waste_type: </b>The type of waste recycled.</li>
    <li><b>waste_disposed_of_tonne: </b>The amount of waste that could not be recycled (in metric tonnes).</li>
    <li><b>total_waste_recycle_tonne: </b>The amount of waste that could be recycled (in metric tonnes).</li>
    <li><b>total_waste_generated: </b>The total amount of waste collected before recycling (in metric tonnes).</li>
    <li><b>recycling_rate: </b>The amount of waste recycled per tonne of waste generated.</li>
    <li><b>year: </b>The recycling year.</li>
</ul>
    </div>
<div style="background-color: #efebe4; color: #05192d; text-align:left; vertical-align: middle; padding: 15px 25px 15px 25px; line-height: 1.6; margin-top: 17px;">
    <div style="font-size:16px"><b>datasets/2018_2019_waste.csv - Recycling statistics per waste type for the period 2018 to 2019</b>
    </div>
    <div> Source: <a href="https://www.nea.gov.sg/our-services/waste-management/waste-statistics-and-overall-recycling">Singapore National Environment Agency</a></div>
<ul>
    <li><b>Waste Type: </b>The type of waste recycled.</li>
    <li><b>Total Generated: </b>The total amount of waste collected before recycling (in thousands of metric tonnes).</li> 
    <li><b>Total Recycled: </b>The amount of waste that could be recycled. (in thousands of metric tonnes).</li>
    <li><b>Year: </b>The recycling year.</li>
</ul>
    </div>
<div style="background-color: #efebe4; color: #05192d; text-align:left; vertical-align: middle; padding: 15px 25px 15px 25px; line-height: 1.6; margin-top: 17px;">
    <div style="font-size:16px"><b>datasets/energy_saved.csv -  Estimations of the amount of energy saved per waste type in kWh</b>
    </div>
<ul>
    <li><b>material: </b>The type of waste recycled.</li>
    <li><b>energy_saved: </b>An estimate of the energy saved (in kiloWatt hour) by recycling a metric tonne of waste.</li> 
    <li><b>crude_oil_saved: </b>An estimate of the number of barrels of oil saved by recycling a metric tonne of waste.</li>
</ul>

</div>
<pre><code>
</code></pre>

This project aims to know the amount of energy (kWh) saved in Singapore by recycling **glass, plastic, ferrous, and non-ferrous metals** between **2015 and 2019**. Hence, we need to import data, select the materials mentioned before and the years of interest.

In [72]:
# Import libraries
import numpy as np
import pandas as pd

Let's work with the first data set about recycling between 2003 and 2017

In [73]:
# Import data from the period 2003-2017
df1 = pd.read_csv('datasets/wastestats.csv')

df1.info()  # No missing values, 225 total rows
df1.tail()

<class 'pandas.core.frame.DataFrame'>
RangeIndex: 225 entries, 0 to 224
Data columns (total 6 columns):
waste_type                     225 non-null object
waste_disposed_of_tonne        225 non-null int64
total_waste_recycled_tonne     225 non-null float64
total_waste_generated_tonne    225 non-null int64
recycling_rate                 225 non-null float64
year                           225 non-null int64
dtypes: float64(2), int64(3), object(1)
memory usage: 10.6+ KB


Unnamed: 0,waste_type,waste_disposed_of_tonne,total_waste_recycled_tonne,total_waste_generated_tonne,recycling_rate,year
220,Ash and sludge,214800,28600.0,243400,0.12,2017
221,Plastic,763400,51800.0,815200,0.06,2017
222,Textile/Leather,141200,9600.0,150800,0.06,2017
223,"Others (stones, ceramic, rubber, etc.)",319300,7100.0,326400,0.02,2017
224,Total,2980000,4724300.0,7704300,0.61,2017


In [74]:
# Years
years = [2015,2016,2017]

# Filter rows by specific years
filtered_df1 = df1.loc[df1['year'].isin(years)]

In [75]:
# Types of materials
print("==== There are {} types of materials ====".format(filtered_df1['waste_type'].nunique()))
filtered_df1['waste_type'].unique()

==== There are 21 types of materials ====


array(['Food', 'Paper/Cardboard', 'Plastics', 'C&D',
       'Horticultural waste', 'Wood', 'Ferrous metal',
       'Non-ferrous metal', 'Used slag', 'Ash & Sludge', 'Glass',
       'Textile/Leather', 'Scrap tyres',
       'Others (stones, ceramics & rubber etc.)', 'Total',
       'Others (stones, ceramics & rubber etc)', 'Construction debris',
       'Non-ferrous metals', 'Ash and sludge', 'Plastic',
       'Others (stones, ceramic, rubber, etc.)'], dtype=object)

In [76]:
# Replace material names
filtered_df1.replace({'Plastics': 'Plastic', 'Ferrous metal': 'Ferrous Metal',
                      'Non-ferrous metal': 'Non-Ferrous Metal',
                      'Non-ferrous metals': 'Non-Ferrous Metal'}, inplace = True)

In [77]:
# Types of materials, after replacement
print("==== There are {} types of materials ====".format(filtered_df1['waste_type'].nunique()))
filtered_df1['waste_type'].unique()

==== There are 19 types of materials ====


array(['Food', 'Paper/Cardboard', 'Plastic', 'C&D', 'Horticultural waste',
       'Wood', 'Ferrous Metal', 'Non-Ferrous Metal', 'Used slag',
       'Ash & Sludge', 'Glass', 'Textile/Leather', 'Scrap tyres',
       'Others (stones, ceramics & rubber etc.)', 'Total',
       'Others (stones, ceramics & rubber etc)', 'Construction debris',
       'Ash and sludge', 'Others (stones, ceramic, rubber, etc.)'],
      dtype=object)

In [78]:
# Materials list
materials = ['Glass', 'Plastic', 'Ferrous Metal', 'Non-Ferrous Metal']

# Filter rows by specific materials
final_df1 = filtered_df1.loc[filtered_df1['waste_type'].isin(materials)].sort_values(by=['year', 'waste_type'])

final_df1

Unnamed: 0,waste_type,waste_disposed_of_tonne,total_waste_recycled_tonne,total_waste_generated_tonne,recycling_rate,year
21,Ferrous Metal,15200,1333300.0,1348500,0.99,2015
25,Glass,60600,14600.0,75200,0.19,2015
22,Non-Ferrous Metal,19600,160400.0,180000,0.89,2015
17,Plastic,766800,57800.0,824600,0.07,2015
6,Ferrous Metal,6000,1351500.0,1357500,0.99,2016
10,Glass,57600,14700.0,72300,0.2,2016
7,Non-Ferrous Metal,1300,95900.0,97200,0.99,2016
2,Plastic,762700,59500.0,822200,0.07,2016
211,Ferrous Metal,7800,1371000.0,1378800,0.99,2017
218,Glass,58900,12400.0,71300,0.17,2017


In [79]:
# Last filter of columns
df1_recycled = final_df1.loc[:, ['waste_type', 'total_waste_recycled_tonne', 'year']]
df1_recycled

Unnamed: 0,waste_type,total_waste_recycled_tonne,year
21,Ferrous Metal,1333300.0,2015
25,Glass,14600.0,2015
22,Non-Ferrous Metal,160400.0,2015
17,Plastic,57800.0,2015
6,Ferrous Metal,1351500.0,2016
10,Glass,14700.0,2016
7,Non-Ferrous Metal,95900.0,2016
2,Plastic,59500.0,2016
211,Ferrous Metal,1371000.0,2017
218,Glass,12400.0,2017


Now, we are going to work with the second data set about recycling between 2017 and 2019

In [80]:
# Import data from 2017 to 2019
df2 = pd.read_csv('datasets/2018_2019_waste.csv')

df2.info()
df2.head()

<class 'pandas.core.frame.DataFrame'>
RangeIndex: 30 entries, 0 to 29
Data columns (total 4 columns):
Waste Type                       30 non-null object
Total Generated ('000 tonnes)    30 non-null int64
Total Recycled ('000 tonnes)     30 non-null int64
Year                             30 non-null int64
dtypes: int64(3), object(1)
memory usage: 1.0+ KB


Unnamed: 0,Waste Type,Total Generated ('000 tonnes),Total Recycled ('000 tonnes),Year
0,Construction& Demolition,1440,1434,2019
1,Ferrous Metal,1278,1270,2019
2,Paper/Cardboard,1011,449,2019
3,Plastics,930,37,2019
4,Food,7440,136,2019


In [81]:
# Replace material names
df2.replace({'Plastics': 'Plastic'}, inplace = True)

In [82]:
# Types of materials
print("==== There are {} types of materials ====".format(df2['Waste Type'].nunique()))
df2['Waste Type'].unique()

==== There are 15 types of materials ====


array(['Construction& Demolition', 'Ferrous Metal', 'Paper/Cardboard',
       'Plastic', 'Food', 'Wood', 'Horticultural', 'Ash & Sludge',
       'Textile/Leather', 'Used Slag', 'Non-Ferrous Metal', 'Glass',
       'Scrap Tyres', 'Others (stones, ceramic, rubber, ect)', 'Overall'],
      dtype=object)

In [83]:
# Filter rows by specific materials
final_df2 = df2.loc[df2['Waste Type'].isin(materials)].sort_values(by='Year')

final_df2

Unnamed: 0,Waste Type,Total Generated ('000 tonnes),Total Recycled ('000 tonnes),Year
16,Ferrous Metal,1269,126,2018
18,Plastic,949,41,2018
25,Non-Ferrous Metal,171,170,2018
26,Glass,64,12,2018
1,Ferrous Metal,1278,1270,2019
3,Plastic,930,37,2019
10,Non-Ferrous Metal,126,124,2019
11,Glass,75,11,2019


In [84]:
# Select columns of interest
df2_recycled = final_df2.loc[:, ['Waste Type', "Total Recycled ('000 tonnes)", 'Year']]

df2_recycled

Unnamed: 0,Waste Type,Total Recycled ('000 tonnes),Year
16,Ferrous Metal,126,2018
18,Plastic,41,2018
25,Non-Ferrous Metal,170,2018
26,Glass,12,2018
1,Ferrous Metal,1270,2019
3,Plastic,37,2019
10,Non-Ferrous Metal,124,2019
11,Glass,11,2019


In [85]:
# Transform total recycled to regular metric

df2_recycled["Total Recycled ('000 tonnes)"] = df2_recycled["Total Recycled ('000 tonnes)"].apply(lambda x: x*1000)

df2_recycled

Unnamed: 0,Waste Type,Total Recycled ('000 tonnes),Year
16,Ferrous Metal,126000,2018
18,Plastic,41000,2018
25,Non-Ferrous Metal,170000,2018
26,Glass,12000,2018
1,Ferrous Metal,1270000,2019
3,Plastic,37000,2019
10,Non-Ferrous Metal,124000,2019
11,Glass,11000,2019


In [86]:
# Change column names and sort by year and materials
df2_recycled = df2_recycled.rename(columns = {'Waste Type': 'waste_type',
                                              "Total Recycled ('000 tonnes)": 'total_waste_recycled_tonne',
                                              'Year': 'year'}).sort_values(by=['year', 'waste_type'])
df2_recycled

Unnamed: 0,waste_type,total_waste_recycled_tonne,year
16,Ferrous Metal,126000,2018
26,Glass,12000,2018
25,Non-Ferrous Metal,170000,2018
18,Plastic,41000,2018
1,Ferrous Metal,1270000,2019
11,Glass,11000,2019
10,Non-Ferrous Metal,124000,2019
3,Plastic,37000,2019


In [87]:
# Combine the data sets
df_recycled = df1_recycled.append(df2_recycled).reset_index(drop = True)

df_recycled

Unnamed: 0,waste_type,total_waste_recycled_tonne,year
0,Ferrous Metal,1333300.0,2015
1,Glass,14600.0,2015
2,Non-Ferrous Metal,160400.0,2015
3,Plastic,57800.0,2015
4,Ferrous Metal,1351500.0,2016
5,Glass,14700.0,2016
6,Non-Ferrous Metal,95900.0,2016
7,Plastic,59500.0,2016
8,Ferrous Metal,1371000.0,2017
9,Glass,12400.0,2017


Extract the energy saved by each material from the third data set

In [88]:
# Import data
df3 = pd.read_csv('datasets/energy_saved.csv')
df3

Unnamed: 0,The table gives the amount of energy saved in kilowatt hour (kWh) and the amount of crude oil (barrels) by recycling 1 metric tonne (1000 kilogram) per waste type,Unnamed: 1,Unnamed: 2,Unnamed: 3,Unnamed: 4,Unnamed: 5
0,1 barrel oil is approximately 159 litres of oil,,,,,
1,,,,,,
2,material,Plastic,Glass,Ferrous Metal,Non-Ferrous Metal,Paper
3,energy_saved,5774 Kwh,42 Kwh,642 Kwh,14000 Kwh,4000 kWh
4,crude_oil saved,16 barrels,,1.8 barrels,40 barrels,1.7 barrels


In [89]:
# Reshape the dataframe
material_energy = df3.loc[2:3, :].T.reset_index(drop = True)
material_energy = material_energy.iloc[1:, ]
material_energy.columns = ['waste_type', 'energy_saved']

material_energy

Unnamed: 0,waste_type,energy_saved
1,Plastic,5774 Kwh
2,Glass,42 Kwh
3,Ferrous Metal,642 Kwh
4,Non-Ferrous Metal,14000 Kwh
5,Paper,4000 kWh


In [90]:
# Extract the number in energy saved
material_energy['energy_saved'] = material_energy['energy_saved'].str.split(' ').str[0]

material_energy

Unnamed: 0,waste_type,energy_saved
1,Plastic,5774
2,Glass,42
3,Ferrous Metal,642
4,Non-Ferrous Metal,14000
5,Paper,4000


In [91]:
# Convert to integer
material_energy['energy_saved'] = material_energy['energy_saved'].astype(int)

material_energy.info()

<class 'pandas.core.frame.DataFrame'>
RangeIndex: 5 entries, 1 to 5
Data columns (total 2 columns):
waste_type      5 non-null object
energy_saved    5 non-null int64
dtypes: int64(1), object(1)
memory usage: 164.0+ bytes


Now it's time to merge the dataframes

In [92]:
# Merge
recycled_energy = df_recycled.merge(material_energy, on = ['waste_type'], how = 'left')

recycled_energy

Unnamed: 0,waste_type,total_waste_recycled_tonne,year,energy_saved
0,Ferrous Metal,1333300.0,2015,642
1,Glass,14600.0,2015,42
2,Non-Ferrous Metal,160400.0,2015,14000
3,Plastic,57800.0,2015,5774
4,Ferrous Metal,1351500.0,2016,642
5,Glass,14700.0,2016,42
6,Non-Ferrous Metal,95900.0,2016,14000
7,Plastic,59500.0,2016,5774
8,Ferrous Metal,1371000.0,2017,642
9,Glass,12400.0,2017,42


In [93]:
# Create total_energy_saved column
recycled_energy['total_energy_saved'] = recycled_energy['total_waste_recycled_tonne'] * recycled_energy['energy_saved']

recycled_energy

Unnamed: 0,waste_type,total_waste_recycled_tonne,year,energy_saved,total_energy_saved
0,Ferrous Metal,1333300.0,2015,642,855978600.0
1,Glass,14600.0,2015,42,613200.0
2,Non-Ferrous Metal,160400.0,2015,14000,2245600000.0
3,Plastic,57800.0,2015,5774,333737200.0
4,Ferrous Metal,1351500.0,2016,642,867663000.0
5,Glass,14700.0,2016,42,617400.0
6,Non-Ferrous Metal,95900.0,2016,14000,1342600000.0
7,Plastic,59500.0,2016,5774,343553000.0
8,Ferrous Metal,1371000.0,2017,642,880182000.0
9,Glass,12400.0,2017,42,520800.0


In [94]:
# Total amount of energy saved per year
recycled_energy_year = recycled_energy[['year', 'total_energy_saved']].groupby('year').sum().reset_index()

recycled_energy_year

Unnamed: 0,year,total_energy_saved
0,2015,3435929000.0
1,2016,2554433000.0
2,2017,2470596000.0
3,2018,2698130000.0
4,2019,2765440000.0


In [95]:
# Set year as index
annual_energy_savings = recycled_energy_year.set_index('year')

annual_energy_savings

Unnamed: 0_level_0,total_energy_saved
year,Unnamed: 1_level_1
2015,3435929000.0
2016,2554433000.0
2017,2470596000.0
2018,2698130000.0
2019,2765440000.0
