## Energy saved from recycling
<p>Did you know that recycling saves energy by reducing or eliminating the need to make materials from scratch? For example, aluminum can manufacturers can skip the energy-costly process of producing aluminum from ore by cleaning and melting recycled cans. Aluminum is classified as a non-ferrous metal.</p>
<p>Singapore has an ambitious goal of becoming a zero-waste nation. The amount of waste disposed of in Singapore has increased seven-fold over the last 40 years. At this rate, Semakau Landfill, Singapore’s only landfill, will run out of space by 2035. Making matters worse, Singapore has limited land for building new incineration plants or landfills.</p>
<p>The government would like to motivate citizens by sharing the total energy that the combined recycling efforts have saved every year. They have asked you to help them.</p>
<p>You have been provided with three datasets. The data come from different teams, so the names of waste types may differ.</p>
<div style="background-color: #efebe4; color: #05192d; text-align:left; vertical-align: middle; padding: 15px 25px 15px 25px; line-height: 1.6;">
    <div style="font-size:16px"><b>datasets/wastestats.csv - Recycling statistics per waste type for the period 2003 to 2017</b>
    </div>
    <div>Source: <a href="https://www.nea.gov.sg/our-services/waste-management/waste-statistics-and-overall-recycling">Singapore National Environment Agency</a></div>
<ul>
    <li><b>waste_type: </b>The type of waste recycled.</li>
    <li><b>waste_disposed_of_tonne: </b>The amount of waste that could not be recycled (in metric tonnes).</li>
    <li><b>total_waste_recycle_tonne: </b>The amount of waste that could be recycled (in metric tonnes).</li>
    <li><b>total_waste_generated: </b>The total amount of waste collected before recycling (in metric tonnes).</li>
    <li><b>recycling_rate: </b>The amount of waste recycled per tonne of waste generated.</li>
    <li><b>year: </b>The recycling year.</li>
</ul>
    </div>
<div style="background-color: #efebe4; color: #05192d; text-align:left; vertical-align: middle; padding: 15px 25px 15px 25px; line-height: 1.6; margin-top: 17px;">
    <div style="font-size:16px"><b>datasets/2018_2019_waste.csv - Recycling statistics per waste type for the period 2018 to 2019</b>
    </div>
    <div> Source: <a href="https://www.nea.gov.sg/our-services/waste-management/waste-statistics-and-overall-recycling">Singapore National Environment Agency</a></div>
<ul>
    <li><b>Waste Type: </b>The type of waste recycled.</li>
    <li><b>Total Generated: </b>The total amount of waste collected before recycling (in thousands of metric tonnes).</li> 
    <li><b>Total Recycled: </b>The amount of waste that could be recycled. (in thousands of metric tonnes).</li>
    <li><b>Year: </b>The recycling year.</li>
</ul>
    </div>
<div style="background-color: #efebe4; color: #05192d; text-align:left; vertical-align: middle; padding: 15px 25px 15px 25px; line-height: 1.6; margin-top: 17px;">
    <div style="font-size:16px"><b>datasets/energy_saved.csv -  Estimations of the amount of energy saved per waste type in kWh</b>
    </div>
<ul>
    <li><b>material: </b>The type of waste recycled.</li>
    <li><b>energy_saved: </b>An estimate of the energy saved (in kiloWatt hour) by recycling a metric tonne of waste.</li> 
    <li><b>crude_oil_saved: </b>An estimate of the number of barrels of oil saved by recycling a metric tonne of waste.</li>
</ul>

</div>
<pre><code>
</code></pre>

In [2]:
import pandas as pd 
import numpy as np

In [3]:
waste17 = pd.read_csv("datasets/wastestats.csv")
waste19 = pd.read_csv("datasets/2018_2019_waste.csv")
energy = pd.read_csv("datasets/energy_saved.csv")

# We'll start by wrangling our first dataset 

In [4]:
waste17.head()

Unnamed: 0,waste_type,waste_disposed_of_tonne,total_waste_recycled_tonne,total_waste_generated_tonne,recycling_rate,year
0,Food,679900,111100.0,791000,0.14,2016
1,Paper/Cardboard,576000,607100.0,1183100,0.51,2016
2,Plastics,762700,59500.0,822200,0.07,2016
3,C&D,9700,1585700.0,1595400,0.99,2016
4,Horticultural waste,111500,209000.0,320500,0.65,2016


In [5]:
waste17.info()

<class 'pandas.core.frame.DataFrame'>
RangeIndex: 225 entries, 0 to 224
Data columns (total 6 columns):
waste_type                     225 non-null object
waste_disposed_of_tonne        225 non-null int64
total_waste_recycled_tonne     225 non-null float64
total_waste_generated_tonne    225 non-null int64
recycling_rate                 225 non-null float64
year                           225 non-null int64
dtypes: float64(2), int64(3), object(1)
memory usage: 10.6+ KB


In [6]:
waste17.drop(['waste_disposed_of_tonne', 'total_waste_generated_tonne', 'recycling_rate'], axis=1, inplace = True)

In [7]:
waste17.head()

Unnamed: 0,waste_type,total_waste_recycled_tonne,year
0,Food,111100.0,2016
1,Paper/Cardboard,607100.0,2016
2,Plastics,59500.0,2016
3,C&D,1585700.0,2016
4,Horticultural waste,209000.0,2016


In [8]:
waste17 = waste17[['waste_type', 'year', 'total_waste_recycled_tonne']] 
waste17.head()

Unnamed: 0,waste_type,year,total_waste_recycled_tonne
0,Food,2016,111100.0
1,Paper/Cardboard,2016,607100.0
2,Plastics,2016,59500.0
3,C&D,2016,1585700.0
4,Horticultural waste,2016,209000.0


In [9]:
waste17.columns = ['Waste_type', 'Year', 'Total_recycled']

In [10]:
x = waste17[(waste17['Waste_type'] == 'Plastics')  | (waste17['Waste_type'] == 'Glass') | (waste17['Waste_type'] == 'Ferrous Metal') | (waste17['Waste_type'] == 'Non-Ferrous Metal')]
x

Unnamed: 0,Waste_type,Year,Total_recycled
2,Plastics,2016,59500.0
10,Glass,2016,14700.0
17,Plastics,2015,57800.0
25,Glass,2015,14600.0
32,Plastics,2014,80000.0
40,Glass,2014,15700.0
47,Plastics,2013,91100.0
51,Ferrous Metal,2013,1369200.0
55,Glass,2013,14600.0
62,Plastics,2012,82100.0


In [11]:
x.reset_index(drop=True)

Unnamed: 0,Waste_type,Year,Total_recycled
0,Plastics,2016,59500.0
1,Glass,2016,14700.0
2,Plastics,2015,57800.0
3,Glass,2015,14600.0
4,Plastics,2014,80000.0
5,Glass,2014,15700.0
6,Plastics,2013,91100.0
7,Ferrous Metal,2013,1369200.0
8,Glass,2013,14600.0
9,Plastics,2012,82100.0


In [12]:
x.shape

(36, 3)

# let's start with our second dataset 

In [13]:
waste19.head()

Unnamed: 0,Waste Type,Total Generated ('000 tonnes),Total Recycled ('000 tonnes),Year
0,Construction& Demolition,1440,1434,2019
1,Ferrous Metal,1278,1270,2019
2,Paper/Cardboard,1011,449,2019
3,Plastics,930,37,2019
4,Food,7440,136,2019


In [14]:
waste19.info()

<class 'pandas.core.frame.DataFrame'>
RangeIndex: 30 entries, 0 to 29
Data columns (total 4 columns):
Waste Type                       30 non-null object
Total Generated ('000 tonnes)    30 non-null int64
Total Recycled ('000 tonnes)     30 non-null int64
Year                             30 non-null int64
dtypes: int64(3), object(1)
memory usage: 1.0+ KB


In [15]:
waste19.columns = ['Waste_type', 'Total_generated', 'Total_recycled', 'Year']

In [16]:
waste19.drop('Total_generated', axis =1, inplace = True )

In [17]:
waste19.Total_recycled = waste19.Total_recycled * 1000

In [18]:
waste19.head()

Unnamed: 0,Waste_type,Total_recycled,Year
0,Construction& Demolition,1434000,2019
1,Ferrous Metal,1270000,2019
2,Paper/Cardboard,449000,2019
3,Plastics,37000,2019
4,Food,136000,2019


In [19]:
y = waste19[(waste19['Waste_type'] == 'Plastics')  | (waste19['Waste_type'] == 'Glass') | (waste19['Waste_type'] == 'Ferrous Metal') | (waste19['Waste_type'] == 'Non-Ferrous Metal')]
y

Unnamed: 0,Waste_type,Total_recycled,Year
1,Ferrous Metal,1270000,2019
3,Plastics,37000,2019
10,Non-Ferrous Metal,124000,2019
11,Glass,11000,2019
16,Ferrous Metal,126000,2018
18,Plastics,41000,2018
25,Non-Ferrous Metal,170000,2018
26,Glass,12000,2018


In [20]:
y.reset_index(drop=True)
y.shape

(8, 3)

> After we wrangled both our datasets its time to merge both and to order them 

In [21]:
waste = pd.concat([x, y], join = 'outer', ignore_index = True)
waste.head()

Unnamed: 0,Total_recycled,Waste_type,Year
0,59500.0,Plastics,2016
1,14700.0,Glass,2016
2,57800.0,Plastics,2015
3,14600.0,Glass,2015
4,80000.0,Plastics,2014


In [22]:
waste.info()

<class 'pandas.core.frame.DataFrame'>
RangeIndex: 44 entries, 0 to 43
Data columns (total 3 columns):
Total_recycled    44 non-null float64
Waste_type        44 non-null object
Year              44 non-null int64
dtypes: float64(1), int64(1), object(1)
memory usage: 1.1+ KB


In [23]:
waste['Year'] = waste.Year.astype('int64')

In [24]:
waste = waste[waste.Year > 2014]

In [25]:
waste.shape

(13, 3)

In [26]:
waste

Unnamed: 0,Total_recycled,Waste_type,Year
0,59500.0,Plastics,2016
1,14700.0,Glass,2016
2,57800.0,Plastics,2015
3,14600.0,Glass,2015
35,12400.0,Glass,2017
36,1270000.0,Ferrous Metal,2019
37,37000.0,Plastics,2019
38,124000.0,Non-Ferrous Metal,2019
39,11000.0,Glass,2019
40,126000.0,Ferrous Metal,2018


In [27]:
waste = waste.groupby(['Year','Waste_type'])[ 'Total_recycled'].sum()


In [28]:
waste = pd.DataFrame(waste)

In [29]:
waste

Unnamed: 0_level_0,Unnamed: 1_level_0,Total_recycled
Year,Waste_type,Unnamed: 2_level_1
2015,Glass,14600.0
2015,Plastics,57800.0
2016,Glass,14700.0
2016,Plastics,59500.0
2017,Glass,12400.0
2018,Ferrous Metal,126000.0
2018,Glass,12000.0
2018,Non-Ferrous Metal,170000.0
2018,Plastics,41000.0
2019,Ferrous Metal,1270000.0


# energy_saved dataset

In [30]:
energy.head()

Unnamed: 0,The table gives the amount of energy saved in kilowatt hour (kWh) and the amount of crude oil (barrels) by recycling 1 metric tonne (1000 kilogram) per waste type,Unnamed: 1,Unnamed: 2,Unnamed: 3,Unnamed: 4,Unnamed: 5
0,1 barrel oil is approximately 159 litres of oil,,,,,
1,,,,,,
2,material,Plastic,Glass,Ferrous Metal,Non-Ferrous Metal,Paper
3,energy_saved,5774 Kwh,42 Kwh,642 Kwh,14000 Kwh,4000 kWh
4,crude_oil saved,16 barrels,,1.8 barrels,40 barrels,1.7 barrels


In [31]:
energy.info()

<class 'pandas.core.frame.DataFrame'>
RangeIndex: 5 entries, 0 to 4
Data columns (total 6 columns):
The table gives the amount of energy saved in kilowatt hour (kWh) and the amount of crude oil (barrels) by recycling 1 metric tonne (1000 kilogram)  per waste type    4 non-null object
Unnamed: 1                                                                                                                                                             3 non-null object
Unnamed: 2                                                                                                                                                             2 non-null object
Unnamed: 3                                                                                                                                                             3 non-null object
Unnamed: 4                                                                                                                                                      

In [32]:
energy.columns = ['0', '1', '2', '3', '4', '5']

In [33]:
energy.head()

Unnamed: 0,0,1,2,3,4,5
0,1 barrel oil is approximately 159 litres of oil,,,,,
1,,,,,,
2,material,Plastic,Glass,Ferrous Metal,Non-Ferrous Metal,Paper
3,energy_saved,5774 Kwh,42 Kwh,642 Kwh,14000 Kwh,4000 kWh
4,crude_oil saved,16 barrels,,1.8 barrels,40 barrels,1.7 barrels


In [34]:
energy.set_index('0', inplace =True)
energy.head()

Unnamed: 0_level_0,1,2,3,4,5
0,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1
1 barrel oil is approximately 159 litres of oil,,,,,
,,,,,
material,Plastic,Glass,Ferrous Metal,Non-Ferrous Metal,Paper
energy_saved,5774 Kwh,42 Kwh,642 Kwh,14000 Kwh,4000 kWh
crude_oil saved,16 barrels,,1.8 barrels,40 barrels,1.7 barrels


In [35]:
energy = energy.T
energy.head()

0,1 barrel oil is approximately 159 litres of oil,nan,material,energy_saved,crude_oil saved
1,,,Plastic,5774 Kwh,16 barrels
2,,,Glass,42 Kwh,
3,,,Ferrous Metal,642 Kwh,1.8 barrels
4,,,Non-Ferrous Metal,14000 Kwh,40 barrels
5,,,Paper,4000 kWh,1.7 barrels


In [36]:
energy.columns = ['1 barrel oil is approximately 159 litres of oil', 'nan', 'Waste_type', 'Energy_saved', 'Oil_saved']

In [37]:
energy.drop(['1 barrel oil is approximately 159 litres of oil', 'nan'], axis=1, inplace=True)

In [38]:
energy.head()

Unnamed: 0,Waste_type,Energy_saved,Oil_saved
1,Plastic,5774 Kwh,16 barrels
2,Glass,42 Kwh,
3,Ferrous Metal,642 Kwh,1.8 barrels
4,Non-Ferrous Metal,14000 Kwh,40 barrels
5,Paper,4000 kWh,1.7 barrels


In [39]:
energy = energy[(energy['Waste_type'] == 'Plastic')  | (energy['Waste_type'] == 'Glass') | (energy['Waste_type'] == 'Ferrous Metal') | (energy['Waste_type'] == 'Non-Ferrous Metal')]
energy

Unnamed: 0,Waste_type,Energy_saved,Oil_saved
1,Plastic,5774 Kwh,16 barrels
2,Glass,42 Kwh,
3,Ferrous Metal,642 Kwh,1.8 barrels
4,Non-Ferrous Metal,14000 Kwh,40 barrels


In [40]:
energy.drop("Oil_saved",axis=1, inplace = True)

In [41]:
energy

Unnamed: 0,Waste_type,Energy_saved
1,Plastic,5774 Kwh
2,Glass,42 Kwh
3,Ferrous Metal,642 Kwh
4,Non-Ferrous Metal,14000 Kwh


# Comining both the waste dataset with energy dataset 

In [42]:
waste

Unnamed: 0_level_0,Unnamed: 1_level_0,Total_recycled
Year,Waste_type,Unnamed: 2_level_1
2015,Glass,14600.0
2015,Plastics,57800.0
2016,Glass,14700.0
2016,Plastics,59500.0
2017,Glass,12400.0
2018,Ferrous Metal,126000.0
2018,Glass,12000.0
2018,Non-Ferrous Metal,170000.0
2018,Plastics,41000.0
2019,Ferrous Metal,1270000.0


In [43]:
energy

Unnamed: 0,Waste_type,Energy_saved
1,Plastic,5774 Kwh
2,Glass,42 Kwh
3,Ferrous Metal,642 Kwh
4,Non-Ferrous Metal,14000 Kwh


> We can drop out some columns will not be used 