## Energy saved from recycling
<p>Did you know that recycling saves energy by reducing or eliminating the need to make materials from scratch? For example, aluminum can manufacturers can skip the energy-costly process of producing aluminum from ore by cleaning and melting recycled cans. Aluminum is classified as a non-ferrous metal.</p>
<p>Singapore has an ambitious goal of becoming a zero-waste nation. The amount of waste disposed of in Singapore has increased seven-fold over the last 40 years. At this rate, Semakau Landfill, Singapore’s only landfill, will run out of space by 2035. Making matters worse, Singapore has limited land for building new incineration plants or landfills.</p>
<p>The government would like to motivate citizens by sharing the total energy that the combined recycling efforts have saved every year. They have asked you to help them.</p>
<p>You have been provided with three datasets. The data come from different teams, so the names of waste types may differ.</p>
<div style="background-color: #efebe4; color: #05192d; text-align:left; vertical-align: middle; padding: 15px 25px 15px 25px; line-height: 1.6;">
    <div style="font-size:16px"><b>datasets/wastestats.csv - Recycling statistics per waste type for the period 2003 to 2017</b>
    </div>
    <div>Source: <a href="https://www.nea.gov.sg/our-services/waste-management/waste-statistics-and-overall-recycling">Singapore National Environment Agency</a></div>
<ul>
    <li><b>waste_type: </b>The type of waste recycled.</li>
    <li><b>waste_disposed_of_tonne: </b>The amount of waste that could not be recycled (in metric tonnes).</li>
    <li><b>total_waste_recycle_tonne: </b>The amount of waste that could be recycled (in metric tonnes).</li>
    <li><b>total_waste_generated: </b>The total amount of waste collected before recycling (in metric tonnes).</li>
    <li><b>recycling_rate: </b>The amount of waste recycled per tonne of waste generated.</li>
    <li><b>year: </b>The recycling year.</li>
</ul>
    </div>
<div style="background-color: #efebe4; color: #05192d; text-align:left; vertical-align: middle; padding: 15px 25px 15px 25px; line-height: 1.6; margin-top: 17px;">
    <div style="font-size:16px"><b>datasets/2018_2019_waste.csv - Recycling statistics per waste type for the period 2018 to 2019</b>
    </div>
    <div> Source: <a href="https://www.nea.gov.sg/our-services/waste-management/waste-statistics-and-overall-recycling">Singapore National Environment Agency</a></div>
<ul>
    <li><b>Waste Type: </b>The type of waste recycled.</li>
    <li><b>Total Generated: </b>The total amount of waste collected before recycling (in thousands of metric tonnes).</li> 
    <li><b>Total Recycled: </b>The amount of waste that could be recycled. (in thousands of metric tonnes).</li>
    <li><b>Year: </b>The recycling year.</li>
</ul>
    </div>
<div style="background-color: #efebe4; color: #05192d; text-align:left; vertical-align: middle; padding: 15px 25px 15px 25px; line-height: 1.6; margin-top: 17px;">
    <div style="font-size:16px"><b>datasets/energy_saved.csv -  Estimations of the amount of energy saved per waste type in kWh</b>
    </div>
<ul>
    <li><b>material: </b>The type of waste recycled.</li>
    <li><b>energy_saved: </b>An estimate of the energy saved (in kiloWatt hour) by recycling a metric tonne of waste.</li> 
    <li><b>crude_oil_saved: </b>An estimate of the number of barrels of oil saved by recycling a metric tonne of waste.</li>
</ul>

</div>
<pre><code>
</code></pre>

# Instructions
The Singapore government has asked you to help them determine how much energy they have saved per year by recycling. You need to answer the following question:

How much energy in kiloWatt hour (kWh) has Singapore saved per year by recycling glass, plastic, ferrous, and non-ferrous metals between 2015 and 2019?

Save your answer as a DataFrame named annual_energy_savings with the an index labelled year. Your DataFrame should consist of one column, total_energy_saved, which contains the total amount of energy in kWh saved per year across the four materials described above. It should resemble the following table:

total_energy_saved year
2015 xxxx 2016 xxxx 2017 xxxx 2018 xxxx 2019 xxxx

Note: Unlike the Guided and Unguided Projects that exist on our platform, if you get stuck in this task, you will not have access to any hints, nor will you be able to request a solution. Similarly, our testing process is focused on your answers and will not provide feedback to help you towards your solution. All steps required, including importing, exploration, cleaning, and analysis, will be up to you!

# Project: Energy Savings: Python Certification

In [None]:
#List files ONLY in the current directory
import os

files = [f for f in os.listdir('.') if os.path.isfile(f)]
print(files)

['.profile', '.bashrc', '.bash_logout', '.startup.py']


In [None]:
#Import packages
import pandas as pd

In [None]:
#Read dataframes
df_recycle_before_18 = pd.read_csv('datasets/wastestats.csv', index_col='year')
df_recycle_from_18 = pd.read_csv('datasets/2018_2019_waste.csv', index_col='Year')
df_energy_saved = pd.read_csv('datasets/energy_saved.csv')

## Understand columns and data types in each dataframe

In [None]:
df_recycle_before_18.info()

<class 'pandas.core.frame.DataFrame'>
Int64Index: 225 entries, 2016 to 2017
Data columns (total 5 columns):
waste_type                     225 non-null object
waste_disposed_of_tonne        225 non-null int64
total_waste_recycled_tonne     225 non-null float64
total_waste_generated_tonne    225 non-null int64
recycling_rate                 225 non-null float64
dtypes: float64(2), int64(2), object(1)
memory usage: 10.5+ KB


No NULL values - a good sign  
waste_type - 225 non-null object - string  
  
The other data types look alright  
-waste_disposed_of_tonne - 225 non-null int64  
-total_waste_recycled_tonne - 225 non-null float64  
-total_waste_generated_tonne - 225 non-null int64  
-recycling_rate - 225 non-null float64  
-year - 225 non-null int64

In [None]:
df_recycle_from_18.info()

<class 'pandas.core.frame.DataFrame'>
Int64Index: 30 entries, 2019 to 2018
Data columns (total 3 columns):
Waste Type                       30 non-null object
Total Generated ('000 tonnes)    30 non-null int64
Total Recycled ('000 tonnes)     30 non-null int64
dtypes: int64(2), object(1)
memory usage: 960.0+ bytes


NO NULL values - a good sign  
Waste Type - 30 non-null object - string  

The other data types look alright  
-Total Generated ('000 tonnes) - 30 non-null int64  
-Total Recycled ('000 tonnes) - 30 non-null int64  
-Year - 30 non-null int64  

In [None]:
df_energy_saved.info()

<class 'pandas.core.frame.DataFrame'>
RangeIndex: 5 entries, 0 to 4
Data columns (total 6 columns):
The table gives the amount of energy saved in kilowatt hour (kWh) and the amount of crude oil (barrels) by recycling 1 metric tonne (1000 kilogram)  per waste type    4 non-null object
Unnamed: 1                                                                                                                                                             3 non-null object
Unnamed: 2                                                                                                                                                             2 non-null object
Unnamed: 3                                                                                                                                                             3 non-null object
Unnamed: 4                                                                                                                                                      

The "Energy Saved" dataframe contains some NULL values in its columns.  
Each field also seems to contain string values.

## Explore value distribution and sample values in each dataframe

#### i. Recycling from 2003 to 2017

In [None]:
df_recycle_before_18.describe()

Unnamed: 0,waste_disposed_of_tonne,total_waste_recycled_tonne,total_waste_generated_tonne,recycling_rate
count,225.0,225.0,225.0,225.0
mean,369719.1,489698.7,859417.3,0.481778
std,684247.0,960767.8,1579112.0,0.365106
min,1300.0,0.0,14400.0,0.0
25%,24600.0,18300.0,118400.0,0.11
50%,106200.0,91100.0,332400.0,0.49
75%,500000.0,520000.0,809800.0,0.85
max,3045200.0,4825900.0,7851500.0,0.99


In [None]:
df_recycle_before_18.head()

Unnamed: 0_level_0,waste_type,waste_disposed_of_tonne,total_waste_recycled_tonne,total_waste_generated_tonne,recycling_rate
year,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1
2016,Food,679900,111100.0,791000,0.14
2016,Paper/Cardboard,576000,607100.0,1183100,0.51
2016,Plastics,762700,59500.0,822200,0.07
2016,C&D,9700,1585700.0,1595400,0.99
2016,Horticultural waste,111500,209000.0,320500,0.65


#### ii. Recycling from 2018 to 2019

In [None]:
df_recycle_from_18.describe()

Unnamed: 0,Total Generated ('000 tonnes),Total Recycled ('000 tonnes)
count,30.0,30.0
mean,1218.433333,560.4
std,2165.170833,1149.760683
min,32.0,6.0
25%,173.5,26.0
50%,360.0,126.5
75%,1043.25,394.25
max,7695.0,4726.0


In [None]:
df_recycle_from_18.head()

Unnamed: 0_level_0,Waste Type,Total Generated ('000 tonnes),Total Recycled ('000 tonnes)
Year,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1
2019,Construction& Demolition,1440,1434
2019,Ferrous Metal,1278,1270
2019,Paper/Cardboard,1011,449
2019,Plastics,930,37
2019,Food,7440,136


#### iii. Energy Saved

The values in the 2 dataframes 'df_recycle_before_18' and 'df_recycle_from_18' look as expected, consistent with our earlier interpretation of the data types.

In [None]:
df_energy_saved.describe()

Unnamed: 0,The table gives the amount of energy saved in kilowatt hour (kWh) and the amount of crude oil (barrels) by recycling 1 metric tonne (1000 kilogram) per waste type,Unnamed: 1,Unnamed: 2,Unnamed: 3,Unnamed: 4,Unnamed: 5
count,4,3,2,3,3,3
unique,4,3,2,3,3,3
top,1 barrel oil is approximately 159 litres of oil,Plastic,Glass,1.8 barrels,Non-Ferrous Metal,Paper
freq,1,1,1,1,1,1


The summary statistics of the "Energy Saved" dataframe look alittle gibberish, such as the 'top' row.  
Let us find out why below.

In [None]:
df_energy_saved.head()

Unnamed: 0,The table gives the amount of energy saved in kilowatt hour (kWh) and the amount of crude oil (barrels) by recycling 1 metric tonne (1000 kilogram) per waste type,Unnamed: 1,Unnamed: 2,Unnamed: 3,Unnamed: 4,Unnamed: 5
0,1 barrel oil is approximately 159 litres of oil,,,,,
1,,,,,,
2,material,Plastic,Glass,Ferrous Metal,Non-Ferrous Metal,Paper
3,energy_saved,5774 Kwh,42 Kwh,642 Kwh,14000 Kwh,4000 kWh
4,crude_oil saved,16 barrels,,1.8 barrels,40 barrels,1.7 barrels


It seems that the "Energy Saved" dataframe has 3 rows in the header which can be removed.  
The 3 fields we are interested in are "material", "energy_saved" and "crude_oil saved".

Since we are at the start, we can reimport a cleaned "Energy Saved" dataframe to extract the 3 fields we are interested in.  
Once we extract these 3 fields, we need to transpose the dataset. 

#### iv. Fix the 'Energy Saved' Dataframe

In [None]:
#pandas read_csv skips NaN/blank lines by default in the skip_blank_lines argument
#so we specify 2 non-blank lines to skip below
df_energy_saved = pd.read_csv('datasets/energy_saved.csv', skiprows=2, index_col=0)
#df_energy_saved = pd.read_csv('datasets/energy_saved.csv', header=1, skiprows=2, index_col=0)

In [None]:
df_energy_saved.head()

Unnamed: 0,Unnamed: 1,Unnamed: 2,Unnamed: 3,Unnamed: 4,Unnamed: 5
material,Plastic,Glass,Ferrous Metal,Non-Ferrous Metal,Paper
energy_saved,5774 Kwh,42 Kwh,642 Kwh,14000 Kwh,4000 kWh
crude_oil saved,16 barrels,,1.8 barrels,40 barrels,1.7 barrels


In [None]:
#try using melt
#df_energy_saved_melted = df_energy_saved.melt(id_vars='material')
#df_energy_saved_melted.head()

In [None]:
df_energy_saved_transposed = df_energy_saved.T

In [None]:
df_energy_saved_transposed.head()

Unnamed: 0,material,energy_saved,crude_oil saved
Unnamed: 1,Plastic,5774 Kwh,16 barrels
Unnamed: 2,Glass,42 Kwh,
Unnamed: 3,Ferrous Metal,642 Kwh,1.8 barrels
Unnamed: 4,Non-Ferrous Metal,14000 Kwh,40 barrels
Unnamed: 5,Paper,4000 kWh,1.7 barrels


In [None]:
df_energy_saved_transposed.info()

<class 'pandas.core.frame.DataFrame'>
Index: 5 entries, Unnamed: 1 to Unnamed: 5
Data columns (total 3 columns):
material           5 non-null object
energy_saved       5 non-null object
crude_oil saved    4 non-null object
dtypes: object(3)
memory usage: 320.0+ bytes


Now that df_energy_saved_transposed has been transposed, we can create a column of integers from the 'energy_saved'  column without the 'Kwh' suffix.

In [None]:
df_energy_saved_transposed['energy_saved_int'] = [5774,42,645,14000,4000]

In [None]:
df_energy_saved_transposed.info()

<class 'pandas.core.frame.DataFrame'>
Index: 5 entries, Unnamed: 1 to Unnamed: 5
Data columns (total 4 columns):
material            5 non-null object
energy_saved        5 non-null object
crude_oil saved     4 non-null object
energy_saved_int    5 non-null int64
dtypes: int64(1), object(3)
memory usage: 360.0+ bytes


Do not use 'material' values as the index column. The 'material' values will need to be in a column instead of being in the index for a join between dataframes later.

In [None]:
df_energy_saved_transposed.head()

Unnamed: 0,material,energy_saved,crude_oil saved,energy_saved_int
Unnamed: 1,Plastic,5774 Kwh,16 barrels,5774
Unnamed: 2,Glass,42 Kwh,,42
Unnamed: 3,Ferrous Metal,642 Kwh,1.8 barrels,645
Unnamed: 4,Non-Ferrous Metal,14000 Kwh,40 barrels,14000
Unnamed: 5,Paper,4000 kWh,1.7 barrels,4000


We have fixed the "Energy Saved" dataframe by   
-reading in **'datasets/energy_saved.csv'** again, skipping the 2 non-blank header rows and 1 blank header row
-transposing the dataframe to show the **'material' column** as the index column, with corresponding **'energy_saved' and 'crude_oil saved' values** across each material value  
-created a column of integers from the **'energy_saved' column** without the 'Kwh'/'kWh' suffix  

  
Now df_energy_saved_transposed is ready for analysis.

## Combine the 2 Recycling Statistics Dataframes

Next, let us combine the 2 Recycling Statistics Dataframes to simplify our analysis.  

To do so, we need to extract the relevant columns from the "Recycling from 2003 to 2017" dataframe. This is because the "Recycling from 2003 to 2017" dataframe has more columns relative to the "Recycling from 2018 to 2019" dataframe irrelevant to our analysis.  

Once done, let us append the "Recycling from 2018 to 2019" to the smaller "Recycling from 2003 to 2017" dataframe.

In [None]:
#extract the relevant columns
df_recycle_before_18 = df_recycle_before_18[['waste_type', 'total_waste_generated_tonne', 'total_waste_recycled_tonne']]

#convert the units for total waste generated and recycled from tonne to 1000 tonnes
df_recycle_before_18['total_waste_generated_tonne'] = df_recycle_before_18['total_waste_generated_tonne']/1000
df_recycle_before_18['total_waste_recycled_tonne'] = df_recycle_before_18['total_waste_recycled_tonne']/1000

#then adjust the column labels to follow the more informative labels in df_recycle_from_18
df_recycle_before_18.rename(columns={"waste_type": "Waste Type",
                                     "total_waste_generated_tonne": "Total Generated ('000 tonnes)",
                                     "total_waste_recycled_tonne": "Total Recycled ('000 tonnes)"},
                            inplace=True)

In [None]:
#Check that column labels have been replaced as expected
df_recycle_before_18.columns

Index(['Waste Type', 'Total Generated ('000 tonnes)',
       'Total Recycled ('000 tonnes)'],
      dtype='object')

In [None]:
#below df_recycle_before_18, append records from df_recycle_from_18 
df_recycle_all = df_recycle_before_18.append(df_recycle_from_18)
df_recycle_all.sort_index()
#df_recycle_all.sort_values(by=['Waste Type'])

Unnamed: 0,Waste Type,Total Generated ('000 tonnes),Total Recycled ('000 tonnes)
2003,Horticultural Waste,304.6,119.3
2003,Paper/Cardboard,1084.7,466.2
2003,Plastics,579.9,39.1
2003,Construction Debris,422.9,398.3
2003,Wood/Timber,213.4,40.8
2003,Ferrous Metals,856.7,799.0
2003,Food waste,548.0,32.9
2003,Non-ferrous Metals,93.9,75.8
2003,Sludge,88.5,0.0
2003,Glass,65.5,6.2


## Combine the full Recycling Statistics Dataframe with the Energy Saved Dataframe

In [None]:
#do an INNER JOIN based on the Recycling Statistics Dataframe 'Waste Type' values found in the Energy Saved Dataframe 'materials' column
#with df_recycle_all on the left and df_energy_saved_transposed on the right
full_energy_saved = df_recycle_all.reset_index().merge(df_energy_saved_transposed, 
                                                        how="inner",
                                                        left_on="Waste Type",
                                                        right_on="material",
                                                        sort=True,
                                                        copy=True).set_index('index')
                                                    
#we also need to remove paper, which the "Instructions" did not want
full_energy_saved = full_energy_saved[full_energy_saved["Waste Type"] != "Paper"]

#left_index=False,
#right_index=True,

In [None]:
full_energy_saved.head()

Unnamed: 0_level_0,Waste Type,Total Generated ('000 tonnes),Total Recycled ('000 tonnes),material,energy_saved,crude_oil saved,energy_saved_int
index,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1,Unnamed: 7_level_1
2013,Ferrous Metal,1416.0,1369.2,Ferrous Metal,642 Kwh,1.8 barrels,645
2012,Ferrous Metal,1386.0,1331.2,Ferrous Metal,642 Kwh,1.8 barrels,645
2011,Ferrous Metal,1239.2,1171.6,Ferrous Metal,642 Kwh,1.8 barrels,645
2010,Ferrous Metal,1194.6,1127.5,Ferrous Metal,642 Kwh,1.8 barrels,645
2009,Ferrous Metal,872.0,806.2,Ferrous Metal,642 Kwh,1.8 barrels,645


In [None]:
#this was used when I performed a LEFT JOIN
#to subset for only the materials found in both dataframes

#with the INNER JOIN approach above, I no longer need to perform this step
#materialList = []

#for material in df_energy_saved_transposed['material']:
#    materialList.append(material)

In [None]:
#materialList
#full_energy_saved[full_energy_saved['Waste Type'].isin(materialList)]

In [None]:
full_energy_saved.tail()

Unnamed: 0_level_0,Waste Type,Total Generated ('000 tonnes),Total Recycled ('000 tonnes),material,energy_saved,crude_oil saved,energy_saved_int
index,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1,Unnamed: 7_level_1
2019,Glass,75.0,11.0,Glass,42 Kwh,,42
2018,Glass,64.0,12.0,Glass,42 Kwh,,42
2019,Non-Ferrous Metal,126.0,124.0,Non-Ferrous Metal,14000 Kwh,40 barrels,14000
2018,Non-Ferrous Metal,171.0,170.0,Non-Ferrous Metal,14000 Kwh,40 barrels,14000
2017,Plastic,815.2,51.8,Plastic,5774 Kwh,16 barrels,5774


From the merged dataframe, we have:  
-**"Waste Type" (and "material")** values  
-**"Total Recycled ('000 tonnes)"** values  
-**"energy_saved_int"** integer values -> per ton  

For each row, compute the total energy saved in kWh by multiplying **"Total Recycled ('000 tonnes)"** by a **factor of 1000** to unravel the "1000" tonne, and multiply with **"energy_saved_int"**.  
Save the computed total energy saved in kWh for each row in a **new column "total_energy_saved"**.

In [None]:
full_energy_saved['total_energy_saved'] = full_energy_saved["Total Recycled ('000 tonnes)"] * 1000 * full_energy_saved['energy_saved_int']

In [None]:
full_energy_saved.head()

Unnamed: 0_level_0,Waste Type,Total Generated ('000 tonnes),Total Recycled ('000 tonnes),material,energy_saved,crude_oil saved,energy_saved_int,total_energy_saved
index,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1,Unnamed: 7_level_1,Unnamed: 8_level_1
2013,Ferrous Metal,1416.0,1369.2,Ferrous Metal,642 Kwh,1.8 barrels,645,883134000.0
2012,Ferrous Metal,1386.0,1331.2,Ferrous Metal,642 Kwh,1.8 barrels,645,858624000.0
2011,Ferrous Metal,1239.2,1171.6,Ferrous Metal,642 Kwh,1.8 barrels,645,755682000.0
2010,Ferrous Metal,1194.6,1127.5,Ferrous Metal,642 Kwh,1.8 barrels,645,727237500.0
2009,Ferrous Metal,872.0,806.2,Ferrous Metal,642 Kwh,1.8 barrels,645,519999000.0


In [None]:
full_energy_saved.reset_index().head()

Unnamed: 0,index,Waste Type,Total Generated ('000 tonnes),Total Recycled ('000 tonnes),material,energy_saved,crude_oil saved,energy_saved_int,total_energy_saved
0,2013,Ferrous Metal,1416.0,1369.2,Ferrous Metal,642 Kwh,1.8 barrels,645,883134000.0
1,2012,Ferrous Metal,1386.0,1331.2,Ferrous Metal,642 Kwh,1.8 barrels,645,858624000.0
2,2011,Ferrous Metal,1239.2,1171.6,Ferrous Metal,642 Kwh,1.8 barrels,645,755682000.0
3,2010,Ferrous Metal,1194.6,1127.5,Ferrous Metal,642 Kwh,1.8 barrels,645,727237500.0
4,2009,Ferrous Metal,872.0,806.2,Ferrous Metal,642 Kwh,1.8 barrels,645,519999000.0


## Aggregate total energy saved per year from 2015 to 2019

Perform a SUM of the **'total_energy_saved' values** across each year.

In [None]:
annual_energy_savings = full_energy_saved.groupby('index')['total_energy_saved'].sum()

Inspect the "annual_energy_savings" variable whether the contents look similar to the expected output requested in the Instructions panel.

In [None]:
print(annual_energy_savings)

index
2003    2.604000e+05
2004    2.058000e+05
2005    1.596000e+05
2006    2.688000e+05
2007    4.311036e+08
2008    4.744950e+08
2009    5.206542e+08
2010    7.280439e+08
2011    7.565808e+08
2012    8.594430e+08
2013    8.837472e+08
2014    6.594000e+05
2015    6.132000e+05
2016    6.174000e+05
2017    2.996140e+08
2018    2.461774e+09
2019    2.555612e+09
Name: total_energy_saved, dtype: float64


Subset the dataframe to contain only the 'total_energy_saved' values from 2015 to 2019.

In [None]:
#annual_energy_savings[-5:]
annual_energy_savings = annual_energy_savings.tail(5)

In [None]:
print(annual_energy_savings)

index
2015    6.132000e+05
2016    6.174000e+05
2017    2.996140e+08
2018    2.461774e+09
2019    2.555612e+09
Name: total_energy_saved, dtype: float64


Set the index column name to 'year'.

In [None]:
annual_energy_savings.index.name = 'year'

Set the value column name to 'total_energy_saved'.

In [None]:
annual_energy_savings.columns = ["total_energy_saved"]

Inspect the variable 'annual_energy_savings'.  
It seems that the 'total_energy_saved' column name is not appearing after checking the last 5 rows.  
After checking the data type of the variable 'annual_energy_savings', it seems that this is currently a pandas Series.

In [None]:
annual_energy_savings.tail()

year
2015    6.132000e+05
2016    6.174000e+05
2017    2.996140e+08
2018    2.461774e+09
2019    2.555612e+09
Name: total_energy_saved, dtype: float64

In [None]:
type(annual_energy_savings)

pandas.core.series.Series

Therefore, let me change the variable 'annual_energy_savings' from a pandas Series to a dataframe.

In [None]:
annual_energy_savings = annual_energy_savings.to_frame()

In [None]:
type(annual_energy_savings)

pandas.core.frame.DataFrame

We now have a:
<ol> 
<li>'annual_energy_savings' dataframe</li>
<li>'year' index column</li>
<li>'annual_energy_savings' value in kWh column</li>
<li>show only values from 2015 to 2019</li>
</ol>

which follows what the Instructions panel wanted.

In [None]:
print(annual_energy_savings)

      total_energy_saved
year                    
2015        6.132000e+05
2016        6.174000e+05
2017        2.996140e+08
2018        2.461774e+09
2019        2.555612e+09


Yet I am encountering an issue after submitting this Jupyter notebook by clicking on "Check Project".  
  
  
Some tests failed  
TEST 1  
There was an error while testing your code. Please double-check your submission.  