## Introduction 

The domain that we will be working with for this project is Energy and sustainability of Calgary. Energy plays a fundamental role in our lives; everything requires energy in one form or another. Canada is in the top five of natural gas producers in the world; two-thirds of which come from Alberta. In 2017, the energy sector made up 9.2%, or $175 Billion, of Canada's Gross Domestic Product (GDP) whereas, in Alberta the energy sector contributed 21.61% of provincial GDP. This is signiﬁcantly more than in the rest of Canada; the oil and gas sector make up a major part of economic activity in Alberta. As one of the major cities in Alberta, Calgary has long been known as an energy city and took on a lot of initiatives to encourage in becoming more energy efficient eventually. In 2008, the City of Calgary developed the Sustainable Buildings Partnership Program to improve the performance of existing city infrastructure and support the sustainable building policy. This program's purpose is to identify and improve existing corporate infrastructure efficiency. These are proposed to be done using audits, alternative energy technologies, conservation, and energy efficiency upgrades. We focused on addressing this context and investigate into energy consumption situation in the City of Calgary.  

## Objectives  

In our project, we want to analyze the energy consumption situations at different structures and facilities in Calgary. The goal of this project is to understand energy use for buildings and investigate the effects of it. We will use Python and SQL to perform these functions and analyze the energy use of buildings and structures aligned with the sustainable building policy. We have performed this investigation on each individual data set, as well as on merged datasets. This study is important to assess if the energy use of buildings and structures are in alignment with the sustainable building policy. Through this investigation, we aim to understand better energy efficiency and we aim to provide new insights as to whether energy efficiency needs to be improved. 

## Individual Datasets 

We have selected four datasets for this project representing the energy consumptions at different capacities within the City of Calgary. 

### Dataset 1: Building energy benchmarking: 

This dataset is an annual dataset from the year 2019 to 2021 and is the energy distribution for each individual building in the City of Calgary. It has 297 rows and 23 columns and contains data about each property. It has a property ID for each, and a classification of which type of property that specific property belongs to, which can be “Office,” “Residential,” “Fire Station,” etc. and which Year it was built in. It also has the Area of each building. For each of these properties, we have the total Site Energy use, the Electricity use, The Natural Gas use, and the District Hot water use, among others.  

### Dataset 2: Corporate Energy Consumption – City of Calgary: 

We wanted to explore more into the energy consumption at city facilities and chose corporate energy consumption dataset. This dataset is open data available at city of Calgary webpage. This is in tabular format with 300k rows, and 9 columns collected over the period of 2014 to 2021.  The total consumption of energy is recorded every month; these data are collected at 20 different business facilities; e.g., Calgary Housing, Calgary Transit, Fire Department, Park and open spaces, Waste and Recycling Services etc. 

### Dataset 3: Energy consumption of 6 Building Data - University of Calgary:  

The data for this project was retrieved through the Office of Sustainability Campus as a Learning Lab initiative collected as energy consumption at different buildings daily. Due to the non-disclosure agreement, we could not present any link to this dataset. This dataset contains different sources of energy namely, heating, cooling, electricity, natural gas and domestic water. The data is available for six different buildings throughout the campus over the three years of 2018 to 2021. 

### Dataset 4: Current and Historical Alberta Weather Station Data – ACIS: 

The dataset was retrieved from Alberta Agriculture, Forestry and Rural Economic Development, Alberta Climate Information Service (ACIS) while agreeing to the “ACIS Data Disclaimer & Terms of Use” https://acis.alberta.ca/acis/data-disclaimer.jsp allows us to operate this dataset with fully acknowledged and cited. In this part of study, we will use “Current and Historical Alberta Weather Station Data” (ACIS, November 2022).  The Alberta Climate Information Service (ACIS) collects weather and climate data from a variety of meteorological stations operated by various government agencies. For the investigation of this dataset, we will be focusing on the variables: Date, Air Temp. Avg. (°C), Heating Degree Days (DD), Cooling Degree Days (DD) for our weather normalization analysis. The dataset recorded daily temperature from 2014 to 2022 at Calgary International CS weather station. 

## Guiding Questions 

The research questions that we intend to discuss in this report via analysis is: 

1. What is the correlation between Natural Gas and Electricity consumption for Annual data, Monthly data, and Campus data. 

We chose this guiding question to explore the energy consumption from two main sources, we have used dataset 1, 2 and 3 to answer this guiding question. 

2. What is the correlation between Normalized Energy consumption for Annual data, Monthly data, and Campus data. 

The purpose of this guiding question is to have a better understanding of energy consumption of heating degree days and cooling degree days using dataset 1, 2 and 4. 

3. What is the correlation between Temperature of each data and energy consumption for every month? 

We wanted to observe whether energy consumption has any dependency on outside temperature and which kind property or building type use most energy on a certain weather; here we also investigated the source of energy mostly used for the heating or cooling. 

4. How did COVID-19 pandemic impact the energy use for buildings on campus? 

During the covid time with lockdown and restrictions most people stayed at home, we wished to observe the effect of covid on University of Calgary Campus by checking up the energy use during that time and comparing it with pre-covid and post-covid periods. 

5. Has the efficiency of heating energy usage on campus buildings improved from 2018 to 2022? 

We focused on the energy consumption at different campus buildings because the University of Calgary is known to be a leading university in sustainability while undertaking steps to reduce dependency on carbon-based fuel and improve energy efficiency.  

## Methodology 

### Data Cleaning, Wrangling and SQL queries 

### Dataset 1:   
The ‘Energy Benchmark of Calgary’ dataset was chosen as it had a lot of data related to the individual buildings across Calgary, and other details which can be useful to join the dataset with other datasets. We have merged this dataset with the Weather Station data and compared the values of this dataset with the corporate energy data, and the Campus data. 

Individual Exploration included checking for null values, and cleaning and structuring the data using Excel and SQL. The ‘District Hot Water’ column was dropped as it contained a lot of null values. There was an exploratory analysis on the dataset, which answered a few questions for us. We found the Property Type that uses the most energy, which, using queries, we get as “Office” building type. There was another analysis on how Age of the building affects the energy use, for which we found that the Age of the building has no effect, after which we did another analysis for how the Area of that building affects energy use, but concluded that that too, has no effect. We also conducted analysis on how each building was dependent on Natural Gas and Electricity, and for results we get that buildings are about 50% dependent on Natural Gas, and 20%-45% dependent on electricity. After all this, there was work done using merged datasets. 


### Dataset 2:   
At first, we imported the dataset ‘Corporate Energy Consumption’ in both jupyterlab and MySQL  and checked the data for any missing or NaN values. We 

In [1]:
import pandas as pd
import sqlalchemy as sq
import matplotlib.pyplot as plt
import plotly.express as px
import seaborn as sns

Dataset 2: At first, we imported the dataset ‘Corporate Energy Consumption’ in both jupyterlab and MySQL; we also checked the data for any missing or NaN values at both excel and SQL. Since there is no missing value so we moved for 

In [2]:
corporate_energy_consumption = pd.read_csv("corporate_energy_consumption.csv")
display(corporate_energy_consumption.head())

  corporate_energy_consumption = pd.read_csv("corporate_energy_consumption.csv")


Unnamed: 0,Business_Unit_Desc,Facility_Name,Site_ID,Facility_Address,Energy_Description,Year,Month,Total_Consumption,Unit
0,Calgary Fire Department,ATCO VILLAGE (HOUSE),20003498361,6015 23 AV SE,Electricity,2014,1,1883,kWh
1,Calgary Fire Department,ATCO VILLAGE (HOUSE),20003498361,6015 23 AV SE,Electricity,2014,2,2320,kWh
2,Calgary Fire Department,ATCO VILLAGE (HOUSE),20003498361,6015 23 AV SE,Electricity,2014,3,1657,kWh
3,Calgary Fire Department,ATCO VILLAGE (HOUSE),20003498361,6015 23 AV SE,Electricity,2014,4,1107,kWh
4,Calgary Fire Department,ATCO VILLAGE (HOUSE),20003498361,6015 23 AV SE,Electricity,2014,5,972,kWh


In [3]:
temperature_data = pd.read_csv("Temp_data.csv")
temperature_data.head()
display(temperature_data.head())
display(temperature_data.tail())

Unnamed: 0,Station_Name,Year,Month,Monthly_Air_Temp,Monthly_Heating_Degree_Days,Monthly_Cooling_Degree_Days
0,Calgary Int'L CS,2018,1,-6.867742,770.9,0.0
1,Calgary Int'L CS,2018,2,-12.4,851.2,0.0
2,Calgary Int'L CS,2018,3,-5.806452,738.0,0.0
3,Calgary Int'L CS,2018,4,0.976667,510.7,0.0
4,Calgary Int'L CS,2018,5,14.122581,131.4,11.2


Unnamed: 0,Station_Name,Year,Month,Monthly_Air_Temp,Monthly_Heating_Degree_Days,Monthly_Cooling_Degree_Days
43,Calgary Int'L CS,2021,8,16.609677,82.3,39.2
44,Calgary Int'L CS,2021,9,12.993333,152.6,2.4
45,Calgary Int'L CS,2021,10,5.277419,394.4,0.0
46,Calgary Int'L CS,2021,11,1.27,501.9,0.0
47,Calgary Int'L CS,2021,12,-12.909677,958.2,0.0


In [4]:
engine = sq.create_engine('mysql+mysqlconnector://jannatul_naeema:2TIU75GLP@datasciencedb.ucalgary.ca/jannatul_naeema')

In [5]:
temperature_data.to_sql('temperature_data', engine )

ValueError: Table 'temperature_data' already exists.

In [6]:
corporate_energy_consumption.to_sql('corporate_energy_consumption',engine)

ValueError: Table 'corporate_energy_consumption' already exists.

In [7]:
# Average Natural gas use for business units from 2014-2022

query1 = pd.read_sql_query("select Business_Unit_Desc, AVG(Total_Consumption*277.778) AS Natural_Gas_AVG_kWh, Energy_Description FROM corporate_energy_consumption WHERE Energy_Description='Natural Gas' GROUP BY Business_Unit_Desc ORDER BY Year, Natural_Gas_AVG_kWh DESC;", engine)
print (query1)

             Business_Unit_Desc  Natural_Gas_AVG_kWh Energy_Description
0                      Mobility         1.100252e+06        Natural Gas
1               Calgary Transit         1.709298e+05        Natural Gas
2                Water Services         1.297069e+05        Natural Gas
3     Waste and Recycling Srvcs         1.196976e+05        Natural Gas
4          Calgary Parking Auth         1.112403e+05        Natural Gas
5                 CPS - Bureaus         8.172051e+04        Natural Gas
6   Recreation and Social Prgms         6.375102e+04        Natural Gas
7           Facility Management         6.278802e+04        Natural Gas
8      Parks and Open Spaces-PK         3.835282e+04        Natural Gas
9    City and Regional Planning         3.012522e+04        Natural Gas
10      Calgary Fire Department         8.623556e+02        Natural Gas
11       Information Technology         4.476810e+02        Natural Gas
12              Calgary Housing         3.799169e+03        Natu

In [8]:
# Average Electricity use for business units from 2014-2022

query2 = pd.read_sql_query("select Business_Unit_Desc, AVG(Total_Consumption) AS Electricity_AVG_kWh, Energy_Description FROM corporate_energy_consumption WHERE Energy_Description='Electricity' GROUP BY Business_Unit_Desc ORDER BY Electricity_AVG_kWh DESC;", engine)
print (query2)

             Business_Unit_Desc  Electricity_AVG_kWh Energy_Description
0                      Mobility          155754.2864        Electricity
1                 CPS - Bureaus           58773.3848        Electricity
2                Water Services           47890.5105        Electricity
3               Calgary Transit           37467.2354        Electricity
4    City and Regional Planning           26817.7183        Electricity
5           Facility Management           22750.8548        Electricity
6          Calgary Parking Auth           22374.4425        Electricity
7   Recreation and Social Prgms           22358.2349        Electricity
8     Waste and Recycling Srvcs           19852.9311        Electricity
9      Real Estate and Dev Serv           13727.2178        Electricity
10            Downtown Strategy            8569.6496        Electricity
11       Information Technology            3728.1325        Electricity
12        Green Line Operations            1474.4929        Elec

In [9]:
# Average Natural gas use for business units group by years

query3 = pd.read_sql_query("select Business_Unit_Desc, Year, AVG(Total_Consumption*277.778) AS Natural_Gas_AVG_kWh, Energy_Description FROM corporate_energy_consumption WHERE Energy_Description='Natural Gas' GROUP BY Business_Unit_Desc, Year ORDER BY Year, Natural_Gas_AVG_kWh DESC;", engine)
print (query3)

          Business_Unit_Desc  Year  Natural_Gas_AVG_kWh Energy_Description
0                   Mobility  2014         1.229723e+06        Natural Gas
1            Calgary Transit  2014         1.677314e+05        Natural Gas
2       Calgary Parking Auth  2014         1.603250e+05        Natural Gas
3             Water Services  2014         1.436230e+05        Natural Gas
4              CPS - Bureaus  2014         9.324022e+04        Natural Gas
..                       ...   ...                  ...                ...
129   Public Spaces Delivery  2022         1.039683e+04        Natural Gas
130  Calgary Fire Department  2022         7.539689e+03        Natural Gas
131          Calgary Housing  2022         3.837743e+03        Natural Gas
132    Green Line Operations  2022         3.663525e+03        Natural Gas
133   Information Technology  2022         4.563496e+02        Natural Gas

[134 rows x 4 columns]


In [10]:
# Average Electricity use for business units group by years

query4 = pd.read_sql_query("SELECT Business_Unit_Desc, AVG(Total_Consumption) AS Electricity_AVG_kWh, Year FROM corporate_energy_consumption WHERE Energy_Description='Electricity' GROUP BY Business_Unit_Desc,Year ORDER BY Year, Electricity_AVG_kWh DESC;", engine)
print (query4)

             Business_Unit_Desc  Electricity_AVG_kWh  Year
0                      Mobility          240204.6905  2014
1                 CPS - Bureaus           60535.3706  2014
2                Water Services           50264.7991  2014
3               Calgary Transit           45773.9590  2014
4    City and Regional Planning           29254.2500  2014
..                          ...                  ...   ...
143      Information Technology            4150.0952  2022
144    Parks and Open Spaces-PK            1572.4611  2022
145      Public Spaces Delivery             810.0811  2022
146     Calgary Fire Department             391.8571  2022
147          Utilities Delivery              88.8780  2022

[148 rows x 3 columns]


In [11]:
# Yearly Natural gas use for business units 2018

query5 = pd.read_sql_query("SELECT Business_Unit_Desc, SUM(Total_Consumption*277.778) AS Natural_Gas_AVG_kWh, Energy_Description FROM corporate_energy_consumption WHERE Energy_Description='Natural Gas' AND Year = 2018 GROUP BY Business_Unit_Desc ORDER BY Natural_Gas_AVG_kWh DESC;", engine)
print (query5)

             Business_Unit_Desc  Natural_Gas_AVG_kWh Energy_Description
0           Facility Management         1.596085e+08        Natural Gas
1                Water Services         1.508135e+08        Natural Gas
2               Calgary Transit         1.290337e+08        Natural Gas
3                 CPS - Bureaus         2.780724e+07        Natural Gas
4     Waste and Recycling Srvcs         2.673085e+07        Natural Gas
5               Calgary Housing         1.435307e+07        Natural Gas
6          Calgary Parking Auth         1.284668e+07        Natural Gas
7      Real Estate and Dev Serv         1.262445e+07        Natural Gas
8                      Mobility         1.198362e+07        Natural Gas
9      Parks and Open Spaces-PK         3.290280e+06        Natural Gas
10  Recreation and Social Prgms         2.260280e+06        Natural Gas
11   City and Regional Planning         3.997225e+05        Natural Gas
12        Green Line Operations         2.700002e+05        Natu

At the year 

In [12]:
# Yearly Natural gas use for business units 2019

query6 = pd.read_sql_query("SELECT Business_Unit_Desc, SUM(Total_Consumption*277.778) AS Natural_Gas_AVG_kWh, Energy_Description FROM corporate_energy_consumption WHERE Energy_Description='Natural Gas' AND Year = 2019 GROUP BY Business_Unit_Desc ORDER BY Natural_Gas_AVG_kWh DESC;", engine)
print (query6)

             Business_Unit_Desc  Natural_Gas_AVG_kWh Energy_Description
0           Facility Management         1.697726e+08        Natural Gas
1               Calgary Transit         1.470912e+08        Natural Gas
2                Water Services         1.395829e+08        Natural Gas
3               Calgary Housing         1.200476e+08        Natural Gas
4     Waste and Recycling Srvcs         2.880641e+07        Natural Gas
5                 CPS - Bureaus         2.665363e+07        Natural Gas
6                      Mobility         1.212890e+07        Natural Gas
7          Calgary Parking Auth         1.205168e+07        Natural Gas
8      Real Estate and Dev Serv         9.423341e+06        Natural Gas
9      Parks and Open Spaces-PK         3.476114e+06        Natural Gas
10  Recreation and Social Prgms         1.485835e+06        Natural Gas
11       Public Spaces Delivery         3.444447e+05        Natural Gas
12   City and Regional Planning         3.136114e+05        Natu

In [13]:
# Yearly Natural gas use for business units 2020

query7 = pd.read_sql_query("select Business_Unit_Desc, SUM(Total_Consumption*277.778) AS Natural_Gas_AVG_kWh, Energy_Description FROM corporate_energy_consumption WHERE Energy_Description='Natural Gas' AND Year = 2020 GROUP BY Business_Unit_Desc ORDER BY Natural_Gas_AVG_kWh DESC;", engine)
print (query7)

             Business_Unit_Desc  Natural_Gas_AVG_kWh Energy_Description
0                Water Services         1.506918e+08        Natural Gas
1               Calgary Transit         1.448140e+08        Natural Gas
2           Facility Management         1.407951e+08        Natural Gas
3               Calgary Housing         1.139356e+08        Natural Gas
4                 CPS - Bureaus         2.401030e+07        Natural Gas
5     Waste and Recycling Srvcs         2.400113e+07        Natural Gas
6                      Mobility         1.411418e+07        Natural Gas
7      Real Estate and Dev Serv         1.005806e+07        Natural Gas
8          Calgary Parking Auth         1.001667e+07        Natural Gas
9      Parks and Open Spaces-PK         4.095281e+06        Natural Gas
10        Green Line Operations         1.913890e+05        Natural Gas
11       Public Spaces Delivery         1.844446e+05        Natural Gas
12  Recreation and Social Prgms         7.416673e+04        Natu

In [14]:
# Yearly Natural gas use for business units 2021

query8 = pd.read_sql_query("select Business_Unit_Desc, SUM(Total_Consumption*277.778) AS Natural_Gas_AVG_kWh, Energy_Description FROM corporate_energy_consumption WHERE Energy_Description='Natural Gas' AND Year = 2021 GROUP BY Business_Unit_Desc ORDER BY Natural_Gas_AVG_kWh DESC;", engine)
print (query8)

           Business_Unit_Desc  Natural_Gas_AVG_kWh Energy_Description
0             Calgary Transit         1.474535e+08        Natural Gas
1              Water Services         1.454621e+08        Natural Gas
2         Facility Management         1.417523e+08        Natural Gas
3             Calgary Housing         1.101681e+08        Natural Gas
4               CPS - Bureaus         2.494363e+07        Natural Gas
5   Waste and Recycling Srvcs         2.212780e+07        Natural Gas
6                    Mobility         1.159529e+07        Natural Gas
7        Calgary Parking Auth         1.011779e+07        Natural Gas
8    Real Estate and Dev Serv         7.732506e+06        Natural Gas
9    Parks and Open Spaces-PK         3.790559e+06        Natural Gas
10      Green Line Operations         2.611113e+05        Natural Gas
11     Public Spaces Delivery         2.288891e+05        Natural Gas
12    Calgary Fire Department         1.000001e+04        Natural Gas
13     Information T

In [15]:
# Yearly Electricity use for business units 2018

query9 = pd.read_sql_query("SELECT Business_Unit_Desc, SUM(Total_Consumption) AS Electricity_kWh, Energy_Description FROM corporate_energy_consumption WHERE Energy_Description='Electricity' AND Year = 2018 GROUP BY Business_Unit_Desc ORDER BY Electricity_kWh DESC;", engine)
print (query9)

             Business_Unit_Desc  Electricity_kWh Energy_Description
0                Water Services      158827747.0        Electricity
1               Calgary Transit      124308093.0        Electricity
2           Facility Management       72400311.0        Electricity
3                      Mobility       57938067.0        Electricity
4                 CPS - Bureaus       18579976.0        Electricity
5     Waste and Recycling Srvcs       18025194.0        Electricity
6          Calgary Parking Auth        6415028.0        Electricity
7      Real Estate and Dev Serv        4813201.0        Electricity
8      Parks and Open Spaces-PK        4813163.0        Electricity
9   Recreation and Social Prgms         360034.0        Electricity
10   City and Regional Planning         329195.0        Electricity
11       Information Technology         143893.0        Electricity
12           Utilities Delivery         104921.0        Electricity
13            Downtown Strategy          73441.0

In [16]:
# Yearly Electricity use for business units 2019

query10 = pd.read_sql_query("SELECT Business_Unit_Desc, SUM(Total_Consumption) AS Electricity_kWh, Energy_Description FROM corporate_energy_consumption WHERE Energy_Description='Electricity' AND Year = 2019 GROUP BY Business_Unit_Desc ORDER BY Electricity_kWh DESC;", engine)
print (query10)

             Business_Unit_Desc  Electricity_kWh Energy_Description
0                Water Services      154848580.0        Electricity
1               Calgary Transit      125116102.0        Electricity
2           Facility Management       71982643.0        Electricity
3                      Mobility       55079417.0        Electricity
4                 CPS - Bureaus       18128164.0        Electricity
5     Waste and Recycling Srvcs       16424720.0        Electricity
6          Calgary Parking Auth        5973970.0        Electricity
7      Parks and Open Spaces-PK        4889756.0        Electricity
8      Real Estate and Dev Serv        3445955.0        Electricity
9   Recreation and Social Prgms         293569.0        Electricity
10   City and Regional Planning         250043.0        Electricity
11           Utilities Delivery         155117.0        Electricity
12            Downtown Strategy         144594.0        Electricity
13       Information Technology         141027.0

In [17]:
# Yearly Electricity use for business units 2020

query11 = pd.read_sql_query("SELECT Business_Unit_Desc, SUM(Total_Consumption) AS Electricity_kWh, Energy_Description FROM corporate_energy_consumption WHERE Energy_Description='Electricity' AND Year = 2020 GROUP BY Business_Unit_Desc ORDER BY Electricity_kWh DESC;", engine)
print (query11)

           Business_Unit_Desc  Electricity_kWh Energy_Description
0              Water Services      158239947.0        Electricity
1             Calgary Transit      103704245.0        Electricity
2         Facility Management       65534303.0        Electricity
3                    Mobility       52656674.0        Electricity
4               CPS - Bureaus       17152861.0        Electricity
5   Waste and Recycling Srvcs       13285986.0        Electricity
6        Calgary Parking Auth        5705186.0        Electricity
7    Parks and Open Spaces-PK        4660480.0        Electricity
8    Real Estate and Dev Serv        3505508.0        Electricity
9           Downtown Strategy         247728.0        Electricity
10     Information Technology         139263.0        Electricity
11         Utilities Delivery         117088.0        Electricity
12     Public Spaces Delivery          64034.0        Electricity
13      Green Line Operations          51230.0        Electricity
14    Calg

In [67]:
# Yearly Electricity use for business units 2021

query11 = pd.read_sql_query("SELECT Business_Unit_Desc, SUM(Total_Consumption) AS Electricity_kWh, Energy_Description FROM corporate_energy_consumption WHERE Energy_Description='Electricity' AND Year = 2021 GROUP BY Business_Unit_Desc ORDER BY Electricity_kWh DESC;", engine)
print (query11)

           Business_Unit_Desc  Electricity_kWh Energy_Description
0              Water Services      170313782.0        Electricity
1             Calgary Transit       97403645.0        Electricity
2         Facility Management       65648104.0        Electricity
3                    Mobility       53009808.0        Electricity
4               CPS - Bureaus       16971312.0        Electricity
5   Waste and Recycling Srvcs       14389518.0        Electricity
6        Calgary Parking Auth        6337019.0        Electricity
7    Parks and Open Spaces-PK        4503123.0        Electricity
8    Real Estate and Dev Serv        2266384.0        Electricity
9           Downtown Strategy         386066.0        Electricity
10     Information Technology         142791.0        Electricity
11     Public Spaces Delivery          53008.0        Electricity
12         Utilities Delivery          45538.0        Electricity
13      Green Line Operations          45252.0        Electricity
14    Calg

### Biggest consumer of Natural gas in 2021 is Calgary Transit, in 2020 is Water Services, in 2019 and 2018 is Facility Management.
### Biggest consumer of Electricity in 2021, 2020, 2019 and 2018 is Calgary Transit.

## Joining Tables

In [19]:
ng_temp_2019 = pd.read_csv("NG_temp_2019.csv")
ng_temp_2019.head()

FileNotFoundError: [Errno 2] No such file or directory: 'NG_temp_2019.csv'

In [None]:
ng_temp_2020 = pd.read_csv("NG_temp_2020.csv")
ng_temp_2020.head()

In [None]:
ng_temp_2021 = pd.read_csv("NG_temp_2021.csv")
ng_temp_2021.head()

In [None]:
ng_temp_2018.to_sql('ng_temp_2018', engine )

In [20]:
ng_temp_2019.to_sql('ng_temp_2019', engine )

NameError: name 'ng_temp_2019' is not defined

In [21]:
ng_temp_2020.to_sql('ng_temp_2020', engine )

NameError: name 'ng_temp_2020' is not defined

In [22]:
ng_temp_2021.to_sql('ng_temp_2021', engine )

NameError: name 'ng_temp_2021' is not defined

In [23]:
electricity_temp_2018 = pd.read_csv("electricity_temp_2018.csv")
electricity_temp_2018.head()

FileNotFoundError: [Errno 2] No such file or directory: 'electricity_temp_2018.csv'

In [24]:
electricity_temp_2019 = pd.read_csv("electricity_temp_2019.csv")
electricity_temp_2019.head()

FileNotFoundError: [Errno 2] No such file or directory: 'electricity_temp_2019.csv'

In [25]:
electricity_temp_2020 = pd.read_csv("electricity_temp_2020.csv")
electricity_temp_2020.head()

FileNotFoundError: [Errno 2] No such file or directory: 'electricity_temp_2020.csv'

In [26]:
electricity_temp_2021 = pd.read_csv("electricity_temp_2021.csv")
electricity_temp_2021.head()

FileNotFoundError: [Errno 2] No such file or directory: 'electricity_temp_2021.csv'

In [27]:
electricity_temp_2018.to_sql('electricity_temp_2018', engine )

NameError: name 'electricity_temp_2018' is not defined

In [28]:
electricity_temp_2019.to_sql('electricity_temp_2019', engine )

NameError: name 'electricity_temp_2019' is not defined

In [29]:
electricity_temp_2020.to_sql('electricity_temp_2020', engine )

NameError: name 'electricity_temp_2020' is not defined

In [30]:
electricity_temp_2021.to_sql('electricity_temp_2021', engine )

NameError: name 'electricity_temp_2021' is not defined

In [31]:
fig = px.bar(ng_temp_2018, 
            x="Business_Unit_Desc", 
            y="NG_kWh",
            color="Month", 
            barmode='group',
            hover_data=["Business_Unit_Desc", "Month","NG_kWh"],
            title="NG Consumption at Different Business Units in 2018")
fig.show()
fig = px.bar(ng_temp_2019, 
            x="Business_Unit_Desc", 
            y="NG_kWh",
            color="Month", 
            barmode='group',
            hover_data=["Business_Unit_Desc", "Month","NG_kWh"],
            title="NG Consumption at Different Business Units in 2019")
fig.show()
fig = px.bar(ng_temp_2020, 
            x="Business_Unit_Desc", 
            y="NG_kWh",
            color="Month", 
            barmode='group',
            hover_data=["Business_Unit_Desc", "Month","NG_kWh"],
            title="NG Consumption at Different Business Units in 2020")
fig.show()
fig = px.bar(ng_temp_2021, 
            x="Business_Unit_Desc", 
            y="NG_kWh",
            color="Month", 
            barmode='group',
            hover_data=["Business_Unit_Desc", "Month","NG_kWh"],
            title="NG Consumption at Different Business Units in 2021")
fig.show()

NameError: name 'ng_temp_2018' is not defined

In [32]:
fig = px.bar(electricity_temp_2018, 
            x="Business_Unit_Desc", 
            y="Electricity_kWh",
            color="Month", 
            barmode='group',
            hover_data=["Business_Unit_Desc", "Month","Electricity_kWh"],
            title="Electricity Consumption at Different Business Units in 2018")
fig.show()
fig = px.bar(electricity_temp_2019, 
            x="Business_Unit_Desc", 
            y="Electricity_kWh",
            color="Month", 
            barmode='group',
            hover_data=["Business_Unit_Desc", "Month","Electricity_kWh"],
            title="Electricity Consumption at Different Business Units in 2019")
fig.show()
fig = px.bar(electricity_temp_2020, 
            x="Business_Unit_Desc", 
            y="Electricity_kWh",
            color="Month", 
            barmode='group',
            hover_data=["Business_Unit_Desc", "Month","Electricity_kWh"],
            title="Electricity Consumption at Different Business Units in 2020")
fig.show()
fig = px.bar(electricity_temp_2021, 
            x="Business_Unit_Desc", 
            y="Electricity_kWh",
            color="Month", 
            barmode='group',
            hover_data=["Business_Unit_Desc", "Month","Electricity_kWh"],
            title="Electricity Consumption at Different Business Units in 2021")
fig.show()

NameError: name 'electricity_temp_2018' is not defined

In [33]:
fig = px.bar(electricity_temp_2018, 
            x="Month", 
            y="Electricity_kWh",
            color="Business_Unit_Desc", 
            barmode='group',
            hover_data=["Business_Unit_Desc", "Month","Electricity_kWh"],
            title="Electricity Consumption at Different Business Units in 2018")
fig.show()
fig = px.bar(electricity_temp_2019, 
            x="Month", 
            y="Electricity_kWh",
            color="Business_Unit_Desc", 
            barmode='group',
            hover_data=["Business_Unit_Desc", "Month","Electricity_kWh"],
            title="Electricity Consumption at Different Business Units in 2019")
fig.show()

fig = px.bar(electricity_temp_2020, 
            x="Month", 
            y="Electricity_kWh",
            color="Business_Unit_Desc", 
            barmode='group',
            hover_data=["Business_Unit_Desc", "Month","Electricity_kWh"],
            title="Electricity Consumption at Different Business Units in 2020")
fig.show()
fig = px.bar(electricity_temp_2021, 
            x="Month", 
            y="Electricity_kWh",
            color="Business_Unit_Desc", 
            barmode='group',
            hover_data=["Business_Unit_Desc", "Month","Electricity_kWh"],
            title="Electricity Consumption at Different Business Units in 2021")
fig.show()

NameError: name 'electricity_temp_2018' is not defined

In [34]:
fig = px.bar(ng_temp_2018, 
            x="Month", 
            y="NG_kWh",
            color="Business_Unit_Desc", 
            barmode='group',
            hover_data=["Business_Unit_Desc", "Month","NG_kWh"],
            title="Natural Gas Consumption at Different Business Units in 2018")
fig.show()
fig = px.bar(ng_temp_2019, 
            x="Month", 
            y="NG_kWh",
            color="Business_Unit_Desc", 
            barmode='group',
            hover_data=["Business_Unit_Desc", "Month","NG_kWh"],
            title="Natural Gas Consumption at Different Business Units in 2019")
fig.show()

fig = px.bar(ng_temp_2020, 
            x="Month", 
            y="NG_kWh",
            color="Business_Unit_Desc", 
            barmode='group',
            hover_data=["Business_Unit_Desc", "Month","NG_kWh"],
            title="Natural Gas Consumption at Different Business Units in 2020")
fig.show()
fig = px.bar(ng_temp_2021, 
            x="Month", 
            y="NG_kWh",
            color="Business_Unit_Desc", 
            barmode='group',
            hover_data=["Business_Unit_Desc", "Month","NG_kWh"],
            title="Natural Gas Consumption at Different Business Units in 2021")
fig.show()

NameError: name 'ng_temp_2018' is not defined

## Normalized Calculations

In [35]:
ng_total_2019 = pd.read_csv("ng_total_2019.csv")
ng_total_2019.head()

FileNotFoundError: [Errno 2] No such file or directory: 'ng_total_2019.csv'

In [36]:
ng_total_2020 = pd.read_csv("ng_total_2020.csv")
ng_total_2020.head()

FileNotFoundError: [Errno 2] No such file or directory: 'ng_total_2020.csv'

In [37]:
ng_total_2021 = pd.read_csv("ng_total_2021.csv")
ng_total_2021.head()

FileNotFoundError: [Errno 2] No such file or directory: 'ng_total_2021.csv'

In [38]:
ng_total_2019.to_sql('ng_total_2019', engine )

NameError: name 'ng_total_2019' is not defined

In [39]:
ng_total_2020.to_sql('ng_total_2020', engine )

NameError: name 'ng_total_2020' is not defined

In [40]:
ng_total_2021.to_sql('ng_total_2021', engine )

NameError: name 'ng_total_2021' is not defined

In [41]:
elec_total_2019 = pd.read_csv("elec_total_2019.csv")
elec_total_2019.head()

FileNotFoundError: [Errno 2] No such file or directory: 'elec_total_2019.csv'

In [42]:
elec_total_2020 = pd.read_csv("elec_total_2020.csv")
elec_total_2020.head()

FileNotFoundError: [Errno 2] No such file or directory: 'elec_total_2020.csv'

In [43]:
elec_total_2021 = pd.read_csv("elec_total_2021.csv")
elec_total_2021.head()

FileNotFoundError: [Errno 2] No such file or directory: 'elec_total_2021.csv'

In [44]:
elec_total_2019.to_sql('elec_total_2019', engine )

NameError: name 'elec_total_2019' is not defined

In [45]:
elec_total_2020.to_sql('elec_total_2020', engine )

NameError: name 'elec_total_2020' is not defined

In [46]:
elec_total_2021.to_sql('elec_total_2021', engine )

NameError: name 'elec_total_2021' is not defined

In [47]:
total_2019 = pd.read_csv("total_2019.csv")
total_2019.head()

FileNotFoundError: [Errno 2] No such file or directory: 'total_2019.csv'

In [48]:
total_2020 = pd.read_csv("total_2020.csv")
total_2020.head()

FileNotFoundError: [Errno 2] No such file or directory: 'total_2020.csv'

In [49]:
total_2021 = pd.read_csv("total_2021.csv")
total_2021.head()

FileNotFoundError: [Errno 2] No such file or directory: 'total_2021.csv'

In [50]:
total_2019.to_sql('total_2019', engine )

NameError: name 'total_2019' is not defined

In [51]:
total_2020.to_sql('total_2020', engine )

NameError: name 'total_2020' is not defined

In [52]:
total_2021.to_sql('total_2021', engine )

NameError: name 'total_2021' is not defined

In [53]:
normalize_energy_2019 = pd.read_csv("Normalize Energy Consumption 2019.csv")
normalize_energy_2019.head()

FileNotFoundError: [Errno 2] No such file or directory: 'Normalize Energy Consumption 2019.csv'

In [54]:
normalize_energy_2020 = pd.read_csv("Normalize Energy Consumption 2020.csv")
normalize_energy_2020.head()

FileNotFoundError: [Errno 2] No such file or directory: 'Normalize Energy Consumption 2020.csv'

In [55]:
normalize_energy_2021 = pd.read_csv("Normalize Energy Consumption 2021.csv")
normalize_energy_2021.head()

FileNotFoundError: [Errno 2] No such file or directory: 'Normalize Energy Consumption 2021.csv'

In [56]:
normalize_energy_2019.to_sql('normalize_energy_2019', engine )

NameError: name 'normalize_energy_2019' is not defined

In [57]:
normalize_energy_2020.to_sql('normalize_energy_2020', engine )

NameError: name 'normalize_energy_2020' is not defined

In [58]:
normalize_energy_2021.to_sql('normalize_energy_2021', engine )

NameError: name 'normalize_energy_2021' is not defined

In [59]:
import plotly.express as px
fig1 = px.line(normalize_energy_2019, x="Month", y="Normalized_Energy_Consumption", 
               title="Normalized Energy in 2019")
fig1.show()

import plotly.express as px
fig1 = px.line(normalize_energy_2020, x="Month", y="Normalized_Energy_Consumption", title="Normalized Energy in 2020")
fig1.show()

import plotly.express as px
fig1 = px.line(normalize_energy_2021, x="Month", y="Normalized_Energy_Consumption", title="Normalized Energy in 2021")
fig1.show()


NameError: name 'normalize_energy_2019' is not defined

In [60]:
percentage_energy = '''SELECT ng_yearly.Year, sum(electricity_yearly.Electricity_kWh) , sum(ng_yearly.NG_kWh),
(SELECT sum(electricity_yearly.Electricity_kWh)*100/(sum(electricity_yearly.Electricity_kWh)+sum(ng_yearly.NG_kWh))) 
AS Percentage_Electricity, (SELECT sum(ng_yearly.NG_kWh)*100/(sum(electricity_yearly.Electricity_kWh)+sum(ng_yearly.NG_kWh))) 
AS Percentage_NG FROM ng_yearly JOIN electricity_yearly ON 
ng_yearly.Business_Unit_Desc = electricity_yearly.Business_Unit_Desc GROUP BY ng_yearly.Year;'''

percentage_energy = pd.read_sql_query(percentage_energy, engine)
(percentage_energy)

ProgrammingError: (mysql.connector.errors.ProgrammingError) 1146 (42S02): Table 'jannatul_naeema.ng_yearly' doesn't exist
[SQL: SELECT ng_yearly.Year, sum(electricity_yearly.Electricity_kWh) , sum(ng_yearly.NG_kWh),
(SELECT sum(electricity_yearly.Electricity_kWh)*100/(sum(electricity_yearly.Electricity_kWh)+sum(ng_yearly.NG_kWh))) 
AS Percentage_Electricity, (SELECT sum(ng_yearly.NG_kWh)*100/(sum(electricity_yearly.Electricity_kWh)+sum(ng_yearly.NG_kWh))) 
AS Percentage_NG FROM ng_yearly JOIN electricity_yearly ON 
ng_yearly.Business_Unit_Desc = electricity_yearly.Business_Unit_Desc GROUP BY ng_yearly.Year;]
(Background on this error at: https://sqlalche.me/e/14/f405)

In [None]:
percentage_energy1 = '''SELECT ng_yearly.Year,(SELECT sum(electricity_yearly.Electricity_kWh)*100/(sum(electricity_yearly.Electricity_kWh)+sum(ng_yearly.NG_kWh))) 
AS Percentage_Electricity, (SELECT sum(ng_yearly.NG_kWh)*100/(sum(electricity_yearly.Electricity_kWh)+sum(ng_yearly.NG_kWh))) 
AS Percentage_NG FROM ng_yearly JOIN electricity_yearly ON 
ng_yearly.Business_Unit_Desc = electricity_yearly.Business_Unit_Desc GROUP BY ng_yearly.Year;'''

percentage_energy1 = pd.read_sql_query(percentage_energy1, engine)
(percentage_energy1)

In [61]:
electricity1 = pd.read_csv("electricity1.csv")
electricity1.head()

FileNotFoundError: [Errno 2] No such file or directory: 'electricity1.csv'

In [62]:
electricity1.to_sql('electricity1', engine )

NameError: name 'electricity1' is not defined

In [63]:
ng1 = pd.read_csv("NG1.csv")
ng1.head()

FileNotFoundError: [Errno 2] No such file or directory: 'NG1.csv'

In [64]:
ng1.to_sql('ng1', engine )

NameError: name 'ng1' is not defined

In [65]:
percentage_energy2019 = '''SELECT ng1.Month, SUM(ng1.NG_kWh), SUM(electricity1.Electricity_kWh), (SELECT SUM(electricity1.Electricity_kWh)*100/
(SUM(electricity1.Electricity_kWh)+ SUM(ng1.NG_kWh))) AS Percentage_Electricity,(SELECT sum(ng1.NG_kWh)*100/
(sum(electricity1.Electricity_kWh)+sum(ng1.NG_kWh))) AS Percentage_NG FROM ng1 join electricity1 on 
ng1.Business_Unit_Desc = electricity1.Business_Unit_Desc where ng1.Year=2019 GROUP BY ng1.Month ORDER BY SUM(ng1.NG_kWh) DESC;'''

percentage_energy2019 = pd.read_sql_query(percentage_energy2019, engine)
(percentage_energy2019)

ProgrammingError: (mysql.connector.errors.ProgrammingError) 1146 (42S02): Table 'jannatul_naeema.ng1' doesn't exist
[SQL: SELECT ng1.Month, SUM(ng1.NG_kWh), SUM(electricity1.Electricity_kWh), (SELECT SUM(electricity1.Electricity_kWh)*100/
(SUM(electricity1.Electricity_kWh)+ SUM(ng1.NG_kWh))) AS Percentage_Electricity,(SELECT sum(ng1.NG_kWh)*100/
(sum(electricity1.Electricity_kWh)+sum(ng1.NG_kWh))) AS Percentage_NG FROM ng1 join electricity1 on 
ng1.Business_Unit_Desc = electricity1.Business_Unit_Desc where ng1.Year=2019 GROUP BY ng1.Month ORDER BY SUM(ng1.NG_kWh) DESC;]
(Background on this error at: https://sqlalche.me/e/14/f405)

In [None]:
percentage_energy2020 = '''SELECT ng1.Month, SUM(ng1.NG_kWh), SUM(electricity1.Electricity_kWh), (SELECT SUM(electricity1.Electricity_kWh)*100/
(SUM(electricity1.Electricity_kWh)+ SUM(ng1.NG_kWh))) AS Percentage_Electricity,(SELECT sum(ng1.NG_kWh)*100/
(sum(electricity1.Electricity_kWh)+sum(ng1.NG_kWh))) AS Percentage_NG FROM ng1 join electricity1 on 
ng1.Business_Unit_Desc = electricity1.Business_Unit_Desc where ng1.Year=2020 GROUP BY ng1.Month ORDER BY SUM(ng1.NG_kWh) DESC;'''

percentage_energy2020 = pd.read_sql_query(percentage_energy2020, engine)
(percentage_energy2020)

In [66]:
percentage_energy2021 = '''SELECT ng1.Month, SUM(ng1.NG_kWh), SUM(electricity1.Electricity_kWh), (SELECT SUM(electricity1.Electricity_kWh)*100/
(SUM(electricity1.Electricity_kWh)+ SUM(ng1.NG_kWh))) AS Percentage_Electricity,(SELECT sum(ng1.NG_kWh)*100/
(sum(electricity1.Electricity_kWh)+sum(ng1.NG_kWh))) AS Percentage_NG FROM ng1 join electricity1 on 
ng1.Business_Unit_Desc = electricity1.Business_Unit_Desc where ng1.Year=2021 GROUP BY ng1.Month ORDER BY SUM(ng1.NG_kWh) DESC;'''

percentage_energy2021 = pd.read_sql_query(percentage_energy2021, engine)
(percentage_energy2021)

ProgrammingError: (mysql.connector.errors.ProgrammingError) 1146 (42S02): Table 'jannatul_naeema.ng1' doesn't exist
[SQL: SELECT ng1.Month, SUM(ng1.NG_kWh), SUM(electricity1.Electricity_kWh), (SELECT SUM(electricity1.Electricity_kWh)*100/
(SUM(electricity1.Electricity_kWh)+ SUM(ng1.NG_kWh))) AS Percentage_Electricity,(SELECT sum(ng1.NG_kWh)*100/
(sum(electricity1.Electricity_kWh)+sum(ng1.NG_kWh))) AS Percentage_NG FROM ng1 join electricity1 on 
ng1.Business_Unit_Desc = electricity1.Business_Unit_Desc where ng1.Year=2021 GROUP BY ng1.Month ORDER BY SUM(ng1.NG_kWh) DESC;]
(Background on this error at: https://sqlalche.me/e/14/f405)