IESO Data Pipeline Project for Price Forecasting

This project creates an end-to-end data pipeline to fetch data from various reports, store it in a Google Cloud Platform (GCP) database, build a dashboard, and develop a machine learning model for price forecasting.

Data Extraction

We use Python, along with libraries such as pandas and BeautifulSoup, to scrape data from various report links. The scraped data is stored in dataframes and then loaded into Google Cloud Storage buckets. This data is then transferred to BigQuery tables for efficient processing. The data extraction process is automated with a Cronjob/Google Cloud Scheduler.

Forecasting Using Machine Learning

We build and run various machine learning models in GCP’s BigQuery to predict future fuel/energy prices. We tested LSTM univariate/multivariate, GRU for time series problems, and ANN Regressor, Random Forests regression for regression problems. The ANN regression model provided the best results for our use case.

Data Visualization Report

After modeling, we generate a data visualization report on Google Data Studio for further insights. The report includes a pie chart about the distribution of fuel generated by each fuel type, a stacked column chart about the distribution of fuel generated each month, and a time series visualization of fuel generation during each quarter of the year.

Results and Insights

Mean Average Error (MAE): The ANN regression model achieved a MAE in the range of 7.51 - 12.
Look Back: The LSTM/GRU models used a look back of 3, meaning they trained on 3 hours of past data.
n_steps_in and n_steps_out: The LSTM/GRU models used n_steps_in of 3 and n_steps_out of 1, meaning they looked at 3 past hours and predicted 1 future hour.
nb_epochs: The LSTM/GRU models completed 10 passes of the entire training dataset.
Pie Chart: Nuclear fuel generated 60% of the fuel.
Stacked Column Chart: Fuel generation was highest in January and August.

Manual Configuration

Instructions for accessing and configuring Google BigQuery, Google Cloud Storage, Google Cloud Functions, and Google Cloud Scheduler are provided in the following sections.

Name		Name	Last commit message	Last commit date
Latest commit History 4 Commits
Adequacy2		Adequacy2
Demand		Demand
GenOutputbyFuelHourly		GenOutputbyFuelHourly
RealtimeMktPrice		RealtimeMktPrice
data		data
notebooks		notebooks
temp		temp
IESO_Methodology.docx.pdf		IESO_Methodology.docx.pdf
Important_Features.png		Important_Features.png
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

IESO Data Pipeline Project for Price Forecasting

Data Extraction

Forecasting Using Machine Learning

Data Visualization Report

Results and Insights

Manual Configuration

Google Cloud Services Used

About

Releases

Packages

Languages

bhavyaverma1/IESO-Price-Forecasting

Folders and files

Latest commit

History

Repository files navigation

IESO Data Pipeline Project for Price Forecasting

Data Extraction

Forecasting Using Machine Learning

Data Visualization Report

Results and Insights

Manual Configuration

Google Cloud Services Used

About

Topics

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages