From 9:30 AM to 3:30 PM, find a tabular dataset and make a predictive model with it.
It is worth noting that I selected North Carolina because North Carolina appears to be the most average State out of the whole United States, based on this 2016 article from Business Insider.
Since completing this Challenge, I have only made the following minor changes:
- Added a PDF version of my unmodified Presentation-Deck (so it can be viewed without download)
- Added the guidelines for this challenge (see link above)
- Added this section to this ReadMe
- Made some minor spelling corrections to this ReadMe
- Changed the Title of this document to be more descriptive.
- Problem Statement
- Executive Summary
- File Directory
- Data
- Data Dictionary
- Conclusions and Recommendations
- Areas for Further Research/Study
- Sources
- Visualizations
Using data from EIA (US Energy Information Administration) are there any patterns in electricity usage in North Carolina from 2001 through 2011, and can we use this data to make a predictive Time-Series Model?
Looking at Electrical usage can we forcast future Electrical usage in North Carolina. I selected North Carolina becasue I was looking for a State that would be a good representation of all of the States, and North Carolina seemd to be the most average through my research.
03-Project
|
|__ code
| |__ 00_table_of_contents.ipynb
| |__ 01_eda_and_cleaning.ipynb
| |__ 02_null_model.ipynb
| |__ 03_time_series_models.ipynb
| |__ 04_auto_regression.ipynb
| |__ 05_analysis.ipynb
| |__ 06_conclusion.ipynb
|
|__ data
| |__ elec_mo_2001_2011_consumption.csv
|
|__ images
| |__ electric_time_series_null_model.png
| |__ electric_time_series_seasonal_forecast.png
| |__ electricity_auto_regression_forecast.png
| |__ electricity_test_forecast.png
| |__ electricity_train_test_forecast.png
|
|__ presentation
| |__ ChrisCaldarella_project4_presentation.pptx
| |__ ChrisCaldarella_project4_presentation.pdf
|
|__ LICENSE
|__ README.md
I found some data about Electricity usage on Data Is Plural — Structured Archive document, which led me to the US Energy Information Administration webpage where I found a monthly data set of energy consumption by month over several years.
Feature | Python Type | Data Type | Descritpion |
---|---|---|---|
date | DateTime | Continuous | Date; First of the Month |
STATE | Object | Nominal | State being observed |
CONSUMPTION | float64 | Continuous | Amount of total energy consumed that month |
It appears that the Holt-Winters performed the best out of all the models, and fits best when looking at plots. SARIMAX did not do well, but the Seasonal model and the Auto-Regression Models seemed to do ok.
Model\Score | RMSE | MAE |
---|---|---|
Holt-Winters | 3631847 | 2920368 |
Seasonal | 4294624 | 3603229 |
Auto-Regression | 4414239 | 3663597 |
SARIMAX | 4713245 | 3813522 |
Null Model | 5359973 | 4164017 |
We can expand this to other states in the United States.