# Notebook for extracting data for CEP-related analysis

This script retrieves and transforms various data sources for this analysis: 

1. EIA-861 files detailing MWh, Revenue and annual customers for CEPs
2. Customer migration statistics from the Public Utilities Commission
3. Historical standard offer rates by service territory (compiled manually)

In [3]:
import os
import helper_functions as hf

## 1. Extract and store EIA files

### Set locations

In [6]:
data_dir = os.path.join(os.getcwd(), 'raw_data')
process_dir = os.path.join(os.getcwd(), 'prepared_data')

#### Read in files from ZIPs at [EIA page](https://www.eia.gov/electricity/data/eia861/)

In [147]:
hf.download_eia_861(2012, 2022, data_dir)

Extracted Sales_Ult_Cust_2012.xlsx
Extracted Sales_Ult_Cust_2013.xls
Extracted Sales_Ult_Cust_2014.xls
Extracted Sales_Ult_Cust_2015.xlsx
Extracted Sales_Ult_Cust_2016.xlsx
Extracted Sales_Ult_Cust_2017.xlsx
Extracted Sales_Ult_Cust_2018.xlsx
Extracted Sales_Ult_Cust_2019.xlsx
Extracted Sales_Ult_Cust_2020.xlsx
Extracted Sales_Ult_Cust_2021.xlsx
Reading in /Users/Darren/git-clones/data-projects/CEPs/etl_scripts/raw_data/Sales_Ult_Cust_2017.xlsx
Reading in /Users/Darren/git-clones/data-projects/CEPs/etl_scripts/raw_data/Sales_Ult_Cust_2021.xlsx
Reading in /Users/Darren/git-clones/data-projects/CEPs/etl_scripts/raw_data/Sales_Ult_Cust_2020.xlsx
Reading in /Users/Darren/git-clones/data-projects/CEPs/etl_scripts/raw_data/Sales_Ult_Cust_2016.xlsx
Reading in /Users/Darren/git-clones/data-projects/CEPs/etl_scripts/raw_data/Sales_Ult_Cust_2013.xls
Reading in /Users/Darren/git-clones/data-projects/CEPs/etl_scripts/raw_data/Sales_Ult_Cust_2012.xlsx
Reading in /Users/Darren/git-clones/data-projec

#### Transform and merge separate files, write to CSV and store as dataframe for inspection

In [84]:
eia_df = hf.process_and_merge_861(data_dir=data_dir, process_dir=process_dir)

Reading in /Users/Darren/git-clones/data-projects/CEPs/etl_scripts/raw_data/Sales_Ult_Cust_2017.xlsx
Reading in /Users/Darren/git-clones/data-projects/CEPs/etl_scripts/raw_data/Sales_Ult_Cust_2021.xlsx
Reading in /Users/Darren/git-clones/data-projects/CEPs/etl_scripts/raw_data/Sales_Ult_Cust_2020.xlsx
Reading in /Users/Darren/git-clones/data-projects/CEPs/etl_scripts/raw_data/Sales_Ult_Cust_2016.xlsx
Reading in /Users/Darren/git-clones/data-projects/CEPs/etl_scripts/raw_data/Sales_Ult_Cust_2013.xls
Reading in /Users/Darren/git-clones/data-projects/CEPs/etl_scripts/raw_data/Sales_Ult_Cust_2012.xlsx
Reading in /Users/Darren/git-clones/data-projects/CEPs/etl_scripts/raw_data/Sales_Ult_Cust_2014.xls
Reading in /Users/Darren/git-clones/data-projects/CEPs/etl_scripts/raw_data/Sales_Ult_Cust_2015.xlsx
Reading in /Users/Darren/git-clones/data-projects/CEPs/etl_scripts/raw_data/Sales_Ult_Cust_2019.xlsx
Reading in /Users/Darren/git-clones/data-projects/CEPs/etl_scripts/raw_data/Sales_Ult_Cust_20

## Extract and process 

## 2. Maine customer migration statistics

Update file dir based on [this page](https://www.maine.gov/mpuc/regulated-utilities/electricity/choosing-supplier/migration-statistics) from the Maine PUC. 

In [109]:
migration_xls = 'https://www.maine.gov/mpuc/sites/maine.gov.mpuc/files/inline-files/Standard%20Offer%20Migration%20Stats%20through%2008.15.23.xls'

In [150]:
migration_df = hf.process_customer_migration_files(migration_xls, process_dir)
migration_df.head()

Captured and wrote file of shape (822, 4)
