# A-01-01C: Understanding Datasets - SMUD Residential Rate Schedules

## 1. Import Libraries required

### 1.1 Basic Libraries

In [1]:
import os
import sys
import importlib
import pathlib as pl
import vaex

### 1.2 Additional Libraries used only in this script

In [2]:
import re

### 1.3 Load User-Written Libraries

In [3]:
# # 1. To generate objects that will be used to load user-written libraries
# ## Note: The numbers used below depend on the directory in which this script is
cwd = pl.Path.cwd()
path_project = cwd.parents[3]
path_hd = cwd.parents[2]
sys.path.append(str(path_hd))
os.chdir(path_project)

In [4]:
# # 2. To load user-written libraries
HD = importlib.import_module('H-Energy-Demand-Analysis')
FNC = importlib.import_module('F-Energy-Demand-Analysis_Common-Functions')
DD = importlib.import_module('D-Energy-Demand-Analysis_Data-Dictionary')

## 2. Set Path(s) and Parameter(s)

### 2.1 Set Path(s)

In [5]:
# # 1. Path(s) from which data file(s) will be loaded
FILE_TO_LOAD_BILLING = 'SMUD_Billing-Data.parquet'
PATH_TO_LOAD_BILLING = os.path.join(HD.PATH_DATA_INTERMEDIATE_SMUD_BILLING, FILE_TO_LOAD_BILLING)

FILE_TO_LOAD_RRS = 'SMUD_Residential-Rate-Schedules.parquet'
PATH_TO_LOAD_RRS = os.path.join(HD.PATH_DATA_INTERMEDIATE_SMUD_RRS, FILE_TO_LOAD_RRS)

# # 2. Path(s) at which data file(s) will be saved
# (NOT Applicable)

### 2.2 Set Parameter(s)

In [6]:
# # 0. Basic Parameters
# # 0.1. Script Number
SCRIPT_NO = 'A-01-01C'

## 3. Details of Work

In [7]:
FNC.printDt('Work begins: ' + SCRIPT_NO)

2020-09-04 17:24:06 - Work begins: A-01-01C


### 3.1. Load Dataset(s) required

In [8]:
billing_vx = vaex.open(PATH_TO_LOAD_BILLING)
rrs_vx = vaex.open(PATH_TO_LOAD_RRS)

### 3.2. Compare Lists of Rate Codes

In [9]:
print(len(billing_vx['rate_code'].unique()) == len(rrs_vx['rate_code'].unique()))
print(len(billing_vx['rate_code'].unique()) > len(rrs_vx['rate_code'].unique()))

False
True


The values of `False` and `True` imply that there are more rate codes in `billing_vx` than ones in `rrs_vx`.

In [10]:
billing_vx['rate_code'].value_counts()

RSG        27274141
RSGH        7473512
RSE         6441273
RSG_E       3745647
RSGH_E      1750430
             ...   
RWC_L3           13
RWE_L2            8
RWG_E1            4
RSC_EL2           4
RSE_L2            3
Length: 157, dtype: int64

There are many rate codes that are NOT defined in SMUD's `Rate Code Definitions`.

In [11]:
list_codes_inRRS = list(rrs_vx['rate_code'].unique())
select = billing_vx.rate_code.isin(list_codes_inRRS)
N_select = billing_vx[select].shape[0]
print('{0:,.1f}% of observations of Billing Data have rate codes that are also in Residential Rate Schedules.'.format(N_select / billing_vx.shape[0] * 100))

80.4% of observations of Billing Data have rate codes that are also in Residential Rate Schedules.


In [12]:
FNC.printDt('Work ends: ' + SCRIPT_NO)

2020-09-04 17:24:10 - Work ends: A-01-01C
