## Creating timeseries

This notebook goes through the steps of taking the NREL time series data provided to us by Xinshuo and mapping that to the Texas7k dataset given to us by Texas A&M University. Here, we will focus on **wind** and **load**. We actually don't need to do much for load on this end, since the timeseries is already provided to us, and the allocation from the zonal level to the bus level will occur in our modified version of rts_gmlc.py

In [141]:
import pandas as pd
import numpy as np
import datetime

# import the files we will need (excluding the timeseries file given to us by Xinshuo)
bus = pd.read_csv("./finished/bus.csv")
branch = pd.read_csv("./finished/branch.csv")
gen = pd.read_csv("./finished/gen.csv")
wind_mappings = pd.read_csv("./NREL Stuff/Texas7k_NREL_wind_map.csv")

In [142]:
# start with wind forecast for now.... can turn this into a function that applies to all the asset types later
wind_forecast_df = pd.read_csv("./NREL Stuff/wind_day_ahead_forecast_site_2018.csv")
wind_forecast_df.shape

# we will get some days around the day that we are interested in to populate for this test run
days_before = 0
days_after = 1
day_of_interest = 191 # july 10
wind_forecast_subset = wind_forecast_df.iloc[np.maximum(0,(day_of_interest-days_before - 1))*24:np.minimum(365,(day_of_interest+days_after))*24,:]
# getting the year, month, day, and hour in order to mimic the RTS formatting
temp = wind_forecast_subset.loc[:,'Forecast_time']
dates = pd.to_datetime(temp).dt.tz_localize("UTC").dt.tz_convert('America/Denver')
# wind_forecast_subset.loc[:,'Forecast_time'] = pd.to_datetime(temp.loc[:,:]).

year = dates.dt.year
month = dates.dt.month
day = dates.dt.day
hours = np.tile(np.arange(1, 25), days_after-days_before+1)
output_df_DA = pd.DataFrame({'Year': year, 'Month':month, 'Day':day, 'Period':hours})
# this has the first 4 columns set up - what remains is to populate with the appropriate time series which correspond
# to the correct assets
output_df_DA

Unnamed: 0,Year,Month,Day,Period
4560,2018,7,10,1
4561,2018,7,10,2
4562,2018,7,10,3
4563,2018,7,10,4
4564,2018,7,10,5
4565,2018,7,10,6
4566,2018,7,10,7
4567,2018,7,10,8
4568,2018,7,10,9
4569,2018,7,10,10


In [168]:
# need to generate the forecasts for the appropriate wind assets 
wnd_nm = 'WND (Wind)'
#get the wind_assets of gen
wind_gens = gen[gen['Fuel'] == wnd_nm]
wind_mappings.head()

Unnamed: 0,Texas7k BusNum,Texas7k GenID,Texas7k SubNum,Texas7k Max MW,Texas7k Min MW,EIA-860 Plant Code,EIA-860 Plant Name,EIA-860 Operating Year,EIA-860 Nameplate Capacity (MW),NREL Wind Site,Mapping Status,Distribution Factor,NREL Capacity Proportion
0,190193,1,3131,253.0,34.1,60902,Dermott Wind,2017,253.0,Amazon Wind Farm Texas,1,1.0,330.0
1,120493,1,1261,99.8,20.75,58000,Anacacho Wind Farm LLC,2012,99.8,Anacacho Wind Farm,1,1.0,129.0
2,160281,1,2424,188.0,46.66,57927,Baffin Wind,2014,188.0,Baffin,1,1.0,264.0
3,150496,1,2197,120.0,45.51,57156,Barton Chapel Wind Farm,2009,120.0,Barton Chapel Wind Farm,1,1.0,157.0
4,220216,1,3727,196.7,65.47,59972,Bearkat,2018,196.7,Bearkat I,1,1.0,257.0


In [172]:
# creates a temporary dataframe for the output
temp_output_df_DA = output_df_DA

# creates a dictionary for the number of times the plant code is used
# this is necessary because we need to make sure we're pulling the correct distribution factor, nrel capacity, and
# texas7k max capacity when scaling in situations where multiple Texas7k generators have the same plant code and
# therefore map from the same NREL wind farm
plant_codes_num_used = {}
gen_codes = np.unique(wind_gens['EIA-860 Plant Code'])
times_used = [0]*len(gen_codes)
plant_codes_num_used = dict(zip(gen_codes, times_used))

# will essentially iterate across the rows of wind_gens
for i in np.arange(wind_gens.shape[0]):
    # finds the gen uid for the associated row, as well as the plant code
    gen_uid = wind_gens.iloc[i]['GEN UID']
    gen_code = wind_gens.iloc[i]['EIA-860 Plant Code']
    # finds the nrel name in wind mappings which agrees with the plant code. this can return lists of length greater than 1
    nrel_name = wind_mappings[wind_mappings['EIA-860 Plant Code'] == gen_code]['NREL Wind Site']
    # finds the index of the correct name in wind_mappings so it can accurately pull the distribution and max capacities
    mapping_idx = nrel_name.index
    # if the name is non-unique (i.e. more than 1 7k generator maps to the same NREL generator), we have to be careful
    if nrel_name.size > 1:
        # chooses the index based on the number of times each plant code has been used already (recall python is 0 indexed)
        nrel_name = list(nrel_name)[plant_codes_num_used[gen_code]]
        mapping_idx = list(mapping_idx)[plant_codes_num_used[gen_code]]
    # based on the mappings above, pull the 7k max, NREL capacity, and distribution factor
    texas7kmax = wind_mappings.iloc[mapping_idx]['Texas7k Max MW']
    nrel_capacity = wind_mappings.iloc[mapping_idx]['NREL Capacity Proportion']
    dist_factor = wind_mappings.iloc[mapping_idx]['Distribution Factor']
    # will multiply the forecast by the below to scale it for texas 7k
    forecast_multiplier = float(dist_factor / nrel_capacity * texas7kmax)
    # assign
    temp_output_df_DA[gen_uid] = wind_forecast_subset[nrel_name] * forecast_multiplier
    plant_codes_num_used[gen_code] += 1


54    Horse Hollow
55    Horse Hollow
56    Horse Hollow
Name: NREL Wind Site, dtype: object
Int64Index([54, 55, 56], dtype='int64')
54    Horse Hollow
55    Horse Hollow
56    Horse Hollow
Name: NREL Wind Site, dtype: object
Int64Index([54, 55, 56], dtype='int64')
24    Capricorn Ridge
25    Capricorn Ridge
26    Capricorn Ridge
27    Capricorn Ridge
Name: NREL Wind Site, dtype: object
Int64Index([24, 25, 26, 27], dtype='int64')
54    Horse Hollow
55    Horse Hollow
56    Horse Hollow
Name: NREL Wind Site, dtype: object
Int64Index([54, 55, 56], dtype='int64')
133    Stephens Ranch
134    Stephens Ranch
Name: NREL Wind Site, dtype: object
Int64Index([133, 134], dtype='int64')
24    Capricorn Ridge
25    Capricorn Ridge
26    Capricorn Ridge
27    Capricorn Ridge
Name: NREL Wind Site, dtype: object
Int64Index([24, 25, 26, 27], dtype='int64')
133    Stephens Ranch
134    Stephens Ranch
Name: NREL Wind Site, dtype: object
Int64Index([133, 134], dtype='int64')
24    Capricorn Ridge
25    C