Jenna Ruzekowicz (jenna.ruzekowicz@nrel.gov) and Caleb Phillips (caleb.phillips@nrel.gov)

The purpose of this notebook is to read in two sets of data over the same time period for comparison:
1) WTK-LED data
2) Either: 
    a) power output and/or wind speed data from turbine(s) 
    b) wind speed measurements from met tower(s)

Data sets are matched on location and time stamp.

A combined and labeled dataframe will be exported to a csv file in the "01Data" folder. The naming convention for the csv file will be as follows: source_lat_lon_startdate_enddate.csv
where source is either "bergey", "oneenergy"...

Notes: 
Might need to install Rex if it isn't installed already:
conda install nrel-rex --channel=nrel

More about rex: https://github.com/NREL/rex
2018 5-min monthly h5 (the file you referenced on the 21st):
/campaign/tap/CONUS/wtk/5min/2018/{month}/conus_2018-{month}.h5
 
2018 5-min yearly h5 slices:
/shared-projects/wtk-led/CONUS/wtk/2018/yearly_h5/conus_2018_{height}m.h5
 
2019 60-min yearly h5:
/campaign/tap/CONUS/wtk/60min/2019/conus_2019.h5

In [1]:
import numpy as np
import pandas as pd
import geopandas as gpd
from rex.resource_extraction import MultiYearWindX
from dw_tap.data_fetching import get_data_wtk_led_on_eagle

Step 1) Read in either power output/wind speed data for wind turbine(s) or wind speed data from met tower(s)

In [18]:
#Reading in data from W1 turbine at Marion OH location (oneenergy turbine), 2018
power_output_df = pd.read_excel("../../data/marion/turbine.oneenergy.00.20180131.000000.marion.w1.xlsx", header=1, usecols="B, C, M")
power_output_df = pd.concat([power_output_df, pd.read_excel("../../data/marion/turbine.oneenergy.00.20180228.000000.marion.w1.xlsx", header=1, usecols="B, C, M")])
power_output_df = pd.concat([power_output_df, pd.read_excel("../../data/marion/turbine.oneenergy.00.20180331.000000.marion.w1.xlsx", header=1, usecols="B, C, M")])
power_output_df = pd.concat([power_output_df, pd.read_excel("../../data/marion/turbine.oneenergy.00.20180430.000000.marion.w1.xlsx", header=1, usecols="B, C, M")])
power_output_df = pd.concat([power_output_df, pd.read_excel("../../data/marion/turbine.oneenergy.00.20180531.000000.marion.w1.xlsx", header=1, usecols="B, C, M")])
power_output_df = pd.concat([power_output_df, pd.read_excel("../../data/marion/turbine.oneenergy.00.20180630.000000.marion.w1.xlsx", header=1, usecols="B, C, M")])
power_output_df = pd.concat([power_output_df, pd.read_excel("../../data/marion/turbine.oneenergy.00.20180731.000000.marion.w1.xlsx", header=1, usecols="B, C, M")])
power_output_df = pd.concat([power_output_df, pd.read_excel("../../data/marion/turbine.oneenergy.00.20180831.000000.marion.w1.xlsx", header=1, usecols="B, C, M")])
power_output_df = pd.concat([power_output_df, pd.read_excel("../../data/marion/turbine.oneenergy.00.20180930.000000.marion.w1.xlsx", header=1, usecols="B, C, M")])
power_output_df = pd.concat([power_output_df, pd.read_excel("../../data/marion/turbine.oneenergy.00.20181031.000000.marion.w1.xlsx", header=1, usecols="B, C, M")])
power_output_df = pd.concat([power_output_df, pd.read_excel("../../data/marion/turbine.oneenergy.00.20181130.000000.marion.w1.xlsx", header=1, usecols="B, C, M")])
power_output_df = pd.concat([power_output_df, pd.read_excel("../../data/marion/turbine.oneenergy.00.20181231.000000.marion.w1.xlsx", header=1, usecols="B, C, M")])
power_output_df.rename(columns={'Time':'timestamp', 'Wind Turbine Energy yield(kWh)':'measured_production', 'Avg Wind Speed(m/s)':'measured_ws'}, inplace=True)

In [19]:
print(power_output_df)

               timestamp  measured_ws  measured_production
0    2018-01-01 00:00:00         5.92                   52
1    2018-01-01 00:10:00         6.01                   58
2    2018-01-01 00:20:00         5.96                   56
3    2018-01-01 00:30:00         6.01                   58
4    2018-01-01 00:40:00         5.82                   52
...                  ...          ...                  ...
4459 2018-12-31 23:10:00        15.02                    0
4460 2018-12-31 23:20:00        12.81                    0
4461 2018-12-31 23:30:00        13.63                    0
4462 2018-12-31 23:40:00        11.48                    0
4463 2018-12-31 23:50:00        11.10                    0

[51417 rows x 3 columns]
