## ANL LOM Analysis with One Energy Data
Naming convention is as follows: 
1) What lab does the model come from? 
2) Model type
3) Data Source/Site Type (One Energy (OE) or Bergey (B)

In this notebook, we assess performance of the ANL LOM using the One Energy sites, currently just Marion, OH. We use the following approaches: 
1) Using timeseries data we use the provided industry power curve to convert ANL LOM windspeed to power and compare directly to the measured turbine power. 
2) Using timeseries data, we use an inverted power curve to convert the turbine power to windspeed and compare against the ANL LOM windspeed. 
3) Finally, we use a 12x24 sectorized windpseed "cube" to estimate the annual energy production at the site and compare that to the actual energy production. 


In [1]:
#Needed imports for pipeline analysis
import numpy as np
import pandas as pd
import time
import h5pyd
import geopandas as gpd
from dw_tap.lom import run_lom

In [3]:
#Functions needed for Goldwind871500-specific analysis (power curve)
def estimate_power_output(df, temp, pres, ws_column="ws-adjusted"): 
    """
    Inputs: dataframe, temperature series and pressure series
    Outputs: total kw predicted over time period, instances with wind speed above possible generation, instances with wind speed below possible generation, lists of wind speeds above and below those marks
    """
    df_copy = df.copy()
    
    air_density = (pres) / (287.05 * temp)
    df_copy[ws_column] = (df_copy[ws_column] * ((air_density/1.225)**(1/3)))
    kw = Goldwind871500.windspeed_to_kw(df_copy, ws_column)
    above_curve_counter = Goldwind871500.above_curve_counter
    below_curve_counter = Goldwind871500.below_curve_counter
    above_curve_list = Goldwind871500.above_curve_list
    below_curve_list = Goldwind871500.below_curve_list
    return kw, above_curve_counter, below_curve_counter, above_curve_list, below_curve_list

class Goldwind871500(object):
    
    # Load data and minimal preprocessing
    raw_data = pd.read_excel("../powercurves/Goldwind871500.xlsx")
    raw_data.rename(columns={"Wind Speed (m/s)": "ws", "Turbine Output": "kw"}, inplace=True)
    
    # Create vectors for interpolation
    interp_x = raw_data.ws
    interp_y = raw_data.kw
    
    # Counters for cases outside of the real curve
    below_curve_counter = 0
    above_curve_counter = 0
    # Keeping windspeeds that are higher than what is in the curve
    above_curve_list = []
    below_curve_list = []
    
    max_ws = max(raw_data.ws)
    
    @classmethod
    def windspeed_to_kw(cls, df, ws_column="ws-adjusted"):
        """ Converts wind speed to kw """
        kw = pd.Series(np.interp(df[ws_column], cls.interp_x, cls.interp_y))
        ws = df[ws_column]
        for i in range(len(kw)):
            if kw.loc[i] <= 0: 
                cls.below_curve_counter += 1
                cls.below_curve_list.append(tuple((df["timestamp"][i], kw[i])))
            if ws.loc[i] > cls.max_ws:
                cls.above_curve_counter += 1
                cls.above_curve_list.append(tuple((df["timestamp"][i], ws[i])))
                kw.loc[i] = 0
        
        return kw
    
    @classmethod
    def reset_counters(cls):
        """ Sets counters and lists back to 0 and empty """
        cls.below_curve_counter = 0
        cls.above_curve_counter = 0
        cls.above_curve_list = []
        cls.below_curve_list = []

### 1) Marion, OH - One Energy site, turbine W1
Using timeseries data we use the provided industry power curve to convert ANL LOM windspeed to power and compare directly to the measured turbine power. Wind speed input data is from WTK-LED. 

In [4]:
#Read in the coordinates and obstalcles around the W1 site and run power predictions
z_turbine = 80
lat, lon = 40.591555, -83.182092
obstacle_file = "../sites/simple_marion_obstacles.geojson"
obstacles_df = gpd.read_file(obstacle_file)
obstacles_df = obstacles_df[["height", "geometry"]]
x1_turbine, y1_turbine = lat, lon
xy_turbine = [np.array([x1_turbine, y1_turbine])]

In [5]:
#Fixme: We need the new WTK-LED data because we have power production data from 2018 onwards
atmospheric_df = getData(f, lat, lon, z_turbine, "IDW", 
                         power_estimate=True,
                         inverse_monin_obukhov_length=True,
                         start_time_idx=0, end_time_idx=4380, time_stride=1)
predictions_df = \
    run_lom(atmospheric_df, obstacles_df, xy_turbine, z_turbine)

NameError: name 'atmospheric_df' is not defined

In [None]:
#Read in the first half of 2018 power output (Jan through August of 2018)
power_output_df = pd.read_excel("../data/marion/turbine.oneenergy.00.20180131.000000.marion.w1.xlsx", header=1, usecols="B, C, M")
power_output_df = power_output_df.append(pd.read_excel("../data/marion/turbine.oneenergy.00.20180228.000000.marion.w1.xlsx", header=1, usecols="B, C, M"))
power_output_df = power_output_df.append(pd.read_excel("../data/marion/turbine.oneenergy.00.20180331.000000.marion.w1.xlsx", header=1, usecols="B, C, M"))
power_output_df = power_output_df.append(pd.read_excel("../data/marion/turbine.oneenergy.00.20180430.000000.marion.w1.xlsx", header=1, usecols="B, C, M"))
power_output_df = power_output_df.append(pd.read_excel("../data/marion/turbine.oneenergy.00.20180531.000000.marion.w1.xlsx", header=1, usecols="B, C, M"))
power_output_df = power_output_df.append(pd.read_excel("../data/marion/turbine.oneenergy.00.20180630.000000.marion.w1.xlsx", header=1, usecols="B, C, M"))
power_output_df = power_output_df.append(pd.read_excel("../data/marion/turbine.oneenergy.00.20180731.000000.marion.w1.xlsx", header=1, usecols="B, C, M"))
power_output_df = power_output_df.append(pd.read_excel("../data/marion/turbine.oneenergy.00.20180831.000000.marion.w1.xlsx", header=1, usecols="B, C, M"))
power_output_df.rename(columns={'Time':'timestamp', 'Wind Turbine Energy yield(kWh)':'measured_production', 'Avg Wind Speed(m/s)':'measured_ws'}, inplace=True)

## Step 2: Read in the actual generated kw production from the same time period.

In [9]:
#Fixme: Potential substitution of using lidar data instead of WTK data (needs 2018 onwards), lidar has 6-month overlap

#Reading in the Lidar Data
#Using 79m wind speed and direction (hub height is 80m for turbines in this area)
atmospheric_df = pd.read_excel("../data/lidar_marion_OH.xlsx", header=3, usecols="A,L,M,AY,AZ")
atmospheric_df.rename(columns={'Date (UTC)':'datetime', 'Air Temp. Cel.':'temp', 'Pressure (mbar)':'pres', 'Wind Direction (deg) at 79m':'wd', 'Horizontal Wind Speed (m/s) at 79m':'ws'}, inplace=True)
atmospheric_df['temp'] = atmospheric_df['temp'].apply(lambda x : x + 273.15) #convert to K from C
atmospheric_df['pres'] = atmospheric_df['pres'].apply(lambda x : x * 100) #convert to Pascals from mbar

#Turbine specs and location + obstacles
z_turbine = 80
lat, lon = 40.591555, -83.182092
obstacle_file = "../sites/simple_marion_obstacles.geojson"
obstacles_df = gpd.read_file(obstacle_file)
obstacles_df = obstacles_df[["height", "geometry"]]
x1_turbine, y1_turbine = lat, lon
xy_turbine = [np.array([x1_turbine, y1_turbine])]

#Run the LOM prediction for wind speeds
predictions_df = \
    run_lom(atmospheric_df, obstacles_df, xy_turbine, z_turbine).join(atmospheric_df["wd"])

temp = atmospheric_df['temp']
predictions_df = predictions_df.join(temp)

pres = atmospheric_df['pres']
predictions_df = predictions_df.join(pres)

#One issue is that the lidar data spans from mid-august of 2017 to mid-august of 2018. It looks like we only have wind production data from 2018 onwards. 

#Reading in actual power generation
power_output_df = pd.read_excel("../data/marion/turbine.oneenergy.00.20180131.000000.marion.w1.xlsx", header=1, usecols="B, C, M")
power_output_df = power_output_df.append(pd.read_excel("../data/marion/turbine.oneenergy.00.20180228.000000.marion.w1.xlsx", header=1, usecols="B, C, M"))
power_output_df = power_output_df.append(pd.read_excel("../data/marion/turbine.oneenergy.00.20180331.000000.marion.w1.xlsx", header=1, usecols="B, C, M"))
power_output_df = power_output_df.append(pd.read_excel("../data/marion/turbine.oneenergy.00.20180430.000000.marion.w1.xlsx", header=1, usecols="B, C, M"))
power_output_df = power_output_df.append(pd.read_excel("../data/marion/turbine.oneenergy.00.20180531.000000.marion.w1.xlsx", header=1, usecols="B, C, M"))
power_output_df = power_output_df.append(pd.read_excel("../data/marion/turbine.oneenergy.00.20180630.000000.marion.w1.xlsx", header=1, usecols="B, C, M"))
power_output_df = power_output_df.append(pd.read_excel("../data/marion/turbine.oneenergy.00.20180731.000000.marion.w1.xlsx", header=1, usecols="B, C, M"))
power_output_df = power_output_df.append(pd.read_excel("../data/marion/turbine.oneenergy.00.20180831.000000.marion.w1.xlsx", header=1, usecols="B, C, M"))
power_output_df.rename(columns={'Time':'timestamp', 'Wind Turbine Energy yield(kWh)':'measured_production', 'Avg Wind Speed(m/s)':'measured_ws'}, inplace=True)

pre_analysis = predictions_df.merge(power_output_df[['timestamp', 'measured_production', 'measured_ws']], on='timestamp', how='left')
pre_analysis = pre_analysis.dropna()

pre_analysis = pre_analysis.reset_index()
kw, above_curve, below_curve, above_curve_list, below_curve_list = \
    estimate_power_output(pre_analysis, pre_analysis["temp"], pre_analysis["pres"])

LOM time : 3.52  min


  power_output_df = power_output_df.append(pd.read_excel("../data/marion/turbine.oneenergy.00.20180228.000000.marion.w1.xlsx", header=1, usecols="B, C, M"))
  power_output_df = power_output_df.append(pd.read_excel("../data/marion/turbine.oneenergy.00.20180331.000000.marion.w1.xlsx", header=1, usecols="B, C, M"))
  power_output_df = power_output_df.append(pd.read_excel("../data/marion/turbine.oneenergy.00.20180430.000000.marion.w1.xlsx", header=1, usecols="B, C, M"))
  power_output_df = power_output_df.append(pd.read_excel("../data/marion/turbine.oneenergy.00.20180531.000000.marion.w1.xlsx", header=1, usecols="B, C, M"))
  power_output_df = power_output_df.append(pd.read_excel("../data/marion/turbine.oneenergy.00.20180630.000000.marion.w1.xlsx", header=1, usecols="B, C, M"))
  power_output_df = power_output_df.append(pd.read_excel("../data/marion/turbine.oneenergy.00.20180731.000000.marion.w1.xlsx", header=1, usecols="B, C, M"))
  power_output_df = power_output_df.append(pd.read_excel("

In [10]:
#Fixme: 
print(pre_analysis)

       index           timestamp  ws-adjusted     ws       wd    temp  \
0      19885 2018-01-01 00:00:00        2.852  2.852  302.343  259.85   
1      19886 2018-01-01 00:10:00        2.566  2.566  305.654  259.79   
2      19887 2018-01-01 00:20:00        1.967  1.967  303.185  259.73   
3      19888 2018-01-01 00:30:00        1.774  1.774  301.331  259.44   
4      19889 2018-01-01 00:40:00        2.039  2.039  296.283  259.05   
...      ...                 ...          ...    ...      ...     ...   
29375  52616 2018-08-16 09:30:00        6.802  6.802  182.113  293.15   
29376  52617 2018-08-16 09:40:00        5.719  5.719  185.084  293.22   
29377  52618 2018-08-16 09:50:00        6.585  6.585  170.531  293.35   
29378  52619 2018-08-16 10:00:00        6.719  6.719  168.171  293.36   
29379  52620 2018-08-16 10:10:00        7.121  7.121  163.370  293.36   

          pres  measured_production  measured_ws  
0      99780.0                 52.0         5.92  
1      99790.0       