# Wind Data From NOAA's RAP model
---

**NOAA**: National Oceanic and Atmospheric Administration

**RAP**: Rapide Refresh
Information can be found at the following url:
https://www.ncdc.noaa.gov/data-access/model-data/model-datasets/rapid-refresh-rap

Data can be retrieved using the NetCDF Subset Service (NCSS). Information on this protocol are available at: https://www.unidata.ucar.edu/software/thredds/current/tds/reference/NetcdfSubsetServiceReference.html

The Rapid Refresh (RAP) numerical weather model is run by the National Centers for Environmental Prediction (NCEP), which is part of of the NOAA. Multiple data sources go into the generation of RAP model: commercial aircraft weather data, balloon data, radar data, surface observations, and satellite data. The model generates data down to a 13 km resolution horizontal grid every hour. 

## 1. Sites Location
The location (longitude and latitude) of the wind farms in the Western grid are retrieved.

In [1]:
import westernintnet
grid = westernintnet.WesternIntNet()

Loading sub
Loading bus2sub
Loading bus
Loading genbus
Loading branches
Done loading


In [2]:
wind_farm = grid.genbus.groupby('type').get_group('wind')
n_target = len(wind_farm)

print("There are %d wind farms in the Western grid." % n_target)

There are 243 wind farms in the Western grid.


In [3]:
wind_farm.head(n=10)

Unnamed: 0_level_0,busID,Pg,Qg,Qmax,Qmin,Vg,mBase,status,Pmax,Pmin,...,mu_Pmax,mu_Pmin,mu_Qmax,mu_Qmin,type,lat,lon,GenMWMax,GenMWMin,base_color
plantID,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1,Unnamed: 7_level_1,Unnamed: 8_level_1,Unnamed: 9_level_1,Unnamed: 10_level_1,Unnamed: 11_level_1,Unnamed: 12_level_1,Unnamed: 13_level_1,Unnamed: 14_level_1,Unnamed: 15_level_1,Unnamed: 16_level_1,Unnamed: 17_level_1,Unnamed: 18_level_1,Unnamed: 19_level_1,Unnamed: 20_level_1,Unnamed: 21_level_1
7,10691,59.72,21.07,21.07,-14.24,1.0063,98.92,1,59.72,59.72,...,0.0,0.0,0.0,0.0000;,wind,45.8131,-120.3475,98.900002,16.648942,#15b01a
10,10699,79.42,-14.5,21.45,-14.5,1.0019,120.64,1,79.42,79.42,...,0.0,0.0,0.0,0.0000;,wind,47.1356,-120.6872,100.699997,13.150001,#15b01a
11,10703,136.3,25.92,29.03,-19.63,1.0605,174.64,1,136.3,136.3,...,0.0,0.0,0.0,0.0000;,wind,45.8797,-120.8072,136.300004,43.062568,#15b01a
38,10746,66.35,-12.96,19.17,-12.96,1.0205,98.26,1,66.35,66.35,...,0.0,0.0,0.0,0.0000;,wind,46.9547,-120.1819,89.999998,19.930001,#15b01a
52,10768,212.51,56.83,56.83,-38.42,1.0304,301.55,1,212.51,212.51,...,0.0,0.0,0.0,0.0000;,wind,46.421111,-118.026944,266.799998,122.148347,#15b01a
68,10791,58.2,22.43,22.43,-15.16,0.996,136.19,1,58.2,58.2,...,0.0,0.0,0.0,0.0000;,wind,47.155833,-117.364444,105.299997,19.808362,#15b01a
73,10800,82.34,-14.08,20.83,-14.08,1.04,129.15,1,82.34,82.34,...,0.0,0.0,0.0,0.0000;,wind,45.781825,-120.521151,97.799999,28.855422,#15b01a
85,10821,0.0,0.0,32.21,-21.77,1.0423,208.02,0,151.2,63.85,...,0.0,0.0,0.0,0.0000;,wind,45.9215,-120.2355,151.199996,63.85113,#15b01a
132,10904,37.13,-3.49,10.65,-7.2,1.0489,64.29,1,37.13,37.13,...,0.0,0.0,0.0,0.0000;,wind,45.9192,-120.3039,50.0,24.632299,#15b01a
151,10933,50.0,-7.2,10.65,-7.2,1.0192,50.0,1,50.0,50.0,...,0.0,0.0,0.0,0.0000;,wind,45.744171,-120.783568,50.0,10.43,#15b01a


In [4]:
lon_target = wind_farm.lon.values
lat_target = wind_farm.lat.values
id_target  = wind_farm.index.values

## 2. Wind Data

In [5]:
import numpy as np
import pandas as pd
import datetime
import math

The path to all files we will download is created. These are 1 hour resolution files for the year 2016. Note that 2016 is a leap year.

In [6]:
path = 'https://www.ncei.noaa.gov/thredds/ncss/rap130anl/'

start = datetime.datetime.strptime('2016-01-01', '%Y-%m-%d')
end = datetime.datetime.strptime('2016-12-31', '%Y-%m-%d')
step = datetime.timedelta(days=1)

files = []
while start <= end:
    ts = start.strftime('%Y%m%d')
    url = path + '2016'+ ts[4:6] + '/' + ts + '/'
    for h in range(10000,12400,100):
        files.append(url + 'rap_130_' + ts + '_' + str(h)[1:] + '_000.grb2?')
    start += step

print("There are %d files" % len(files))

There are 8784 files


The u and v components of the wind speed at 10m and 80 meters are the only variables that will be enclosed in the files. Note that We don't need to consider the entire grid. We retrieve only the variables for the portion of the grid enclosed in the bounding box defined in the cell below. The boundaries of the box have been chosen according to the northernmost, easternmost, southernmost and westernmost wind farms.

The data will be downloaded in the NetCDF (Network Common Data Form) format. Instructions given in https://www.unidata.ucar.edu/software/thredds/current/tds/reference/NetcdfSubsetServiceReference.html have been very helpful to access these data.

In [7]:
# Variables
var = 'var=u-component_of_wind_height_above_ground' + '&' + \
      'var=v-component_of_wind_height_above_ground'

# Bounding Box
box = 'north=49&west=-122&east=-102&south=32&disableProjSubset=on&horizStride=1&addLatLon=true'

# Data Format
extension = 'accept=netCDF'

For each farm, we need to find the closest location on the grid. To do so, we calculate the angular distance between the direction of the wind farm and all the directions of the grid. Two functions are defined on the grid below. The first one, `ll2uv` converts the longitude and latitude of a location to its corresponding unit vector ($x$,$y$,$z$). The second function, `angular_distance`, calculates the scalar product between two vectors and returns the subtended angle.

In [8]:
def ll2uv(lon, lat):
    cos_lat = math.cos(math.radians(lat))
    sin_lat = math.sin(math.radians(lat))
    cos_lon = math.cos(math.radians(lon))
    sin_lon = math.sin(math.radians(lon))
    
    uv = []
    uv.append(cos_lat * cos_lon)
    uv.append(cos_lat * sin_lon)
    uv.append(sin_lat)
    
    return uv


def angular_distance(uv1, uv2):    
    cos_angle = uv1[0]*uv2[0] + uv1[1]*uv2[1] + uv1[2]*uv2[2]
    if cos_angle >= 1:
        cos_angle = 1
    if cos_angle <= -1:
        cos_angle = -1
    angle = math.degrees(math.acos(cos_angle))
    
    return angle

NREL (National Renewable Energy Laboratory) provides generic power curves to estimate wind power output. Three classes of turbines have been defined, which depends on the available wind speed. An offshore class has also been developed. For each class a power curve is given to convert windspeed at 100m to power output. More information can be found in the following document: https://www.nrel.gov/docs/fy14osti/61714.pdf

Note that we use the **IEC class 2** power curve for all the location here.

In [9]:
PowerCurves = pd.read_csv('../IECPowerCurves.csv')

def get_power(wspd, turbine):
    match  = (PowerCurves['Speed bin (m/s)'] <= np.ceil(wspd)) & (PowerCurves['Speed bin (m/s)'] >= np.floor(wspd))
    if not match.any():
        return 0
    values = PowerCurves[turbine][match]
    return np.interp(wspd,PowerCurves[turbine][match].index.values,PowerCurves[turbine][match].values)

Data are collected below and a dataframe is filled out. Note that some files are missing. An interpolation will be used later in this notebook.

In [10]:
import requests
import time
import os
from netCDF4 import Dataset
from collections import OrderedDict

missing = []
target2grid = OrderedDict()
data = pd.DataFrame({'plantID':[], 'U':[], 'V':[], 'Pout':[], 'ts':[], 'tsID':[]})
dt = datetime.datetime.strptime('2016-01-01', '%Y-%m-%d')
step = datetime.timedelta(hours=1)

program_start = time.time()
    
for i, file in enumerate(files[:10]):
    now = time.time()
    if i % 100 == 0: print("%d/8784" % i)
    
    query = file + var + '&' + box + '&' + extension
    request = requests.get(query)
    
    data_tmp = pd.DataFrame({'plantID':id_target, 'ts':[dt]*n_target, 'tsID':[i+1]*n_target})
    
    if request.status_code == 200:
        with open('tmp.nc', 'wb') as f: 
            f.write(request.content)
        tmp = Dataset('tmp.nc', 'r')
        lon_grid = tmp.variables['lon'][:].flatten()
        lat_grid = tmp.variables['lat'][:].flatten()
        u_wsp = tmp.variables['u-component_of_wind_height_above_ground'][0,1,:,:].flatten()
        v_wsp = tmp.variables['v-component_of_wind_height_above_ground'][0,1,:,:].flatten()
            
        n_grid = len(lon_grid)
        if data.empty:
            # The angular distance is calculated once. The target to grid correspondence is stored in a dictionary.
            for j in range(n_target):
                uv_target = ll2uv(lon_target[j], lat_target[j])
                distance = [angular_distance(uv_target, ll2uv(lon_grid[k],lat_grid[k])) for k in range(n_grid)]                    
                target2grid[id_target[j]] = np.argmin(distance)
         
        data_tmp['U'] = [u_wsp[target2grid[id_target[j]]] for j in range(n_target)]
        data_tmp['V'] = [v_wsp[target2grid[id_target[j]]] for j in range(n_target)]
        data_tmp['Pout'] = np.sqrt(data_tmp['U']**2 + data_tmp['V']**2)
        data_tmp['Pout'] = [get_power(val, 'IEC class 2') for val in data_tmp['Pout'].values]
        
        tmp.close()
        os.remove('tmp.nc')
    else:
        print("File %s is missing" % file)
        missing.append(file)
        
        # missing data are set to -99.
        data_tmp['U'] = [-99] * n_target
        data_tmp['V'] = [-99] * n_target         
        data_tmp['Pout'] = [-99] * n_target
        
    data = data.append(data_tmp, ignore_index=True, sort=False)   
        
    dt += step
    
print("It has been {0} seconds since the loop started".format(now - program_start))

0/8784
File https://www.ncei.noaa.gov/thredds/ncss/rap130anl/201601/20160101/rap_130_20160101_0600_000.grb2? is missing
It has been 57.97989296913147 seconds since the loop started


In [11]:
data['plantID'] = data['plantID'].astype(np.int32)
data['tsID'] = data['tsID'].astype(np.int32)

In [12]:
data.sort_values(by=['tsID', 'plantID'], inplace=True)
data.reset_index(inplace=True, drop=True)

In [13]:
data.head(n=300)

Unnamed: 0,plantID,U,V,Pout,ts,tsID
0,7,-6.482498,-3.105740,0.342867,2016-01-01 00:00:00,1
1,10,-5.482498,5.894260,0.482881,2016-01-01 00:00:00,1
2,11,-6.982498,-2.480740,0.378478,2016-01-01 00:00:00,1
3,38,-6.357498,-1.605740,0.258761,2016-01-01 00:00:00,1
4,52,-2.732498,-3.980740,0.092663,2016-01-01 00:00:00,1
5,68,-3.357498,-1.230740,0.026568,2016-01-01 00:00:00,1
6,73,-6.732498,-1.980740,0.315559,2016-01-01 00:00:00,1
7,85,-4.607498,-4.730740,0.264431,2016-01-01 00:00:00,1
8,132,-5.607498,-4.980740,0.392919,2016-01-01 00:00:00,1
9,151,-8.107498,1.144260,0.509956,2016-01-01 00:00:00,1
