# NSRDB NREL API for Solar Radiation and Weather Data

This notebook demonstrates basic usage of the National Renewable Energy Laboratory (NREL) National Solar Radiation Database (NSRDB) data. The data is provided from Amazon Web Services using the HDF Group's Highly Scalable Data Service (HSDS).

The NSRDB is a serially complete collection of hourly and half-hourly values of the three most common measurements of solar radiation—global horizontal, direct normal, and diffuse horizontal irradiance—and meteorological data. The current NSRDB is modeled using multi-channel measurements from geostationary satellites. The older versions of the NSRDB were modeled using cloud and weather information primarily collected at airports. Sufficient number of locations and temporal and spatial scales were used to represent regional solar radiation climates accurately.

Using the NSRDB data, it is possible to estimate the amount of solar energy that has been historically available at a given time and location anywhere in the United States; the NSRDB is also expanding to encompass a growing list of international locations. Using the long-term NSRDB data in various models, it is possible to predict the potential future availability of solar energy in a location based on past conditions.

Typical Meteorological Year (TMY) data can be derived from the NSRDB time-series datasets. Visit the TMY page for detailed information about this data type and its uses.


## Request Rate and Weight Limits

The API is restricted in both the number of requests a single user can make in a 24 hour period, as well as in the maximum size of a single request. The API rate limits are set at:

    300 requests per day
    1 request every 2 seconds

For downloading directly via CSV there are separate rate limits. This is due to the fact that it is only possible to download a single site for a single year via the CSV endpoint, therefor these requests are always quite small by comparison:

    2000 requests per day
    1 request per second

The size limit per each single request is determined by the number of total attributes in each request. The maximum weight of each request is 175000000. The calculation for determining the weight of each request is:

    site-count*attribute-count*year-count*data-intervals-per-year

    site-count is derived from the WKT value submitted and can be retrieved using the site_count API endpoint.
    attribute-count is equal to the number of attributes requested
    year-count is equal to the number of years requested
    data-intervals-per-year is ((60/interval)*24*365) where interval is the interval requested

To maximize a single request simply minimize the variables wherever possible. For example by requesting half as many attributes one can request twice as many years (or sites) worth of data.




Sources:
    - https://nsrdb.nrel.gov/about/what-is-the-nsrdb.html
    - https://www.nrel.gov/docs/fy19osti/74137.pdf
    - https://nsrdb.nrel.gov/about/u-s-data.html
    - https://www.homerenergy.com/products/pro/docs/latest/global_horizontal_irradiance_ghi.html
    - https://www.ncdc.noaa.gov/data-access/land-based-station-data/land-based-datasets/solar-radiation 
    - https://firstgreenconsulting.wordpress.com/2012/04/26/differentiate-between-the-dni-dhi-and-ghi/
    

In [1]:
import pandas as pd
import sqlite3
import os

In [2]:
f = open("auth.txt")
f.readline()  # enphase API info - not relevant to this notebook
f.readline()  # enphase API info - not relevant to this notebook
f.readline()  # enphase API info - not relevant to this notebook
api_key = f.readline()[:-1]
wkt_point = f.readline()[:-1]
full_name = f.readline()[:-1]
email = f.readline()[:-1]
affiliation = f.readline()[:-1]
reason = f.readline()[:-1]
f.close()

## Example API Query returned to Pandas

In [3]:
year = 2010
interval = 60  # must be 60 or 30
utc_flag = 'false'
lead_day_flag = 'true'

url = "https://developer.nrel.gov/api/solar/nsrdb_psm3_download.csv?wkt=POINT("+wkt_point+")&names="+str(year)+"&leap_day="+lead_day_flag+"&interval="+str(interval)+"&utc="+utc_flag+"&full_name="+full_name+"&email="+email+"&affiliation="+affiliation+"&mailing_list=false&reason="+reason+"&api_key="+api_key+"&attributes=ghi,dhi,dni,wind_speed,air_temperature,solar_zenith_angle,cloud_type"

In [4]:
site_count = 1
attribute_count = 7
year_count = 1
data_intervals_per_year = 60/interval*24*365.0
request_weight = site_count*attribute_count*year_count*data_intervals_per_year
max_allowed_weight = 175000000.0
print("This API query is expected to use "+str(100.0*request_weight/max_allowed_weight)+"% of the alloted limit.")

This API query is expected to use 0.03504% of the alloted limit.


In [5]:
data = pd.read_csv(url, skiprows=2)  # two header rows
data.head()

Unnamed: 0,Year,Month,Day,Hour,Minute,GHI,DHI,DNI,Wind Speed,Temperature,Solar Zenith Angle,Cloud Type
0,2010,1,1,0,30,0,0,0,0.2,-5,158.46,6
1,2010,1,1,1,30,0,0,0,0.2,-5,151.19,7
2,2010,1,1,2,30,0,0,0,0.2,-5,141.32,8
3,2010,1,1,3,30,0,0,0,0.2,-5,130.54,7
4,2010,1,1,4,30,0,0,0,0.2,-5,119.55,7


## Productization for General Query and Database Writes

In [6]:
f = open("auth.txt")
f.readline()  # enphase API info - not relevant to this notebook
f.readline()  # enphase API info - not relevant to this notebook
f.readline()  # enphase API info - not relevant to this notebook
api_key = f.readline()[:-1]
wkt_point = f.readline()[:-1]
full_name = f.readline()[:-1]
email = f.readline()[:-1]
affiliation = f.readline()[:-1]
reason = f.readline()[:-1]
f.close()

interval = 30  # must be 30 or 60
utc_flag = 'false'
lead_day_flag = 'true'

db_path = os.path.join("solar.db")

In [7]:
def query_api(year, interval):
    url = "https://developer.nrel.gov/api/solar/nsrdb_psm3_download.csv?wkt=POINT("+wkt_point+")&names="+str(year)+"&leap_day="+lead_day_flag+"&interval="+str(interval)+"&utc="+utc_flag+"&full_name="+full_name+"&email="+email+"&affiliation="+affiliation+"&mailing_list=false&reason="+reason+"&api_key="+api_key+"&attributes=ghi,dhi,dni,wind_speed,air_temperature,solar_zenith_angle,cloud_type"
    site_count = 1
    attribute_count = 7
    year_count = 1
    data_intervals_per_year = 60/interval*24*365.0
    request_weight = site_count*attribute_count*year_count*data_intervals_per_year
    max_allowed_weight = 175000000.0
    print("This API query is expected to use "+str(100.0*request_weight/max_allowed_weight)+"% of the alloted limit.")
    
    df = pd.read_csv(url, skiprows=2)
    print("Data successfully queried from API!")
    return df

In [8]:
def pd_to_sql(df, db_path):
    conn = sqlite3.connect(db_path)
    cur = conn.cursor()
    for x in range(0,len(df)):
        cols = "year, month, day, hour, minute, GHI_w_per_m2, DHI_w_per_m2, DNI_w_per_m2, wind_speed_m_per_s, temp_c, solar_zenith_angle_deg, cloud_type"
        vals = str(df["Year"][x])+", "+str(df["Month"][x])+", "+str(df["Day"][x])+", "+str(df["Hour"][x])+", "+str(df["Minute"][x])+", "+str(df["GHI"][x])+", "+str(df["DHI"][x])+", "+str(df["DNI"][x])+", "+str(df["Wind Speed"][x])+", "+str(df["Temperature"][x])+", "+str(df["Solar Zenith Angle"][x])+", "+str(df["Cloud Type"][x])
        q = "INSERT OR REPLACE INTO weather ("+cols+") VALUES ("+vals+")"
        cur.execute(q)
        conn.commit()
    conn.close()
    print("Data successfully exported from pandas to sqlite database!")

In [9]:
for year in range(2013,2019):
    print("Querying data for "+str(year)+"...")
    data = query_api(year, interval)
    pd_to_sql(data, db_path)

Querying data for 2013...
This API query is expected to use 0.07008% of the alloted limit.
Data successfully queried from API!
Data successfully exported from pandas to sqlite database!
Querying data for 2014...
This API query is expected to use 0.07008% of the alloted limit.
Data successfully queried from API!
Data successfully exported from pandas to sqlite database!
Querying data for 2015...
This API query is expected to use 0.07008% of the alloted limit.
Data successfully queried from API!
Data successfully exported from pandas to sqlite database!
Querying data for 2016...
This API query is expected to use 0.07008% of the alloted limit.
Data successfully queried from API!
Data successfully exported from pandas to sqlite database!
Querying data for 2017...
This API query is expected to use 0.07008% of the alloted limit.
Data successfully queried from API!
Data successfully exported from pandas to sqlite database!
Querying data for 2018...
This API query is expected to use 0.07008% o

NOTE: As of 20200124 - NSRDB Does not have data available for 2019+

"Parameter 'names' must be a comma delimited list including [1998, 1999, 2000, 2001, 2002, 2003, 2004, 2005, 2006, 2007, 2008, 2009, 2010, 2011, 2012, 2013, 2014, 2015, 2016, 2017, 2018, tmy, tmy-2016, tmy-2017, tdy-2017, tgy-2017, tmy-2018, tdy-2018, tgy-2018]"