In this part one, we will be accomplishing the following steps:
1. Pull data from NSRDB using the Python API
    - If you would like to do this yourself, instructions are here: https://nsrdb.nrel.gov/api-instructions 
    - More information regarding NSRDB can be found here: https://rredc.nrel.gov/solar/old_data/nsrdb/1991-2005/tmy3/
2. Extract the Data to be analyzed for this problem
    - For this first time series analysis, we will just be looking at the total irradiance (Beam + Diffuse) as a univariate series.

In [1]:
#import dependencies
import pandas as pd
import numpy as np
import sys, os

import config


In [2]:
#looking at Minneapolis St. Paul Intl. Airport 

#list of years to pull
year_list = ['2009', '2010', '2011']
lat, lon= '44.8848', '-93.2223'
api_key = config.api_key
attributes = 'dhi,dni'
leap_year = 'false'
interval = '60'
utc = 'false'
your_name = config.name
reason_for_use = 'Education'
your_affiliation = 'Self+Employed'
your_email = config.email
mailing_list = 'false'

# Declare url string
for year in year_list:
    url = 'http://developer.nrel.gov/api/solar/nsrdb_0512_download.csv?wkt=POINT({lon}%20{lat})&names={year}&leap_day={leap}&interval={interval}&utc={utc}&full_name={name}&email={email}&affiliation={affiliation}&mailing_list={mailing_list}&reason={reason}&api_key={api}&attributes={attr}'.format(year=year, lat=lat, lon=lon, leap=leap_year, interval=interval, utc=utc, name=your_name, email=your_email, mailing_list=mailing_list, affiliation=your_affiliation, reason=reason_for_use, api=api_key, attr=attributes)
    df = pd.read_csv(url, skiprows=2)
    df = df.set_index(pd.date_range('1/1/{yr}'.format(yr=year), freq=interval+'Min', periods=525600/int(interval)))
    df.to_csv('data\\' + year + '_nsrdb_data.csv')

In [3]:
#get the irradiance data we want
df = pd.read_csv('data\\2010_nsrdb_data.csv')
print(df.head())

            Unnamed: 0  Year  Month  Day  Hour  Minute  DHI  DNI
0  2010-01-01 00:00:00  2010      1    1     0      30    0    0
1  2010-01-01 01:00:00  2010      1    1     1      30    0    0
2  2010-01-01 02:00:00  2010      1    1     2      30    0    0
3  2010-01-01 03:00:00  2010      1    1     3      30    0    0
4  2010-01-01 04:00:00  2010      1    1     4      30    0    0


In [4]:
#add the DHI and DNI columns together
df = pd.DataFrame()
for year in year_list:
    new_df = pd.read_csv('data\\' + year + '_nsrdb_data.csv')
    new_df = new_df.drop('Minute', axis=1)
    df = df.append(new_df)
tot_irrad = df['DHI'].values + df['DNI'].values
df['Total Irradiance'] = tot_irrad
df = df.drop(['DNI', 'DHI'], axis=1)
df

Unnamed: 0.1,Unnamed: 0,Year,Month,Day,Hour,Total Irradiance
0,2009-01-01 00:00:00,2009,1,1,0,0
1,2009-01-01 01:00:00,2009,1,1,1,0
2,2009-01-01 02:00:00,2009,1,1,2,0
3,2009-01-01 03:00:00,2009,1,1,3,0
4,2009-01-01 04:00:00,2009,1,1,4,0
5,2009-01-01 05:00:00,2009,1,1,5,0
6,2009-01-01 06:00:00,2009,1,1,6,0
7,2009-01-01 07:00:00,2009,1,1,7,0
8,2009-01-01 08:00:00,2009,1,1,8,94
9,2009-01-01 09:00:00,2009,1,1,9,295


In the next section, we'll be diving into lots of analysis of the temporal structure of the data including the seasonality both from day to day and year to year.  Hope to see you there!