# Introduction

This notebook will compare results from a reimplemnation of the Reno-Hanson clear sky detection method to that which is available in PVLib.  Their method is being reimplemented here so it will be easier to extract the calculated properties for further analysis and processing.  

# Setup

## Imports, config, etc

In [1]:
import numpy as np
import pandas as pd
import datetime
import pvlib
import model_comparison

import matplotlib
import matplotlib.pyplot as plt
from mpl_toolkits.mplot3d import Axes3D
%matplotlib notebook

import os, sys
lib_path = os.path.abspath(os.path.join('..', 'rdtools'))
sys.path.append(lib_path)
import filtering

import warnings
warnings.simplefilter('ignore')


## Load data

Weather and performance data are taken at 1min intervals.  Supplier is the Sandia National Lab Regional Test Center in Albuquerque, NM.  Data spans roughly 2016-April 2017.  This data was scraped from PVDAQ (http://bit.ly/2mKrOwG).  The meteorological data and the performance data are contained in two separate files (technically separate systems - PV system and meteorological station).

SRRL data is available at https://www.nrel.gov/midc/srrl_bms/.  This data is being used because it was already downloaded for a different notebook.  Irradiance data is also measured on a minute-to-minute basis.

In [2]:
def load_snl():
    filename = os.path.expanduser('~/data_sets/snl_raw_data/1429_1405/raw_1405_weather_for_1429.csv')
    cols = ['Global_Wm2', 'Date-Time']
    data = pd.read_csv(filename, parse_dates=['Date-Time'], usecols=cols, index_col=['Date-Time'])
    data.index = data.index.tz_localize('Etc/GMT+7')
    data = data.reindex(pd.date_range(start=data.index[0], end=data.index[-1], freq='1min')).fillna(0)
    data = pd.Series(data['Global_Wm2'], index=data.index)
    data[data < 50] = 0 
    return data

In [3]:
def load_srrl():
    srrl_file = os.path.expanduser('~/data_sets/srrl/20140101.csv')
    srrl_data = pd.read_csv(srrl_file)
    srrl_data.index = pd.to_datetime(srrl_data['DATE (MM/DD/YYYY)'] + ' ' + srrl_data['MST'])
    srrl_data.index = srrl_data.index.tz_localize('Etc/GMT+7')
    srrl_data = srrl_data[~srrl_data.index.duplicated(keep='first')]
    srrl_data.drop(['DATE (MM/DD/YYYY)', 'MST'], inplace=True, axis=1)
    srrl_data = pd.Series(srrl_data['Global 40-South LI-200 [W/m^2]'], index=srrl_data.index)
    srrl_data[srrl_data < 50] = 0
    srrl_data2 = pd.Series(0, index=pd.date_range(start=srrl_data.index.date[0], 
                                                  end=srrl_data.index.date[-1] + pd.Timedelta('1D'), freq='1min'))
    srrl_data2.index = srrl_data2.index.tz_localize('Etc/GMT+7')
    srrl_data2[srrl_data.index] = srrl_data
    srrl_data = srrl_data2.copy()
    return srrl_data

In [4]:
snl_data = load_snl()
srrl_data = load_srrl()

## PVLib

In [5]:
def make_pvlib_sys(tilt, elevation, azimuth, lat, lon):
    sys_no_loc = pvlib.pvsystem.PVSystem(surface_tilt=tilt, surface_azimuth=azimuth)
    sys_loc = pvlib.location.Location(lat, lon, altitude=elevation)
    sys = pvlib.pvsystem.LocalizedPVSystem(pvsystem=sys_no_loc, location=sys_loc)
    return sys

In [6]:
snl_params = {'tilt': 35, 'elevation': 1658, 'azimuth': 180, 
              'lat': 35.0549, 'lon': -106.5433}
rtc = make_pvlib_sys(**snl_params)

In [7]:
srrl_params = {'tilt': 40, 'elevation': 1828.8, 'azimuth': 180, 
               'lat': 39.742, 'lon': -105.18}
srrl = make_pvlib_sys(**srrl_params)

## Analysis functions

In [8]:
def pvlib_compare_plot(sample, mf, pvlib, title=''):
    size = 15
    fig, axes = plt.subplots(ncols=1, nrows=2, figsize=(8, 5))

    ax = axes[0]
    _ = ax.plot(sample.index, sample)
    _ = ax.scatter(sample.index[mf & ~pvlib], sample[mf & ~pvlib], 
                   facecolor='none', edgecolor='green', label='reimplementation', s=size)
    _ = ax.scatter(sample.index[pvlib & ~mf], sample[pvlib & ~mf], 
                   facecolor='none', edgecolor='red', label='pvlib', s=size)
    _ = ax.scatter(sample.index[pvlib & mf], sample[pvlib & mf], 
                   facecolor='none', edgecolor='orange', label='both', s=size)
    _ = ax.legend()
    _ = ax.set_ylabel('GHI / W/m^2')
    ax.set_title(title, fontsize='large')
    
    tmp_df = pd.DataFrame()
    tmp_df['reimplementation'] = mf
    tmp_df['pvlib'] = pvlib
    tmp_df = tmp_df.resample('D').mean()
    tmp_df.index = tmp_df.index.date
    tmp_df.plot(kind='bar', ax=axes[1])
    _ = axes[1].set_ylabel('Pct of day clear')
    
    fig.tight_layout()


In [9]:
def stat_cs_compare_plot(sample, stat_cs, pvlib, title=''):
    size = 15
    fig, axes = plt.subplots(ncols=1, nrows=2, figsize=(8, 5))

    ax = axes[0]
    _ = ax.plot(sample.index, sample)
    _ = ax.scatter(sample.index[stat_cs & ~pvlib], sample[stat_cs & ~pvlib], 
                   facecolor='none', edgecolor='green', label='Stat. CS', s=size)
    _ = ax.scatter(sample.index[pvlib & ~stat_cs], sample[pvlib & ~stat_cs], 
                   facecolor='none', edgecolor='red', label='PVLib CS', s=size)
    _ = ax.scatter(sample.index[pvlib & stat_cs], sample[pvlib & stat_cs], 
                   facecolor='none', edgecolor='orange', label='both', s=size)
    _ = ax.legend()
    _ = ax.set_ylabel('GHI / W/m^2')
    ax.set_title(title, fontsize='large')
    
    tmp_df = pd.DataFrame()
    tmp_df['Stat. CS'] = stat_cs
    tmp_df['PVLib CS'] = pvlib
    tmp_df = tmp_df.resample('D').mean()
    tmp_df.index = tmp_df.index.date
    tmp_df.plot(kind='bar', ax=axes[1])
    _ = axes[1].set_ylabel('Pct of day clear')
    
    fig.tight_layout()

# Investigation

## Sandia RTC

In [10]:
sample = snl_data[(snl_data.index >= '2016-07-01') & (snl_data.index < '2016-07-15')]

### Generate model clear sky irradiance and detect clear skies in sample using PVLib functionality.

In [11]:
clear_skies = rtc.get_clearsky(sample.index)
clear_skies = pd.Series(clear_skies['ghi'], index=sample.index)
pvlib_clear, components, alpha = \
    pvlib.clearsky.detect_clearsky(sample, clear_skies, 
                                   sample.index, 10, return_components=True)

### Detect clear skies using reimplementation of RH method

In [12]:
mc = model_comparison.ModelCompareDetect(sample, clear_skies)

In [13]:
compare_clear = mc.reno_hansen_detection()

In [14]:
pvlib_compare_plot(sample, compare_clear, pvlib_clear, title='')

<IPython.core.display.Javascript object>

The reimplementation of the Reno-Hansen method in PVLib works well.  We notice that the reimplementation labels less points than the PVLib implementation even though the same tolerances are used.  This is expected behavior though.  In the PVLib implementation, all points inside a window that passed all tests was given the 'clear' label.  In this implementation, only the midpoint of clear windows are labeled as clear. 

### Using a statistical clear sky model in place of PVLib model

In [15]:
mc = model_comparison.ModelCompareDetect(sample)
pvlib_clear_stat_cs, components, alpha = \
    pvlib.clearsky.detect_clearsky(sample, mc.model, 
                                   sample.index, 10, return_components=True)

In [16]:
stat_cs_compare_plot(sample, pvlib_clear_stat_cs, pvlib_clear)

<IPython.core.display.Javascript object>

The statistical clear sky model gives very similar results to the PVLib physical model.

## Solar Radiation Research Lab BMS

In [17]:
sample = srrl_data[(srrl_data.index >= '2014-01-01') & (srrl_data.index < '2014-01-15')]

### Generate model clear sky irradiance and detect clear skies in sample using PVLib functionality.

In [18]:
clear_skies = rtc.get_clearsky(sample.index)
clear_skies = pd.Series(clear_skies['ghi'], index=sample.index)
pvlib_clear, components, alpha = \
    pvlib.clearsky.detect_clearsky(sample, clear_skies, 
                                   sample.index, 10, return_components=True)

### Detect clear skies using reimplementation of RH method

In [19]:
mc = model_comparison.ModelCompareDetect(sample, clear_skies)

In [20]:
compare_clear = mc.reno_hansen_detection()

In [21]:
pvlib_compare_plot(sample, compare_clear, pvlib_clear, title='')

<IPython.core.display.Javascript object>

Even though it looks bad, this is the desired result.  The reimplementation of RH method and the original method from PVLib behave very similarly, with the reimplementation being more conservative with clear labels.  The reimplemntation will make it easier to analyze correlations in the derived properties and metrics used for classification as well as additional metrics.

### Detect clear skies when model from PVLib fails

The PVLib clear sky model is clearly 'misaligned' with the measurements at the SRRL BMS location.  In the case where modeled clear skies and the measurements are too far off to reconcile, the `ModelCompareDetect` object can generate a clear sky model based on a limited sample of days.

In [22]:
mc = model_comparison.ModelCompareDetect(sample)
pvlib_clear_stat_cs, components, alpha = \
    pvlib.clearsky.detect_clearsky(sample, mc.model, 
                                   sample.index, 10, return_components=True)

In [23]:
stat_cs_compare_plot(sample, pvlib_clear_stat_cs, pvlib_clear, title='')

<IPython.core.display.Javascript object>

The 'statistical clear sky model' proves to be of use when the model from PVLib fails.  The modeled clear sky here does require that the user use an appropriate sample size (too large will see seasonal affects, too small will be noisy) and makes assumptions that may be untrue.  The biggest assumption is that over the period of time chosen, the curve will yield a clear sky curve based on the chosen method.

# Conclusion






In the two example cases above, the reimplemntation of the Reno-Hanson clear sky detection algorithm produces similar results ot the original implementation.  Again, the lower number of clear sky periods is an expected result.  The original method sets all periods to clear within a clear window.  This implementation sets only the midpoint of a clear window to clear.