## Generating solar scenarios using PGscen ##

In this notebook we will use PGscen to create a thousand scenarios for photovoltaic generator output across the Texas 7k system. The output generated here can also be created using the ```pgscen-solar``` command line tool installed as part of PGscen.

We begin by finding the folder in which input datasets are located, and where the output scenarios will be saved. We then choose the date for which scenarios will be generated. Although scenarios are generated starting at midnight local time, we normalize our data to UTC time, and our scenarios start at 6am UTC.

In [None]:
from pathlib import Path

cur_path = Path("day-ahead_solar.ipynb").parent.resolve()
data_dir = Path(cur_path, '..', "data").resolve()
print("The data folder we will use is: {}".format(data_dir))

import pandas as pd

scenario_count = 1000
start_date = '2018-03-02'
scen_start_time = pd.to_datetime(' '.join([start_date, "06:00:00"]), utc=True)
print("Scenarios will start at: {}".format(scen_start_time))

The next step is to load the input datasets: the actual output for each photovoltaic generator, the day-ahead forecasted output, and the generator characteristics. The former two are then split according to whether they came from before the day we want to generate scenarios for (```hist```) or after the scenario day (```futures```).

In [None]:
from pgscen.command_line import (
    load_solar_data, split_actuals_hist_future, split_forecasts_hist_future)

(solar_site_actual_df, solar_site_forecast_df,
     solar_meta_df) = load_solar_data(data_dir)

(solar_site_actual_hists,
     solar_site_actual_futures) = split_actuals_hist_future(
            solar_site_actual_df, scen_start_time)

(solar_site_forecast_hists,
     solar_site_forecast_futures) = split_forecasts_hist_future(
            solar_site_forecast_df, scen_start_time)

print("SOLAR ACTUALS")
print(solar_site_actual_df.iloc[:5, [3, 101]])
print("")
print("SOLAR FORECASTS")
print(solar_site_forecast_df.iloc[:5, [0, 1, 5, 103]])
print("")

import matplotlib.pyplot as plt
from matplotlib.patches import Patch
import matplotlib.dates as mdates
%matplotlib inline
plt.rcParams['figure.figsize'] = [19, 11]

fig, (hist_ax, future_ax) = plt.subplots(figsize=(13, 6), nrows=1, ncols=2)
title_args = dict(weight='semibold', size=19)
actual_clr, fcst_clr = "#430093", "#D9C800"
plt_asset = 'Angelo Solar'

hist_ax.set_title("History", **title_args)
hist_ax.plot(solar_site_actual_hists[plt_asset][-250:], c=actual_clr)
hist_ax.plot(solar_site_forecast_hists['Forecast_time'][-250:],
             solar_site_forecast_hists[plt_asset][-250:],
             c=fcst_clr)
hist_ax.xaxis.set_major_formatter(mdates.DateFormatter('%m/%d'))

future_ax.set_title("Future", **title_args)
future_ax.plot(solar_site_actual_futures[plt_asset][:250], c=actual_clr)
future_ax.plot(solar_site_forecast_futures['Forecast_time'][:250],
               solar_site_forecast_futures[plt_asset][:250], c=fcst_clr)
future_ax.xaxis.set_major_formatter(mdates.DateFormatter('%m/%d'))

lgnd_ptchs = [Patch(color=actual_clr, alpha=0.53, label="Actuals"),
              Patch(color=fcst_clr, alpha=0.53, label="Forecasts")]

fig.legend(handles=lgnd_ptchs, frameon=False, fontsize=17, ncol=2, loc=8,
           bbox_to_anchor=(0.5, -0.04), handletextpad=0.7)

We are now ready to create a scenario engine, and instruct it to fit a set of models describing the historical generator output data. Note that this step may take around a minute to complete. 

In [None]:
from pgscen.engine import SolarGeminiEngine

se = SolarGeminiEngine(solar_site_actual_hists, solar_site_forecast_hists,
                       scen_start_time, solar_meta_df)
se.fit_solar_model()

This includes a model for generator behavior during the daytime, as well as a series of conditional models describing dawn and dusk times which must be handled separately. Each of these models consists of a covariance matrix across all pairs of generators, and another covariance matrix across all pairs of day times included in the model.

In [None]:
import seaborn as sns
from scipy.spatial import distance
from scipy.cluster.hierarchy import linkage, dendrogram

cov_cmap = sns.diverging_palette(3, 237, s=81, l=43, sep=3, as_cmap=True)

def get_clustermat(cov_mat):
    clust_order = dendrogram(linkage(distance.pdist(cov_mat,
                                                    metric='euclidean'),
                                     method='centroid'),
                             no_plot=True)['leaves']
    
    return cov_mat.iloc[clust_order, clust_order]
    
fig, axarr = plt.subplots(figsize=(15, 21),
                          nrows=len(se.gemini_dict), ncols=2)

day_model = se.gemini_dict['day']['gemini_model']
sns.heatmap(get_clustermat(day_model.asset_cov),
            ax=axarr[0, 0], cmap=cov_cmap, vmin=-1, vmax=1, square=True)
sns.heatmap(get_clustermat(day_model.horizon_cov),
            ax=axarr[0, 1], cmap=cov_cmap, vmin=-1, vmax=1, square=True)

for i in range(se.cond_count):
    cond_model = se.gemini_dict['cond', i]['gemini_model']
    
    sns.heatmap(get_clustermat(cond_model.asset_cov), ax=axarr[i + 1, 0],
                cmap=cov_cmap, vmin=-1, vmax=1, square=True)
    sns.heatmap(get_clustermat(cond_model.horizon_cov), ax=axarr[i + 1, 1],
                cmap=cov_cmap, vmin=-1, vmax=1, square=True)

axarr[0, 0].set_title("Generator Covariance\n", **title_args)
axarr[0, 1].set_title("Timestep Covariance\n", **title_args)
plt.tight_layout(w_pad=-2, h_pad=1.7)

We are now ready to use these fitted models to generate scenarios. This is done by producing deviations from the forecasted data for the given day using distributions whose parameters were determined during the fitting step. Scenario generation can also take roughly a minute of runtime.

In [None]:
se.create_solar_scenario(scenario_count, solar_site_forecast_futures)
print(se.scenarios['solar'].iloc[:, [0, 301, 777, 1001]])

In [None]:
plt_asset = se.asset_list[17]

for i in range(scenario_count):
    plt.plot(se.scenarios['solar'].iloc[i][plt_asset],
             c='black', alpha=0.11, lw=0.2)

plt_fcst = se.forecasts['solar'][plt_asset]
plt.plot(plt_fcst, c=fcst_clr, alpha=0.47, lw=7.1)
plt.plot(solar_site_actual_futures.loc[plt_fcst.index, plt_asset],
         c=actual_clr, alpha=0.47, lw=7.1)

lgnd_ptchs = [Patch(color='black', alpha=0.23, label="Scenarios"),
              Patch(color=fcst_clr, alpha=0.81, label="Forecast"),
              Patch(color=actual_clr, alpha=0.81, label="Actual")]

plt.legend(handles=lgnd_ptchs, frameon=False, fontsize=17, ncol=3, loc=8,
           bbox_to_anchor=(0.5, -0.13), handletextpad=0.7)

Our final step is to save the generated scenarios to file. We include the actual and forecasted generator outputs in the saved data to facilitate downstream analyses.

In [None]:
se.write_to_csv(data_dir, {'solar': solar_site_actual_futures},
                write_forecasts=True)