# Post-processing and visualization

If you have run many HAWC2 simulations, perhaps on a cluster or using a batch script, it is useful to visualize the statistics. The two steps are to first post-process and then visualize the results.

## Post-processing

Once the files have been generated, you can use the lacbox to calculate statistics from all of the simulations and save the results into an HDF5 file. The post-processing can be with or without the 10-minute damage-equivalent loads (DELs). Calculating without includes base statistics such as mean, standard deviation, max, min, etc. For more documentation on the post-processing options, please see the API tab to the left.

Because post-processing is a heavy computational load, we will not demo it in detail in this notebook. However, here is some code to show you what it would look like.

```
res_dir = './res/'  # folder with HDF5 HAWC2 files
save_path = 'stats.hdf5'  # where to save the stats file
calc_del = True  # don't calculate damage-equivalent loads
stats_df = process_statistics(res_dir, save_path, calc_del=calc_del)
```

## Visualization

Below is some code demonstrating how you can use pandas to visualize the results in the HDF5 file.

### Load the HDF5 file and examine its structure

In [None]:
from pathlib import Path
import matplotlib.pyplot as plt
import numpy as np
import pandas as pd

from lacbox.io import ReadHAWC2
from lacbox.test import test_data_path

res_name = Path(test_data_path) / 'dtu_10mw_steady_stats.hdf5'
stats_df = pd.read_hdf(res_name, 'stats_df')

Let's first familiarize ourselves with the structure of the stats file.

In [None]:
print(stats_df.shape)
stats_df.columns

There are 4998 rows, each corresponding to a single channel in a single file. The columns are:
 * `path`: the full path to the HAWC2 time-marching result file
 * `filename`: the name of the HAWC2 file
 * `subfolder`: the name of the subfolder, if the htc file was in one
 * `ichan`: the channel index, matching pdap (i.e., starting from 1)
 * `mean`, `max`, etc.: the value of the corresponding statistic
 * `X%`: the value of the corresponding percentile
 * `delX`: the 10-minute damage-equivalent fatigue load, calculated with the indicated value for the Wöhler exponent

Note that there is **NO** metadata for the channels saved. If you want to know which channel is which or what units, you will need to load a single time series into Pdap to determine the channel indices that you are interested in.

This dataset is the DTU 10 MW operating with steady wind, no tower shadow, and no shear. The dataset includes two sets of simulations: with normal tilt and with tilt set to 0. Each case was saved in a subfolder, and we can examine their names.

In [None]:
stats_df.subfolder.unique()

### Isolate data of interest and make plots

Let's make a plot of some data versus mean wind speed for the two cases to investigate the effects of tilt on mean loads.

Start by defining a dictionary that maps some human-friendly names to the channel indices for this dataset.

In [None]:
chans = dict(wsp=15, omega=10, pitch=4, power=111, gentrq=81,  # wsp, rotor speed, pitch, power, generator torque
             tbfa=19, tbss=20, ttpt=22, ttrl=23,  # tower-base fore-aft and side-side, tower-top pitch and roll
             fbrm=37, ebrm=38)  # flapwise and edgewise blade-root moment

Now let's isolate the two cases with and without tilt.

In [None]:
tilt_df = stats_df[stats_df.subfolder == 'tilt']
notilt_df = stats_df[stats_df.subfolder == 'notilt']

And finally let's plot our channels of interest in a quick plot.

In [None]:
plot_chans = ['omega', 'pitch', 'power', 'gentrq', 'tbfa', 'tbss', 'ttpt', 'ttrl', 'fbrm', 'ebrm']
fig, axs = plt.subplots(5, 2, num=1, figsize=(12, 14), clear=True)
for i, chan in enumerate(plot_chans):
    ax = axs.flatten()[i]
    # get the mean value for that channel
    tilt_mean = tilt_df.loc[tilt_df.ichan == chans[chan], 'mean']
    notilt_mean = notilt_df.loc[notilt_df.ichan == chans[chan], 'mean']
    # plot versus mean wsp
    ax.plot(tilt_df.loc[tilt_df.ichan == chans['wsp'], 'mean'], tilt_mean, 'o', label='With tilt')
    ax.plot(notilt_df.loc[notilt_df.ichan == chans['wsp'], 'mean'], notilt_mean, 'x', label='Without tilt')
    # prettify
    ax.grid()
    ax.set(title=chan)

axs[0, 0].legend()

fig.tight_layout()

We can see that tilt does not significantly impact the steady-state operational values for the turbine or the tower-base fore-aft moment. However, it does have an impact on the tower-base side-side moment and a very significant impact on the towertop pitch moment. This makes sense, as pitch creates an imbalanced load on the rotor in the lateral and vertical directions, and this extra loads increase in magnitude with wind speed.

Let's identify which simulation has the largest tower-base for-aft moment for both cases.

In [None]:
chan = 'tbfa'

idxmax = tilt_df.loc[tilt_df.ichan == chans[chan], 'mean'].idxmax()
print('Max-TBFA simulation with tilt:', tilt_df.loc[idxmax, 'filename'])
print('  Max load:', tilt_df.loc[idxmax, 'mean'], 'kNm')

idxmax = notilt_df.loc[notilt_df.ichan == chans[chan], 'mean'].idxmax()
print('Max-TBFA simulation without tilt:', notilt_df.loc[idxmax, 'filename'])
print('  Max load:', notilt_df.loc[idxmax, 'mean'], 'kNm')

As expected, both have peaks at 11 m/s, which is near rated.