## This notebook is intended to display the rogue wave classification results using the different architectures in different classification and ocean scenarios.

<div style="text-align: center;">
  <img src="ocean_waves.jpg" width="600">
</div>

[Source](https://oceanographicmagazine.com/news/rogue-wave-off-canada/)

- **Extreme waves are waves that are significantly larger than the preceding and subsequent waves. Although the occurrence of these events have a low probability of occurrence, the  impact of these waves can nevertheless be devastating causing serious damage to ships, offshore structures, and people on board them.**
- **These extreme waves are also called *freak* or *rogue* waves and can be loosely defined as waves that are significantly higher than the surrounding waves.**
- **A more precise definition can be stated by relating a local wave height measure, either the wave height $H$ (from trough to crest) or the crest height $\eta_c$ to the   significant wave height $H_s$. Therein, the significant wave height, which is defined as four times the standard deviation of the sea surface elevation, provides a measure for the average wave height. If the wave height $H$ exceeds the significant wave height by a factor of 2.0 (or alternatively 2.2), then the corresponding wave is a rogue wave. An alternative definition requires the crest height $\eta_c$ to exceed the significant wave height $H_s$ by a factor of 1.25.**

**The forecasting of rogue waves is undertaken by designing the task as follows.**

- **Given a window of time series data extracted from a buoy, the purpose of the task is to predict whether there will be a rogue wave within some fixed time horizon. The training data is prepared such that there are equal proportions of wave data windows leading to a rogue wave in the horizon and those that do not lead up to a rogue wave in the horizon.**
- **The training input is thus each such data window, while the output is determined by the presence or absence of a rogue wave at the end of the fixed forecasting horizon.**
- **Experiments have been carried out to observe the effect of both the length of the training data window as well as the forecast horiron used in this training process on the rogue wave forecasting accuracy of the trained neural network models.**  

**An overview of the data window used and the subsequent rogue wave to be forecast is displayed through the illustration here. The forecast horizon $t_{horizon}$ is varied between 3,5 and 10 minutes and the length of the trainign window $t_{window}$ is varied between 15 and 20 minutes to investigate their respective effects on the rogue wave forecasting accuracy.**

<div style="text-align: center;">
  <img src="Slide3.JPG" width="600">
</div>

### Initially, experiments have been carried out using all the different ML architectures to forecast rogue waves over a horizon $t_{advance}$ spanning from 0 minutes to 10 minutes using a data window $t_{training}$ of 15 minutes to train the ML models.

<table>
  <tr>
    <td><img src = "Confusion_matrix_RWs_H_g_2_tadv_5min_0.5_scce_.png" width="400"> </td>
    <td><img src = "svm_RWs_H_g_2_tadv_5min_rw_smallWindow_0.5.jpg" width="400"> </td>
    <td><img src = "dt_RWs_H_g_2_tadv_5min_rw_smallWindow_0.5.jpg" width="400"> </td>
  </tr>
</table>

- **The confusion matrices for the rogue wave classification over $t_{advance}$=5 minutes are displayed here for the different ML architectures.**
- **Number of training examples is around 252,000 and text examples is 68,000 for this dataset.**
- ***From left to right, the confusion matrices are for LSTM, SVM and DT rogue wave classification on the test data respectively.***
- ***It is observed that the best classification accuracies are observed for the models trained using LSTM architectures (67%).***
- ***SVM results (64%) are similar to the LSTM model results, however, DT classifiers do not perform well (51%), pointing to the failure of DT classifiers to generalize on test data.***

<table>
  <tr>
    <td><img src = "Best_accuracies.png" width="600"> </td>
    <td><img src = "Best_f1_scores.png" width="600"> </td>
  </tr>
</table>

- **The best classification accuracies are observed when $rw_{prop}$=0.5.**
- **The best classification accuracies observed for the different model architectures over the range of forecasting horizons is displayed here.**
- ***It is observed that the LSTM model performs the best over the range of the forecasting horizons.***
- ***The SVM classifier closely follows the LSTM model.***
- ***However, the DT classifier does not perform well as the forecasting horizon increases.***
- ***For all the model architectures, the classification accuracies diminish as the forecasting horizon increases.***

### Following this, experiments were carried out using all the different ML architectures to forecast rogue waves over horizon $t_{advance}$ spanning from 0 minutes to 10 minutes and increasing the data window $t_{training}$ to 20 minutes to train the ML models.

<div style="text-align: center;">
  <img src="Effect_of_training_window.png" width="400">
</div>

- **The classification accuracies observed for the different model architectures for a forecast horizon $t_{advance}$ of 5 minutes and using two different durations of the training window $t_{training}$ = {15, 20} minutes is displayed here.**
- ***It is observed that the LSTM model performs the best for both the training window sizes.***
- ***The SVM classifier closely follows the LSTM model.***
- ***However, the accuracy of the DT classifier remains the same for both the training window sizes, signifying that the model might be too simple to capture the complexties of the data used here.***
- ***For the LSTM and SVM models, the test classification accuracy increases as $t_{training}$ increases by 4% and 6% respectively.***

## To understand the applicability of this classification process in diverse and particularly, localized scenarios, rogue wave classification experiments were carried out using data obtained from buoys located within some miles away from each other. 

### In the first case, a group of three buoys were chosen as given [here](all_water_buoys_with_distance_triangular_area.html). For this, the models were trained using data from $d_{1}$ and $d_{2}$ and the model was tested on classifying rogue waves from data obtained from $s_{1}$. $d_{1}$ and $d_{2}$ are buoys located in deep water while $s_{1}$ is a shallow water buoy.

<table>
  <tr>
    <td><img src = "buoySystems_triangular_area\3min\Confusion_matrix_RWs_H_g_2_tadv_3min_0.5_scce_.png" width="400"> </td>
    <td><img src = "buoySystems_triangular_area/3min/confusion_matrices_svm/RWs_H_g_2_tadv_3min_triangular_area_rw_0.5.jpg" width="400"> </td>
    <td><img src = "buoySystems_triangular_area/3min/confusion_matrices_dt/RWs_H_g_2_tadv_3min_triangular_area_rw_0.5.jpg" width="400"> </td>
  </tr>
</table>

- **The confusion matrices for the rogue wave classification over $t_{advance}$=5 minutes are displayed here for the different ML architectures.**
- **Number of training examples is around 9900 and text examples is 7600 for this dataset.**
- ***From left to right, the confusion matrices are for LSTM, SVM and DT rogue wave classification on the test data respectively.***
- ***It is observed that the best classification accuracies are observed for the models trained using SVM architectures (64%).***
- ***The SVM results are much better compared to the LSTM model results (55%) and DT classifiers (51%).***
- ***This shows that the SVM model is better suited for cases compared to neural networks where there is a dearth of training examples. It can generalize better for unseen data in diverse locations. The DT, as observed previously, fails to generalize well for unseen data.***

### In the second case, a group of five buoys were chosen as given [here](all_water_buoys_with_distance_localized.html). For this, the models were trained using data from $d_{1}$ to $d_{4}$ and the model was tested on classifying rogue waves from data obtained from $d_{5}$. $d_{5}$ is located closer to the shore while the other buoys are at more off-shore locations.

<table>
  <tr>
    <td><img src = "Results_github\accuracies_intermediate buoys.png" width="500"> </td>
    <td><img src = "Results_github\f1_scores_intermediate buoys.png" width="500"> </td>
  </tr>
</table>

- **The classification accuracies and the rogue wave $F_{1}$ scores observed for the different model architectures for a training window $t_{training}$ = 15 minutes over the range of the forecasting horizons $t_{advance}$ is displayed here.**
- **Number of training examples is around 8094 and text examples is 340 for this dataset.**
- ***It is observed that the SVM model performs the best over the range of the forecasting horizon similar to the last case.***
- ***However, the accuracy of the DT classifier remains the same for both the training window sizes, signifying that the model might be too simple to capture the complexties of the data used here.***
- ***For the LSTM and SVM models, both the test classification accuracy and the $F_{1}$ score increases as $t_{training}$ increases.***
- ***This further reiterates that the SVM model is better suited for cases with lower training examples compared to data-intensive neural networks. It can generalize better for unseen data in diverse locations. The DT, as observed previously, fails to generalize well for unseen data.***