## This code reads the unrandomized training/validation/test datasets and classifies it with the trained CNN and plots the distributions of the CNN predictions (0=ice; 1=water)

In [None]:
from matplotlib import pyplot as plt
import pandas as pd
import numpy as np

Loads the dataframe where the CNN outputs of classified samples are stored

In [None]:
dfice = pd.read_pickle(r'../../data/AllIceDF.pkl')
dfwater = pd.read_pickle(r'../../data/AllWaterDF.pkl')

Group the dataframes by bins of incidence angles

In [None]:
bins = np.arange(19, 51)

wat_ang = dfwater.groupby(pd.cut(dfwater['Angle'],bins=bins))
ice_ang = dfice.groupby(pd.cut(dfice['Angle'],bins=bins))

## Prediction box plot
Note: prediction for ice are the inverse of water since it is a binary output

$Prediction_{CNN} = P_{water}$ where $P_{water}$ is the probability of the sample to be in the water class
$P_{ice} = 1 - P_{water} = 1 - Prediction_{CNN}$

The closer the CNN prediction ($Prediciton_{CNN}$) is to 1, the more probable the sample is in the water class and the closer it is to 0, the more pobable the sample is ice. If in fact, the sample is water, and the prediction is 1, this is a perfect guess. Same goes for ice, if the sample is ice and the prediction is 0, this is a perfect accuarcy.

For this study an accurate prediction for ice is if $Prediction_{CNN}\leq0.5$ 
and an accurate prediction for water is if $Prediction_{CNN}\geq0.5$

## Figure: Distribution of CNN predictions for each degree of incidence angle

In [None]:
fig, ax5 = plt.subplots(figsize=(15,10))
bp1 = ax5.boxplot(1-ice_ang['cnn_prediction'].unique(), whis = [5, 95], sym = '', 
            labels = bins[0:-1], positions = np.arange(0.8, len(bins)-0.2, 1), 
            widths = 0.3, patch_artist = True)
bp2 = ax5.boxplot(wat_ang['cnn_prediction'].unique(), whis = [5, 95], sym = '',
            labels = bins[0:-1], positions = np.arange(1.2, len(bins), 1),
            widths = 0.3, patch_artist = True)

for box in bp1['boxes']:
    box.set(facecolor = 'green')
    

for box in bp2['boxes']:
    box.set(facecolor = 'blue')
ax5.set_xlim(0,31)
ax5.set_xticks(np.arange(0, len(bins)+2,2))
ax5.set_xticklabels(np.arange(18, 51,2).tolist(), fontweight = 'bold', fontsize = 22)
ax5.set_ylim(0,1)
ax5.set_yticklabels(np.around(np.arange(0,1.2,0.2),decimals=2).tolist(), fontweight = 'bold', fontsize = 22)
ax5.set_ylabel(r'Model Prediction', fontsize = 26, fontweight = 'bold')
ax5.set_xlabel(r'Incidence Angle ($^O$)', fontsize = 26, fontweight = 'bold')
ax5.legend([bp1["boxes"][0], bp2["boxes"][0]], ['Ice', 'Water'], loc='lower left', fontsize=26)

#fig.savefig(../../data/figures/Figure6.png)

<center><img src="../../data/figures/Figure6.png" height="500px"></center>

<center>Figure 6. 4-band input model (Table 4) predictions distribution per incidence angle. Orange line is the median, boxes correspond to the 1st and 3rd quartile and whiskers represent the 5th and 95th percentile.</center>