This notebook allows you to make machine learning based predictions on if supercell storm mode is favored. The predictions are based on nine input variables, and three different machine learning models are avaialable. The full details will be included in a forthcoming journal article.

In [1]:
#imports
import pickle
import numpy as np
from tensorflow import keras
from util import adjust_probs

In [2]:
#load models
gbt = pickle.load(open(f'models/GBT.sav', 'rb'))
svm = pickle.load(open(f'models/SVM.sav', 'rb'))
ann = keras.models.load_model(f'models/ANN')

In [3]:
#define input variable values
MUCAPE = 3000.0 #Description:Most Unstable Parcel CAPE Units: J/kg
MUCIN = -50.0 #Description:Most Unstable Parcel CIN Units: J/kg
MULCL = 1500.0 #Description:Most Unstable Parcel LCL Units: m
LLCAPE = 200.0 #Description:Most Unstable Parcel CAPE in the 3km above the LFC Units: J/kg
sfc1BWD = 7.5 #Description:0-1km Bulk Wind Difference Units: m/s
EBWD = 15.0 #Description:Effective Bulk Wind Difference Units: m/s
ESRH = 100.0 #Description:Effective Storm Relative Helicity Units: m2/s2
ELSRW = 25.0 #Description:Storm Relative Wind at the Equlibrium Level Units: m/s
ESRW = 15.0 #Description:Storm Relative Wind in the Effective Inflow Layer Units: m/s

input_variables = np.asarray([MUCAPE, MUCIN, MULCL, LLCAPE, sfc1BWD, EBWD, ESRH, ELSRW, ESRW]).reshape(1,-1)

In [4]:
#scale inputs for ann and svm
scaler = pickle.load(open(f'models/scaler.sav', 'rb'))
input_variables_scaled=scaler.transform(input_variables)

In [5]:
#make predictions
gbt_prediction = gbt.predict_proba(input_variables)[:,-1][0]
svm_prediction = svm.predict_proba(input_variables_scaled)[:,-1][0]
ann_prediction = ann.predict(input_variables_scaled)[0][0]
avg_prediction = (gbt_prediction+svm_prediction+ann_prediction)/3.0 #model average

In [6]:
#display predictions
#conditional probability of supercell
print('Gradient Boosted Tree Ensemble Prediction:',round(100*gbt_prediction,1),'%')
print('Support Vector Machine Prediction:',round(100*svm_prediction,1),'%')
print('Artificial Neural Network Prediction:',round(100*ann_prediction,1),'%')
print('Average Prediction of the Three Machine Learning Models:',round(100*avg_prediction,1),'%')

Gradient Boosted Tree Ensemble Prediction: 98.8 %
Support Vector Machine Prediction: 85.1 %
Artificial Neural Network Prediction: 84.4 %
Average Prediction of the Three Machine Learning Models: 89.4 %


<b> Adjusting Probabilities: </b>
Due to the fact that the machine learning models are trained on a dataset comprised on a higher ratio of supercells than would be expected climatologically. The machine learning probabilities can be adjusted to account for this as described in our paper using the code shown below.

In [7]:
#estimated prevalence of supercells climatologically. In our study we assume 25% or 0.25
sup_climo = 0.25

#adjust probabilities
gbt_prediction_adjusted = adjust_probs(gbt_prediction, sup_climo, 0.77)
svm_prediction_adjusted = adjust_probs(svm_prediction, sup_climo, 0.50)
ann_prediction_adjusted = adjust_probs(ann_prediction, sup_climo, 0.77)
avg_prediction_adjusted = round((gbt_prediction_adjusted + svm_prediction_adjusted + ann_prediction_adjusted)/3.0,1) #model average

In [8]:
#display predictions
#conditional probability of supercell
print('Gradient Boosted Tree Ensemble Prediction:',round(100*gbt_prediction_adjusted,1),'%')
print('Support Vector Machine Prediction:',round(100*svm_prediction_adjusted,1),'%')
print('Artificial Neural Network Prediction:',round(100*ann_prediction_adjusted,1),'%')
print('Average Prediction of the Three Machine Learning Models:',round(100*avg_prediction_adjusted,1),'%')

Gradient Boosted Tree Ensemble Prediction: 89.4 %
Support Vector Machine Prediction: 65.6 %
Artificial Neural Network Prediction: 34.9 %
Average Prediction of the Three Machine Learning Models: 60.0 %
