# Vyhodnotenie najlepšieho modelu pre porovnanie s podobnou štúdiou
Súbor: vyhodnotenie.ipynb

Program: Hospodárska informatika

Vypracovala: Bc. Veronika Motúzová

Diplomová práca: : Predikcia geomagnetickych búrok pomocou hlbokého učenia

Vedúci diplomovej práce: doc. Ing. Peter Butka, PhD.

Konzultanti: Ing. Viera Maslej Krešňáková, PhD., RNDr. Šimon Mackovjak, PhD. 

### Inštalácia knižníc

In [None]:
!pip install pyarrow
!pip install keras
!pip install --upgrade tensorflow
!pip install --upgrade tensorflow-gpu

### Kontrola kapacity servera

In [2]:
!nvidia-smi

Mon Mar 13 21:23:33 2023       
+-----------------------------------------------------------------------------+
| NVIDIA-SMI 515.86.01    Driver Version: 515.86.01    CUDA Version: 11.7     |
|-------------------------------+----------------------+----------------------+
| GPU  Name        Persistence-M| Bus-Id        Disp.A | Volatile Uncorr. ECC |
| Fan  Temp  Perf  Pwr:Usage/Cap|         Memory-Usage | GPU-Util  Compute M. |
|                               |                      |               MIG M. |
|   0  Quadro RTX 4000     Off  | 00000000:8B:00.0 Off |                  N/A |
| 30%   41C    P0    39W / 125W |    912MiB /  8192MiB |      0%      Default |
|                               |                      |                  N/A |
+-------------------------------+----------------------+----------------------+
                                                                               
+-----------------------------------------------------------------------------+
| Proces

### Import knižníc

In [18]:
import pandas as pd
import numpy as np
from tensorflow import keras
import matplotlib.pyplot as plt
from sklearn.preprocessing import MinMaxScaler, StandardScaler
from pandas.plotting import register_matplotlib_converters
register_matplotlib_converters()
import seaborn as sns

from keras.preprocessing.sequence import TimeseriesGenerator
from keras.models import Sequential

from keras.layers import Dense, Activation, Dropout, Input, Conv1D, LSTM, MaxPooling1D, Flatten, TimeDistributed, Bidirectional
from tensorflow.keras.callbacks import ModelCheckpoint, EarlyStopping
from keras.models import Model


from scipy.special import expit, logit

from sklearn.metrics import confusion_matrix, classification_report, matthews_corrcoef

### Načitanie dát, odstránenie na hodnôt, určenie predikovaného atribútu

In [22]:
test = pd.read_csv('test_omni.csv')
features = ['time1',
           'DST',
            'F10_INDEX',
            'BZ_GSM',
            'DST+6']
test = test[features]
test['time1']=pd.to_datetime(test['time1'])
predicted_label = 'DST+6'
predicators = ['DST',
            'F10_INDEX',
            'BZ_GSM']
y_col='DST+6'
y_test = test[y_col].values.copy()
X_test = test[predicators].values.copy()

In [23]:
# set batch, n_input, n_features

n_input = 6  # how many samples/rows/timesteps to look in the past in order to forecast the next sample
n_features= len(X_test)  # how many predictors/Xs/features we have to predict y
b_size = 256  # Number of timeseries samples in each batch

test_generator = TimeseriesGenerator(X_test, y_test, length=n_input, batch_size=b_size)

In [24]:
X_test

array([[ 1.00000000e+01,  1.77699997e+02,  1.79999995e+00],
       [ 1.30000000e+01,  1.77699997e+02,  2.29999995e+00],
       [ 1.60000000e+01,  1.78399994e+02,  1.60000002e+00],
       ...,
       [-2.50000000e+01,  1.34100006e+02, -3.90000010e+00],
       [-2.40000000e+01,  1.34100006e+02, -1.00000001e-01],
       [-1.90000000e+01,  1.25300003e+02, -1.50000000e+00]])

In [25]:
# load best model
model = keras.models.load_model('6_6_pridane_atr.hdf5')

In [26]:
# prediction
y_pred = model.predict(test_generator)



In [11]:
intervals1 = [-1000,-180,-40,1000] #posunuté thresholdy

In [10]:
intervals2 = [-1000,-250,-50,1000] #thresholdy podla štúdie

In [12]:
y_pred2 = np.digitize(y_pred, intervals1)

In [13]:
y_test2 = np.digitize(y_test[n_input:], intervals2)

In [14]:
y_pred2

array([[3],
       [3],
       [3],
       ...,
       [3],
       [3],
       [3]])

In [15]:
cm = confusion_matrix(y_test2, y_pred2)
print("Confusion matrix: \n" + str(cm))

Confusion matrix: 
[[    11      0      0]
 [    17   3358      5]
 [     0   1856 150006]]


In [16]:
class_1_tpr = cm[0, 0] / np.sum(cm[0])
class_1_fpr = (np.sum(cm[:, 0]) - cm[0, 0]) / (np.sum(cm) - np.sum(cm[0]))

class_2_tpr = cm[1, 1] / np.sum(cm[1])
class_2_fpr = (np.sum(cm[:, 1]) - cm[1, 1]) / (np.sum(cm) - np.sum(cm[1]))

class_3_tpr = cm[2, 2] / np.sum(cm[2])
class_3_fpr = (np.sum(cm[:, 2]) - cm[2, 2]) / (np.sum(cm) - np.sum(cm[2]))

# Print the results
print("Class 1 TPR:", class_1_tpr)
print("Class 1 FPR:", class_1_fpr)
print("Class 2 TPR:", class_2_tpr)
print("Class 2 FPR:", class_2_fpr)
print("Class 3 TPR:", class_3_tpr)
print("Class 3 FPR:", class_3_fpr)

Class 1 TPR: 1.0
Class 1 FPR: 0.00010950644799732032
Class 2 TPR: 0.993491124260355
Class 2 FPR: 0.012220737063204125
Class 3 TPR: 0.9877783777376828
Class 3 FPR: 0.001474491300501327


In [17]:
print(classification_report(y_test2, y_pred2))

              precision    recall  f1-score   support

           1       0.39      1.00      0.56        11
           2       0.64      0.99      0.78      3380
           3       1.00      0.99      0.99    151862

    accuracy                           0.99    155253
   macro avg       0.68      0.99      0.78    155253
weighted avg       0.99      0.99      0.99    155253

