<h1>Prototip aplikacije</h1>

U ovoj bilježnici biti će demonstriran jednostavan prototip aplikacije. Učitati će se podatci i spremljeni model, te će se zatim na temelju toga predvidjeti niz podataka.

<h2>Učitavanje podataka</h2>

Prvo će se učitati svi potrebni podatci pomoću biblioteke pandas. Podatci se učitavaju odvojeno za proizvodnju i vremensku prognozu:

In [1]:

import pandas as pd
pd.set_option('display.max_rows', None)

# Učitavanje podataka o proizvodnji
dataframe_production_all = pd.read_csv('Data/SEDrava1_Filtered.csv')
dataframe_production_all['power_timestamp'] = pd.to_datetime(dataframe_production_all['power_timestamp'])
dataframe_production_all.head(100)


Unnamed: 0,power_timestamp,qyt
0,2023-08-10 19:00:00,0.0
1,2024-02-03 09:00:00,1.419
2,2023-07-04 23:45:00,0.0
3,2023-10-19 07:30:00,0.0
4,2024-01-19 00:30:00,0.0
5,2023-12-28 19:45:00,0.0
6,2023-07-02 22:45:00,0.0
7,2023-08-06 12:15:00,0.231
8,2023-10-23 15:15:00,0.0
9,2023-10-17 06:30:00,0.0


Zatim ćemo za potrebe demonstracije učitati nekoliko retka, na temelju kojih ćemo predviđati.

In [2]:

# Odabir podskupa podataka o proizvodnji
dataframe_production_small = dataframe_production_all[(dataframe_production_all['power_timestamp'] >=pd.to_datetime('2023-08-10 05:00:00')) & (dataframe_production_all['power_timestamp'] <= pd.to_datetime('2023-08-10 06:00:00'))]
dataframe_production_small


Unnamed: 0,power_timestamp,qyt
3674,2023-08-10 05:00:00,0.2805
4285,2023-08-10 06:00:00,0.7095
12826,2023-08-10 05:30:00,0.4125
13726,2023-08-10 05:15:00,0.264
23683,2023-08-10 05:45:00,0.5115


Učitavamo i prognozu za odabrani vremenski period. Biramo prognoze s najažurnijim TOF za to.

In [3]:

# Učitavanje podataka o vremenskoj prognozi
dataframe_weather_all = pd.read_csv('Data/SEDrava1_VrijemePrognoza_Filtered.csv')
dataframe_weather_all['tof'] = pd.to_datetime(dataframe_weather_all['tof'])
dataframe_weather_all['vt'] = pd.to_datetime(dataframe_weather_all['vt'])
dataframe_weather_all.head(100)


Unnamed: 0,tof,vt,barometer,outtemp,windspeed,winddir,rain,radiation,cloud_cover
0,2023-01-01 00:00:00,2023-01-01 01:00:00,1028.8,8.0,3.6,200,0.0,0,0.0
1,2023-01-01 00:00:00,2023-01-01 02:00:00,1028.7,7.0,3.6,202,0.0,0,0.0
2,2023-01-01 00:00:00,2023-01-01 03:00:00,1029.2,6.4,3.8,203,0.0,0,0.0
3,2023-01-01 00:00:00,2023-01-01 04:00:00,1028.9,6.0,4.0,208,0.0,0,0.0
4,2023-01-01 00:00:00,2023-01-01 05:00:00,1029.5,6.0,3.3,208,0.0,0,0.0
5,2023-01-01 00:00:00,2023-01-01 06:00:00,1029.5,6.0,3.3,201,0.0,0,0.5
6,2023-01-01 00:00:00,2023-01-01 07:00:00,1030.0,6.7,3.3,185,0.0,51,0.2
7,2023-01-01 00:00:00,2023-01-01 08:00:00,1030.2,8.1,3.9,178,0.0,182,0.0
8,2023-01-01 00:00:00,2023-01-01 09:00:00,1030.2,10.3,3.7,183,0.0,290,0.0
9,2023-01-01 00:00:00,2023-01-01 10:00:00,1029.7,12.1,4.7,180,0.0,334,27.1


In [4]:

# Odabir podskupa podataka o vremenskoj prognozi
selected_dates = (dataframe_weather_all['vt'] == '2023-08-10 06:00:00') | (dataframe_weather_all['vt'] == '2023-08-10 07:00:00') | (dataframe_weather_all['vt'] == '2023-08-10 08:00:00')
filtered_df = dataframe_weather_all[selected_dates]
idx = filtered_df.groupby(['vt'])['tof'].transform('max') == filtered_df['tof']
dataframe_weather_small = filtered_df[idx]
dataframe_weather_small


Unnamed: 0,tof,vt,barometer,outtemp,windspeed,winddir,rain,radiation,cloud_cover
63653,2023-08-10 00:00:00,2023-08-10 06:00:00,1019.9,19.2,1.6,356,0.0,363,8.3
63720,2023-08-10 06:00:00,2023-08-10 07:00:00,1020.0,22.1,3.5,16,0.0,542,0.8
63721,2023-08-10 06:00:00,2023-08-10 08:00:00,1020.8,23.5,2.9,9,0.0,696,8.1


<h2>Učitavanje modela</h2>

Sada trebamo učitati model. Model možemo učitati pomoću biblioteke pickle, kako je bio i spremljen:

In [5]:

import pickle

# Učitavanje modela
with open('test_model.pkl', 'rb') as f:
    model = pickle.load(f)
    

<h2>Kod za predviđanje niza podataka</h2>

Konačno, napisati će se kod za predviđanje niza podataka. Kod će dinamički stvarati retke potrebne za input modela slijedno na temelju starih vrijednosti.

In [6]:

import numpy as np

# Ova funkcija služi za predviđanje niza podataka unaprijed.
# Funkcija dinamički gradi retke pomoću kojih dalje predviđa na temelju vremenskih prognoza i predviđanja u prethodnim koracima
# Ulazni argumenti su:
#    current_datetime: pandas datetime od kojeg počinje predviđanje
#    end_datetime: pandas datetime do kojeg se predviđa (uključivo)
#    dataframe_production: pandas dataframe s podatcima o proizvodnji. Moraju biti prisutni svi podatci od current_datetime do fiksnog broja točaka iza, ovisno o modelu
#    dataframe_weather: pandas dataframe s podatcima o vremenskoj prognozi. Moraju biti prisutni podatci za sve pune sate od current_datetime do end_datetime
#    model: model strojnog učenja
def predict_series(current_datetime, end_datetime, dataframe_production, dataframe_weather, model, verbose=False):
    
    if(verbose): print("Početak predviđanja")
    if(verbose): print("Tablica proizvodnje na početku je: ")    
    if(verbose): print(dataframe_production)    
    
    
    
    while(current_datetime < end_datetime):
        
        if(verbose): print("")    
        if(verbose): print("#####")
        if(verbose): print("Novi korak")    
        if(verbose): print("#####") 
        if(verbose): print("")
            
        if(verbose): print("Trenutni datum je: " + str(current_datetime))
        
        weather_current_row = dataframe_weather.loc[(dataframe_weather['vt'] == current_datetime.floor('H'))].reset_index(drop=True)
        production_current_row = dataframe_production.loc[(dataframe_production['power_timestamp'] == current_datetime)].reset_index(drop=True)
        production_15min_row = dataframe_production.loc[(dataframe_production['power_timestamp'] == current_datetime - pd.Timedelta(minutes=15))].reset_index(drop=True)
        production_30min_row = dataframe_production.loc[(dataframe_production['power_timestamp'] == current_datetime - pd.Timedelta(minutes=30))].reset_index(drop=True)
        production_45min_row = dataframe_production.loc[(dataframe_production['power_timestamp'] == current_datetime - pd.Timedelta(minutes=45))].reset_index(drop=True)
        production_60min_row = dataframe_production.loc[(dataframe_production['power_timestamp'] == current_datetime - pd.Timedelta(minutes=60))].reset_index(drop=True)

        prediction_row = [weather_current_row.iloc[0, 2], weather_current_row.iloc[0, 3], weather_current_row.iloc[0, 4], weather_current_row.iloc[0, 5], weather_current_row.iloc[0, 6], weather_current_row.iloc[0, 7], weather_current_row.iloc[0, 8], production_current_row.iloc[0, 1], production_15min_row.iloc[0, 1], production_30min_row.iloc[0, 1], production_45min_row.iloc[0, 1], production_60min_row.iloc[0, 1]]

        if(verbose): print("Redak na temelju kojeg se predviđa je: " + str(prediction_row))
        if(verbose): print(prediction_row)
        
        predicted_num = model.predict(np.array(prediction_row).reshape(1, -1))[0]
        
        if(verbose): print("Broj koji je predviđen je: " + str(predicted_num))
        
        current_datetime = current_datetime + pd.Timedelta(minutes=15)
        
        new_row = pd.DataFrame({'power_timestamp': [current_datetime], ' qyt': [predicted_num]})
        dataframe_production = pd.concat([dataframe_production,new_row], ignore_index = True)
        
        if(verbose): print("Tablica proizvodnje sada je: ")    
        if(verbose): print(dataframe_production)   
        
    if(verbose): print("")      
    if(verbose): print("!!Završeno predviđanje!!")
    if(verbose): print("") 
 
    return dataframe_production.sort_values(by='power_timestamp')

In [7]:

# Demonstracija rada koda
dataframe_predicted = predict_series(pd.to_datetime('2023-08-10 06:00:00'), pd.to_datetime('2023-08-10 08:00:00'), dataframe_production_small, dataframe_weather_small, model, verbose=True)

Početak predviđanja
Tablica proizvodnje na početku je: 
          power_timestamp     qyt
3674  2023-08-10 05:00:00  0.2805
4285  2023-08-10 06:00:00  0.7095
12826 2023-08-10 05:30:00  0.4125
13726 2023-08-10 05:15:00  0.2640
23683 2023-08-10 05:45:00  0.5115

#####
Novi korak
#####

Trenutni datum je: 2023-08-10 06:00:00
Redak na temelju kojeg se predviđa je: [1019.9, 19.2, 1.6, 356, 0.0, 363, 8.3, 0.7095, 0.5115, 0.4125, 0.264, 0.2805]
[1019.9, 19.2, 1.6, 356, 0.0, 363, 8.3, 0.7095, 0.5115, 0.4125, 0.264, 0.2805]
Broj koji je predviđen je: 0.7343068471921314
Tablica proizvodnje sada je: 
      power_timestamp       qyt
0 2023-08-10 05:00:00  0.280500
1 2023-08-10 06:00:00  0.709500
2 2023-08-10 05:30:00  0.412500
3 2023-08-10 05:15:00  0.264000
4 2023-08-10 05:45:00  0.511500
5 2023-08-10 06:15:00  0.734307

#####
Novi korak
#####

Trenutni datum je: 2023-08-10 06:15:00
Redak na temelju kojeg se predviđa je: [1019.9, 19.2, 1.6, 356, 0.0, 363, 8.3, 0.7343068471921314, 0.7095, 0.5115, 



In [8]:

#Vizualna usporedba s realnim vrijednostima
dataframe_production_small = dataframe_production_all[(dataframe_production_all['power_timestamp'] >=pd.to_datetime('2023-08-10 05:00:00')) & (dataframe_production_all['power_timestamp'] <= pd.to_datetime('2023-08-10 08:00:00'))]
dataframe_production_small = dataframe_production_small.sort_values(by='power_timestamp')

print("Ovo su stvarne vrijednosti:")
print(dataframe_production_small)
print()
print("Ovo su predviđene vrijednosti (od 2023-08-10 06:00:00):")
print(dataframe_predicted)


Ovo su stvarne vrijednosti:
          power_timestamp     qyt
3674  2023-08-10 05:00:00  0.2805
13726 2023-08-10 05:15:00  0.2640
12826 2023-08-10 05:30:00  0.4125
23683 2023-08-10 05:45:00  0.5115
4285  2023-08-10 06:00:00  0.7095
14176 2023-08-10 06:15:00  0.7095
25721 2023-08-10 06:30:00  0.5115
8172  2023-08-10 06:45:00  0.6600
22380 2023-08-10 07:00:00  0.8910
12385 2023-08-10 07:15:00  0.5940
16520 2023-08-10 07:30:00  0.7095
6619  2023-08-10 07:45:00  0.7590
8147  2023-08-10 08:00:00  1.0065

Ovo su predviđene vrijednosti (od 2023-08-10 06:00:00):
       power_timestamp       qyt
0  2023-08-10 05:00:00  0.280500
3  2023-08-10 05:15:00  0.264000
2  2023-08-10 05:30:00  0.412500
4  2023-08-10 05:45:00  0.511500
1  2023-08-10 06:00:00  0.709500
5  2023-08-10 06:15:00  0.734307
6  2023-08-10 06:30:00  0.775254
7  2023-08-10 06:45:00  0.804756
8  2023-08-10 07:00:00  0.823971
9  2023-08-10 07:15:00  0.838194
10 2023-08-10 07:30:00  0.850748
11 2023-08-10 07:45:00  0.859174
12 2023-08

Vidi se da u ovom slučaju predviđanje na dulje staze pokazuje uniformniji rast i nedostatak fluktuacija u usporedbi s realnim vrijednostima, iako je generalni trend i red veličine vrijednosti pogođen. Ovo će se trebati pokušati riješiti kompleksnijim modelima i većim korištenjem podataka o vremenskoj prognozi a manjim o prošlim vrijednostima.