# Modelowanie przy użyciu metody najmniejszych kwadratów

W pierwszym podejściu, modelujemy zadane wyjścia przy zastosowaniu metody najmniejszych kwadratów. Przetestowane zostaną modele liniowe oraz nieliniowe o różnych stopniach nielinowości.

### Zaimportuj potrzebne biblioteki

In [1]:
import pandas as pd
import numpy as np
import matplotlib as mpl
import matplotlib.pyplot as plt
%matplotlib inline

from sklearn.linear_model import LinearRegression
from sklearn.metrics import r2_score

print("pandas version: {}".format(pd.__version__))
print("numpy version: {}".format(np.__version__))
print("matplotlib version: {}".format(mpl.__version__))

pandas version: 1.0.1
numpy version: 1.18.1
matplotlib version: 3.2.0


### Zamień datę na sekundy od początku eksperymentu

In [2]:
def changeDateToSeconds(df):
    first = df["date"][0]
    df["date"] = df["date"].apply(lambda timestamp: (timestamp-first).seconds)
    return df

### Wczytaj dane

In [3]:
def readDataFromExcel(path, sheet):
    df = pd.read_excel(path, sheet_name=sheet)
    df["date"] = pd.to_datetime(df["date"])
    df = changeDateToSeconds(df)
    return df

### Wczytaj zbiór uczący i weryfikacyjny

In [4]:
df_learn = readDataFromExcel("./data/K-1_MI.xlsx", "d2")
df_verif = readDataFromExcel("./data/K-1_MI.xlsx", "d6")

### Zbiór uczący 

In [5]:
df_learn.head()

Unnamed: 0,date,FP05,LT1,LT2,LT3,LT4,TMA,TMB,TMC,TMD,...,PTWS,TW02,TW01,FW03,TW04,TW03,FW04,TTWT,PTWT,PPW
0,0,1068.9575,21.4107,11.6199,11.714,18.8643,43.8947,76.1302,75.8843,74.866,...,19.3311,539.8597,301.2129,9.4894,539.5955,290.892,14.2998,177.7786,10.3202,3.5039
1,10,1068.9575,21.4107,11.6199,11.714,18.8643,43.8947,76.1302,75.8843,74.866,...,19.3311,539.8597,301.2129,9.4894,539.5955,290.892,14.2998,177.7786,10.3202,3.5039
2,20,1068.9575,21.4107,11.6199,11.714,18.8643,43.8947,76.1302,75.8843,74.866,...,19.3147,539.8597,301.2129,9.4894,539.5955,292.1826,14.2998,177.7786,10.3202,3.4937
3,30,1068.9575,21.4107,11.6199,11.714,18.8643,43.8947,76.1302,75.8843,74.866,...,19.3298,539.8597,302.4265,9.4894,539.5955,292.1826,14.2998,177.7786,10.3202,3.4937
4,40,1068.9575,21.4107,11.6199,11.714,18.8643,43.8947,76.1302,75.8843,74.866,...,19.3298,539.8597,302.4265,9.4894,539.5955,292.1826,14.2998,177.7786,10.3202,3.4937


### Zbiór weryfikacyjny

In [6]:
df_verif.head()

Unnamed: 0,date,FP05,LT1,LT2,LT3,LT4,TMA,TMB,TMC,TMD,...,PTWS,TW02,TW01,FW03,TW04,TW03,FW04,TTWT,PTWT,PPW
0,0,1055.1635,21.5507,12.5797,11.714,20.9103,40.9401,72.2017,72.6573,69.6785,...,19.1031,540.7268,293.7237,10.8584,540.5005,277.3242,16.5933,176.8009,10.1627,3.4616
1,2,1055.1635,21.5507,12.5797,11.714,20.9103,40.9401,72.2017,72.6573,69.6785,...,19.1031,540.7268,293.7237,10.8584,540.5005,277.3242,16.5933,176.8009,10.1627,3.4616
2,4,1055.1635,21.5507,12.5797,11.714,20.9103,40.9401,72.2017,72.6573,69.6785,...,19.1031,540.7268,293.7237,10.8584,540.5005,277.3242,16.5933,176.8009,10.1627,3.4616
3,6,1055.1635,21.5507,12.5797,11.714,20.9103,40.9401,72.2017,72.6573,69.6785,...,19.1031,540.7268,293.7237,10.8584,540.5005,277.3242,16.5933,176.8009,10.1627,3.4616
4,8,1055.1635,21.5507,12.5797,11.714,20.9103,40.9401,72.2017,72.6573,69.6785,...,19.0894,540.7268,293.7237,10.8584,540.5005,277.3242,16.5933,176.8009,10.1627,3.4616


### MNK - model liniowy, statyczny

In [7]:
u_learn = df_learn.drop(["LT01", "DP", "date"], axis=1).to_numpy()
y_learn = df_learn[["LT01", "DP"]].to_numpy()

u_verif = df_verif.drop(["LT01", "DP", "date"], axis=1).to_numpy()
y_verif = df_verif[["LT01", "DP"]].to_numpy()


### Weryfikacja modelu liniowego, statycznego


In [8]:
reg = LinearRegression().fit(u_learn, y_learn)
y_model_learn = reg.predict(u_learn)
y_model_verif = reg.predict(u_verif)

print("Score learn: {}".format(r2_score(y_learn, y_model_learn)))
print("Score verif: {}".format(r2_score(y_verif, y_model_verif)))

Score learn: 0.68981181987722
Score verif: -6.578488912345968e+18


### Modele nielinowe, dynamiczne

Funkcja `createModelMatrix` tworzy macierz A do rozwiązywania zadania najmniejszych kwadratów. Macierz jest postaci:

[ y0^1[k] ... y0^D[k] y0^1[k-1] .. y0^D[k-1] ... y0^1[k-N] ... y0^D[k-N] y1^1[k] .... y1^1[k] ... y1^D[k] y1^1[k-1] .. y1^D[k-1] ... y1^1[k-N] ... y1^D[k-N] .... ]

In [93]:
def createModelMatrix(exponent, order, inputs):
    samples = inputs.shape[0]
    modelVariables = inputs.shape[1]
    widthCoefficient = order*exponent
    heightAbsoluteTerm = order-1
    
    A = np.zeros([samples - heightAbsoluteTerm, modelVariables*widthCoefficient])
    
    for i in range(modelVariables):
        for j in range(order):
            for k in range(exponent):
                colIndex = i*widthCoefficient + j*exponent + k
                A[:, colIndex] = np.power(inputs[j:samples-heightAbsoluteTerm+j, i], k+1)
    
    return A

(998, 216)
(999, 54)


array([[0., 0., 0., 0., 0.],
       [0., 0., 0., 0., 0.]])