## Machine Learning

### Was ist Machine Learning?

Beim Machine Learning wollen wir innerhalb eines Datensets Zusammenhänge erkennen

Danach können wir verschiedene Dinge machen, z.B. Klassifizierung, Regression, Bilderkennung, neue Inhalte generieren (Text, Bilder, Audio, ...), ...

---

Beispiel:

Wettervorhersage

Per ML die Temperatur anhand von mehreren Parametern vorhersagen

Parameter: Datum, Zeit, Wolkenanzahl, Niederschlag

Output: Temperatur

Annahme: 21.06.2024, 13:30 Uhr, 10% Wolken, 0% Niederschlag -> Temperatur? 25°, 27°, Realität: 30°

Anhand von Machine Learning können jetzt historische Daten (ein Datenset) verarbeitet werden, um ein Modell zu erzeugen

Modell: Programm, welches die Parameter als Inputs bekommt und einen Output gibt

Dieses Modell kann ausgeführt werden um uns eine Vorhersage zu geben

Bei 4 Parametern kann das Programm noch per Hand geschrieben werden, bei 20 Parameter eher nicht mehr

### Datenset

Das Income.csv Datenset enthält Personendaten und soll aussagen, welche Personen über/unter 50.000$/Jahr verdienen

In [3]:
import numpy as np
import pandas as pd
import matplotlib.pyplot as plt

In [4]:
data = pd.read_csv("Data/MAGIC.csv")

In [5]:
data

Unnamed: 0,fLength,fWidth,fSize,fConc,fConc1,fAsym,fM3Long,fM3Trans,fAlpha,fDist,class
0,28.7967,16.0021,2.6449,0.3918,0.1982,27.7004,22.0110,-8.2027,40.0920,81.8828,g
1,31.6036,11.7235,2.5185,0.5303,0.3773,26.2722,23.8238,-9.9574,6.3609,205.2610,g
2,162.0520,136.0310,4.0612,0.0374,0.0187,116.7410,-64.8580,-45.2160,76.9600,256.7880,g
3,23.8172,9.5728,2.3385,0.6147,0.3922,27.2107,-6.4633,-7.1513,10.4490,116.7370,g
4,75.1362,30.9205,3.1611,0.3168,0.1832,-5.5277,28.5525,21.8393,4.6480,356.4620,g
...,...,...,...,...,...,...,...,...,...,...,...
19015,21.3846,10.9170,2.6161,0.5857,0.3934,15.2618,11.5245,2.8766,2.4229,106.8258,h
19016,28.9452,6.7020,2.2672,0.5351,0.2784,37.0816,13.1853,-2.9632,86.7975,247.4560,h
19017,75.4455,47.5305,3.4483,0.1417,0.0549,-9.3561,41.0562,-9.4662,30.2987,256.5166,h
19018,120.5135,76.9018,3.9939,0.0944,0.0683,5.8043,-93.5224,-63.8389,84.6874,408.3166,h


### Probleme mit dem Datenset

1. Alle Daten müssen numerisch sein -> Encoding
2. Die Daten müssen skaliert sein -> Outlier reduzieren
3. Unebenheiten ausgleichen -> Bei der zu findenden Spalte sollten die Trainingsdaten gleich viele Daten von beiden Seiten enthalten

In [6]:
data["class"] == "g"  # Alle Zeilen mit <=50K finden

(data["class"] == "g").astype(int)  # Boolean Maske zu Integern konvertieren (0 oder 1)

data["class"] = (data["class"] == "g").astype(int)

In [7]:
data

Unnamed: 0,fLength,fWidth,fSize,fConc,fConc1,fAsym,fM3Long,fM3Trans,fAlpha,fDist,class
0,28.7967,16.0021,2.6449,0.3918,0.1982,27.7004,22.0110,-8.2027,40.0920,81.8828,1
1,31.6036,11.7235,2.5185,0.5303,0.3773,26.2722,23.8238,-9.9574,6.3609,205.2610,1
2,162.0520,136.0310,4.0612,0.0374,0.0187,116.7410,-64.8580,-45.2160,76.9600,256.7880,1
3,23.8172,9.5728,2.3385,0.6147,0.3922,27.2107,-6.4633,-7.1513,10.4490,116.7370,1
4,75.1362,30.9205,3.1611,0.3168,0.1832,-5.5277,28.5525,21.8393,4.6480,356.4620,1
...,...,...,...,...,...,...,...,...,...,...,...
19015,21.3846,10.9170,2.6161,0.5857,0.3934,15.2618,11.5245,2.8766,2.4229,106.8258,0
19016,28.9452,6.7020,2.2672,0.5351,0.2784,37.0816,13.1853,-2.9632,86.7975,247.4560,0
19017,75.4455,47.5305,3.4483,0.1417,0.0549,-9.3561,41.0562,-9.4662,30.2987,256.5166,0
19018,120.5135,76.9018,3.9939,0.0944,0.0683,5.8043,-93.5224,-63.8389,84.6874,408.3166,0


### Aufteilen der Daten

Die Daten müssen in ein Trainingsset und ein Testset aufgeteilt werden

Das Trainingsset wird jetzt weiterverarbeitet (Skaliert und Ausgeglichen)

Das Testset bleibt unberührt

In [8]:
data.sample()  # Ein Random Datensatz

Unnamed: 0,fLength,fWidth,fSize,fConc,fConc1,fAsym,fM3Long,fM3Trans,fAlpha,fDist,class
10162,27.9835,25.1852,2.8182,0.2964,0.1588,-7.6944,-18.7768,12.6197,31.247,188.174,1


In [9]:
random = data.sample(frac=1)  # Alle Daten in zufälliger Reihenfolge

In [10]:
trainingAmount = int(len(random) * 0.8)
testAmount = len(random) - trainingAmount

In [11]:
trainingAmount

15216

In [12]:
testAmount

3804

In [13]:
x = np.split(random, (trainingAmount, len(random)))  # Drei Ergebnisse: Trainingsset, Testset, Leeres Set

  return bound(*args, **kwds)


In [14]:
training = x[0]  # Trainingsset aus x entnehmen
test = x[1]  # Testset aus x entnehmen

test_left = test[test.columns[:-1]]
test_right = test[test.columns[-1]]

### Skalierung der Daten

Bei der Skalierung der Daten werden starke Unterschiede (Outlier) zw. den einzelnen Daten geglättet

In [15]:
from sklearn.preprocessing import StandardScaler

scaler = StandardScaler()

# WICHTIG: Hier darf nicht die letzte Spalte (class) skaliert werden, nachdem diese 0 oder 1 bleiben soll
scaledTraining = scaler.fit_transform(training[training.columns[0:-1]])

# Letzte Spalte wieder anhängen
training = np.hstack((scaledTraining, training[training.columns[-1]].values.reshape(-1, 1)))

# training[training.columns[-1]].values.reshape(-1, 1) -> Nimm die letzte Spalte, nimm davon das unterliegende Numpy Array und konvertiere dieses von einem 1D-Array zu einem 2D-Array

In [16]:
training = pd.DataFrame(training)

### Unebenheiten ausgleichen

Wenn im Datenset Unebenheiten herrschen, kann sich beim Lernprozess eine Neigung in die eine oder andere Richtung entwickeln (Bias)

In [19]:
len(training.groupby(10).get_group(0))

5407

In [20]:
len(training.groupby(10).get_group(1))

9809

In [21]:
from imblearn.over_sampling import RandomOverSampler  # Generiert neue Daten, welche einen Ausgleich machen

In [22]:
overSampler = RandomOverSampler()

left, right = overSampler.fit_resample(training, training[training.columns[-1]])  # Zwei Parameter: Das Trainingsset, die Spalte nach welcher skaliert werden soll

training = left

In [23]:
left[left[10] == 0]

Unnamed: 0,0,1,2,3,4,5,6,7,8,9,10
1,-0.258141,-0.229809,-0.240882,-0.806815,-0.862250,0.189680,0.418933,0.358619,-0.002271,-1.334774,0.0
3,-0.599643,-0.580268,-0.307196,0.257353,0.201392,0.535086,0.042297,-0.569699,-0.117685,-1.604745,0.0
7,1.584516,1.050781,1.023309,-1.259442,-1.459701,-3.121142,1.868097,0.833247,0.808790,-0.200388,0.0
14,-0.714932,-0.826880,-0.742788,1.968678,2.706157,0.131090,0.169931,0.252805,-0.045515,-0.695099,0.0
17,-0.683940,-0.744734,-0.872660,1.204801,1.776488,0.220786,-0.424532,-0.488890,1.717899,-0.421321,0.0
...,...,...,...,...,...,...,...,...,...,...,...
19613,-0.703488,-0.642972,-0.221391,0.986159,1.656998,0.240374,0.077442,-0.394546,1.783033,-0.739372,0.0
19614,-0.413062,-0.165867,-0.334314,-0.173904,-0.021294,-0.182339,0.211693,-0.215672,-0.845441,-1.519116,0.0
19615,-0.607975,-0.727062,-0.658890,1.670032,1.554707,0.337684,0.097013,0.728737,1.084333,-0.981098,0.0
19616,-0.649657,-0.440898,-0.380077,0.154881,-0.073797,0.642613,-0.001470,0.346531,2.063434,0.949516,0.0


In [25]:
left[left[10] == 1]

Unnamed: 0,0,1,2,3,4,5,6,7,8,9,10
0,-0.697052,-0.245607,-0.548085,0.513805,0.740908,0.314712,0.036070,0.636431,-0.574239,-0.646422,1.0
2,-0.945279,-0.650836,-1.374566,2.287600,2.659990,0.294598,-0.068405,0.486014,1.577605,-1.136646,1.0
4,-0.633877,-0.484393,-0.371814,0.820671,0.637712,0.398852,0.231908,0.453577,0.134998,-0.929782,1.0
5,0.218610,0.425600,0.598099,-0.929013,-0.958205,-0.600690,0.812484,-0.942757,-0.698471,1.279826,1.0
6,-0.384806,0.076678,0.772251,-0.845173,-0.911133,0.947906,0.261093,0.672330,-0.858773,-0.164204,1.0
...,...,...,...,...,...,...,...,...,...,...,...
15210,0.369882,0.400200,0.972674,-1.098885,-1.037865,-1.693099,-0.898388,-0.618367,-0.840449,1.346188,1.0
15211,-0.942588,-0.625240,-1.464396,2.214171,1.739374,-0.168228,-0.253436,-0.584405,0.521997,-0.274798,1.0
15212,-0.006941,-0.007977,0.197676,-1.092309,-1.064116,0.833205,0.262432,-0.549551,-0.600044,-0.582590,1.0
15213,-0.416346,-0.560875,-0.324780,0.035971,-0.163415,0.978288,-0.551183,0.339684,-0.867335,0.232816,1.0


In [26]:
training

Unnamed: 0,0,1,2,3,4,5,6,7,8,9,10
0,-0.697052,-0.245607,-0.548085,0.513805,0.740908,0.314712,0.036070,0.636431,-0.574239,-0.646422,1.0
1,-0.258141,-0.229809,-0.240882,-0.806815,-0.862250,0.189680,0.418933,0.358619,-0.002271,-1.334774,0.0
2,-0.945279,-0.650836,-1.374566,2.287600,2.659990,0.294598,-0.068405,0.486014,1.577605,-1.136646,1.0
3,-0.599643,-0.580268,-0.307196,0.257353,0.201392,0.535086,0.042297,-0.569699,-0.117685,-1.604745,0.0
4,-0.633877,-0.484393,-0.371814,0.820671,0.637712,0.398852,0.231908,0.453577,0.134998,-0.929782,1.0
...,...,...,...,...,...,...,...,...,...,...,...
19613,-0.703488,-0.642972,-0.221391,0.986159,1.656998,0.240374,0.077442,-0.394546,1.783033,-0.739372,0.0
19614,-0.413062,-0.165867,-0.334314,-0.173904,-0.021294,-0.182339,0.211693,-0.215672,-0.845441,-1.519116,0.0
19615,-0.607975,-0.727062,-0.658890,1.670032,1.554707,0.337684,0.097013,0.728737,1.084333,-0.981098,0.0
19616,-0.649657,-0.440898,-0.380077,0.154881,-0.073797,0.642613,-0.001470,0.346531,2.063434,0.949516,0.0


In [27]:
training

Unnamed: 0,0,1,2,3,4,5,6,7,8,9,10
0,-0.697052,-0.245607,-0.548085,0.513805,0.740908,0.314712,0.036070,0.636431,-0.574239,-0.646422,1.0
1,-0.258141,-0.229809,-0.240882,-0.806815,-0.862250,0.189680,0.418933,0.358619,-0.002271,-1.334774,0.0
2,-0.945279,-0.650836,-1.374566,2.287600,2.659990,0.294598,-0.068405,0.486014,1.577605,-1.136646,1.0
3,-0.599643,-0.580268,-0.307196,0.257353,0.201392,0.535086,0.042297,-0.569699,-0.117685,-1.604745,0.0
4,-0.633877,-0.484393,-0.371814,0.820671,0.637712,0.398852,0.231908,0.453577,0.134998,-0.929782,1.0
...,...,...,...,...,...,...,...,...,...,...,...
19613,-0.703488,-0.642972,-0.221391,0.986159,1.656998,0.240374,0.077442,-0.394546,1.783033,-0.739372,0.0
19614,-0.413062,-0.165867,-0.334314,-0.173904,-0.021294,-0.182339,0.211693,-0.215672,-0.845441,-1.519116,0.0
19615,-0.607975,-0.727062,-0.658890,1.670032,1.554707,0.337684,0.097013,0.728737,1.084333,-0.981098,0.0
19616,-0.649657,-0.440898,-0.380077,0.154881,-0.073797,0.642613,-0.001470,0.346531,2.063434,0.949516,0.0


In [28]:
train_left = training[training.columns[:-1]]
train_right = training[training.columns[-1]]

### Verschiedene ML Modelle

kNN - k-nearest neighbors
- Sammlung von bereits klassifizierten Datenpunkten
- Neuer Datenpunkt wird platziert
- Danach wird geprüft, wieviele Nachbarn der einen Klasse und der anderen Klasse dieser neue Datenpunkt hat
- Die Klasse welche öfter in den Nachbarn vertreten ist, wird dem neuen Datenpunkt zugewiesen
- WICHTIG: Bei kNN muss es eine ungerade Anzahl von Nachbarn geben

In [29]:
from sklearn.neighbors import KNeighborsClassifier

In [30]:
knn = KNeighborsClassifier(7)  # Anzahl Nachbarn = 7

In [31]:
knn_model = knn.fit(train_left, train_right)

In [32]:
prediction = knn_model.predict(test_left)



In [33]:
(prediction == test_right).value_counts()

class
False    2523
True     1281
Name: count, dtype: int64

In [34]:
# Funktion definieren um Modell zu testen
def evaluate(prediction):
    tf = (prediction == test_right).value_counts()
    print(f"Korrekte Vorhersagen: {tf.iloc[0] / len(prediction) * 100}%")
    print(f"Falsche Vorhersagen: {tf.iloc[1] / len(prediction) * 100}%")

In [39]:
def compare(prediction):
    gesamt = np.hstack((test_left, test_right.values.reshape(-1, 1), prediction.reshape(-1, 1)))
    df = pd.DataFrame(gesamt)
    df.rename(columns={10: "Actual", 11: "Prediction"}, inplace=True)
    return df

In [40]:
evaluate(prediction)

Korrekte Vorhersagen: 66.32492113564669%
Falsche Vorhersagen: 33.67507886435332%


In [41]:
compare(prediction)

Unnamed: 0,0,1,2,3,4,5,6,7,8,9,Actual,Prediction
0,41.1210,20.2472,2.7889,0.2569,0.1439,9.0304,-22.6301,-11.6170,6.0005,233.6900,1.0,0.0
1,35.8686,19.3033,2.7585,0.3557,0.1874,-57.9170,19.7179,-8.6044,1.0842,283.0970,1.0,0.0
2,18.8916,12.3608,2.5112,0.5300,0.3097,-2.3514,6.5856,-9.2091,9.8140,186.4910,1.0,0.0
3,22.6754,11.6713,2.3414,0.5057,0.2893,28.0107,10.7875,11.4814,9.3290,166.1210,1.0,0.0
4,55.4333,16.1018,2.9043,0.3863,0.2020,1.3011,35.9742,7.6825,40.1342,250.7299,0.0,0.0
...,...,...,...,...,...,...,...,...,...,...,...,...
3799,29.4859,12.9874,2.4433,0.3640,0.2180,26.0576,17.3940,-4.4317,6.0350,131.6680,1.0,0.0
3800,25.9264,13.0316,2.5752,0.5372,0.2832,-1.6516,23.9634,6.3471,19.2670,147.7910,1.0,0.0
3801,25.0243,18.4051,2.5453,0.4672,0.2949,2.8213,-21.7414,-15.2108,29.5560,141.1230,1.0,0.0
3802,21.3719,12.7783,2.4273,0.4561,0.2299,29.3936,-5.7645,8.0277,10.6360,138.6130,1.0,0.0


Naive Bayes
- Verwendet Wahrscheinlichkeiten um Punkte zu klassifizieren
- Wenn die Wahrscheinlichkeiten in Summe über 50% sind, wird die eine Klasse genommen, sonst die andere

In [42]:
from sklearn.naive_bayes import GaussianNB

In [43]:
nb = GaussianNB()

In [44]:
nb_model = nb.fit(train_left, train_right)

In [45]:
prediction = nb_model.predict(test_left)



In [46]:
evaluate(prediction)

Korrekte Vorhersagen: 66.32492113564669%
Falsche Vorhersagen: 33.67507886435332%


In [47]:
compare(prediction)

Unnamed: 0,0,1,2,3,4,5,6,7,8,9,Actual,Prediction
0,41.1210,20.2472,2.7889,0.2569,0.1439,9.0304,-22.6301,-11.6170,6.0005,233.6900,1.0,0.0
1,35.8686,19.3033,2.7585,0.3557,0.1874,-57.9170,19.7179,-8.6044,1.0842,283.0970,1.0,0.0
2,18.8916,12.3608,2.5112,0.5300,0.3097,-2.3514,6.5856,-9.2091,9.8140,186.4910,1.0,0.0
3,22.6754,11.6713,2.3414,0.5057,0.2893,28.0107,10.7875,11.4814,9.3290,166.1210,1.0,0.0
4,55.4333,16.1018,2.9043,0.3863,0.2020,1.3011,35.9742,7.6825,40.1342,250.7299,0.0,0.0
...,...,...,...,...,...,...,...,...,...,...,...,...
3799,29.4859,12.9874,2.4433,0.3640,0.2180,26.0576,17.3940,-4.4317,6.0350,131.6680,1.0,0.0
3800,25.9264,13.0316,2.5752,0.5372,0.2832,-1.6516,23.9634,6.3471,19.2670,147.7910,1.0,0.0
3801,25.0243,18.4051,2.5453,0.4672,0.2949,2.8213,-21.7414,-15.2108,29.5560,141.1230,1.0,0.0
3802,21.3719,12.7783,2.4273,0.4561,0.2299,29.3936,-5.7645,8.0277,10.6360,138.6130,1.0,0.0


Logistische Regression
- Ein Datenpunkt wird auf der Regressionslinie platziert, und dann mit 0 oder 1 verglichen
- Wenn dieser über dem Grenzwert liegt, bekommt er die eine Klasse, sonst die andere

In [48]:
from sklearn.linear_model import LogisticRegression

In [49]:
lr = LogisticRegression()

In [50]:
lr_model = lr.fit(train_left, train_right)

In [51]:
prediction = lr_model.predict(test_left)



In [52]:
evaluate(prediction)

Korrekte Vorhersagen: 66.32492113564669%
Falsche Vorhersagen: 33.67507886435332%


In [53]:
compare(prediction)

Unnamed: 0,0,1,2,3,4,5,6,7,8,9,Actual,Prediction
0,41.1210,20.2472,2.7889,0.2569,0.1439,9.0304,-22.6301,-11.6170,6.0005,233.6900,1.0,0.0
1,35.8686,19.3033,2.7585,0.3557,0.1874,-57.9170,19.7179,-8.6044,1.0842,283.0970,1.0,0.0
2,18.8916,12.3608,2.5112,0.5300,0.3097,-2.3514,6.5856,-9.2091,9.8140,186.4910,1.0,0.0
3,22.6754,11.6713,2.3414,0.5057,0.2893,28.0107,10.7875,11.4814,9.3290,166.1210,1.0,0.0
4,55.4333,16.1018,2.9043,0.3863,0.2020,1.3011,35.9742,7.6825,40.1342,250.7299,0.0,0.0
...,...,...,...,...,...,...,...,...,...,...,...,...
3799,29.4859,12.9874,2.4433,0.3640,0.2180,26.0576,17.3940,-4.4317,6.0350,131.6680,1.0,0.0
3800,25.9264,13.0316,2.5752,0.5372,0.2832,-1.6516,23.9634,6.3471,19.2670,147.7910,1.0,0.0
3801,25.0243,18.4051,2.5453,0.4672,0.2949,2.8213,-21.7414,-15.2108,29.5560,141.1230,1.0,0.0
3802,21.3719,12.7783,2.4273,0.4561,0.2299,29.3936,-5.7645,8.0277,10.6360,138.6130,1.0,0.0


Support Vector Machines
- Legt eine Linie (Hyperplane) durch den Raum
- Von dieser Hyperplane breitet sich danach ein Margin in beide Richtungen zur Hyperplane aus
- Der Erste Punkt, welcher von dem Margin getroffen wird, gibt diesen beiden Punkten unterschiedliche Klassen
- Dieses Margin bewegt sich weiter, bis alle Punkte klassifiziert sind

In [54]:
from sklearn.svm import SVC

In [55]:
svc = SVC()

In [56]:
svc_model = svc.fit(train_left, train_right)

In [57]:
prediction = svc_model.predict(test_left)



In [58]:
evaluate(prediction)

Korrekte Vorhersagen: 66.32492113564669%
Falsche Vorhersagen: 33.67507886435332%


In [59]:
compare(prediction)

Unnamed: 0,0,1,2,3,4,5,6,7,8,9,Actual,Prediction
0,41.1210,20.2472,2.7889,0.2569,0.1439,9.0304,-22.6301,-11.6170,6.0005,233.6900,1.0,0.0
1,35.8686,19.3033,2.7585,0.3557,0.1874,-57.9170,19.7179,-8.6044,1.0842,283.0970,1.0,0.0
2,18.8916,12.3608,2.5112,0.5300,0.3097,-2.3514,6.5856,-9.2091,9.8140,186.4910,1.0,0.0
3,22.6754,11.6713,2.3414,0.5057,0.2893,28.0107,10.7875,11.4814,9.3290,166.1210,1.0,0.0
4,55.4333,16.1018,2.9043,0.3863,0.2020,1.3011,35.9742,7.6825,40.1342,250.7299,0.0,0.0
...,...,...,...,...,...,...,...,...,...,...,...,...
3799,29.4859,12.9874,2.4433,0.3640,0.2180,26.0576,17.3940,-4.4317,6.0350,131.6680,1.0,0.0
3800,25.9264,13.0316,2.5752,0.5372,0.2832,-1.6516,23.9634,6.3471,19.2670,147.7910,1.0,0.0
3801,25.0243,18.4051,2.5453,0.4672,0.2949,2.8213,-21.7414,-15.2108,29.5560,141.1230,1.0,0.0
3802,21.3719,12.7783,2.4273,0.4561,0.2299,29.3936,-5.7645,8.0277,10.6360,138.6130,1.0,0.0


### Neurales Netzwerk

Wir können jetzt ein eigenes Modell bauen

Dieses Modell besteht aus Knoten, welche miteinander in Schichten verbunden sind

Jeder Knoten (Neuron) besteht aus:
- Den Inputs, diese werden Multipliziert
- Den Summenlayer, summiert alles auf
- Die Activation Function, eine beliebige Mathematische Funktion, welche genau einen Wert zurückgibt

Diese Neuronen werden in Schichten angelegt und miteinander verbunden

Jede Schicht wird fertig ausgeführt, und danach geht jeder Wert an die nächste Schicht weiter

Wenn das gesamte Netzwerk einmal ausgeführt wurde, wird das Endresultat wieder an das Modell zurückgefüttert und daraus entsteht eine Metrik namens Loss

Der Loss soll möglichst reduziert werden, um zu bestimmen wie genau das Modell ist

In [60]:
import tensorflow as tf

In [61]:
# Modell aufbauen mit der Sequential API (Keras)
model = tf.keras.Sequential([
    tf.keras.layers.Input(shape=(10,)),  # Die erste Schicht, die Input Schicht
    tf.keras.layers.Dense(32, activation="relu"),
    tf.keras.layers.Dense(32, activation="relu"),
    tf.keras.layers.Dense(1, activation="sigmoid")  # Die letzte Schicht, der Output Layer
])

In [62]:
model.compile(loss="binary_crossentropy", metrics=["accuracy"])

In [63]:
model.fit(train_left, train_right,
          verbose=1,
          epochs=50)  # verbose: Outputs anzeigen, Epochs: Anzahl der Durchläufe

Epoch 1/50
[1m614/614[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m4s[0m 4ms/step - accuracy: 0.7558 - loss: 0.4861
Epoch 2/50
[1m614/614[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m2s[0m 4ms/step - accuracy: 0.8311 - loss: 0.3772
Epoch 3/50
[1m614/614[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m2s[0m 4ms/step - accuracy: 0.8439 - loss: 0.3524
Epoch 4/50
[1m614/614[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m2s[0m 4ms/step - accuracy: 0.8473 - loss: 0.3482
Epoch 5/50
[1m614/614[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m2s[0m 3ms/step - accuracy: 0.8491 - loss: 0.3437
Epoch 6/50
[1m614/614[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m2s[0m 3ms/step - accuracy: 0.8508 - loss: 0.3394
Epoch 7/50
[1m614/614[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m2s[0m 3ms/step - accuracy: 0.8517 - loss: 0.3356
Epoch 8/50
[1m614/614[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m2s[0m 3ms/step - accuracy: 0.8575 - loss: 0.3266
Epoch 9/50
[1m614/614[0m [32m━━━━━━━━

<keras.src.callbacks.history.History at 0x19d820c8d40>

In [64]:
prediction = model.predict(test_left).reshape(-1)

[1m119/119[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m1s[0m 3ms/step


In [65]:
evaluate(prediction)

Korrekte Vorhersagen: 68.82229232386962%
Falsche Vorhersagen: 31.17770767613039%


In [67]:
compare(prediction)

Unnamed: 0,0,1,2,3,4,5,6,7,8,9,Actual,Prediction
0,41.1210,20.2472,2.7889,0.2569,0.1439,9.0304,-22.6301,-11.6170,6.0005,233.6900,1.0,7.390746e-05
1,35.8686,19.3033,2.7585,0.3557,0.1874,-57.9170,19.7179,-8.6044,1.0842,283.0970,1.0,2.173728e-30
2,18.8916,12.3608,2.5112,0.5300,0.3097,-2.3514,6.5856,-9.2091,9.8140,186.4910,1.0,7.516126e-17
3,22.6754,11.6713,2.3414,0.5057,0.2893,28.0107,10.7875,11.4814,9.3290,166.1210,1.0,1.052964e-07
4,55.4333,16.1018,2.9043,0.3863,0.2020,1.3011,35.9742,7.6825,40.1342,250.7299,0.0,0.000000e+00
...,...,...,...,...,...,...,...,...,...,...,...,...
3799,29.4859,12.9874,2.4433,0.3640,0.2180,26.0576,17.3940,-4.4317,6.0350,131.6680,1.0,2.938194e-19
3800,25.9264,13.0316,2.5752,0.5372,0.2832,-1.6516,23.9634,6.3471,19.2670,147.7910,1.0,0.000000e+00
3801,25.0243,18.4051,2.5453,0.4672,0.2949,2.8213,-21.7414,-15.2108,29.5560,141.1230,1.0,0.000000e+00
3802,21.3719,12.7783,2.4273,0.4561,0.2299,29.3936,-5.7645,8.0277,10.6360,138.6130,1.0,1.043742e-07


In [68]:
model.save("MAGIC.keras")

In [69]:
model = tf.keras.models.load_model("MAGIC.keras")

In [70]:
prediction = model.predict(test_left).reshape(-1)

[1m119/119[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m1s[0m 3ms/step
