**Einlesen der Daten**

In [None]:
import pandas as pd

df = pd.read_csv("trainingsdaten_ohne_vibration.csv", encoding = "ISO-8859-1")
df_test = pd.read_csv("testdaten_mit_vibration.csv", encoding = "ISO-8859-1")

**Trainingsdaten anschauen**

In [None]:
df.head()

**Testdaten anschauen**

In [None]:
df_test.head()

** *Time* löschen**

In [None]:
del df["Time"]
del df_test["Time"]

**Deskriptive Statistik (Trainingsdaten)**

In [None]:
df.describe()

**Deskriptive Statistik (Testdaten)**

In [None]:
df_test.describe()

**Daten visualisieren**

In [None]:
import matplotlib.pyplot as plt

df.plot(subplots=True, layout=(3,2), legend=False)
plt.show()

In [None]:
df_test.plot(subplots=True, layout=(3,2), legend=False)
plt.show()

**Datenvorverarbeitung**

In [None]:
# 2 Spalten enthalten keine Information -> Löschen
df = df.drop(['NumberOverload', 'NumberUnderloads'], 1)
df_test = df_test.drop(['NumberOverload', 'NumberUnderloads'], 1)

**Test- und Trainingsdaten festlegen**

In [None]:
X_train = df
X_test = df_test[['AmplitudeBandWidth', 'AmplitudeMean', 'StabilizationTime']].copy()
y_test = df_test[['VibrationMotorOn']].copy()

**Modell trainieren**

In [None]:
from sklearn.ensemble import IsolationForest

# Modell fitten
clf = IsolationForest()
clf.fit(X_train)
y_pred_train = clf.predict(X_train)
y_pred_test = clf.predict(X_test)

new_column = pd.DataFrame(columns = ["Anomaly"])
# Vorhersage des Modells
for k in range(len(X_test)):
    temp = clf.predict([[X_test["AmplitudeBandWidth"][k], X_test["AmplitudeMean"][k], X_test["StabilizationTime"][k]]])[0]
    new_column = new_column.append({"Anomaly": temp}, ignore_index = True)

# Vorhersage des Modells auf 1/0 recodieren
new_column['Anomaly'] = new_column['Anomaly'].map( {1: 0, -1: 1} )

**Metriken ausgeben**

In [None]:
from sklearn import metrics

print("Accuracy: ", metrics.accuracy_score(y_test, new_column))  
print("Precision: ", metrics.precision_score(y_test, new_column))  
print("Recall: ", metrics.recall_score(y_test, new_column))  
print("F1: ", metrics.f1_score(y_test, new_column))  
print("Confusion Matrix:")
print(metrics.confusion_matrix(y_test, new_column))  

**Daten zusammenführen für Visualisierung**

In [None]:
complete_data = X_test.copy()
complete_data['VibrationMotorOn'] = y_test['VibrationMotorOn']
complete_data['Anomaly'] = new_column.values

**Farben für Visualisierung einfügen**

In [None]:
complete_data["color"] = 0
for idx,row in complete_data.iterrows():
    label_true = complete_data.loc[idx,"VibrationMotorOn"]
    label_pred = complete_data.loc[idx,"Anomaly"]
    if  label_true == True:
        if  label_pred == True:
            complete_data.loc[idx,"color"] = "blue"
        else:
            complete_data.loc[idx,"color"] = "green"
    else:
        if label_pred == False:
            complete_data.loc[idx,"color"] = "red"            
        else:
            complete_data.loc[idx,"color"] = "yellow"

**Visualisierung der Konfusionsmatrix**

In [None]:
from matplotlib import pyplot as plt

fig, ax = plt.subplots()
a = complete_data.copy()

ax.scatter(a.loc[a.color == 'blue'].index,a.loc[a.color == 'blue',['AmplitudeBandWidth', 'AmplitudeMean', 'StabilizationTime']].mean(axis=1), alpha=0.4, color='blue', label='True positive')
ax.scatter(a.loc[a.color == 'green'].index,a.loc[a.color == 'green',['AmplitudeBandWidth', 'AmplitudeMean', 'StabilizationTime']].mean(axis=1), color='green', label='False negative')
ax.scatter(a.loc[a.color == 'red'].index,a.loc[a.color == 'red',['AmplitudeBandWidth', 'AmplitudeMean', 'StabilizationTime']].mean(axis=1), alpha=0.6, color='red', label='True negative')
ax.scatter(a.loc[a.color == 'yellow'].index,a.loc[a.color == 'yellow',['AmplitudeBandWidth', 'AmplitudeMean', 'StabilizationTime']].mean(axis=1), color='yellow', label='False positive')

plt.legend()
plt.show()