<a href="https://colab.research.google.com/github/andreasdarsa/Neural-Networks-Projects/blob/main/knn_nc.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

# Ενδιάμεση εργασία στα Νευρωνικά Δίκτυα

Σκοπός της εργασίας είναι η σύγκριση των αλγορίθμων **K Nearest Neighbors** (k=1 & k=3) και **Nearest Centroid** σε ότι αφορά την απόδοση τους

Η σύγκριση αυτή θα γίνει σε ένα πρόβλημα πρόβλεψης αποτελεσμάτων σε **αγώνες ποδοσφαίρου της Premier League** με βάση δεδομένα αγώνων από την περίοδο 2020-2021 έως την περίοδο 2024-25.

Θα μελετηθεί η απόδοση τους σε metrics όπως **accuracy, precision** και **recall**.

# Libraries/frameworks που θα χρησιμοποιηθούν

In [None]:
import pandas as pd
import numpy as np
from sklearn.model_selection import train_test_split
from sklearn.neighbors import KNeighborsClassifier, NearestCentroid
from sklearn.metrics import accuracy_score, precision_score, recall_score

# Φόρτωση των δεδομένων, encoding σε αριθμητικές τιμές και ορισμός features, target

In [None]:
df = pd.read_csv('transformed_matches.csv')

# Encode categorical variables (team names) into numerical values
df['home_team'] = df['home_team'].astype('category').cat.codes
df['away_team'] = df['away_team'].astype('category').cat.codes

X = df.drop(columns=['date', 'home_gf', 'away_gf', 'result'])
y = df['result']

print(X.shape, y.shape)

Mounted at /content/drive
(1900, 10) (1900,)


# Χωρισμός σε train και test set

In [5]:
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2, random_state=42)

print(X_train.shape, X_test.shape, y_train.shape, y_test.shape)

(1520, 10) (380, 10) (1520,) (380,)


# Εκπαίδευση και έλεγχος του K Nearest Neighbors Classifier με K=1

In [6]:
knn1 = KNeighborsClassifier(n_neighbors=1)
knn1.fit(X_train, y_train)
y_pred1 = knn1.predict(X_test)
accuracy_knn1 = accuracy_score(y_test, y_pred1)
precision_knn1 = precision_score(y_test, y_pred1, average='weighted', zero_division=0)
recall_knn1 = recall_score(y_test, y_pred1, average='weighted', zero_division=0)
print(f'KNN (k=1) - Accuracy: {accuracy_knn1}, Precision: {precision_knn1}, Recall: {recall_knn1}')

KNN (k=1) - Accuracy: 0.4052631578947368, Precision: 0.42344655707362366, Recall: 0.4052631578947368


# Εκπαίδευση και έλεγχος του K Nearest Neighbors Classifier με K=3

In [7]:
knn3 = KNeighborsClassifier(n_neighbors=3)
knn3.fit(X_train, y_train)
y_pred3 = knn3.predict(X_test)
accuracy_knn3 = accuracy_score(y_test, y_pred3)
precision_knn3 = precision_score(y_test, y_pred3, average='weighted', zero_division=0)
recall_knn3 = recall_score(y_test, y_pred3, average='weighted', zero_division=0)
print(f'KNN (k=1) - Accuracy: {accuracy_knn3}, Precision: {precision_knn3}, Recall: {recall_knn3}')

KNN (k=1) - Accuracy: 0.4868421052631579, Precision: 0.4627306522177508, Recall: 0.4868421052631579


# Εκπαίδευση και έλεγχος του Nearest Centroid Classifier

In [8]:
nc = NearestCentroid()
nc.fit(X_train, y_train)
y_pred_nc = nc.predict(X_test)
accuracy_nc = accuracy_score(y_test, y_pred_nc)
precision_nc = precision_score(y_test, y_pred_nc, average='weighted', zero_division=0)
recall_nc = recall_score(y_test, y_pred_nc, average='weighted', zero_division=0)
print(f'Nearest Centroid - Accuracy: {accuracy_nc}, Precision: {precision_nc}, Recall: {recall_nc}')

Nearest Centroid - Accuracy: 0.4473684210526316, Precision: 0.4307975416887402, Recall: 0.4473684210526316
