# # Anomaly detection for Cellular Networks

Anomalies are rare events or observations that deviate significantly from what is considered normal. Detecting anomalies is crucial in various fields to identify potential security breaches, fraudulent activities, or abnormal health conditions.

Isolation forest is a machine learning algorithm specifically designed for anomaly detection. It works by isolating anomalies in a dataset by randomly partitioning the data into subsets. The algorithm then builds an ensemble of trees to isolate anomalies based on how quickly they are separated from the rest of the data. This approach is particularly effective for high-dimensional datasets and is known for its efficiency and scalability.

In [1]:
import pandas as pd
from sklearn.model_selection import train_test_split
from sklearn.neighbors import KNeighborsClassifier
from sklearn.svm import SVC
from sklearn.ensemble import RandomForestClassifier
from sklearn.metrics import accuracy_score, precision_score, recall_score, f1_score

In [2]:
data = pd.read_csv(r"C:\Users\hp\Downloads\ML-MATT-CompetitionQT2021_train.csv", delimiter=';')


In [3]:
X = data.drop(columns=['Unusual'])
y = data['Unusual']

# Convert categorical features to dummy variables if needed
X = pd.get_dummies(X, drop_first=True)


In [4]:
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2, random_state=42)

In [5]:
knn = KNeighborsClassifier()
svm = SVC()
rf = RandomForestClassifier()

# Train and evaluate each classifier
classifiers = {'KNN': knn, 'SVM': svm, 'Random Forest': rf}

In [6]:
for name, clf in classifiers.items():
    # Train the classifier
    clf.fit(X_train, y_train)
    
    # Predict on the test set
    y_pred = clf.predict(X_test)
    
    # Calculate evaluation metrics
    accuracy = accuracy_score(y_test, y_pred)
    precision = precision_score(y_test, y_pred)
    recall = recall_score(y_test, y_pred)
    f1 = f1_score(y_test, y_pred)
    
    # Print the evaluation metrics
    print(f'{name} Classifier:')
    print(f'Accuracy: {accuracy:.4f}')
    print(f'Precision: {precision:.4f}')
    print(f'Recall: {recall:.4f}')
    print(f'F1 Score: {f1:.4f}')
    print('')

KNN Classifier:
Accuracy: 0.7507
Precision: 0.5940
Recall: 0.3182
F1 Score: 0.4144



  _warn_prf(average, modifier, msg_start, len(result))


SVM Classifier:
Accuracy: 0.7228
Precision: 0.0000
Recall: 0.0000
F1 Score: 0.0000

Random Forest Classifier:
Accuracy: 0.9271
Precision: 0.9564
Recall: 0.7722
F1 Score: 0.8545

