Title: Classification Model Performance Metrics

Accuracy, Precision, Recall, F1-Score:

Task 1: Evaluate a binary classifier for spam detection using accuracy, precision, recall and F1-score.

In [1]:
# Write your code here
from sklearn.datasets import make_classification
from sklearn.model_selection import train_test_split
from sklearn.linear_model import LogisticRegression
from sklearn.metrics import classification_report, confusion_matrix

X, y = make_classification(n_samples=1000, n_features=10, n_classes=2, random_state=42)
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.3, random_state=42)

model = LogisticRegression()
model.fit(X_train, y_train)
y_pred = model.predict(X_test)

print("Binary Classification Report (Spam Detection):\n")
print(classification_report(y_test, y_pred))
print("Confusion Matrix:\n", confusion_matrix(y_test, y_pred))


Binary Classification Report (Spam Detection):

              precision    recall  f1-score   support

           0       0.80      0.87      0.84       135
           1       0.89      0.82      0.86       165

    accuracy                           0.85       300
   macro avg       0.85      0.85      0.85       300
weighted avg       0.85      0.85      0.85       300

Confusion Matrix:
 [[118  17]
 [ 29 136]]



Task 2: Compare performance of a multi-class classifier on recognizing animals.

In [2]:

# Write your code here
from sklearn.datasets import load_iris
from sklearn.tree import DecisionTreeClassifier

data = load_iris()
X, y = data.data, data.target
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.3, random_state=42)

model = DecisionTreeClassifier()
model.fit(X_train, y_train)
y_pred = model.predict(X_test)

print("\nMulti-Class Classification Report (Animal Recognition):\n")
print(classification_report(y_test, y_pred, target_names=data.target_names))
print("Confusion Matrix:\n", confusion_matrix(y_test, y_pred))



Multi-Class Classification Report (Animal Recognition):

              precision    recall  f1-score   support

      setosa       1.00      1.00      1.00        19
  versicolor       1.00      1.00      1.00        13
   virginica       1.00      1.00      1.00        13

    accuracy                           1.00        45
   macro avg       1.00      1.00      1.00        45
weighted avg       1.00      1.00      1.00        45

Confusion Matrix:
 [[19  0  0]
 [ 0 13  0]
 [ 0  0 13]]


Task 3: Analyze classifier performance for predicting disease outbreaks.

In [3]:
# Write your code here
from sklearn.ensemble import RandomForestClassifier

X, y = make_classification(n_samples=1000, n_features=8, n_classes=2, weights=[0.8, 0.2], random_state=42)
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.25, random_state=42, stratify=y)

model = RandomForestClassifier()
model.fit(X_train, y_train)
y_pred = model.predict(X_test)

print("\nBinary Classification Report (Disease Outbreak Prediction):\n")
print(classification_report(y_test, y_pred))
print("Confusion Matrix:\n", confusion_matrix(y_test, y_pred))



Binary Classification Report (Disease Outbreak Prediction):

              precision    recall  f1-score   support

           0       0.93      0.98      0.96       199
           1       0.92      0.71      0.80        51

    accuracy                           0.93       250
   macro avg       0.93      0.85      0.88       250
weighted avg       0.93      0.93      0.92       250

Confusion Matrix:
 [[196   3]
 [ 15  36]]
