Exercise 1 : Analyzing Confusion Matrix
Instructions
Imagine you have a dataset for a binary classification problem, such as email spam detection, where emails are classified as either ‘Spam’ or ‘Not Spam’. You are provided with the confusion matrix results of a classifier.
- Define in your own words what True Positives, True Negatives, False Positives, and False Negatives mean in the context of this email spam detection problem.
- Given a confusion matrix with specific values for TP, TN, FP, FN, calculate the Accuracy, Precision, Recall, and F1-Score.
- Discuss how the classifier’s performance would change with a higher number of False Positives compared to False Negatives, and vice versa.



In [1]:
true_positives_definition = "Emails correctly classified as Spam"
true_negatives_definition = "Emails correctly classified as Not Spam"
false_positives_definition = "Non-Spam emails incorrectly classified as Spam"
false_negatives_definition = "Spam emails incorrectly classified as Not Spam"

# Example Confusion Matrix Values
TP, TN, FP, FN = 80, 50, 10, 5  # Example values

# Calculating Evaluation Metrics
accuracy = (TP + TN) / (TP + TN + FP + FN)
precision = TP / (TP + FP)
recall = TP / (TP + FN)
f1_score = 2 * (precision * recall) / (precision + recall)

print(f"Accuracy: {accuracy}, Precision: {precision}, Recall: {recall}, F1-Score: {f1_score}")


Accuracy: 0.896551724137931, Precision: 0.8888888888888888, Recall: 0.9411764705882353, F1-Score: 0.9142857142857143



Exercise 2 : Evaluating Trade-offs in Metrics
Instructions
Consider a medical diagnosis application where a model predicts whether patients have a certain disease.
- Explain why high recall is more important than high precision in this medical diagnosis context.
- Describe a scenario where precision becomes more important than recall.
- Discuss the potential consequences of focusing solely on improving accuracy in imbalanced datasets.



In [5]:
recall_importance_medical = """
High recall is crucial in medical diagnosis as it reduces the number of False Negatives, which are critical in this context. Missing a true case of the disease could be life-threatening.
"""

# Scenario where Precision is more important than Recall
precision_important_scenario = """
In email spam detection, precision is more critical as False Positives (legitimate emails marked as spam) can be more problematic than False Negatives (spam emails not detected).
"""

# Consequences of focusing solely on Accuracy
accuracy_consequences = """
Focusing only on accuracy in imbalanced datasets can be misleading as the model might simply predict the majority class most of the time, ignoring the minority class which is often of greater interest.
"""

Exercise 3 : Understanding Cross-Validation and Learning Curves
Instructions
You are working on a project with a large dataset that involves predicting housing prices based on various features.
- Explain the difference between K-Fold Cross-Validation and Stratified K-Fold Cross-Validation. Which one would you choose for this task and why?
- Describe what learning curves are and how they can help in understanding the performance of your model.
- Discuss the implications of underfitting and overfitting as observed from learning curves, and how you might address these issues.



In [3]:
# Difference between K-Fold and Stratified K-Fold Cross-Validation
k_fold_vs_stratified = """
K-Fold divides the data into 'K' equal parts without considering the distribution of classes, whereas Stratified K-Fold divides data such that each fold maintains the same percentage of samples for each class.
"""

# Learning Curves Explanation
learning_curves_explanation = """
Learning curves plot the model's performance on both the training set and validation set over varying levels of training data or complexity. They help in diagnosing problems like overfitting or underfitting.
"""

Exercise 4 : Impact of Class Imbalance on Model Evaluation
Instructions
Imagine you are working on a dataset for detecting a rare disease where only 2% of the instances are positive cases (have the disease).
- Explain why using accuracy as an evaluation metric might be misleading in this scenario.
- Discuss the importance of precision and recall in the context of this imbalanced dataset.
- Propose strategies you could use to more effectively evaluate and improve the model’s performance in this scenario, considering the imbalance in the dataset.





In [4]:
# Misleading nature of Accuracy in Imbalanced Datasets
accuracy_misleading = """
In datasets with a significant class imbalance, accuracy can be misleading as it can be high even if the model is only predicting the majority class correctly.
"""

# Strategies for Evaluating and Improving Model Performance in Imbalanced Datasets
imbalance_strategies = """
Use metrics like Precision, Recall, and F1-Score. Consider using techniques like SMOTE for oversampling the minority class or adjusting class weights in the model.
"""

Exercise 5 : Role of Threshold Tuning in Classification Models
Instructions
You are evaluating a binary classification model that predicts whether a bank’s client will default on a loan. The model outputs a probability score between 0 and 1.
- Describe how changing the threshold for classifying a positive case (default) from 0.5 to 0.7 might affect the model’s precision and recall.
- Discuss the potential consequences of setting the threshold too high or too low in the context of loan default prediction.
- Explain how ROC (Receiver Operating Characteristic) curves and AUC (Area Under the Curve) can assist in finding the optimal threshold.



In [6]:
threshold_impact = """
Increasing the threshold from 0.5 to 0.7 for a positive class can increase precision (fewer False Positives) but may decrease recall (more False Negatives).
"""

# ROC and AUC in Threshold Determination
roc_auc_explanation = """
ROC curves plot the True Positive Rate against the False Positive Rate at various threshold settings. AUC provides an aggregate measure of performance across all possible thresholds. They help in finding the optimal balance between sensitivity and specificity.
"""