## Implementing ML Model Monitoring Pipelines

### Model Performance Drift:
**Description**: Setup a monitoring pipeline to track key performance metrics (e.g., accuracy, precision) of an ML model over time using a monitoring tool or dashboard.

In [1]:
# write your code from here
import pandas as pd
from sklearn.metrics import accuracy_score, precision_score, recall_score, f1_score
from datetime import datetime
import os

def monitor_model_performance(predictions_df, metric_log_file='metrics_log.csv'):
    """
    Compute performance metrics for model predictions and log them over time.

    Parameters:
        predictions_df (pd.DataFrame): Must contain 'y_true' and 'y_pred' columns.
        metric_log_file (str): Path to CSV log file storing past metrics.

    Returns:
        dict: Calculated metrics for the current batch.
    """
    if not {'y_true', 'y_pred'}.issubset(predictions_df.columns):
        raise ValueError("Input dataframe must contain 'y_true' and 'y_pred' columns.")

    # Compute metrics
    metrics = {
        'timestamp': datetime.now().strftime('%Y-%m-%d %H:%M:%S'),
        'accuracy': accuracy_score(predictions_df['y_true'], predictions_df['y_pred']),
        'precision': precision_score(predictions_df['y_true'], predictions_df['y_pred'], zero_division=0),
        'recall': recall_score(predictions_df['y_true'], predictions_df['y_pred'], zero_division=0),
        'f1_score': f1_score(predictions_df['y_true'], predictions_df['y_pred'], zero_division=0)
    }

    # Append metrics to a CSV log
    metrics_df = pd.DataFrame([metrics])
    if os.path.exists(metric_log_file):
        metrics_df.to_csv(metric_log_file, mode='a', header=False, index=False)
    else:
        metrics_df.to_csv(metric_log_file, index=False)

    return metrics

# Example usage
if __name__ == "__main__":
    # Simulated batch results
    data = {
        'y_true': [1, 0, 1, 1, 0, 1, 0],
        'y_pred': [1, 0, 1, 0, 0, 1, 1]
    }
    df = pd.DataFrame(data)
    result = monitor_model_performance(df)
    print("Logged metrics:", result)


Logged metrics: {'timestamp': '2025-05-22 09:03:42', 'accuracy': 0.7142857142857143, 'precision': 0.75, 'recall': 0.75, 'f1_score': 0.75}


### Feature Distribution Drift:
**Description**: Monitor the distribution of your input features in deployed models to detect any significant shifts from training data distributions.

In [2]:
# write your code from here

### Anomaly Detection in Predictions:
**DEscription**: Implement an anomaly detection mechanism to flag unusual model
predictions. Simulate anomalies by altering input data.

In [3]:
# write your code from here