## I. High-Priority Model Performance Monitoring

This code uses the **Boto3** library to create an **Amazon CloudWatch Alarm** that monitors the performance of a live machine learning model. Specifically, it tracks the **F2 Score**—a critical metric for models where missing a "positive" result is dangerous or costly.

 It ensures that the model remains effective at identifying the specific events it was trained to find, such as fraudulent transactions or security threats.
---

### 1. The Core Metric: Understanding the F2 Score

The **F2 Score** is a specialized version of the more common F1 score. While most metrics try to balance "accuracy" and "completeness," the F2 score is weighted heavily toward **Recall**.

* **Why it matters:** In fields like cybersecurity or medical diagnostics, a "False Negative" (missing a real threat) is much worse than a "False Positive" (a false alarm).
* **The Goal:** An F2 score of **1.0** is perfect. This script ensures that the model doesn't drift into a state where it starts missing too many important events.

### 2. Guarding the "Quality" Threshold

The code establishes a performance floor for your SageMaker model.

* **The Limit (0.8):** The script sets a threshold of **0.8**. If the model's ability to capture true positives drops below this level, it is considered a significant failure in model quality.
* **Namespace:** It looks specifically in the `model-quality` area of your SageMaker logs. This distinguishes this alarm from others that might monitor hardware issues (like CPU) or data issues (like formatting).

### 3. Continuous Hourly Surveillance

Machine learning models can "decay" over time as the real world changes. This script handles that by:

* **Hourly Checks:** It evaluates the average F2 score every **60 minutes** (`Period=3600`).
* **Statistical Rigor:** It uses the **Average** score over that hour to ensure the alarm doesn't trigger over a single weird data point, but rather a genuine trend in poor performance.

### 4. Automated Incident Response

The script ensures that the engineering team is the first to know if the model fails.

* **The Alert (SNS):** If the score drops below 0.8, a notification is instantly sent via the **Simple Notification Service (SNS)**. This can be routed to an email, a SMS, or a pager.
* **The Recovery Signal:** When the model is fixed or the score rises back above 0.8, a "Clear" notification is sent, allowing the team to stand down.

### 5. Precision Identification

The alarm is laser-focused on a specific **SageMaker Endpoint**. By using the `EndpointName` as a dimension, the script ensures that you aren't getting general alerts; you are getting a specific diagnosis of exactly which model is underperforming.


In [None]:
# I. High-Priority Model Performance Monitoring
import boto3

cloudwatch = boto3.client('cloudwatch')
endpoint_name = "ueba-endpoint2026216-v2"
alarm_name = f"{endpoint_name}-F2-Low"
sns_topic_arn = "arn:aws:sns:us-east-1:805801076223:YourSNSTopic"  # <-- Replace with your SNS topic ARN

cloudwatch.put_metric_alarm(
    AlarmName=alarm_name,
    AlarmDescription=f'Alarm when F2 score drops below 0.8 for endpoint {endpoint_name}',
    ActionsEnabled=True,
    OKActions=[sns_topic_arn],
    AlarmActions=[sns_topic_arn],
    MetricName='f2',
    Namespace='aws/sagemaker/Endpoints/model-quality',
    Statistic='Average',
    Dimensions=[
        {'Name': 'EndpointName', 'Value': endpoint_name}
    ],
    Period=3600,          # 1 hour (monitoring runs hourly)
    EvaluationPeriods=1,
    Threshold=0.8,
    ComparisonOperator='LessThanThreshold'
)
print(f"✅ Alarm '{alarm_name}' created.")

✅ Alarm 'ueba-endpoint2026216-v2-F2-Low' created.


## II. Automated Data Quality and Feature Drift Monitoring

This script acts as a "digital health monitor" for the information being fed into your AI model. It focuses on **Feature Drift**, which is a critical signal that your model might be starting to fail because the real world has changed since the model was trained.

---

### 1. The Core Objective: Detecting "Stale" Data

AI models are like students who studied for an exam using a specific textbook. **Feature Drift** happens when the "exam" (real-world data) starts using questions or topics that weren't in the "textbook" (training data).

For example, if a model was trained to predict house prices based on interest rates of **3%**, but interest rates suddenly climb to **8%**, the "interest rate" feature has drifted. This script detects that gap before the model starts making wild, inaccurate guesses.

### 2. Monitoring the "Ingredients" of AI

Rather than looking at the model's final answer, this code inspects the "ingredients"—the individual data points (features) used to make a prediction.

* **Targeting the Endpoint:** It identifies the specific live model (the "Endpoint") to watch.
* **The Specific Metric:** It tracks a value called `feature_baseline_drift`. This is a mathematical score that calculates how different today's data looks compared to the data used during training.
* **Sensitivity (Maximum Statistic):** The script is set to look for the **Maximum** drift found across *any* of your data features. This means if even one single variable (like "user age" or "transaction amount") goes out of bounds, the alarm will trigger.

### 3. The "Line in the Sand" (The Threshold)

The system establishes a clear boundary for what is considered "acceptable" data.

* **The Threshold (1.0):** This is the limit. In SageMaker, a drift score of **1.0** typically represents a significant violation of the rules established during the model's setup.
* **Evaluation Frequency:** The system checks the data in **1-hour blocks**. If the drift hits the limit at any point during that hour, the alarm is raised immediately.

### 4. Automated Notification and Recovery

Monitoring is only useful if someone is notified. This script connects the alarm to a notification service (SNS).

* **The Alert:** As soon as the drift exceeds the limit, a message is sent to a specific "Topic" (which can forward it to an email, a Slack channel, or a pager).
* **The "All Clear":** Importantly, the script also sends a notification when the data returns to normal (**OKActions**). This tells the team that the issue has been resolved and the model's data is healthy again.


In [None]:
# II. Automated Data Quality and Feature Drift Monitoring

import boto3

cloudwatch = boto3.client('cloudwatch')
endpoint_name = "ueba-endpoint2026216-v2"
alarm_name = f"{endpoint_name}-FeatureDrift"
sns_topic_arn = "arn:aws:sns:us-east-1:805801076223:YourSNSTopic"  # <-- Replace

cloudwatch.put_metric_alarm(
    AlarmName=alarm_name,
    AlarmDescription=f'Alarm when feature drift is detected for endpoint {endpoint_name}',
    ActionsEnabled=True,
    OKActions=[sns_topic_arn],
    AlarmActions=[sns_topic_arn],
    MetricName='feature_baseline_drift',
    Namespace='aws/sagemaker/Endpoints/data-quality',
    Statistic='Maximum',
    Dimensions=[
        {'Name': 'EndpointName', 'Value': endpoint_name}
    ],
    Period=3600,          # 1 hour (matching monitoring frequency)
    EvaluationPeriods=1,
    Threshold=1.0,
    ComparisonOperator='GreaterThanOrEqualToThreshold'
)
print(f"✅ Alarm '{alarm_name}' created.")

✅ Alarm 'ueba-endpoint2026216-v2-FeatureDrift' created.


## III. Automated Fairness and Bias Monitoring for AI Models

This script establishes a "safety corridor" for an Artificial Intelligence model to ensure it treats different groups of people fairly. It specifically monitors **Disparate Impact (DI)**, a standard metric used to detect bias in automated decision-making.

---

### 1. The Core Objective: Protecting Fairness

The goal of this code is to ensure the model does not develop a "bias" against a particular group (such as gender, age, or ethnicity) over time. In data science, perfect fairness is represented by a value of **1.0**. If the value shifts too far in either direction, it indicates the model is favoring one group over another.

### 2. The "Safety Corridor" Strategy

Unlike a simple alarm that only watches for a single problem, this script sets up two separate "guards" to catch bias in both directions:

* **The Lower Boundary (0.8):** If the fairness metric drops below 0.8, it suggests a group is receiving significantly fewer favorable outcomes than others.
* **The Upper Boundary (1.25):** If the metric rises above 1.25, it suggests the opposite—one group is being disproportionately favored compared to the rest.

By setting these two limits, the script ensures the model operates within a legally and ethically accepted "safe zone."

### 3. Continuous Vigilance and Verification

The system doesn't just check once; it provides ongoing surveillance of the live model environment.

* **Hourly Checks:** The system averages the model's behavior every hour to ensure the data is statistically significant.
* **Instant Reaction:** If the model's behavior falls outside the 0.8–1.25 range for even a single hour, the system immediately flags it as an issue.

### 4. Incident Response and Recovery

The script automates the communication process so that human engineers can intervene as soon as a bias is detected.

* **The Alert System:** If the model becomes biased, an automated notification is sent to a specific team or dashboard (via the SNS notification service).
* **The Recovery Signal:** If the model is retuned or the data stabilizes, the system sends a second notification confirming that the model is back within the fair "safe zone."

### 5. Targeting the Model Source

The script is precision-targeted to a specific project. It identifies the exact "location" of the live model and looks specifically at the **Model Bias** logs. This ensures that the fairness monitoring is tied directly to the specific AI service being used for decision-making.


In [None]:
# III. Automated Fairness and Bias Monitoring for AI Models

import boto3

cloudwatch = boto3.client('cloudwatch')
endpoint_name = "ueba-endpoint2026216-v2"
alarm_name_low = f"{endpoint_name}-DI-Low"
alarm_name_high = f"{endpoint_name}-DI-High"
sns_topic_arn = "arn:aws:sns:us-east-1:805801076223:YourSNSTopic"  # <-- Replace

# Alarm for DI below 0.8
cloudwatch.put_metric_alarm(
    AlarmName=alarm_name_low,
    AlarmDescription=f'Alarm when Disparate Impact drops below 0.8 for {endpoint_name}',
    ActionsEnabled=True,
    OKActions=[sns_topic_arn],
    AlarmActions=[sns_topic_arn],
    MetricName='DI',          # or 'disparate_impact' – check your metrics
    Namespace='aws/sagemaker/Endpoints/model-bias',
    Statistic='Average',
    Dimensions=[
        {'Name': 'EndpointName', 'Value': endpoint_name}
        # You may also need to add dimensions for facet, label, etc.
    ],
    Period=3600,
    EvaluationPeriods=1,
    Threshold=0.8,
    ComparisonOperator='LessThanThreshold'
)

# Alarm for DI above 1.25
cloudwatch.put_metric_alarm(
    AlarmName=alarm_name_high,
    AlarmDescription=f'Alarm when Disparate Impact exceeds 1.25 for {endpoint_name}',
    ActionsEnabled=True,
    OKActions=[sns_topic_arn],
    AlarmActions=[sns_topic_arn],
    MetricName='DI',
    Namespace='aws/sagemaker/Endpoints/model-bias',
    Statistic='Average',
    Dimensions=[
        {'Name': 'EndpointName', 'Value': endpoint_name}
    ],
    Period=3600,
    EvaluationPeriods=1,
    Threshold=1.25,
    ComparisonOperator='GreaterThanThreshold'
)

print(f"✅ Bias alarms created: {alarm_name_low}, {alarm_name_high}")

✅ Bias alarms created: ueba-endpoint2026216-v2-DI-Low, ueba-endpoint2026216-v2-DI-High


## IV. Automated Monitoring for Machine Learning Model Reliability - Explainability Monitor

This script sets up a "digital security guard" for an Artificial Intelligence model. Specifically, it focuses on **Model Explainability**, which ensures that the reasons behind a model's decisions (like why it flagged a specific user as a security risk) remain consistent and logical over time.

---

### 1. The Core Objective: Detecting "Drift"

In machine learning, "drift" happens when a model starts behaving differently than it did during its initial training. This code specifically monitors **Explainability Drift**.

If the model suddenly starts prioritizing different data features to reach its conclusions—for example, if a fraud detection model suddenly stops looking at "transaction location" and starts looking only at "account age"—this is a red flag. This script is designed to catch that shift automatically.

### 2. Setting the "Line in the Sand" (The Threshold)

The system is configured to watch a specific data metric that measures this shift. It establishes a mathematical boundary (a threshold).

* **Monitoring Frequency:** The system checks the health of the model every hour.
* **Sensitivity:** It is set to be highly responsive. If even a single one-hour window shows that the model’s internal logic has shifted beyond the allowed limit, the alarm is triggered immediately.

### 3. The Notification Pipeline (SNS)

Monitoring is useless if no one knows there is a problem. The script connects the alarm to a notification service.

* **When things go wrong:** If the "drift" exceeds the limit, an alert is broadcast to a specific communication channel (like an email list or a DevOps chat).
* **When things are fixed:** The system also sends a "Clear" signal once the model's logic returns to its normal, expected baseline.

### 4. Target Identification

The script isn't just watching everything; it is laser-focused on a specific **SageMaker Endpoint** (the live environment where the AI lives). It identifies the exact "address" of the model and the specific category of logs where the explainability data is stored, ensuring that the alarm only reacts to the relevant machine learning project.


In [None]:
# IV. Automated Monitoring for Machine Learning Model Reliability - Explainability Monitor

import boto3

cloudwatch = boto3.client('cloudwatch')
endpoint_name = "ueba-endpoint2026216-v2"
alarm_name = f"{endpoint_name}-ExplainabilityDrift"
sns_topic_arn = "arn:aws:sns:us-east-1:805801076223:YourSNSTopic"  # <-- Replace

cloudwatch.put_metric_alarm(
    AlarmName=alarm_name,
    AlarmDescription=f'Alarm when explainability drift is detected for {endpoint_name}',
    ActionsEnabled=True,
    OKActions=[sns_topic_arn],
    AlarmActions=[sns_topic_arn],
    MetricName='shap_drift',          # Replace with actual metric name
    Namespace='aws/sagemaker/Endpoints/model-explainability',
    Statistic='Average',
    Dimensions=[
        {'Name': 'EndpointName', 'Value': endpoint_name}
    ],
    Period=3600,
    EvaluationPeriods=1,
    Threshold=1.0,                     # Adjust based on your baseline
    ComparisonOperator='GreaterThanOrEqualToThreshold'
)
print(f"✅ Alarm '{alarm_name}' created.")

✅ Alarm 'ueba-endpoint2026216-v2-ExplainabilityDrift' created.
