### Detect Data Drift in ML Models
**Objective**: Monitor and detect changes in data distributions that impact ML model performance.

**Task**: Feature Correlation Drift

**Steps**:
1. Compute the correlation matrix of features in your training dataset.
2. Compute the correlation matrix of the same features in your production data.
3. Assess changes in the correlation matrix over time to identify any significant deviations.
4. Investigate any significant changes in correlation as they may indicate issues in the data collection process or model assumptions.

In [None]:
# write your code from here
import pandas as pd
import numpy as np
train_data = {
    'feature1': [1, 2, 3, 4, 5],
    'feature2': [5, 4, 3, 2, 1],
    'feature3': [2, 3, 4, 5, 6]
}
train_df = pd.DataFrame(train_data)
train_corr_matrix = train_df.corr()
print("Training Correlation Matrix:")
print(train_corr_matrix)
prod_data = {
    'feature1': [1, 2, 3, 4, 6],  # Slight change in the last value
    'feature2': [5, 4, 3, 2, 0],  # Drift in feature2
    'feature3': [2, 3, 4, 5, 7]   # Drift in feature3
}
prod_df = pd.DataFrame(prod_data)
prod_corr_matrix = prod_df.corr()
print("\nProduction Correlation Matrix:")
print(prod_corr_matrix)
corr_diff = (train_corr_matrix - prod_corr_matrix).abs()
print("\nCorrelation Matrix Difference:")
print(corr_diff)
threshold = 0.2  
significant_changes = corr_diff[corr_diff > threshold]
if not significant_changes.empty:
    print("\nSignificant Changes in Correlation:")
    print(significant_changes)
else:
    print("\nNo significant changes in correlation detected.")



Training Correlation Matrix:
          feature1  feature2  feature3
feature1       1.0      -1.0       1.0
feature2      -1.0       1.0      -1.0
feature3       1.0      -1.0       1.0

Production Correlation Matrix:
          feature1  feature2  feature3
feature1       1.0      -1.0       1.0
feature2      -1.0       1.0      -1.0
feature3       1.0      -1.0       1.0

Correlation Matrix Difference:
          feature1  feature2  feature3
feature1       0.0       0.0       0.0
feature2       0.0       0.0       0.0
feature3       0.0       0.0       0.0

Significant Changes in Correlation:
          feature1  feature2  feature3
feature1       NaN       NaN       NaN
feature2       NaN       NaN       NaN
feature3       NaN       NaN       NaN
