### Machine Learning for Data Quality Prediction
**Description**: Use a machine learning model to predict data quality issues.

**Steps**:
1. Create a mock dataset with features and label (quality issue/label: 0: good, 1: issue).
2. Train a machine learning model.
3. Evaluate the model performance.

In [1]:
import pandas as pd
import numpy as np
from sklearn.model_selection import train_test_split
from sklearn.ensemble import RandomForestClassifier
from sklearn.metrics import classification_report, confusion_matrix, accuracy_score

# 1. Create a mock dataset
np.random.seed(42)
data = {
    'missing_values_ratio': np.random.rand(100),         # Fraction of missing values
    'duplicates_count': np.random.randint(0, 10, 100),   # Count of duplicate rows
    'outlier_score': np.random.rand(100),                # Simulated outlier score
    'inconsistent_types': np.random.randint(0, 5, 100),  # Count of type mismatches
    'quality_issue': np.random.choice([0, 1], 100, p=[0.7, 0.3])  # 0: good, 1: issue
}

df = pd.DataFrame(data)

# Features and label
X = df.drop('quality_issue', axis=1)
y = df['quality_issue']

# 2. Train a machine learning model
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2, random_state=42)

clf = RandomForestClassifier(n_estimators=100, random_state=42)
clf.fit(X_train, y_train)

# 3. Evaluate the model
y_pred = clf.predict(X_test)

print("=== Confusion Matrix ===")
print(confusion_matrix(y_test, y_pred))

print("\n=== Classification Report ===")
print(classification_report(y_test, y_pred))

print(f"=== Accuracy: {accuracy_score(y_test, y_pred):.2f} ===")


=== Confusion Matrix ===
[[9 4]
 [7 0]]

=== Classification Report ===
              precision    recall  f1-score   support

           0       0.56      0.69      0.62        13
           1       0.00      0.00      0.00         7

    accuracy                           0.45        20
   macro avg       0.28      0.35      0.31        20
weighted avg       0.37      0.45      0.40        20

=== Accuracy: 0.45 ===
