### Task 1: Automated Data Profiling

**Steps**:
1. Using Pandas-Profiling
    - Generate a profile report for an existing CSV file.
    - Customize the profile report to include correlations.
    - Profile a specific subset of columns.
2. Using Great Expectations
    - Create a basic expectation suite for your data.
    - Validate data against an expectation suite.
    - Add multiple expectations to a suite.

In [2]:
import pandas as pd
from pandas_profiling import ProfileReport
import great_expectations as ge
from great_expectations.dataset import PandasDataset

# --- Step 1: Load CSV File with Error Handling ---
def load_data(file_path):
    try:
        df = pd.read_csv(file_path)
        print("✅ Data loaded successfully.")
        return df
    except FileNotFoundError:
        print("❌ Error: File not found.")
    except pd.errors.ParserError:
        print("❌ Error: Parsing error. Check CSV formatting.")
    except Exception as e:
        print(f"❌ Unexpected error: {e}")
    return None

# --- Step 2: Generate Pandas Profiling Report ---
def generate_profile(df, columns=None, report_file="profile_report.html"):
    try:
        if columns:
            df = df[columns]
        profile = ProfileReport(df, title="Data Profile Report", correlations={"pearson": True})
        profile.to_file(report_file)
        print(f"📊 Profile report saved to {report_file}")
    except Exception as e:
        print(f"❌ Failed to generate profile: {e}")

# --- Step 3: Validate with Great Expectations ---
def validate_with_expectations(df):
    try:
        df_ge = ge.from_pandas(df)

        # Add expectations
        df_ge.expect_column_to_exist("age")
        df_ge.expect_column_values_to_not_be_null("age")
        df_ge.expect_column_values_to_be_between("age", min_value=0, max_value=100)

        # Run validation
        results = df_ge.validate()
        print("✅ Validation Results:")
        print(results)

    except Exception as e:
        print(f"❌ Error in Great Expectations validation: {e}")

# --- MAIN ---
if __name__ == "__main__":
    filepath = "data.csv"  # Replace with your actual CSV path
    columns_to_profile = ["age", "salary"]  # Optional: Specify a subset of columns

    df = load_data(filepath)
    if df is not None:
        generate_profile(df, columns=columns_to_profile)
        validate_with_expectations(df)


  from pandas_profiling import ProfileReport


ModuleNotFoundError: No module named 'great_expectations.dataset'

### Task 2: Real-time Monitoring of Data Quality

**Steps**:
1. Setting up Alerts for Quality Drops
    - Use the logging library to set up a basic alert on failed expectations.
    - Implementing alerts using email notifications.
    - Using a dashboard like Grafana for visual alerts.
        - Note: Example assumes integration with a monitoring system
        - Alert setup would involve creating a data source and alert rule in Grafana

In [None]:
# Write your code from here

### Task 3: Using AI for Data Quality Monitoring
**Steps**:
1. Basic AI Models for Monitoring
    - Train a simple anomaly detection model using Isolation Forest.
    - Use a simple custom function based AI logic for outlier detection.
    - Creating a monitoring function that utilizes a pre-trained machine learning model.

In [None]:
# Write your code from here