### Task 1: Detecting Missing Values during Data Ingestion
**Description**: You have a CSV file with missing values in some columns. Write a Python script to detect and report missing values during the ingestion process.

**Steps**:
1. Load data
2. Check for missing values
3. Report missing values

In [2]:
import pandas as pd

def load_data(file_path):
    """Load CSV data with error handling."""
    try:
        df = pd.read_csv(file_path)
        print("✅ Data loaded successfully.")
        return df
    except FileNotFoundError:
        print(f"❌ Error: File not found at path: {file_path}")
    except pd.errors.ParserError:
        print("❌ Error: Could not parse CSV file.")
    except Exception as e:
        print(f"❌ Unexpected error: {e}")
    return None

def detect_missing_values(df):
    """Detect and report missing values."""
    missing = df.isnull().sum()
    missing = missing[missing > 0]
    if missing.empty:
        print("✅ No missing values detected.")
    else:
        print("⚠️ Missing values found:")
        print(missing)

def validate_data_types(df, expected_types):
    """Validate data types against expected schema."""
    for col, expected_type in expected_types.items():
        if col not in df.columns:
            print(f"❌ Missing column: {col}")
        elif not pd.api.types.is_dtype_equal(df[col].dtype, expected_type):
            print(f"⚠️ Column '{col}' has type {df[col].dtype}, expected {expected_type}")
        else:
            print(f"✅ Column '{col}' type is valid.")

def remove_duplicates(df):
    """Remove duplicate rows and report count."""
    initial_count = len(df)
    df = df.drop_duplicates()
    removed = initial_count - len(df)
    print(f"✅ Removed {removed} duplicate rows.")
    return df

# === Main Execution ===
if __name__ == "__main__":
    file_path = "data.csv"  # Update path
    expected_types = {
        "id": "int64",
        "name": "object",
        "age": "float64"
    }

    df = load_data(file_path)
    if df is not None:
        detect_missing_values(df)
        validate_data_types(df, expected_types)
        df = remove_duplicates(df)


❌ Error: File not found at path: data.csv


### Task 2: Validate Data Types during Extraction
**Description**: You have a JSON file that should have specific data types for each field. Write a script to validate if the data types match the expected schema.

**Steps**:
1. Define expected schema
2. Validate data types

In [3]:
# Write your code from here

### Task 3: Remove Duplicate Records in Data
**Description**: You have a dataset with duplicate entries. Write a Python script to find and remove duplicate records using Pandas.

**Steps**:
1. Find duplicate records
2. Remove duplicates
3. Report results

In [4]:
# Write your code from here