### Task 1: Measure Data Accuracy using a Trusted Source

**Description**: You have two datasets of product prices: `company_prices.csv` and
`trusted_prices.csv` . Check if the prices in `company_prices.csv` match the prices in
`trusted_prices.csv` . Assume both files have a "product_id" and "price" column.

In [1]:
# Write your code from here

import pandas as pd
import os

def measure_price_accuracy(company_file, trusted_file):
    """
    Compares product prices between company data and trusted source data.

    Parameters:
        company_file (str): Path to the company's product prices CSV.
        trusted_file (str): Path to the trusted product prices CSV.

    Returns:
        dict: Accuracy metrics and mismatched records.
    """
    # Check if files exist
    if not os.path.exists(company_file):
        raise FileNotFoundError(f"Company file not found: {company_file}")
    if not os.path.exists(trusted_file):
        raise FileNotFoundError(f"Trusted file not found: {trusted_file}")

    # Read both datasets
    try:
        company_df = pd.read_csv(company_file)
        trusted_df = pd.read_csv(trusted_file)
    except Exception as e:
        raise ValueError(f"Error reading files: {e}")

    # Validate required columns
    required_columns = ['product_id', 'price']
    for df, name in [(company_df, "company"), (trusted_df, "trusted")]:
        missing = [col for col in required_columns if col not in df.columns]
        if missing:
            raise ValueError(f"Missing columns in {name} data: {missing}")

    # Merge datasets on product_id
    merged = pd.merge(company_df, trusted_df, on='product_id', suffixes=('_company', '_trusted'))

    # Compare prices
    merged['match'] = merged['price_company'] == merged['price_trusted']
    total_checked = len(merged)
    total_matches = merged['match'].sum()
    accuracy_rate = (total_matches / total_checked) * 100

    # Get mismatched records
    mismatches = merged[~merged['match']]

    return {
        'total_checked': total_checked,
        'total_matches': total_matches,
        'accuracy_rate': accuracy_rate,
        'mismatches': mismatches
    }

# Example usage:
if __name__ == "__main__":
    try:
        results = measure_price_accuracy('company_prices.csv', 'trusted_prices.csv')
        print(f"Total Products Checked: {results['total_checked']}")
        print(f"Matches: {results['total_matches']}")
        print(f"Accuracy Rate: {results['accuracy_rate']:.2f}%")
        if not results['mismatches'].empty:
            print("\nMismatched Records:")
            print(results['mismatches'][['product_id', 'price_company', 'price_trusted']])
    except Exception as e:
        print(f"Error: {e}")


Error: Company file not found: company_prices.csv


### Task 2: Detect Incorrect Values

**Description**: In `company_prices.csv` , detect any negative price values which are incorrect values for prices.

In [2]:
# Write your code from here

### Task 3: Check Missing Data Rates

**Description**: Calculate the percentage of missing values in `customer_data.csv` .

In [3]:
# Write your code from here

### Task 4: Handling Partially Available Records

**Description**: In `customer_data.csv` , identify records with missing "email" or "phone number" and decide whether to drop or fill them.

In [4]:
# Write your code from here