# Measuring Data Accuracy

**Activity Overview**: Assess data accuracy by comparing it to a trusted source and detecting incorrect values or mismatches.

## Title: Product Pricing

**Task**: Compare a dataset of product prices with the latest official price list.

**Steps**:
1. Obtain the latest product price list from the official company website.
2. Compare the dataset's product prices against the verified list.
3. Identify any discrepancies and mark them for correction.

In [1]:
# Write your code from here
import pandas as pd

# Load datasets
company_prices = pd.read_csv("company_prices.csv")
official_prices = pd.read_csv("official_prices.csv")

# Merge on product_id to compare prices side by side
merged = pd.merge(company_prices, official_prices, on="product_id", how="left", suffixes=('_company', '_official'))

# Identify mismatches where prices differ or official price missing
discrepancies = merged[
    (merged['price_company'] != merged['price_official']) | (merged['price_official'].isnull())
]

print(f"Found {len(discrepancies)} discrepancies in product pricing:\n")
print(discrepancies)

# Optional: Save discrepancies for correction
discrepancies.to_csv("price_discrepancies.csv", index=False)


Found 3 discrepancies in product pricing:

   product_id  price_company  price_official
1         102          29.99           27.99
3         104          10.00           12.00
4         105          25.00             NaN
