# Measuring Data Accuracy

**Activity Overview**: Assess data accuracy by comparing it to a trusted source and detecting incorrect values or mismatches.

## Title: Product Pricing

**Task**: Compare a dataset of product prices with the latest official price list.

**Steps**:
1. Obtain the latest product price list from the official company website.
2. Compare the dataset's product prices against the verified list.
3. Identify any discrepancies and mark them for correction.

In [2]:
# Write your code from here
import pandas as pd

# Example company dataset (possibly outdated or error-prone)
company_prices = pd.DataFrame({
    'product_id': ['P001', 'P002', 'P003', 'P004', 'P005'],
    'price': [10.99, 23.50, 15.00, 12.75, 9.99]
})

# Trusted official price list (latest verified prices)
trusted_prices = pd.DataFrame({
    'product_id': ['P001', 'P002', 'P003', 'P004', 'P005'],
    'price': [10.99, 22.99, 15.00, 13.00, 9.99]
})

# Merge datasets on product_id to compare prices
comparison = pd.merge(company_prices, trusted_prices, on='product_id', suffixes=('_company', '_trusted'))

# Identify discrepancies where prices do not match
comparison['price_mismatch'] = comparison['price_company'] != comparison['price_trusted']

# Extract mismatches for correction
mismatches = comparison[comparison['price_mismatch']]

print("Products with price discrepancies:")
print(mismatches[['product_id', 'price_company', 'price_trusted']])

# Mark for correction: could update or flag in the dataset
# Example: Update company prices with trusted prices where mismatch exists
company_prices_corrected = company_prices.copy()
for idx, row in mismatches.iterrows():
    product = row['product_id']
    correct_price = row['price_trusted']
    company_prices_corrected.loc[company_prices_corrected['product_id'] == product, 'price'] = correct_price

print("\nCompany prices after correction:")
print(company_prices_corrected)


Products with price discrepancies:
  product_id  price_company  price_trusted
1       P002          23.50          22.99
3       P004          12.75          13.00

Company prices after correction:
  product_id  price
0       P001  10.99
1       P002  22.99
2       P003  15.00
3       P004  13.00
4       P005   9.99
