## Ensuring Consistency in Multi-source Data Integration

**Description**: Validate the integration of two datasets `products_A.csv` and `products_B.csv` . Ensure consistency in product "category" information.

In [1]:
# Write your code from here
import pandas as pd

# Simulated dataset from source A
products_A = pd.DataFrame({
    'product_id': [101, 102, 103, 104],
    'product_name': ['Widget', 'Gadget', 'Doodad', 'Thingamajig'],
    'category': ['Tools', 'Electronics', 'Tools', 'Gadgets']
})

# Simulated dataset from source B
products_B = pd.DataFrame({
    'product_id': [101, 102, 103, 104],
    'product_name': ['Widget', 'Gadget', 'Doodad', 'Thingamajig'],
    'category': ['Tools', 'Electronics', 'Hardware', 'Gadgets']  # Notice category mismatch for product_id 103
})

# Merge datasets on product_id
merged = pd.merge(products_A, products_B, on='product_id', suffixes=('_A', '_B'))

# Check for category consistency
merged['category_match'] = merged['category_A'] == merged['category_B']

# Find inconsistent rows
inconsistent = merged[~merged['category_match']]

print("Merged Data with Category Consistency Check:")
print(merged)

print("\nInconsistent Category Entries:")
print(inconsistent[['product_id', 'category_A', 'category_B']])


Merged Data with Category Consistency Check:
   product_id product_name_A   category_A product_name_B   category_B  \
0         101         Widget        Tools         Widget        Tools   
1         102         Gadget  Electronics         Gadget  Electronics   
2         103         Doodad        Tools         Doodad     Hardware   
3         104    Thingamajig      Gadgets    Thingamajig      Gadgets   

   category_match  
0            True  
1            True  
2           False  
3            True  

Inconsistent Category Entries:
   product_id category_A category_B
2         103      Tools   Hardware
