## Find Conflicting Values Across Datasets

**Description**: You have two datasets: `crm_customers.csv` and `erp_customers.csv` . Find customers with conflicting "email" information.

In [2]:
import pandas as pd

# Step 0: Create sample data for cra_customers.csv and erp_customers.csv

# Sample data for CRA customers
cra_data = {
    "customer_id": [1, 2, 3, 4],
    "email": ["alice@example.com", "bob@example.com", "carol@example.com", "dave@example.com"]
}
cra_df = pd.DataFrame(cra_data)

# Sample data for ERP customers (with one conflicting email for customer_id=3)
erp_data = {
    "customer_id": [1, 2, 3, 4],
    "email": ["alice@example.com", "bob@example.com", "caroline@example.com", "dave@example.com"]
}
erp_df = pd.DataFrame(erp_data)

# Step 1: Merge datasets on customer_id
merged_df = cra_df.merge(erp_df, on="customer_id", suffixes=('_cra', '_erp'))

# Step 2: Identify conflicting emails
conflicts = merged_df[merged_df["email_cra"] != merged_df["email_erp"]]

# Step 3: Display conflicts
print("Conflicting Email Records:")
print(conflicts[["customer_id", "email_cra", "email_erp"]])

print(f"\nTotal conflicting email entries found: {len(conflicts)}")

Conflicting Email Records:
   customer_id          email_cra             email_erp
2            3  carol@example.com  caroline@example.com

Total conflicting email entries found: 1
