Task 2: Lookalike Model
# Build a Lookalike Model that takes a user's information as input and recommends 3 similar **bold text**

In [None]:
# Merge transactions with customer and product data
merged_data = pd.merge(Transactions, customer, on='CustomerID', how='inner')
merged_data = pd.merge(merged_data, Products, on='ProductID', how='inner')

**Feature Engineering: Create features based on customer profiles and transaction

In [None]:
# Example feature: Total spend
customer_features = merged_data.groupby('CustomerID').agg({
    'TotalValue': 'sum',
    'Quantity': 'sum',
    'Category': lambda x: x.mode()[0],  # Most purchased category
    'Region': 'first'  # Region from customer profile
}).reset_index()


Normalize Features: Normalize numerical features to ensure comparability.

In [None]:
from sklearn.preprocessing import MinMaxScaler

scaler = MinMaxScaler()
customer_features[['TotalValue', 'Quantity']] = scaler.fit_transform(
    customer_features[['TotalValue', 'Quantity']]
)

Build the Similarity Model
Calculate Similarity: Use cosine similarity to find similar customers based on their profiles and transaction features.

In [None]:
from sklearn.metrics.pairwise import cosine_similarity

# Compute similarity matrix
feature_matrix = customer_features[['TotalValue', 'Quantity']]
similarity_matrix = cosine_similarity(feature_matrix)

# Create a DataFrame for easy interpretation
similarity_df = pd.DataFrame(similarity_matrix, index=customer_features['CustomerID'], columns=customer_features['CustomerID'])


Find Top 3 Lookalikes: For each customer (C0001–C0020), find the top 3 similar customers.

python
Copy
Edit


In [None]:
lookalike_results = {}

for customer_id in customer_features['CustomerID'][:20]:  # First 20 customers
    similar_customer = similarity_df[customer_id].sort_values(ascending=False)[1:4]
    lookalike_results[customer_id] = list(zip(similar_customer.index, similar_customer.values))

# Convert results to a DataFrame
lookalike_df = pd.DataFrame({
    'cust_id': lookalike_results.keys(),
    'lookalikes': [str(v) for v in lookalike_results.values()]
})

# Save to CSV
lookalike_df.to_csv("FirstName_LastName_Lookalike.csv", index=False)


In [None]:

# Aggregate transaction data by CustomerID
Transaction_summary = Transactions.groupby('CustomerID').agg({
    'TotalValue': 'sum',
    'Quantity': 'sum',
    'TransactionID': 'count'
}).rename(columns={'TransactionID': 'TransactionCount'}).reset_index()

# Merge with customer data
customer_data = pd.merge(customer, Transaction_summary, on='CustomerID', how='inner')

In [None]:
# One-hot encode 'Region'
customer_data = pd.get_dummies(customer_data, columns=['Region'], drop_first=True)

# Normalize numerical features
from sklearn.preprocessing import MinMaxScaler
scaler = MinMaxScaler()
features = ['TotalValue', 'Quantity', 'TransactionCount']
customer_data[features] = scaler.fit_transform(customer_data[features])
