# ***ReelCode : Project-A-Thons***

  * Build the Future with AI Agents & Automation

## **Theme :-  E-commerce – AI Cart Recovery Agent**

#### *Prototype built-up by Deepak Kaura*

## **🛒✨ Problem Statement :-**

*Cart abandonment remains one of the biggest challenges in e-commerce, with an average abandonment rate close to 70% — translating to billions in lost revenue every year. Yet this challenge is also an untapped opportunity: by leveraging advanced technology, behavioral insights, and smart engagement strategies, businesses can recover a significant share of this lost revenue.*

---

#### **What is Cart abandonment?**

Cart abandonment happens when a customer adds items to their online shopping cart but leaves the site without completing the purchase. It’s a common issue in e-commerce, where only a portion of carts end up as completed sales


#### **Why Does Cart Abandonment matter?**

Have you ever gone online shopping and added items to your cart, only to leave without buying them? It’s a familiar scenario that many of us, myself included, are guilty of. Understanding why users abandon carts is crucial for businesses. By gaining this insight, they can enhance customer experience, simplify the checkout process, and ultimately boost sales conversions. After all, high cart abandonment rates represent missed revenue opportunities.




---

## 📚 **Data Dictionary**

| Column Name            | Description                                                                             |
| ---------------------- | --------------------------------------------------------------------------------------- |
| **user\_id**           | Unique identifier for each user (e.g., UUID or alphanumeric ID).                        |
| **product\_id**        | Unique identifier for the product viewed or purchased.                                  |
| **category**           | Product category (e.g., Electronics, Fashion, Groceries).                               |
| **price**              | Price of the product in currency units.                                                 |
| **discount\_applied**  | Discount applied to the product, usually as a percentage or absolute value.             |
| **payment\_method**    | Payment method used (e.g., Credit Card, Debit Card, PayPal, COD).                       |
| **purchase\_date**     | Date of purchase or interaction, typically as a timestamp or string.                    |
| **pages\_visited**     | Number of product pages or site pages visited during the session.                       |
| **time\_spent**        | Total time spent on the website or app (in seconds or minutes).                         |
| **rating**             | Customer rating given to the product, usually on a fixed scale (e.g., 1–5 stars).       |
| **review\_text**       | Textual review left by the customer for the product.                                    |
| **sentiment\_score**   | Sentiment score derived from the review text (e.g., using NLP; range could be -1 to 1). |
| **age**                | Age of the customer in years.                                                           |
| **gender**             | Gender of the customer (e.g., Male, Female, Non-binary).                                |
| **income\_level**      | Customer’s income level, usually encoded numerically or by bracket.                     |
| **location**           | Customer’s location (e.g., city, state, or country).                                    |
| **purchase\_decision** | Binary flag indicating whether a purchase was made (1) or not (0).                      |
| **add\_to\_cart**      | Binary flag indicating whether the product was added to the cart (1) or not (0).        |
| **abandoned\_cart**    | Binary flag indicating whether the cart was abandoned (1) or completed (0).             |

---


## **Loading and Reading dataset**

In [None]:
import pandas as pd
import numpy as np

df_cart = pd.read_csv('/content/consumer_behavior_dataset.csv')
df_cart.head()

Unnamed: 0,user_id,product_id,category,price,discount_applied,payment_method,purchase_date,pages_visited,time_spent,add_to_cart,abandoned_cart,rating,review_text,sentiment_score,age,gender,income_level,location,purchase_decision
0,b93e568c-81fc-4db8-8509-2940e261b2f2,6badd48c-5349-444b-9f92-8a27c11ce05f,Clothing,389.23,46.99,Credit Card,2025-01-31 04:07:28,6,955,0,0,4,Start similar morning police quality various m...,0.16,51,Other,Low,Jeremyview,0
1,68c55d68-1074-48af-80ae-b5dad0915b8d,fe635f30-f9bb-4ca9-8a0d-3d6c67e2ae23,Clothing,344.81,9.11,COD,2025-01-21 06:07:28,8,790,1,0,3,Candidate level take evening almost push social.,0.03,54,Male,High,South Amy,1
2,a5adbe49-8208-459a-b72d-1b2960b38ade,e9dc8006-25eb-4aec-9c11-a2b33037b01b,Electronics,180.59,19.37,COD,2025-02-05 00:30:15,3,336,0,0,1,Way nearly value Republican part foot degree i...,0.1,49,Female,Low,West Lisaside,0
3,d0f1b4e1-d647-46d9-b88a-fea1beb8239f,8c7073db-ee68-415b-9686-ae6f068c9d2d,Grocery,415.26,25.18,Debit Card,2025-01-08 16:03:46,4,624,1,0,4,Drive eight upon do work share fear soldier no...,-0.5,20,Male,Low,Lake Deanport,1
4,2d530715-fb4f-40c5-95bb-02367d3c8c2c,864100c8-2502-4717-ba77-4261eae803f8,Electronics,352.82,46.07,PayPal,2025-02-27 19:53:59,4,579,0,0,5,Drug late look state concern personal go second.,-0.1,21,Other,Medium,Gloriaborough,0


### **Checking Categorical features**

In [None]:
df_cart.purchase_decision.value_counts()

Unnamed: 0_level_0,count
purchase_decision,Unnamed: 1_level_1
0,3766
1,1234


In [None]:
df_cart.rating.value_counts()

Unnamed: 0_level_0,count
rating,Unnamed: 1_level_1
1,1040
2,1004
4,990
3,986
5,980


In [None]:
df_cart.payment_method.value_counts()

Unnamed: 0_level_0,count
payment_method,Unnamed: 1_level_1
PayPal,1290
COD,1257
Debit Card,1227
Credit Card,1226


In [None]:
df_cart.category.value_counts()

Unnamed: 0_level_0,count
category,Unnamed: 1_level_1
Electronics,1032
Furniture,1031
Clothing,995
Grocery,983
Books,959


In [None]:
df_cart.abandoned_cart.value_counts()

Unnamed: 0_level_0,count
abandoned_cart,Unnamed: 1_level_1
0,3761
1,1239


In [None]:
df_cart.gender.value_counts()

Unnamed: 0_level_0,count
gender,Unnamed: 1_level_1
Other,1709
Male,1661
Female,1630


In [None]:
df_cart.income_level.value_counts()

Unnamed: 0_level_0,count
income_level,Unnamed: 1_level_1
High,1726
Medium,1665
Low,1609


### **Changing User ID and Product ID into readable way**

In [None]:
import re

df_cart['user_id'] = df_cart['user_id'].str.replace(r'[^0-9]', '', regex=True).str[:5]

df_cart['product_id'] = df_cart['product_id'].str.replace(r'[^0-9]', '', regex=True).str[:5]

### **Creating Clusters for Customer Segmentation -**

In [None]:
df_cart['income_level'] = df_cart['income_level'].map({
    'Low': 1,
    'Medium': 2,
    'High': 3
})

In [None]:
from sklearn.preprocessing import StandardScaler
from sklearn.cluster import KMeans

# Select features for clustering
features = df_cart[['age',
    'income_level',
    'pages_visited',
    'time_spent',
    'discount_applied',
    'price']]

# -----------------------------
# Scale features — very important for K-Means
# -----------------------------
scaler = StandardScaler()
features_scaled = scaler.fit_transform(features)

In [None]:
# -----------------------------
# Find optimal k with Elbow Method
# -----------------------------
inertia = []
k_range = range(2, 10)

for k in k_range:
    kmeans = KMeans(n_clusters=k, random_state=42)
    kmeans.fit(features_scaled)
    inertia.append(kmeans.inertia_)  # Distortion / SSE

import plotly.graph_objects as go

fig = go.Figure()

fig.add_trace(
    go.Scatter(
        x=list(k_range),
        y=inertia,
        mode='lines+markers',
        marker=dict(color='royalblue', size=8),
        line=dict(width=2),
        name='Inertia'
    )
)

fig.update_layout(
    title='Elbow Method For Optimal k',
    xaxis_title='Number of clusters (k)',
    yaxis_title='Inertia (SSE)',
    xaxis=dict(tickmode='linear'),
    template='plotly_white',
    width=800,
    height=500
)

fig.show()


In [None]:
# Suppose you choose k= from the elbow
optimal_k = 5

kmeans_final = KMeans(n_clusters=optimal_k, random_state=42)
df_cart['Customer_Segment_Type'] = kmeans_final.fit_predict(features_scaled)

print(df_cart.groupby('Customer_Segment_Type')[['price', 'discount_applied', 'pages_visited', 'time_spent']].mean())

                            price  discount_applied  pages_visited  time_spent
Customer_Segment_Type                                                         
0                      134.943293         24.993049      10.713415  772.873984
1                      292.915386         23.576254      12.365110  921.391802
2                      157.710884         22.927636       9.906283  330.143770
3                      384.118859         24.938620       9.351660  405.609959
4                      276.934793         29.097190      10.176692  625.209586


* Note -

  * ✅ price: Clear difference → some spend low (130–160), others high (280–380). Good for value segmentation.

  * ⚠️ discount_applied: Less variation (22–29). Cluster 4 is slightly higher → some discount sensitivity → but not as strong as price.

  * pages_visited & time_spent: Some signal, but more for engagement, not spending power in-short as behavioral profiling.

In [None]:
cluster_avg = df_cart.groupby("Customer_Segment_Type")["price"].mean().sort_values()
ordered_clusters = cluster_avg.index.tolist()

In [None]:
# Make a mapping
segment_names = ["Window Shopper", "Low Value Quick Buyer", "Regular Buyer", "Engaged Premium Buyer", "VIP Fast Buyer"]
cluster_to_label = {cluster: segment_names[i] for i, cluster in enumerate(ordered_clusters)}

# Apply mapping
df_cart["Customer_Segment_Label"] = df_cart["Customer_Segment_Type"].map(cluster_to_label)

print(df_cart[["user_id","Customer_Segment_Type", "Customer_Segment_Label"]].head())
print('--' * 22)
print(df_cart[["Customer_Segment_Type", "Customer_Segment_Label"]].value_counts())

  user_id  Customer_Segment_Type Customer_Segment_Label
0   93568                      4          Regular Buyer
1   68556                      1  Engaged Premium Buyer
2   54982                      4          Regular Buyer
3   01416                      3         VIP Fast Buyer
4   25307                      3         VIP Fast Buyer
--------------------------------------------
Customer_Segment_Type  Customer_Segment_Label
4                      Regular Buyer             1064
1                      Engaged Premium Buyer     1049
0                      Window Shopper             984
3                      VIP Fast Buyer             964
2                      Low Value Quick Buyer      939
Name: count, dtype: int64


#### **Checking columns**

In [None]:
print(df_cart.columns)

Index(['user_id', 'product_id', 'category', 'price', 'discount_applied',
       'payment_method', 'purchase_date', 'pages_visited', 'time_spent',
       'add_to_cart', 'abandoned_cart', 'rating', 'review_text',
       'sentiment_score', 'age', 'gender', 'income_level', 'location',
       'purchase_decision', 'Customer_Segment_Type', 'Customer_Segment_Label'],
      dtype='object')


### **Creating new features in terms of Feature Engineering -**

In [None]:
df_cart['add_to_cart_ratio'] = df_cart['add_to_cart'] / df_cart['pages_visited'].replace(0, 1)

df_cart['discount_ratio'] = df_cart['discount_applied'] / df_cart['price'].replace(0, 1)


### **Checking each column data types**

In [None]:
df_cart.info()

<class 'pandas.core.frame.DataFrame'>
RangeIndex: 5000 entries, 0 to 4999
Data columns (total 23 columns):
 #   Column                  Non-Null Count  Dtype  
---  ------                  --------------  -----  
 0   user_id                 5000 non-null   object 
 1   product_id              5000 non-null   object 
 2   category                5000 non-null   object 
 3   price                   5000 non-null   float64
 4   discount_applied        5000 non-null   float64
 5   payment_method          5000 non-null   object 
 6   purchase_date           5000 non-null   object 
 7   pages_visited           5000 non-null   int64  
 8   time_spent              5000 non-null   int64  
 9   rating                  5000 non-null   int64  
 10  review_text             5000 non-null   object 
 11  sentiment_score         5000 non-null   float64
 12  age                     5000 non-null   int64  
 13  gender                  5000 non-null   object 
 14  income_level            5000 non-null   

In [None]:
df_cart = df_cart[['user_id', 'product_id', 'category', 'price', 'discount_applied', 'discount_ratio',
       'payment_method', 'purchase_date', 'pages_visited', 'time_spent','rating', 'review_text',
       'sentiment_score','age', 'gender', 'income_level', 'location',
       'purchase_decision','add_to_cart','add_to_cart_ratio','abandoned_cart','Customer_Segment_Type', 'Customer_Segment_Label']]

In [None]:
def show_cat_mapping(df, col):
    cat = df[col].astype('category')
    for i, val in enumerate(cat.cat.categories):
        print(f"{i}: {val}")

show_cat_mapping(df_cart, 'payment_method')
show_cat_mapping(df_cart, 'gender')


0: COD
1: Credit Card
2: Debit Card
3: PayPal
0: Female
1: Male
2: Other


## **Customer Behaviour Analysis :-**

In [None]:
import plotly.express as px

# 📌 Group by your new segments
profile_df = df_cart.groupby("Customer_Segment_Label").agg({
    "price": "mean",
    "time_spent": "mean",
    "pages_visited": "mean",
    "discount_applied": "mean"
}).reset_index()

# ✅ Melt to long format for grouped bars
profile_long = profile_df.melt(id_vars="Customer_Segment_Label",
                               value_vars=["price", "time_spent", "pages_visited", "discount_applied"],
                               var_name="Metric", value_name="Value")

# 📊 Plot
fig = px.bar(
    profile_long,
    x="Customer_Segment_Label",
    y="Value",
    color="Metric",
    barmode="group",
    title="Customer Behavior by Segment",
    text_auto=".2s"
)

fig.update_layout(
    width=900,
    height=500,
    xaxis_title="Customer Segment",
    yaxis_title="Average Value",
    legend_title="Metric",
    bargap=0.3
)

fig.show()


#### **Insights from above plot we can derive:**

* **Engaged Premium Buyers** spend the highest time and money → they browse longer, spend more — but may expect deals or top service.

* **VIP Fast Buyer** has low browsing time but moderate price — they’re efficient and decisive.

* **Window Shoppers** have mid-level prices but high pages visited — they spend time browsing but don’t always buy.

* **Low Value Quick Buyers** spend the least, visit few pages, and have high discounts → they’re deal hunters, likely to buy only with incentives.

In [None]:
abandon_rate = (
    df_cart.groupby("Customer_Segment_Label")["abandoned_cart"]
    .mean()
    .reset_index()
    .rename(columns={"abandoned_cart": "Abandonment_Rate"})
)


fig = px.bar(
    abandon_rate,
    x="Customer_Segment_Label",
    y="Abandonment_Rate",
    color="Customer_Segment_Label",  # color by segment
    color_discrete_sequence=px.colors.qualitative.Set2,  # pick a nice color palette
    text_auto=".2f",
    title="Cart Abandonment Rate by Customer Segment"
)

fig.update_layout(
    width=800,  # width in pixels
    height=500,  # height in pixels
    bargap=0.3,  # space between bars
    xaxis_title="Customer Segment",
    yaxis_title="Abandonment Rate",
    legend_title="Segment"
)

fig.show()


#### **Insights from above plot we can derive:**

* **Engaged Premium Buyer:** This segment has the highest cart abandonment rate at 0.27 (27%). This is a surprising finding, as "Engaged Premium Buyers" are expected to be highly motivated to purchase. This high rate could indicate issues with the checkout process, unexpected shipping costs, or a lack of specific product availability that frustrates this valuable customer group.

* **Low Value Quick Buyer**, **Regular Buyer**, **VIP Fast Buyer**, and **Window Shopper:** All these segments have a similar cart abandonment rate of 0.24 (24%). The consistency across these segments suggests that a general issue might be affecting a wide range of customers.

In [None]:
purchase_vs_abandon = (
    df_cart.groupby("Customer_Segment_Label")[["purchase_decision", "abandoned_cart"]]
    .mean()
    .reset_index()
)

fig = px.scatter(
    purchase_vs_abandon,
    x="purchase_decision",
    y="abandoned_cart",
    color="Customer_Segment_Label",
    size="purchase_decision",
    text="Customer_Segment_Label",
    color_discrete_sequence=px.colors.qualitative.Prism,  # any palette you like
    title="Purchase vs Abandonment by Segment"
)

fig.update_layout(
    width=900,
    height=600,
    xaxis_title="Purchase Decision Rate",
    yaxis_title="Abandonment Rate",
    legend_title="Customer Segment",
)
fig.show()


#### **Insights from above plot we can derive:**

* **Engaged Premium Buyer:** This segment has a low purchase decision rate (just over 23%) and the highest abandonment rate (approximately 27%). This is a critical finding: these highly engaged customers are the least likely to convert their filled carts into purchases. This could be due to a poor checkout experience, hidden fees, complex forms, or other friction points late in the buying process.

* **Window Shopper & VIP Fast Buyer:** These segments have the highest purchase decision rates (around 26%) and the lowest abandonment rates (around 24%). This suggests that when these customers add an item to their cart, they are very likely to complete the transaction. Their path from consideration to purchase is relatively smooth.

* **Regular Buyer:** This segment falls in the middle, with a purchase decision rate and abandonment rate both around 25%.

In [None]:
import plotly.express as px

# Your data prep stays the same
discount_usage = (
    df_cart.groupby("Customer_Segment_Label")["discount_applied"]
    .mean()
    .reset_index()
    .rename(columns={"discount_applied": "Avg_Discount_Applied"})
)

# 👉 Improved bar chart with size & color control
fig = px.bar(
    discount_usage,
    x="Customer_Segment_Label",
    y="Avg_Discount_Applied",
    color="Customer_Segment_Label",  # ✅ color by segment
    color_discrete_sequence=px.colors.qualitative.Set2,  # ✅ pick a nice palette
    text_auto=".2f",
    title="Average Discount Used by Customer Segment"
)

# ✅ Control figure size & axes
fig.update_layout(
    width=800,   # Width in pixels
    height=500,  # Height in pixels
    bargap=0.3,  # Gap between bars
    xaxis_title="Customer Segment",
    yaxis_title="Average Discount Applied (%)",
    legend_title="Customer Segment"
)

fig.show()


#### **Insights from above plot we can derive:**

* **Regular Buyer:** This segment uses the highest average discount at 29.10%. This suggests that "Regular Buyers" are highly sensitive to price and are likely to make purchases when discounts or promotions are available. This could be a strategy for the business to maintain loyalty and encourage repeat purchases from this core customer group.

* **VIP Fast Buyer & Window Shopper:** These two segments use similar average discounts, at 24.94% and 24.99% respectively. They are moderately influenced by discounts.

* **Engaged Premium Buyer:** This segment uses a lower average discount of 23.58%. This indicates that their purchasing decisions are less driven by price promotions compared to "Regular Buyers." Their loyalty and engagement likely stem from other factors, such as product quality, brand experience, or convenience.

* **Low Value Quick Buyer:** This segment uses the lowest average discount at 22.93%. This is a notable finding, as one might expect "low value" buyers to be price-sensitive. The low discount usage could suggest that they are primarily focused on speed and convenience, and are less likely to seek out or wait for promotions.

In [None]:
import plotly.express as px

# Your grouped data stays the same
time_vs_abandon = (
    df_cart.groupby("Customer_Segment_Label")[["time_spent", "abandoned_cart"]]
    .mean()
    .reset_index()
)

# 👉 Scatter plot with color palette + plot size
fig = px.scatter(
    time_vs_abandon,
    x="time_spent",
    y="abandoned_cart",
    color="Customer_Segment_Label",  # color by segment
    size="time_spent",               # bubble size
    text="Customer_Segment_Label",
    color_discrete_sequence=px.colors.qualitative.Prism,  # custom color palette
    title="Avg Time Spent vs Abandonment Rate"
)

# ✅ Control plot size, axis titles, and legend
fig.update_layout(
    width=900,    # width in pixels
    height=600,   # height in pixels
    xaxis_title="Average Time Spent (seconds)",
    yaxis_title="Average Abandonment Rate",
    legend_title="Customer Segment"
)

fig.show()


#### **Insights from above plot we can derive:**

* **Engaged Premium Buyer:** This segment spends the most time on the platform (over 900 seconds) and also has the highest abandonment rate (approximately 27%). This is a crucial finding. Customers who invest a significant amount of time in Browse and selecting items are ultimately not completing their purchases. This could be due to a poor checkout experience, hidden fees, complex forms, or technical issues that appear late in the process.

* **Window Shopper:** This segment spends a moderate amount of time (around 750-800 seconds) and has a relatively low abandonment rate (around 24%). This suggests that while they spend time Browse, their cart abandonment rate is close to the average, and they are not as negatively impacted by the same issues as the premium buyers.

* **Regular Buyer:** This group spends a moderate amount of time (around 600-650 seconds) with a moderate abandonment rate (around 25%). This places them in the middle of the pack for both metrics.

* **VIP Fast Buyer and Low Value Quick Buyer:** Both these segments spend the least amount of time on the platform (400 seconds or less) and have the lowest abandonment rates (around 24%). This is expected, as "quick buyers" are likely to have a specific item in mind and complete their purchase with minimal friction.

In [None]:
import plotly.express as px

# Group same as you have
sentiment_abandon = (
    df_cart.groupby("Customer_Segment_Label")[["sentiment_score", "abandoned_cart"]]
    .mean()
    .reset_index()
)

# Build scatter with custom size & color
fig = px.scatter(
    sentiment_abandon,
    x="sentiment_score",
    y="abandoned_cart",
    color="Customer_Segment_Label",
    size="sentiment_score",
    title="Avg Sentiment vs Abandonment Rate",
    text="Customer_Segment_Label",
    color_discrete_sequence=px.colors.qualitative.Set2  # Example: use a softer palette
)

# ✅ Control figure size & axes
fig.update_layout(
    width=800,    # Width in pixels
    height=500,   # Height in pixels
    xaxis_title="Average Sentiment Score",
    yaxis_title="Average Abandonment Rate",
    legend_title="Customer Segment",
)

# Optional: control marker border & text position
fig.update_traces(
    marker=dict(line=dict(width=1, color='DarkSlateGrey')),
    textposition='top center'
)

fig.show()


#### **Insights from above plot we can derive:**

* **Engaged Premium Buyer** has the highest abandonment rate, but also one of the better sentiment scores — meaning they like the brand but still abandon carts. This may indicate they expect premium perks or targeted nudges to convert.

* **Low Value Quick Buyer** has relatively higher sentiment but lower abandonment — they may be impulsive buyers responding to quick incentives.

* **VIP Fast Buyer** has lower sentiment and low abandonment — they’re decisive, value speed, and rarely abandon carts.

* **Window Shoppers** have moderate sentiment and abandonment — they’re browsing more, buying less.

In [None]:
import plotly.express as px

# Group data
payment_ct = df_cart.groupby(["Customer_Segment_Label", "payment_method"]).size().reset_index(name="Count")

# Create sunburst with custom colors
fig = px.sunburst(
    payment_ct,
    path=["Customer_Segment_Label", "payment_method"],
    values="Count",
    title="Payment Methods Used by Segment",
    color="Customer_Segment_Label",  # 🟢 use the parent for color base
    color_discrete_sequence=px.colors.qualitative.Set3  # 🎨 example palette
)

# ✅ Control figure size
fig.update_layout(
    width=700,  # width in px
    height=550, # height in px
    margin=dict(t=50, l=0, r=0, b=0)
)

fig.show()


#### **From above plot we came to know -**

* **Regular Buyer:** This segment uses a wide variety of payment methods. The most prominent payment methods appear to be COD (Cash on Delivery), followed by Credit Card and PayPal. This indicates a diverse range of preferences within this large customer base, suggesting the importance of offering multiple payment options to cater to them.

* **Engaged Premium Buyer:** This segment shows a strong preference for Credit Card and Debit Card, followed by COD and PayPal. The emphasis on card payments suggests that these customers are comfortable with digital transactions and may value the convenience and security associated with them.

* **Low Value Quick Buyer:** This segment also uses a mix of payment methods. The chart shows a significant portion using COD and PayPal, along with Debit Card and Credit Card. The prominence of COD and PayPal for this group might indicate a preference for payment options that are quick to use without requiring detailed card information, which aligns with their "quick buyer" behavior.

* **VIP Fast Buyer:** This segment seems to favor a mix of PayPal, Credit Card, and Debit Card. They also use COD, but to a lesser extent than some other segments. Their preference for online payment methods like PayPal and credit/debit cards is consistent with the "fast buyer" profile, as these methods can facilitate quick, seamless checkouts.

* **Window Shopper:** This segment uses a balanced mix of all four payment methods: Credit Card, COD, PayPal, and Debit Card. This wide distribution suggests that the payment method is not a major differentiating factor for this segment and that providing all these options is necessary to capture their occasional purchases.

### **Visualizing Target column**

In [None]:
import plotly.express as px

# Make Abandoned Cart labels as strings
counts = df_cart['abandoned_cart'].value_counts().reset_index()
counts.columns = ['abandoned_cart', 'Count']
counts['abandoned_cart'] = counts['abandoned_cart'].astype(str)  # 👈 ensure it's treated as categorical

# Plot
fig = px.bar(
    counts,
    x='abandoned_cart',
    y='Count',
    text='Count',
    color='abandoned_cart',
    color_discrete_map={
        '0': 'lightseagreen',
        '1': 'royalblue'
    },  # ✅ this works now!
    title='Abandoned Cart Distribution'
)

fig.update_layout(
    width=800,
    height=500,
    xaxis_title='Abandoned Cart Label',
    yaxis_title='Abandoned Cart Counts',
    bargap=0.3
)

fig.update_traces(
    textposition='outside'
)

fig.show()


#### ***From above plot we came to know that imbalanced data, with completed carts outnumbering abandoned carts by more than 3 to 1. It is a crucial factor to consider for any subsequent analysis or predictive modeling.***

#### **Converting few features into proper data type**

In [None]:
df_cart['payment_method'] = df_cart['payment_method'].astype('category').cat.codes

df_cart['gender'] = df_cart['gender'].astype('category').cat.codes

df_cart['user_id'] = df_cart['user_id'].astype('int')


### **Heatmap (To check features each other bonding)**

In [None]:
df_numeric = df_cart.select_dtypes(include=['number'])
corr = df_numeric.corr()

import plotly.express as px

fig = px.imshow(
    corr,
    text_auto=".2f",
    color_continuous_scale='RdBu_r',
    title="Correlation Heatmap (Numeric Columns Only)"
)

fig.update_layout(
    width=1000,
    height=800
)

fig.show()


#### **From above Heatmap we came to know that :-**

* **Customer Segment is a key driver:** The Customer_Segment_Type variable has a significant correlation with abandoned_cart, purchase_decision, and income_level. This reinforces the idea that customer segmentation is a powerful tool for understanding behavior.

* **Add-to-Cart Behavior is a Strong Indicator:** Both the number of items added to the cart (add_to_cart) and the ratio (add_to_cart_ratio) are highly correlated with a successful purchase_decision. This suggests that encouraging customers to add more items to their cart could be a successful strategy.

* **Discounting Strategy:** The negative correlation between price and discount_applied is expected. However, the correlation between Customer_Segment_Type and discount_applied suggests a nuanced discounting strategy is possible, targeting specific segments that are more price-sensitive.

* **Complex Time Spent Relationship:** The negative correlation between abandoned_cart and time_spent is an interesting point for further investigation. It suggests that quick, perhaps frustrating, visits are a major cause of abandonment. This would contradict the idea that longer time spent leads to more frustration. Further analysis would be needed to reconcile this finding with previous plots.



### **How much impact that feature has on cart abandonment -**



In [None]:
corr_features = [
    'price', 'discount_applied',
    'payment_method', 'pages_visited', 'time_spent',
    'rating', 'sentiment_score', 'age', 'gender', 'income_level',
    'Customer_Segment_Type',
    'add_to_cart_ratio', 'discount_ratio',
    'abandoned_cart'
]

import plotly.express as px
import plotly.graph_objects as go

# Calculate correlation matrix again
corr_matrix = df_cart[corr_features].corr()

# Select only correlations with target column
target_corr = corr_matrix[['abandoned_cart']].drop(index='abandoned_cart').reset_index()
target_corr.columns = ['Feature', 'Correlation']
target_corr['AbsCorrelation'] = target_corr['Correlation'].abs()

# Sort by absolute correlation
target_corr = target_corr.sort_values(by='AbsCorrelation', ascending=False)

fig = px.bar(
    target_corr,
    x='Feature',
    y='Correlation',
    color='Correlation',
    title='Feature Correlations with Cart Abandonment',
    color_continuous_scale='earth',
    text=target_corr['Correlation'].round(3)
)

fig.update_layout(
    width=900,
    height=650,
    xaxis_title='Feature',
    yaxis_title='Correlation with Cart Abandoned',
    bargap=0.3
)

fig.update_traces(
    textposition='outside'
)

fig.show()


#### **From above plot we came across :-**

* **add_to_cart_ratio (0.317):** This is the strongest positive correlation. This means that as the ratio of items added to the cart increases, the likelihood of a cart being abandoned also significantly increases. This is a crucial and somewhat counter-intuitive finding.

* **discount_ratio (-0.022), discount_applied (-0.024), payment_method (-0.018):** These features show a very weak, slightly negative correlation with cart abandonment. This suggests that the applied discount and the choice of payment method have a negligible impact on whether a cart is abandoned.

* **price (0.016):** This feature has a very weak positive correlation. This suggests that a higher price has a minimal positive effect on abandonment.

* **Other Features:** The remaining features (pages_visited, sentiment_score, time_spent, rating, age, income_level, gender, and Customer_Segment_Type) all have correlations very close to zero. The values range from -0.016 to 0.013. This indicates that these features have virtually no linear relationship with cart abandonment.

#### **Feature Selection**

In [None]:
# Suppose you select:
selected_features = [

    'price', 'discount_applied', 'pages_visited',
    'payment_method','income_level',
    'time_spent', 'sentiment_score','purchase_decision',
    'add_to_cart_ratio', 'discount_ratio', 'Customer_Segment_Type'

]

X = df_cart[selected_features]
y = df_cart['abandoned_cart']


#### **Utilizing SMOTE technique to handle imbalanceness in Target column**

In [None]:
from sklearn.model_selection import train_test_split
from imblearn.over_sampling import SMOTE


# Split the combined data
X_train, X_test, y_train, y_test = train_test_split(
    X, y,
    test_size=0.2,
    random_state=42
)

# Apply SMOTE to the training data
sm = SMOTE(random_state=42)
X_train_res, y_train_res = sm.fit_resample(X_train, y_train)

## **Model Building : Neural Network**

In [None]:
from sklearn.neural_network import MLPClassifier
from sklearn.metrics import accuracy_score, confusion_matrix
import plotly.figure_factory as ff
import numpy as np

# SMOTE on training data
smote = SMOTE(random_state=42)
X_train_res, y_train_res = smote.fit_resample(X_train, y_train)

# Model — MLP Classifier
mlp = MLPClassifier(random_state=42, max_iter=500)
mlp.fit(X_train_res, y_train_res)

y_pred = mlp.predict(X_test)

print('----' * 16)
print(f"✅ MLP Classifier's Accuracy: {accuracy_score(y_test, y_pred):.2f}")
print('----' * 16)

# Confusion matrix
cm = confusion_matrix(y_test, y_pred)

# Plot confusion matrix using Plotly
labels = ['Cart_Abandoned: No', 'Cart_Abandoned: Yes']
z_text = [[str(y) for y in x] for x in cm]

fig = ff.create_annotated_heatmap(
    z=cm,
    x=labels,
    y=labels,
    annotation_text=z_text,
    colorscale='Blues',
    showscale=True
)

fig.update_layout(
    title_text='Confusion Matrix - MLP Classifier',
    width=550,
    height=500
)

fig['data'][0]['showscale'] = True
fig.show()


----------------------------------------------------------------
✅ MLP Classifier's Accuracy: 1.00
----------------------------------------------------------------


In [None]:
# ✅ 1️⃣ Keep user_id as separate Series
user_ids = df_cart['user_id']

# ✅ 2️⃣ Do the SAME split
X_train, X_test, y_train, y_test, user_id_train, user_id_test = train_test_split(
    X, y, user_ids, test_size=0.2, random_state=42
)



In [None]:
X_test['user_id'] = user_id_test.values

#### *Storing predictions in Test data*

In [None]:
X_test['Predicted_Cart_Abnd'] = y_pred

In [None]:
X_test.head()

Unnamed: 0,price,discount_applied,pages_visited,payment_method,income_level,time_spent,sentiment_score,purchase_decision,add_to_cart_ratio,discount_ratio,Customer_Segment_Type,user_id,Predicted_Cart_Abnd
1501,215.57,22.77,1,1,1,607,0.18,0,1.0,0.105627,4,99093,1
2586,112.53,21.38,5,2,2,598,-0.17,0,0.0,0.189994,0,70499,0
2653,339.14,32.06,1,1,3,377,0.05,1,1.0,0.094533,2,24134,0
1055,407.74,4.86,6,2,3,1082,-0.5,0,0.0,0.011919,1,22297,0
705,104.7,24.1,11,2,2,750,0.21,0,0.0,0.230181,0,40053,0


#### *Reversing categorical features back to original form*

In [None]:
# Cluster mapping
Customer_Segment_Type_map = {
   4:  'Regular Buyer',
   1:  'Engaged Premium Buyer',
   0:  'Window Shopper',
   3:  'VIP Fast Buyer',
   2:  'Low Value Quick Buyer'
}

Income_Level_map = {
   1:  'Low',
   2:  'Medium',
   3:  'High'
}

payment_method_map = {
  0: 'COD',
  1: 'Credit Card',
  2: 'Debit Card',
  3: 'PayPal'
}


# Churn mapping
Cart_Abnd_map = {
    1: 'Cart_Abnd_Yes',
    0: 'Cart_Abnd_No'
}


# Apply maps
X_test['Customer_Segment_Type'] = X_test['Customer_Segment_Type'].map(Customer_Segment_Type_map)
X_test['Predicted_Cart_Abnd'] = X_test['Predicted_Cart_Abnd'].map(Cart_Abnd_map)
X_test['income_level'] = X_test['income_level'].map(Income_Level_map)
X_test['payment_method'] = X_test['payment_method'].map(payment_method_map)


In [None]:
X_test = X_test[['user_id','pages_visited','time_spent','sentiment_score',
                 'price','discount_applied','discount_ratio','income_level',
                 'purchase_decision','Customer_Segment_Type','add_to_cart_ratio',
                 'Predicted_Cart_Abnd']]

In [None]:
print(X_test.head(12).to_markdown(index=False, numalign="left", stralign="left"))
print('----' * 72)
print(X_test.tail(14).to_markdown(index=False, numalign="left", stralign="left"))

| user_id   | pages_visited   | time_spent   | sentiment_score   | price   | discount_applied   | discount_ratio   | income_level   | purchase_decision   | Customer_Segment_Type   | add_to_cart_ratio   | Predicted_Cart_Abnd   |
|:----------|:----------------|:-------------|:------------------|:--------|:-------------------|:-----------------|:---------------|:--------------------|:------------------------|:--------------------|:----------------------|
| 99093     | 1               | 607          | 0.18              | 215.57  | 22.77              | 0.105627         | Low            | 0                   | Regular Buyer           | 1                   | Cart_Abnd_Yes         |
| 70499     | 5               | 598          | -0.17             | 112.53  | 21.38              | 0.189994         | Medium         | 0                   | Window Shopper          | 0                   | Cart_Abnd_No          |
| 24134     | 1               | 377          | 0.05              | 339.14  | 32.06      

### **Visualizing Customer Segment and Cart Abandonment Prediction in context of User ID**

In [None]:
!pip install squarify

Collecting squarify
  Downloading squarify-0.4.4-py3-none-any.whl.metadata (600 bytes)
Downloading squarify-0.4.4-py3-none-any.whl (4.1 kB)
Installing collected packages: squarify
Successfully installed squarify-0.4.4


In [None]:
import plotly.express as px

# Group and sort
cluster_df = (
    X_test.groupby('Customer_Segment_Type')
    .agg({'user_id': 'count'})
    .reset_index()
    .rename(columns={'user_id': 'Count'})
    .sort_values(by='Count', ascending=False)
)

# 📊 Plotly Treemap
fig = px.treemap(
    cluster_df,
    path=['Customer_Segment_Type'],
    values='Count',
    color='Count',
    color_continuous_scale='darkmint',
    title="Treemap: Customer ID Based on Segmentation Distribution"
)

# ✅ Control figure size here
fig.update_layout(
    width=700,   # Width in pixels
    height=500   # Height in pixels
)

fig.show()


#### **Above Treemap we came to know :-**

*Each rectangle's size is proportional to the value of the data it represents. In this plot, the size of each rectangle corresponds to the number of customers in that segment. The color intensity also represents the count, with darker colors indicating a higher count and lighter colors indicating a lower count, as shown in the color bar legend.*

In [None]:
import plotly.express as px

cart_df = (
    X_test.groupby('Predicted_Cart_Abnd')
    .agg({'user_id': 'count'})
    .reset_index()
    .rename(columns={'user_id': 'Count'})
    .sort_values(by='Count', ascending=False)
)


fig = px.pie(
    cart_df,
    names='Predicted_Cart_Abnd',
    values='Count',
    title="Pie Chart: Predicted Cart Abandonment Distribution",
    color_discrete_sequence=px.colors.sequential.Teal
)

fig.update_traces(
    textinfo='percent+label+value'  # Show % + label + value on slices
)

fig.update_layout(
    width=700,
    height=500
)

fig.show()


***In summary, the pie chart provides a clear and simple overview of the proportion of carts that a predictive model anticipates will be abandoned versus those that will be completed.***

In [None]:
import plotly.express as px

# --------------------------------------------
# 1️⃣ Group your data
# --------------------------------------------
df_sunburst = (
    X_test.groupby(['Customer_Segment_Type', 'Predicted_Cart_Abnd'])
    .agg({'user_id': 'count'})
    .reset_index()
    .rename(columns={'user_id': 'Count'})
)

df_sunburst['Predicted_Cart_Abnd_Label'] = df_sunburst['Predicted_Cart_Abnd'].replace({1: 'Cart_Abnd_Yes', 0: 'Cart_Abnd_No'})

# --------------------------------------------
# 2️⃣ Build sunburst
# --------------------------------------------
fig = px.sunburst(
    df_sunburst,
    path=['Customer_Segment_Type', 'Predicted_Cart_Abnd_Label'],  # hierarchy: parent → child
    values='Count',
    color='Customer_Segment_Type',  # color by Cluster!
    color_discrete_sequence=px.colors.sequential.Darkmint,
    title="Sunburst: Customer Segments with their Cart_Abandoned No vs Yes Breakdown"
)

# --------------------------------------------
# 3️⃣ Style
# --------------------------------------------
fig.update_traces(
    textinfo='label+percent entry+value'
)

fig.update_layout(
    width=800,
    height=600
)

fig.show()


#### ***From above Sunburn plot we came to know that :-***

 * The "Regular Buyer" and "Engaged Premium Buyer" segments—exhibit the highest rates of cart abandonment. This is a critical business concern, as it indicates that the company is losing a substantial amount of potential revenue from its core and most engaged customers.

 * The "Window Shopper" and "Low Value Quick Buyer" segments have the lowest abandonment rates. This suggests that while these groups may be less engaged or provide lower per-transaction value, the path from their decision to buy to a completed purchase is relatively smooth.

In [None]:
X_test.user_id.duplicated().sum()

np.int64(7)

In [None]:
X_test = X_test.drop_duplicates(subset='user_id', keep='first')

In [None]:
!pip install langchain



## ***Building : AI Cart Recovery Agent.... (utlizing agentic framework)***

In [None]:
from langchain.prompts import PromptTemplate
from langchain.chains import llm
from langchain.schema import StrOutputParser
from txtai.pipeline import llm
from langchain.llms.base import llm as LangChainLLM


llm = TxtaiLangChainLLM()

# ✅ Prompt for rule-based cart action
cart_action_prompt = PromptTemplate.from_template("""
You are a strict rule-following assistant.

Choose the recommended cart recovery action based ONLY on the following matching rules:

RULES:
1. If Predicted_Cart_Abnd == 'Cart_Abnd_Yes' AND Customer_Segment_Type == 'VIP Fast Buyer':
   → Send highly personalized cart recovery with exclusive VIP perks and express shipping offer

2. If Predicted_Cart_Abnd == 'Cart_Abnd_Yes' AND Customer_Segment_Type == 'Engaged Premium Buyer':
   → Send reminder with personalized recommendations and premium loyalty incentive

3. If Predicted_Cart_Abnd == 'Cart_Abnd_Yes' AND Customer_Segment_Type == 'Regular Buyer':
   → Send standard abandoned cart reminder with small discount

4. If Predicted_Cart_Abnd == 'Cart_Abnd_Yes' AND Customer_Segment_Type == 'Window Shopper':
   → Use retargeting ads and remind about viewed items with urgency

5. If Predicted_Cart_Abnd == 'Cart_Abnd_Yes' AND Customer_Segment_Type == 'Low Value Quick Buyer':
   → Send basic reminder email with a small incentive to complete purchase

6. If Predicted_Cart_Abnd == 'Cart_Abnd_No' AND Customer_Segment_Type == 'VIP Fast Buyer':
   → Offer upsell or exclusive loyalty reward post-purchase

7. If Predicted_Cart_Abnd == 'Cart_Abnd_No' AND Customer_Segment_Type == 'Engaged Premium Buyer':
   → Maintain engagement with personalized recommendations

8. If Predicted_Cart_Abnd == 'Cart_Abnd_No' AND Customer_Segment_Type == 'Regular Buyer':
   → Send occasional offers and keep them engaged

9. If Predicted_Cart_Abnd == 'Cart_Abnd_No' AND Customer_Segment_Type == 'Window Shopper':
   → Encourage browsing with product suggestions

10. If Predicted_Cart_Abnd == 'Cart_Abnd_No' AND Customer_Segment_Type == 'Low Value Quick Buyer':
   → Monitor behavior and send targeted promotions if needed

Else:
   → No action needed, wait for organic return visit

INPUT:
- Customer_Segment_Type: {Customer_Segment_Type}
- Predicted_Cart_Abnd: {Predicted_Cart_Abnd}

ACTION:
""")

(
    llm=llm,
    prompt=cart_action_prompt,
    output_parser=StrOutputParser()
)

# ✅ Prompt to explain LLM reasoning
cart_explanation_prompt = PromptTemplate.from_template("""
You are a cart recovery assistant.

Customer Info:
- Customer_Segment_Type: {Customer_Segment_Type}
- Predicted_Cart_Abnd: {Predicted_Cart_Abnd}

User question: {question}

The system's recommended action is:
{action}

Explain in 1-2 sentences **why this action is appropriate** based on the customer info.
Avoid stating the rule directly. Keep it under 30 words.

Answer:
""")

(
    llm=llm,
    prompt=cart_explanation_prompt,
    output_parser=StrOutputParser()
)


# ✅ Ask for user_id and process
individual_input = int(input("\n🛒 Enter user_id: ").strip())

# ✅ Lookup in X_test
result = X_test[X_test['user_id'] == individual_input]

if result.empty:
    print("❌ user_id not found.")
else:
    row = result.iloc[0]
    segment = row['Customer_Segment_Type']
    cart_abnd = row['Predicted_Cart_Abnd']

    # Get system-recommended action
    action = recommend_chain.run({
        "Customer_Segment_Type": segment,
        "Predicted_Cart_Abnd": cart_abnd
    })

    # Display basic info
    print(f"\nCustomer Info:")
    print(f"User ID: {row['user_id']}")
    print(f"Customer Segment Category: {segment}")
    print(f"Model Predicted: {cart_abnd}")

    # Ask user question
    question = input("\n❓ Ask your question to get the Cart Recovery LLM explanation: ").strip()

    # Get LLM explanation
    llm_explanation = llm.run({
        "Customer_Segment_Type": segment,
        "Predicted_Cart_Abnd": cart_abnd,
        "question": question,
        "action": action
    })

    # Output
    print("\n✅ SYSTEM RECOMMENDED ACTION:")
    print(action)

    print("\n✅ LLM Explanation:")
    print(llm_explanation)


llama_kv_cache_unified_iswa: using full-size SWA cache (ref: https://github.com/ggml-org/llama.cpp/pull/13194#issuecomment-2868343055)



🛒 Enter user_id: 2700

Customer Info:
User ID: 2700
Customer Segment Category: Engaged Premium Buyer
Model Predicted: Cart_Abnd_No

❓ Ask your question to get the Cart Recovery LLM explanation: What action should we take?

✅ SYSTEM RECOMMENDED ACTION:
→ Maintain engagement with personalized recommendations


**Explanation:**

The rules state that if a customer is an "Engaged Premium Buyer" and their cart is not abandoned, then you should maintain engagement with personalized recommendations.

✅ LLM Explanation:
This approach ensures the customer feels valued and continues engaging with your site.


In [None]:
X_test.head(13)

Unnamed: 0,user_id,pages_visited,time_spent,sentiment_score,price,discount_applied,discount_ratio,income_level,purchase_decision,Customer_Segment_Type,add_to_cart_ratio,Predicted_Cart_Abnd
1501,99093,1,607,0.18,215.57,22.77,0.105627,Low,0,Regular Buyer,1.0,Cart_Abnd_Yes
2586,70499,5,598,-0.17,112.53,21.38,0.189994,Medium,0,Window Shopper,0.0,Cart_Abnd_No
2653,24134,1,377,0.05,339.14,32.06,0.094533,High,1,Low Value Quick Buyer,1.0,Cart_Abnd_No
1055,22297,6,1082,-0.5,407.74,4.86,0.011919,High,0,Engaged Premium Buyer,0.0,Cart_Abnd_No
705,40053,11,750,0.21,104.7,24.1,0.230181,Medium,0,Window Shopper,0.0,Cart_Abnd_No
106,89701,2,295,-0.04,42.76,21.58,0.504677,Medium,1,Window Shopper,0.5,Cart_Abnd_No
589,52470,15,884,-0.4,27.3,20.26,0.742125,High,0,Engaged Premium Buyer,0.0,Cart_Abnd_No
2468,19920,2,817,0.1,67.51,40.29,0.5968,Medium,0,Window Shopper,0.0,Cart_Abnd_No
2413,92166,19,826,0.0,449.78,23.12,0.051403,Low,0,Regular Buyer,0.0,Cart_Abnd_No
1600,33161,20,219,-0.25,317.67,6.34,0.019958,Low,1,VIP Fast Buyer,0.05,Cart_Abnd_No


Q What action should we take?

## ***Deploying 🛒 AI Cart Recovery Agent.... (Gradio)***

In [None]:
import gradio as gr
from langchain.prompts import PromptTemplate
from langchain.chains import llm
from langchain.schema import StrOutputParser
from txtai.pipeline import llm
from langchain.llms.base import llm as LangChainLLM


llm = TxtaiLangChainLLM()


# ✅ Prompt for rule-based cart action
cart_action_prompt = PromptTemplate.from_template("""
You are a strict rule-following assistant.
Choose the recommended cart recovery action based ONLY on the following matching rules:

RULES:
1. If Predicted_Cart_Abnd == 'Cart_Abnd_Yes' AND Customer_Segment_Type == 'VIP Fast Buyer':
   → Send highly personalized cart recovery with exclusive VIP perks and express shipping offer
2. If Predicted_Cart_Abnd == 'Cart_Abnd_Yes' AND Customer_Segment_Type == 'Engaged Premium Buyer':
   → Send reminder with personalized recommendations and premium loyalty incentive
3. If Predicted_Cart_Abnd == 'Cart_Abnd_Yes' AND Customer_Segment_Type == 'Regular Buyer':
   → Send standard abandoned cart reminder with small discount
4. If Predicted_Cart_Abnd == 'Cart_Abnd_Yes' AND Customer_Segment_Type == 'Window Shopper':
   → Use retargeting ads and remind about viewed items with urgency
5. If Predicted_Cart_Abnd == 'Cart_Abnd_Yes' AND Customer_Segment_Type == 'Low Value Quick Buyer':
   → Send basic reminder email with a small incentive to complete purchase
6. If Predicted_Cart_Abnd == 'Cart_Abnd_No' AND Customer_Segment_Type == 'VIP Fast Buyer':
   → Offer upsell or exclusive loyalty reward post-purchase
7. If Predicted_Cart_Abnd == 'Cart_Abnd_No' AND Customer_Segment_Type == 'Engaged Premium Buyer':
   → Maintain engagement with personalized recommendations
8. If Predicted_Cart_Abnd == 'Cart_Abnd_No' AND Customer_Segment_Type == 'Regular Buyer':
   → Send occasional offers and keep them engaged
9. If Predicted_Cart_Abnd == 'Cart_Abnd_No' AND Customer_Segment_Type == 'Window Shopper':
   → Encourage browsing with product suggestions
10. If Predicted_Cart_Abnd == 'Cart_Abnd_No' AND Customer_Segment_Type == 'Low Value Quick Buyer':
   → Monitor behavior and send targeted promotions if needed
Else:
   → No action needed, wait for organic return visit

INPUT:
- Customer_Segment_Type: {Customer_Segment_Type}
- Predicted_Cart_Abnd: {Predicted_Cart_Abnd}

ACTION:
""")

(
    llm=llm,
    prompt=cart_action_prompt,
    output_parser=StrOutputParser()
)

# ✅ Prompt to explain LLM reasoning
cart_explanation_prompt = PromptTemplate.from_template("""
You are a cart recovery assistant.
Customer Info:
- Customer_Segment_Type: {Customer_Segment_Type}
- Predicted_Cart_Abnd: {Predicted_Cart_Abnd}

User question: {question}
The system's recommended action is:
{action}

Explain in 1-2 sentences **why this action is appropriate** based on the customer info.
Avoid stating the rule directly. Keep it under 30 words.

Answer:
""")

(
    llm=llm,
    prompt=cart_explanation_prompt,
    output_parser=StrOutputParser()
)

# ✅ Stage 1: Fetch Customer Info
def fetch_customer_info(user_id):
    try:
        user_id = int(user_id)
    except:
        return "Invalid user_id", "", ""

    row = X_test[X_test['user_id'] == user_id]
    if row.empty:
        return "User not found", "", ""
    row = row.iloc[0]
    segment = row['Customer_Segment_Type']
    cart_abnd = row['Predicted_Cart_Abnd']

    info = f"User ID: {row['user_id']}\nCustomer Segment: {segment}\nCart Abandonment: {cart_abnd}"
    return info, "", ""  # Don't show action or explanation yet

# ✅ Stage 2: Run LLM on Question
def run_llm_logic(user_id, question):
    try:
        user_id = int(user_id)
    except:
        return "", "Invalid user_id or question", ""

    row = X_test[X_test['user_id'] == user_id]
    if row.empty:
        return "", "User not found", ""
    row = row.iloc[0]
    segment = row['Customer_Segment_Type']
    cart_abnd = row['Predicted_Cart_Abnd']

    action = recommend_chain.run({
        "Customer_Segment_Type": segment,
        "Predicted_Cart_Abnd": cart_abnd
    })

    explanation = llm.run({
        "Customer_Segment_Type": segment,
        "Predicted_Cart_Abnd": cart_abnd,
        "question": question,
        "action": action
    })

    return action, explanation

# ✅ Gradio Interface
with gr.Blocks() as demo:
    gr.Markdown("# 🛒 Cart Recovery Assistant: LLM-Driven Strategy")

    with gr.Row():
        with gr.Column(scale=1):
            user_id = gr.Textbox(label="Enter Customer ID")
            get_customer_btn = gr.Button("Get Customer Info")

            question = gr.Textbox(label="Your Question")
            ask_question_btn = gr.Button("Ask LLM")

        with gr.Column(scale=2):
            customer_info = gr.Textbox(label="Customer Info", lines=3)
            system_action = gr.Textbox(label="LLM Recommended Action", lines=3)
            llm_explanation = gr.Textbox(label="LLM Explanation", lines=3)

    get_customer_btn.click(
        fetch_customer_info,
        inputs=[user_id],
        outputs=[customer_info, system_action, llm_explanation]
    )

    ask_question_btn.click(
        run_llm_logic,
        inputs=[user_id, question],
        outputs=[system_action, llm_explanation]
    )

# ✅ Launch the app
demo.launch()


llama_kv_cache_unified_iswa: using full-size SWA cache (ref: https://github.com/ggml-org/llama.cpp/pull/13194#issuecomment-2868343055)

The class `LLMChain` was deprecated in LangChain 0.1.17 and will be removed in 1.0. Use :meth:`~RunnableSequence, e.g., `prompt | llm`` instead.



It looks like you are running Gradio on a hosted Jupyter notebook, which requires `share=True`. Automatically setting `share=True` (you can turn this off by setting `share=False` in `launch()` explicitly).

Colab notebook detected. To show errors in colab notebook, set debug=True in launch()
* Running on public URL: https://91bc70fe4a60e4e612.gradio.live

This share link expires in 1 week. For free permanent hosting and GPU upgrades, run `gradio deploy` from the terminal in the working directory to deploy to Hugging Face Spaces (https://huggingface.co/spaces)


