# 🛒 Predicting E-commerce Cart Abandonment

Cart abandonment happens when a customer adds items to their cart but leaves the website without completing the purchase.  
In this project, we use **machine learning** 🤖 to analyze customer behavior and predict the likelihood of cart abandonment.

### 🎯 Project Goals
- 📊 Explore customer and session data  
- 🔍 Identify key factors influencing cart abandonment  
- 🧠 Build and evaluate predictive machine learning models  
- 💡 Provide insights to improve e-commerce conversion rates


In [1]:
import pandas as pd 
import numpy as np
import seaborn as sns
from matplotlib import pyplot as plt

In [2]:
data = pd.read_csv("ecommerce_cart_abandonment_5k.csv")

In [3]:
data.head()

Unnamed: 0,UserID,SessionID,BrowserType,OperatingSystem,DeviceType,NumItemsInCart,CartTotalAmount,TimeSpentOnSiteMinutes,HasCouponApplied,IsReturningCustomer,LastPageViewedCategory,AbandonmentReason
0,8986,311935,Edge,MacOS,Tablet,2,87.96,7.39,False,True,Clothing,Found Better Price
1,1937,207006,Opera,MacOS,Desktop,1,120.0,11.37,False,False,Beauty,Payment Issues
2,3194,779500,Edge,iOS,Desktop,6,150.12,6.8,True,False,Electronics,Found Better Price
3,5125,980665,Opera,iOS,Mobile,3,88.2,7.52,True,True,Electronics,Distraction
4,7250,595333,Edge,Linux,Tablet,4,64.19,3.75,True,False,Books,High Shipping Cost


In [4]:
data.tail()

Unnamed: 0,UserID,SessionID,BrowserType,OperatingSystem,DeviceType,NumItemsInCart,CartTotalAmount,TimeSpentOnSiteMinutes,HasCouponApplied,IsReturningCustomer,LastPageViewedCategory,AbandonmentReason
4995,2553,895977,Edge,Linux,Desktop,3,116.7,11.37,True,True,Electronics,No Abandonment
4996,2761,651563,Chrome,Linux,Tablet,3,84.6,12.56,True,False,Toys,Payment Issues
4997,9775,834262,Safari,iOS,Desktop,3,26.17,4.42,False,False,Electronics,Payment Issues
4998,7799,240027,Safari,Linux,Tablet,1,76.07,7.9,True,False,Toys,High Shipping Cost
4999,5310,542826,Chrome,Android,Tablet,6,11.01,3.55,True,False,Toys,High Shipping Cost


In [5]:
data.describe()

Unnamed: 0,UserID,SessionID,NumItemsInCart,CartTotalAmount,TimeSpentOnSiteMinutes
count,5000.0,5000.0,5000.0,5000.0,5000.0
mean,5524.798,546423.0836,3.0152,59.801418,8.08965
std,2600.325994,261066.96119,1.740912,42.437128,2.957523
min,1000.0,100013.0,0.0,0.51,0.0
25%,3245.0,317856.5,2.0,28.3975,6.13
50%,5531.5,546830.0,3.0,50.335,8.09
75%,7802.5,771604.75,4.0,80.7875,10.12
max,9998.0,999978.0,11.0,338.77,19.15


In [6]:
data.info()

<class 'pandas.core.frame.DataFrame'>
RangeIndex: 5000 entries, 0 to 4999
Data columns (total 12 columns):
 #   Column                  Non-Null Count  Dtype  
---  ------                  --------------  -----  
 0   UserID                  5000 non-null   int64  
 1   SessionID               5000 non-null   int64  
 2   BrowserType             5000 non-null   object 
 3   OperatingSystem         5000 non-null   object 
 4   DeviceType              5000 non-null   object 
 5   NumItemsInCart          5000 non-null   int64  
 6   CartTotalAmount         5000 non-null   float64
 7   TimeSpentOnSiteMinutes  5000 non-null   float64
 8   HasCouponApplied        5000 non-null   bool   
 9   IsReturningCustomer     5000 non-null   bool   
 10  LastPageViewedCategory  5000 non-null   object 
 11  AbandonmentReason       5000 non-null   object 
dtypes: bool(2), float64(2), int64(3), object(5)
memory usage: 400.5+ KB
