Project Name - Zepto market analysis

Project Type - Exploratory Data Analysis (EDA)

Project Summary -
The purpose of this analysis is to understand the key factors that influence customer retention, repeat purchases, and marketing effectiveness in Zepto’s quick-commerce business model. This project focuses on identifying patterns across customer demographics, order behavior, delivery performance, and campaign exposure. Our analysis provides useful insights for quick-commerce businesses like Zepto to optimize delivery operations, improve loyalty programs, and enhance customer acquisition strategies.

This project involved exploring and cleaning a dataset to prepare it for analysis. The data exploration process included understanding the structure of the dataset, checking data types, identifying missing values, and analyzing the distribution of important variables such as delivery time, average order value, and order frequency. The data cleaning process involved handling inconsistencies, removing duplicate records, treating outliers, and ensuring the dataset was reliable for further analysis.

Through this process, we were able to fix data quality issues and ensure that the dataset was ready for meaningful business insights. Data preparation is a crucial step in any analytics project, as it helps avoid biases and errors that could affect final results. The cleaned dataset was then used to answer key research questions related to Zepto’s customer behavior and marketing strategy.

Once the data was cleaned and prepared, we began exploring and summarizing it using descriptive statistics and visualization techniques. Various graphs and charts were created to uncover patterns and trends across multiple variables such as city-wise demand, delivery speed impact, campaign effectiveness, discount usage, and loyalty membership.

Using data visualization, we were able to better understand customer engagement and retention patterns that would be difficult to identify through raw data alone. For example, we found that delivery time, Zepto Pass membership, order frequency, and marketing campaign exposure play a significant role in determining repeat purchases. Additionally, metro cities showed higher customer activity, while Tier-II cities presented strong growth opportunities for expansion.

The observations and insights identified through this analysis are valuable for future decision-making in quick-commerce. This project demonstrates how Zepto’s speed-driven strategy, digital-first campaigns, and loyalty programs contribute to customer retention and competitive positioning in India’s growing quick-commerce market.



Problem Statements-


Importing the necessary libraries


In [1]:
import numpy as np
import pandas as pd
import matplotlib.pyplot as plt     #for visualization
%matplotlib inline
import seaborn as sns               #for visualization
import warnings
warnings.filterwarnings('ignore')

Import the dataset

In [14]:
pd.read_csv("C:/Users/DELL/Desktop/Zepto market analysis/zepto_market_analysis_dataset.csv")

Unnamed: 0,user_id,city,age_group,user_type,orders_per_month,avg_order_value_inr,delivery_time_min,campaign_exposed,discount_used,zepto_pass_member,order_category,repeat_purchase
0,1,Jaipur,35-44,Returning,3,500.88,9.96,Festival Offer,1,0,Household,0
1,2,Hyderabad,25-34,Returning,6,208.46,7.71,Influencer,1,1,Personal Care,0
2,3,Chennai,18-24,New,7,227.83,8.26,Influencer,0,1,Household,1
3,4,Jaipur,18-24,Returning,7,340.18,7.96,Meme Marketing,1,1,Snacks,1
4,5,Bangalore,25-34,New,8,523.63,9.48,Referral,1,0,Private Label,0
...,...,...,...,...,...,...,...,...,...,...,...,...
2495,2496,Surat,18-24,Returning,8,430.60,6.11,Influencer,1,0,Personal Care,1
2496,2497,Hyderabad,25-34,New,5,375.40,7.14,Zepto Pass,0,0,Personal Care,1
2497,2498,Delhi,25-34,Returning,5,496.69,7.84,Uncle Ji,1,1,Grocery,1
2498,2499,Bangalore,25-34,Returning,3,337.02,10.41,Referral,1,1,Private Label,1


 Assign it to a variable 

In [15]:
zepto = pd.read_csv("C:/Users/DELL/Desktop/Zepto market analysis/zepto_market_analysis_dataset.csv")

In [16]:
zepto

Unnamed: 0,user_id,city,age_group,user_type,orders_per_month,avg_order_value_inr,delivery_time_min,campaign_exposed,discount_used,zepto_pass_member,order_category,repeat_purchase
0,1,Jaipur,35-44,Returning,3,500.88,9.96,Festival Offer,1,0,Household,0
1,2,Hyderabad,25-34,Returning,6,208.46,7.71,Influencer,1,1,Personal Care,0
2,3,Chennai,18-24,New,7,227.83,8.26,Influencer,0,1,Household,1
3,4,Jaipur,18-24,Returning,7,340.18,7.96,Meme Marketing,1,1,Snacks,1
4,5,Bangalore,25-34,New,8,523.63,9.48,Referral,1,0,Private Label,0
...,...,...,...,...,...,...,...,...,...,...,...,...
2495,2496,Surat,18-24,Returning,8,430.60,6.11,Influencer,1,0,Personal Care,1
2496,2497,Hyderabad,25-34,New,5,375.40,7.14,Zepto Pass,0,0,Personal Care,1
2497,2498,Delhi,25-34,Returning,5,496.69,7.84,Uncle Ji,1,1,Grocery,1
2498,2499,Bangalore,25-34,Returning,3,337.02,10.41,Referral,1,1,Private Label,1


About the Dataset 

This Zepto market analysis dataset contains over 2,500 observations with multiple columns representing customer behavior, delivery performance, and marketing engagement.

The data includes both categorical and numerical variables, such as city, age group, campaign exposure, delivery time, and average order value.

This dataset is useful for analyzing trends and patterns in India’s quick-commerce market, especially focusing on customer retention and repeat purchases.

By exploring this data, we can gain insights into the impact of Zepto Pass membership, discounts, and marketing campaigns on customer loyalty.

Overall, this dataset helps understand how Zepto’s 10-minute delivery strategy and digital-first approach drive growth in the competitive quick-commerce industry.

 Data Dictionary – Zepto Market Analysis Dataset

user_id – Unique identifier assigned to each customer in the dataset.

city – Customer’s location city, representing metro and Tier-II market coverage.

age_group – Age category of the customer, reflecting Zepto’s target demographic segments.

user_type – Indicates whether the customer is a new user or a returning customer.

orders_per_month – Number of orders placed by the customer in an average month.

avg_order_value_inr – Average spending amount per order in Indian Rupees (INR).

delivery_time_min – Actual delivery time taken to deliver the order in minutes.

campaign_exposed – Marketing campaign through which the customer was influenced or acquired.

discount_used – Binary flag showing whether the customer applied a discount coupon (1 = Yes, 0 = No).

zepto_pass_member – Indicates whether the customer is subscribed to Zepto Pass loyalty program (1 = Yes, 0 = No).

order_category – Product category of the order such as grocery, snacks, household, or private label.

repeat_purchase – Target variable indicating whether the customer made a repeat purchase (1 = Yes, 0 = No).

The number of observations in the dataset.

In [20]:
zepto.shape[1]

12

In [21]:
zepto.shape[0]

2500

Name of all the columns.

In [22]:
zepto.columns

Index(['user_id', 'city', 'age_group', 'user_type', 'orders_per_month',
       'avg_order_value_inr', 'delivery_time_min', 'campaign_exposed',
       'discount_used', 'zepto_pass_member', 'order_category',
       'repeat_purchase'],
      dtype='object')