Skip to content

A/B Testing Analysis: Statistical analysis of loyalty program promotion

License

Notifications You must be signed in to change notification settings

DariaSavva/python_AB_test

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

1 Commit
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

A/B Testing Analysis: Loyalty Program Promotion

Project Overview

This project analyzes the results of an A/B test conducted for SkyCrossroads company's loyalty program promotion. The company ran a promotional campaign where customers could earn additional loyalty points for purchases.

Business Context

  • Control Group (A): Customers receive 1,000 additional loyalty points for purchases over 100 rubles
  • Test Group (B): Customers receive 2,000 additional loyalty points for purchases over 100 rubles (double the control)

The experiment was conducted across multiple trading points to determine the effectiveness of the enhanced promotion.

Dataset Description

The analysis uses the following variables:

Variable Description
id_client Unique customer ID
id_group Group identifier (control = 1,000 points, test = 2,000 points)
sum_pay Purchase amount
id_point Trading point ID
months_reg Duration of customer registration in the loyalty program (in months)

Project Structure

.
├── README.md                    # This file
├── ab_test_main.ipynb            # Main analysis notebook (English)
├── Dataset_AB_TEST.csv         # Original dataset
└── requirements.txt            # Python dependencies (if needed)

Analysis Tasks

Task 1: Statistical Analysis Function

Built a comprehensive statistical analysis function that:

  • Validates input data types and sample sizes
  • Calculates descriptive statistics (mean, variance, standard deviation)
  • Computes quantiles (including median, quartiles, and deciles)
  • Generates histogram visualizations

Task 2: Parametric Test (Student's t-test)

Implemented a t-test function to compare means between control and test groups, including:

  • t-statistic calculation
  • p-value determination
  • Statistical significance assessment at 5% alpha level

Task 3: Non-parametric Test (Mann-Whitney Test)

Created a Mann-Whitney test function for comparing distributions when normality assumptions may not hold.

Task 4: Data Cleaning and Comparative Analysis

Comprehensive data preparation and analysis:

  • Removed null values and outliers
  • Created visualization functions for comparing distributions
  • Applied both parametric and non-parametric tests
  • Analyzed overall test results

Task 5: Analysis by Trading Points

Segmented analysis across six trading points:

  • Visualized results for each location
  • Applied statistical tests per trading point
  • Ensured sample size comparability
  • Identified location-specific patterns

Task 6: User Segmentation by Registration Duration

Analyzed correlation between payment amounts and customer tenure:

  • Calculated Pearson and Spearman correlations
  • Created scatter plots for visualization
  • Examined correlation patterns in control vs. test groups
  • Generated business insights based on customer lifetime

Key Findings

Overall Results

  • Mann-Whitney Test: Control and test samples show similar distributions
  • T-Test: Significant difference in mean purchase amounts between groups
  • The discrepancy between tests suggests heterogeneity across trading points

Trading Point Analysis

Three major trading points (#1178, #1179, #1182) showed:

  • Large sample sizes (1,000+ observations)
  • Consistent patterns validating parametric test application
  • Varying effectiveness of the enhanced promotion

Customer Segmentation Insights

  • Strong correlation between purchase amounts and registration duration
  • Correlation is stronger in the test group than control group
  • Visual analysis (heatmap) confirms these patterns
  • Suggests targeted strategies based on customer lifetime

Technologies Used

  • Python 3.9+
  • Libraries:
    • pandas - Data manipulation and analysis
    • numpy - Numerical computations
    • scipy - Statistical tests
    • seaborn - Statistical data visualization
    • matplotlib - Plotting and visualization

Getting Started

Prerequisites

pip install pandas numpy scipy seaborn matplotlib jupyter openpyxl

Running the Analysis

  1. Clone this repository
  2. Ensure the dataset file (Dataset_AB_TEST.csv) is in the same directory
  3. Open the Jupyter notebook:
    jupyter notebook ab_test_maim.ipynb
  4. Run all cells sequentially

Methodology

The analysis follows a rigorous statistical approach:

  1. Data Validation: Check for data quality, missing values, and outliers
  2. Exploratory Analysis: Understand distributions and patterns
  3. Statistical Testing: Apply both parametric and non-parametric tests for robustness
  4. Segmentation: Analyze results by trading points and customer segments
  5. Business Insights: Translate statistical findings into actionable recommendations

Business Recommendations

Based on the analysis:

  1. Enhanced promotion effectiveness varies by location - Consider targeted implementation
  2. Customer tenure matters - Long-term customers show higher engagement with increased rewards
  3. Test group shows stronger correlation - Enhanced rewards program may improve customer lifetime value
  4. Sample size considerations - Focus implementation on high-traffic locations initially

Author

This project was completed as part of a data analysis course.

License

This project is available for educational and portfolio purposes.

About

A/B Testing Analysis: Statistical analysis of loyalty program promotion

Topics

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published