This project analyzes the results of an A/B test conducted for SkyCrossroads company's loyalty program promotion. The company ran a promotional campaign where customers could earn additional loyalty points for purchases.
- Control Group (A): Customers receive 1,000 additional loyalty points for purchases over 100 rubles
- Test Group (B): Customers receive 2,000 additional loyalty points for purchases over 100 rubles (double the control)
The experiment was conducted across multiple trading points to determine the effectiveness of the enhanced promotion.
The analysis uses the following variables:
| Variable | Description |
|---|---|
id_client |
Unique customer ID |
id_group |
Group identifier (control = 1,000 points, test = 2,000 points) |
sum_pay |
Purchase amount |
id_point |
Trading point ID |
months_reg |
Duration of customer registration in the loyalty program (in months) |
.
├── README.md # This file
├── ab_test_main.ipynb # Main analysis notebook (English)
├── Dataset_AB_TEST.csv # Original dataset
└── requirements.txt # Python dependencies (if needed)
Built a comprehensive statistical analysis function that:
- Validates input data types and sample sizes
- Calculates descriptive statistics (mean, variance, standard deviation)
- Computes quantiles (including median, quartiles, and deciles)
- Generates histogram visualizations
Implemented a t-test function to compare means between control and test groups, including:
- t-statistic calculation
- p-value determination
- Statistical significance assessment at 5% alpha level
Created a Mann-Whitney test function for comparing distributions when normality assumptions may not hold.
Comprehensive data preparation and analysis:
- Removed null values and outliers
- Created visualization functions for comparing distributions
- Applied both parametric and non-parametric tests
- Analyzed overall test results
Segmented analysis across six trading points:
- Visualized results for each location
- Applied statistical tests per trading point
- Ensured sample size comparability
- Identified location-specific patterns
Analyzed correlation between payment amounts and customer tenure:
- Calculated Pearson and Spearman correlations
- Created scatter plots for visualization
- Examined correlation patterns in control vs. test groups
- Generated business insights based on customer lifetime
- Mann-Whitney Test: Control and test samples show similar distributions
- T-Test: Significant difference in mean purchase amounts between groups
- The discrepancy between tests suggests heterogeneity across trading points
Three major trading points (#1178, #1179, #1182) showed:
- Large sample sizes (1,000+ observations)
- Consistent patterns validating parametric test application
- Varying effectiveness of the enhanced promotion
- Strong correlation between purchase amounts and registration duration
- Correlation is stronger in the test group than control group
- Visual analysis (heatmap) confirms these patterns
- Suggests targeted strategies based on customer lifetime
- Python 3.9+
- Libraries:
pandas- Data manipulation and analysisnumpy- Numerical computationsscipy- Statistical testsseaborn- Statistical data visualizationmatplotlib- Plotting and visualization
pip install pandas numpy scipy seaborn matplotlib jupyter openpyxl- Clone this repository
- Ensure the dataset file (
Dataset_AB_TEST.csv) is in the same directory - Open the Jupyter notebook:
jupyter notebook ab_test_maim.ipynb
- Run all cells sequentially
The analysis follows a rigorous statistical approach:
- Data Validation: Check for data quality, missing values, and outliers
- Exploratory Analysis: Understand distributions and patterns
- Statistical Testing: Apply both parametric and non-parametric tests for robustness
- Segmentation: Analyze results by trading points and customer segments
- Business Insights: Translate statistical findings into actionable recommendations
Based on the analysis:
- Enhanced promotion effectiveness varies by location - Consider targeted implementation
- Customer tenure matters - Long-term customers show higher engagement with increased rewards
- Test group shows stronger correlation - Enhanced rewards program may improve customer lifetime value
- Sample size considerations - Focus implementation on high-traffic locations initially
This project was completed as part of a data analysis course.
This project is available for educational and portfolio purposes.