# A/B Testing Analysis for Nykaa

This notebook demonstrates how to use the A/B testing framework for Nykaa e-commerce experiments.

## Table of Contents
1. Setup and Data Generation
2. Basic A/B Test Analysis
3. Visualization
4. Segment Analysis
5. Statistical Power and Sample Size Calculation

In [None]:
# Import required libraries
import sys
sys.path.append('../src')

import pandas as pd
import numpy as np
import matplotlib.pyplot as plt

from ab_testing import ABTestAnalyzer
from utils.data_loader import DataLoader
from utils.visualizer import ABTestVisualizer

# Set display options
pd.set_option('display.max_columns', None)
pd.set_option('display.precision', 4)

## 1. Setup and Data Generation

Let's generate sample e-commerce data for our analysis.

In [None]:
# Generate sample e-commerce data
loader = DataLoader()
data = loader.generate_ecommerce_data(n_users=5000, random_seed=42)

print("Dataset shape:", data.shape)
print("\nFirst few rows:")
data.head()

In [None]:
# Check group distribution
print("Group distribution:")
print(data['group'].value_counts())

print("\nConversion by group:")
print(data.groupby('group')['converted'].agg(['sum', 'mean', 'count']))

## 2. Basic A/B Test Analysis

Run the complete A/B test analysis.

In [None]:
# Initialize analyzer
analyzer = ABTestAnalyzer(data, control_group='A', treatment_group='B')

# Run A/B test
results = analyzer.run_ab_test(alpha=0.05)

# Display results
print("=" * 60)
print("A/B TEST RESULTS")
print("=" * 60)
print(f"\nConversion Rates:")
print(f"  Control (A): {results['conversion_rates']['control']:.2%}")
print(f"  Treatment (B): {results['conversion_rates']['treatment']:.2%}")
print(f"  Lift: {results['conversion_rates']['lift']:.2%}")

print(f"\nSample Sizes:")
print(f"  Control (A): {results['sample_sizes']['control']:,}")
print(f"  Treatment (B): {results['sample_sizes']['treatment']:,}")

print(f"\nStatistical Test:")
print(f"  Z-Statistic: {results['z_statistic']:.4f}")
print(f"  P-Value: {results['p_value']:.4f}")
print(f"  Significance Level: {results['alpha']}")
print(f"  Is Significant: {'Yes ✓' if results['is_significant'] else 'No ✗'}")

print(f"\nEffect Size & Power:")
print(f"  Cohen's h: {results['effect_size']:.4f}")
print(f"  Statistical Power: {results['power']:.2%}")

print(f"\nConfidence Intervals (95%):")
print(f"  Control: [{results['control_ci'][0]:.4f}, {results['control_ci'][1]:.4f}]")
print(f"  Treatment: [{results['treatment_ci'][0]:.4f}, {results['treatment_ci'][1]:.4f}]")
print("=" * 60)

## 3. Visualization

Create visual representations of the results.

In [None]:
# Initialize visualizer
visualizer = ABTestVisualizer()

# Plot conversion rates
visualizer.plot_conversion_rates(results)
plt.show()

In [None]:
# Plot sample sizes
visualizer.plot_sample_sizes(results)
plt.show()

In [None]:
# Create comprehensive summary report
visualizer.create_summary_report(results)
plt.show()

## 4. Segment Analysis

Analyze results across different user segments.

In [None]:
# Segment analysis by device
device_results = analyzer.segment_analysis('device')
print("Segment Analysis by Device:")
print(device_results.to_string())

In [None]:
# Visualize segment analysis
visualizer.plot_segment_analysis(device_results)
plt.show()

In [None]:
# Segment analysis by user type
user_type_results = analyzer.segment_analysis('user_type')
print("\nSegment Analysis by User Type:")
print(user_type_results.to_string())

visualizer.plot_segment_analysis(user_type_results)
plt.show()

In [None]:
# Funnel analysis
funnel_stages = ['viewed_product', 'added_to_cart', 'initiated_checkout', 'completed_purchase']
visualizer.plot_funnel_analysis(data, funnel_stages)
plt.show()

## 5. Statistical Power and Sample Size Calculation

Calculate required sample sizes for future experiments.

In [None]:
# Calculate required sample size
baseline_rate = 0.12  # 12% baseline conversion rate
mde = 0.10  # Want to detect 10% relative improvement

required_size = analyzer.calculate_sample_size(
    baseline_rate=baseline_rate,
    mde=mde,
    alpha=0.05,
    power=0.8
)

print(f"Sample Size Calculation:")
print(f"  Baseline Conversion Rate: {baseline_rate:.2%}")
print(f"  Minimum Detectable Effect: {mde:.2%}")
print(f"  Significance Level (α): 0.05")
print(f"  Desired Power: 80%")
print(f"  \nRequired Sample Size per Group: {required_size:,}")
print(f"  Total Required Sample Size: {required_size * 2:,}")

In [None]:
# Power analysis for different sample sizes
sample_sizes = np.arange(500, 5000, 500)
powers = []

for n in sample_sizes:
    # Simulate data with this sample size
    test_data = loader.generate_sample_data(
        n_control=n,
        n_treatment=n,
        control_rate=0.12,
        treatment_rate=0.132,  # 10% relative lift
        random_seed=42
    )
    test_analyzer = ABTestAnalyzer(test_data)
    test_results = test_analyzer.run_ab_test()
    powers.append(test_results['power'])

# Plot power analysis
plt.figure(figsize=(10, 6))
plt.plot(sample_sizes, powers, marker='o', linewidth=2, markersize=8)
plt.axhline(y=0.8, color='r', linestyle='--', label='Target Power (80%)')
plt.xlabel('Sample Size per Group', fontsize=12)
plt.ylabel('Statistical Power', fontsize=12)
plt.title('Power Analysis: Sample Size vs Statistical Power', fontsize=14, fontweight='bold')
plt.grid(True, alpha=0.3)
plt.legend()
plt.tight_layout()
plt.show()

## Conclusion

This notebook demonstrated:
1. How to generate and load A/B test data
2. Running comprehensive statistical analysis
3. Creating visualizations of results
4. Performing segment analysis
5. Calculating sample sizes and analyzing statistical power

The framework can be easily adapted for various A/B testing scenarios at Nykaa.