Imagine you've just joined the analytics team at a growing software company. Your customer support manager comes to you with a question: "We've been experimenting with offering support through both chat and email. The chat system is more expensive, so we need to know if it's actually performing better in terms of customer satisfaction."

She shares the following data from the past month:
**Chat Support**: 280 satisfied customers out of 320 total responses
**Email Support**: 410 satisfied customers out of 500 total responses



In [1]:
# Calculate sample proportions
chat_prop = 280/320  # ≈ 0.875 or 87.5%
email_prop = 410/500  # ≈ 0.82 or 82%

# Check success-failure condition for both groups
def check_conditions(successes, n):
    failures = n - successes
    return {
        'np': n * (successes/n),
        'n(1-p)': n * (1 - successes/n),
        'conditions_met': (n * (successes/n) >= 10) and (n * (1 - successes/n) >= 10)
    }

chat_check = check_conditions(280, 320)
email_check = check_conditions(410, 500)

print("Chat conditions:")
print(f"np = {chat_check['np']:.1f}")
print(f"n(1-p) = {chat_check['n(1-p)']:.1f}")
print(f"Conditions met: {chat_check['conditions_met']}\n")

print("Email conditions:")
print(f"np = {email_check['np']:.1f}")
print(f"n(1-p) = {email_check['n(1-p)']:.1f}")
print(f"Conditions met: {email_check['conditions_met']}")

Chat conditions:
np = 280.0
n(1-p) = 40.0
Conditions met: True

Email conditions:
np = 410.0
n(1-p) = 90.0
Conditions met: True


**Hypothesis Test**

In [5]:
import numpy as np
from statsmodels.stats.proportion import proportions_ztest

# Prepare data for scipy.stats.proportions_ztest
count = np.array([280, 410])  # successes for chat and email
nobs = np.array([320, 500])   # total observations for each group

# Perform the test
stat, pvalue = proportions_ztest(count, nobs)

print(f"\nZ-statistic: {stat:.4f}")
print(f"P-value: {pvalue:.4f}")

# Calculate the effect size (difference in proportions)
effect_size = (280/320) - (410/500)
print(f"Effect size: {effect_size:.4f}")


Z-statistic: 2.1035
P-value: 0.0354
Effect size: 0.0550


**Confidence Interval**

In [7]:
import scipy.stats as stats

# Calculate pooled proportion
p_pooled = (280 + 410) / (320 + 500)

# Calculate standard error
se = np.sqrt(p_pooled * (1 - p_pooled) * (1/320 + 1/500))

# Calculate 95% confidence interval
z_critical = stats.norm.ppf(0.975)  # for 95% CI
margin_error = z_critical * se
ci_lower = effect_size - margin_error
ci_upper = effect_size + margin_error

print(f"\n95% Confidence Interval: ({ci_lower:.4f}, {ci_upper:.4f})")


95% Confidence Interval: (0.0038, 0.1062)


Sample presentation of results

"I've analyzed the satisfaction rates between our chat and email support channels. Here's what I found:
Chat support had a satisfaction rate of 87.5% (28010 out of 320 responses), while email support had a satisfaction rate of 82% (410 out of 500 responses). This represents a 5.5 percentage point difference in favor of chat support.

Our statistical analysis shows this difference is statistically significant (p = 0.037), meaning it's unlikely this difference occurred by chance. We can be 95% confident that chat support's true satisfaction rate is between 0.3 and 10.7 percentage points higher than email support.

**Business Implications:**

The data supports that chat provides better customer satisfaction.
The minimum improvement we're confident about (0.3 percentage points) might be too small to justify chat's higher cost
However, the potential improvement could be as high as 10.7 percentage points, which might make the investment worthwhile.

**Recommendations:**

Continue collecting data to narrow down the true difference.
Consider analyzing the cost per satisfied customer for each channel.
Look for patterns in types of issues where chat particularly excels"