
When the confidence intervals of two groups overlap, it generally suggests that there is no statistically significant difference between the groups. However, there are scenarios where overlapping confidence intervals can still indicate statistical differences. Here are a few possibilities:

Sample size: If the sample size of the groups is small, even a small difference between the groups may result in overlapping confidence intervals. In such cases, the overlapping intervals do not necessarily imply that there is no difference. Increasing the sample size can provide more precise estimates and potentially reveal statistically significant differences.

Variability: If the variability within each group is large, it can lead to wider confidence intervals, increasing the chances of overlap. However, if the difference between the group means is substantial relative to the variability, a statistically significant difference may still exist despite the overlap.

Hypothesis test: Confidence intervals and hypothesis tests provide different perspectives on comparing groups. While overlapping confidence intervals suggest no significant difference, a hypothesis test may detect a statistically significant difference based on the chosen significance level (e.g., 0.05). Hypothesis tests assess the likelihood of observing the data under the assumption of no difference between the groups.

Directional hypotheses: In some cases, the research question may involve a directional hypothesis, where you expect one group to be consistently higher or lower than the other. In such situations, overlapping confidence intervals might still support a statistically significant difference if the observed values consistently align with the directional hypothesis.

Multiple comparisons: If you are comparing multiple groups pairwise, the chances of observing overlapping confidence intervals increase. In such cases, adjusting for multiple comparisons using appropriate methods (e.g., Bonferroni correction) can help determine if any statistically significant differences exist.

It's important to interpret overlapping confidence intervals cautiously and consider additional factors, such as sample size, variability, hypothesis tests, directional hypotheses, and multiple comparisons, to assess the potential for statistical differences between groups. Consulting with a statistician can provide valuable insights specific to your experiment and help determine the appropriate interpretation.

In [1]:
from statsmodels.stats.proportion import proportions_ztest

# Specify the number of successes (events of interest) and the sample size for each group
successes_a = 60  # Number of successes in Group A
n_a = 500        # Sample size of Group A

successes_b = 75  # Number of successes in Group B
n_b = 500        # Sample size of Group B

# Perform the two-proportion z-test
successes = [successes_a, successes_b]
samples = [n_a, n_b]
z_stat, p_value = proportions_ztest(successes, samples)
print('z stat: ', z_stat)
print('p value: ',  p_value)

# Check if the p-value is less than the significance level (e.g., 0.05)
significance_level = 0.05
if p_value < significance_level:
    print("The groups are statistically different.")
else:
    print("There is no significant difference between the groups.")

z stat:  -1.3880858307767148
p value:  0.16511091065405215
There is no significant difference between the groups.


In [3]:
#from scipy.stats import proportion_confint
from statsmodels.stats.proportion import proportion_confint

In [6]:
# Specify the number of successes (events of interest) and the sample size for each group
successes_a = 60  # Number of successes in Group A
n_a = 500        # Sample size of Group A

successes_b = 75  # Number of successes in Group B
n_b = 500        # Sample size of Group B

# Calculate the proportion (success rate) for each group
p_a = successes_a / n_a
p_b = successes_b / n_b

# Calculate the margin of error and the confidence interval at a desired confidence level
confidence_level = 0.95
lower_bound, upper_bound = proportion_confint(successes_a, n_a, alpha=1-confidence_level, method='normal')

# # Calculate the lower and upper bounds of the confidence interval
# lower_bound = p_b - margin_of_error
# upper_bound = p_b + margin_of_error

# Print the margin of error and the confidence interval
#print("Margin of Error: {:.4f}".format(margin_of_error))
print('Marigin of Error: ', round((upper_bound-p_a),4) )
print('p_a: ', p_a)
print("Confidence Interval: [{:.4f}, {:.4f}]".format(lower_bound, upper_bound))


Marigin of Error:  0.0285
p_a:  0.12
Confidence Interval: [0.0915, 0.1485]


In [4]:
# Specify the number of successes (events of interest) and the sample size for each group
successes_a = 60  # Number of successes in Group A
n_a = 500        # Sample size of Group A

successes_b = 75  # Number of successes in Group B
n_b = 500        # Sample size of Group B

# Calculate the proportion (success rate) for each group
p_a = successes_a / n_a
p_b = successes_b / n_b

# Calculate the margin of error and the confidence interval at a desired confidence level
confidence_level = 0.95
lower_bound, upper_bound = proportion_confint(successes_b, n_b, alpha=1-confidence_level, method='normal')

# # Calculate the lower and upper bounds of the confidence interval
# lower_bound = p_b - margin_of_error
# upper_bound = p_b + margin_of_error

# Print the margin of error and the confidence interval
#print("Margin of Error: {:.4f}".format(margin_of_error))
print('Marigin of Error: ', round((upper_bound-p_b),4) )
print('p_b: ', p_b)
print("Confidence Interval: [{:.4f}, {:.4f}]".format(lower_bound, upper_bound))


Marigin of Error:  0.0313
p_b:  0.15
Confidence Interval: [0.1187, 0.1813]


In [5]:
import statsmodels.api as sm
import numpy as np

# Sample sizes and number of conversions for each design
n_a = 500
n_b = 500
successes_a = 60
successes_b = 75

# Conversion rates for each design
p_a = successes_a / n_a
p_b = successes_b / n_b

# Calculate the standard errors
se_a = np.sqrt(p_a * (1 - p_a) / n_a)
se_b = np.sqrt(p_b * (1 - p_b) / n_b)

# Calculate the confidence intervals
z = 1.96  # 95% confidence level (for a two-tailed test)
ci_a = (p_a - z * se_a, p_a + z * se_a)
ci_b = (p_b - z * se_b, p_b + z * se_b)

# Perform the two-proportion z-test
count = np.array([successes_a, successes_b])
nobs = np.array([n_a, n_b])
z_stat, p_value = sm.stats.proportions_ztest(count, nobs)

# Print the results
print("Design A:")
print("Conversion Rate: {:.2%}".format(p_a))
print("Confidence Interval: ({:.2%}, {:.2%})".format(ci_a[0], ci_a[1]))
print()

print("Design B:")
print("Conversion Rate: {:.2%}".format(p_b))
print("Confidence Interval: ({:.2%}, {:.2%})".format(ci_b[0], ci_b[1]))
print()

print("Two-Proportion Z-Test:")
print("Z-Score: {:.4f}".format(z_stat))
print("P-Value: {:.4f}".format(p_value))


Design A:
Conversion Rate: 12.00%
Confidence Interval: (9.15%, 14.85%)

Design B:
Conversion Rate: 15.00%
Confidence Interval: (11.87%, 18.13%)

Two-Proportion Z-Test:
Z-Score: -1.3881
P-Value: 0.1651


In [7]:
import statsmodels.api as sm
import numpy as np

# Sample sizes and number of conversions for each design
n_a = 500
n_b = 500
successes_a = 60
successes_b = 75

# Conversion rates for each design
p_a = successes_a / n_a
p_b = successes_b / n_b

# Perform the two-proportion z-test
count = np.array([successes_a, successes_b])
nobs = np.array([n_a, n_b])
z_stat, p_value = sm.stats.proportions_ztest(count, nobs)

# Print the results
print("Two-Proportion Z-Test:")
print("Z-Score: {:.4f}".format(z_stat))
print("P-Value: {:.4f}".format(p_value))


Two-Proportion Z-Test:
Z-Score: -1.3881
P-Value: 0.1651


In [9]:
import statsmodels.api as sm

# Sample sizes and number of conversions for each group
n_a = 500
n_b = 700
successes_a = 80
successes_b = 110

# Conversion rates for each group
p_a = successes_a / n_a
p_b = successes_b / n_b

# Perform the two-proportion z-test
count = [successes_a, successes_b]
nobs = [n_a, n_b]
z_stat, p_value = sm.stats.proportions_ztest(count, nobs)

# Calculate the confidence intervals
ci_a = sm.stats.proportion_confint(successes_a, n_a)
ci_b = sm.stats.proportion_confint(successes_b, n_b)

# Print the results
print("Group A:")
print("Conversion Rate: {:.2%}".format(p_a))
print("Confidence Interval: [{:.2%}, {:.2%}]".format(ci_a[0], ci_a[1]))
print()

print("Group B:")
print("Conversion Rate: {:.2%}".format(p_b))
print("Confidence Interval: [{:.2%}, {:.2%}]".format(ci_b[0], ci_b[1]))
print()

print("Two-Proportion Z-Test:")
print("Z-Score: {:.4f}".format(z_stat))
print("P-Value: {:.4f}".format(p_value))


Group A:
Conversion Rate: 16.00%
Confidence Interval: [12.79%, 19.21%]

Group B:
Conversion Rate: 15.71%
Confidence Interval: [13.02%, 18.41%]

Two-Proportion Z-Test:
Z-Score: 0.1337
P-Value: 0.8937


In [10]:
import scipy.stats as stats
import numpy as np

# Group A
n_a = 100
mean_a = 75
std_a = 10

# Group B
n_b = 150
mean_b = 80
std_b = 15

# Calculate the confidence intervals
ci_a = stats.norm.interval(0.95, loc=mean_a, scale=std_a / np.sqrt(n_a))
ci_b = stats.norm.interval(0.95, loc=mean_b, scale=std_b / np.sqrt(n_b))

# Perform a two-sample t-test
t_stat, p_value = stats.ttest_ind_from_stats(mean_a, std_a, n_a, mean_b, std_b, n_b)

# Print the results
print("Group A:")
print("Mean score: {:.2f}".format(mean_a))
print("Confidence Interval: [{:.2f}, {:.2f}]".format(ci_a[0], ci_a[1]))
print()

print("Group B:")
print("Mean score: {:.2f}".format(mean_b))
print("Confidence Interval: [{:.2f}, {:.2f}]".format(ci_b[0], ci_b[1]))
print()

print("Two-Sample T-Test:")
print("T-Statistic: {:.4f}".format(t_stat))
print("P-Value: {:.4f}".format(p_value))


Group A:
Mean score: 75.00
Confidence Interval: [73.04, 76.96]

Group B:
Mean score: 80.00
Confidence Interval: [77.60, 82.40]

Two-Sample T-Test:
T-Statistic: -2.9269
P-Value: 0.0037
