# A/B Testing on Email Subject Line

<br>
  
We were interested in testing 2 different subject lines for an event invitation, to see which achieved a better open and click-through rate. Both were sent to 1076 doctors, at the exact same date and time.


### **Subject Line A**: Register today! 2 International Speakers - Event Title

>**Open Rate**: 24.7%
>
>**Click-Through Rate**: 1.7%

<br>

### **Subject Line B**: Meeting Invite: 2 International Speakers - Event Title (Limited seating)

>**Open Rate**: 22.5%
>
>**Click-Through Rate**: 2.9%

<br>

In [24]:
import plotly.graph_objects as go
headline=["Open Rate", "Click-Through Rate"]

fig = go.Figure(data=[
    go.Bar(name='Subject Line A', x=headline, y=[0.247, 0.017]),
    go.Bar(name='Subject Line B', x=headline, y=[0.225, 0.029])
])
# Change the bar mode
fig.update_layout(barmode='group')
fig.show()

# Analysing Effect on Open Rate

<br>

We need to test whether there was a statistical difference between the open rates of Email A vs Email B.

>H0: Open rate of Email A = Open rate of Email B
>
>H1: Open rate of Email A is stastically different to Open rate of Email B

In [34]:
import numpy as np
from scipy.stats import mannwhitneyu

num_a, num_b = 1076, 1076
open_a, open_b = 266, 242
rate_a, rate_b = open_a / num_a, open_b / num_b

a_dist = np.zeros(num_a)
a_dist[:open_a] = 1
b_dist = np.zeros(num_b)
b_dist[:open_b] = 1

stat, p_value = mannwhitneyu(a_dist, b_dist)

print(f"p-value is {p_value:0.3f}")

if p_value > 0.05:
        print("Same proportions of errors (fail to reject H0)")
else:
        print("Different proportions of errors (reject H0)")

p-value is 0.112
Same proportions of errors (fail to reject H0)


## Conclusion 

> Thus, because the p-value is 0.112 (which is greater than our significance level alpha = 0.05), then we cannot reject the null hypothesis that the same between Email A vs Email B.
>
>**Therefore, there was not a significant difference between the open rate results of Email A vs Email B.**

<br>

# Analysing Effect on Click-Through Rate

<br>

We need to test whether there was a statistical difference between the click-through rates of Email A vs Email B.

>H0: Click-through rate of Email A = Click-through rate of Email B
>
>H1: Click-through rate of Email A is stastically different to Click-through rate of Email B

In [35]:
num_a, num_b = 1076, 1076
click_a, click_b = 18, 31
rate_a, rate_b = click_a / num_a, click_b / num_b

a_dist = np.zeros(num_a)
a_dist[:click_a] = 1
b_dist = np.zeros(num_b)
b_dist[:click_b] = 1

stat, p_value = mannwhitneyu(a_dist, b_dist)

print(f"p-value is {p_value:0.3f}")

if p_value > 0.05:
        print("Same proportions of errors (fail to reject H0)")
else:
        print("Different proportions of errors (reject H0)")

p-value is 0.030
Different proportions of errors (reject H0)


### Conclusion 

>Thus, because the p-value is 0.030 (which is less than our significance level alpha = 0.05), we can reject the null hypothesis that the click-through rates are the same between Email A vs Email B.
>
>**Therefore, there WAS a significant difference between the click-through rate of Email A vs Email B.**

<br>

# Headline B achieved a higher click-through rate than Headline A

<br>

- The different headlines between Test A and Test B did not result in a statistically different open rate. 

- However, the click-through rate of Test B was statistically different to that of Test A.

- In general, Headline B achieved a higher click-through rate than Headline A.

- **It's seems that the inclusion of "Limited Seating" helped to encourage a higher click through rate.**

<br>

# Recommendations

- Test again with a new A/B test where the only difference is "Limited Seating"