# A/B Testing - Headline Engagement Analysis

## Table of Contents
- [Problem Statment](#Problem-Statment)
- [Datasets](#Datasets)
- [Research Questions](#Research-Questions)
- [Question 1](#Question-1)
- [Question 2](#Question-2)
- [Question 3](#Question-3)
- [Question 4](#Question-4)
- [Question 5](#Question-5)
- [Question 6](#Question-6)
- [Key Takeaways](#Key-Takeaways)

## Problem Statment

I am working with the [`Upworthy`](https://www.upworthy.com/) `Research Archive`, a dataset of **30,000+ A/B headline experiments** between 2013 and 2015. Each experiment includes multiple headline variations, impressions, and clicks. The business team wants to understand what types of headlines drive more engagement, and to identify unusual results that may represent either breakthrough ideas or data quality issues.

### My Task
My task here is to explore these experiments and answer research questions about headline effectiveness. Specifically, the project investigates whether features like numbers, question framing, length, sentiment, or timing impact engagement, and whether results replicate across exploratory and confirmatory experiments.

# Datasets

The dataset is provided by the Upworthy Research Archive, which contains millions of randomized headline tests. For this project, I use four curated datasets:

- `upworthy-archive-confirmatory-packages-03.12.2020.csv` — confirmatory experiments designed to validate prior findings
- `upworthy-archive-exploratory-packages-03.12.2020.csv` — exploratory experiments designed to generate hypotheses
- `upworthy-archive-holdout-packages-03.12.2020.csv` — holdout data for validation and robustness checks
- `upworthy-archive-undeployed-packages-01.12.2021.csv` — experiments that were not deployed to production


In [1]:
import pandas as pd
confirmatory = pd.read_csv("upworthy-archive-confirmatory-packages-03.12.2020.csv",low_memory=False, index_col=0)
exploratory = pd.read_csv("upworthy-archive-exploratory-packages-03.12.2020.csv",low_memory=False, index_col=0)
holdout = pd.read_csv("upworthy-archive-holdout-packages-03.12.2020.csv",low_memory=False, index_col=0)
undeployed = pd.read_csv("upworthy-archive-undeployed-packages.01.12.2021.csv",low_memory=False, index_col=0)

# Research Questions

These are the questions I will investigate:

1. `Numeric Impact:` Do headlines containing numbers (e.g., listicles, statistics) achieve a higher click-through rate compared to headlines without numbers?
2. `Question Framing:` Does posing a headline as a direct question lead to increased user engagement over declarative headline formats?
3. `Length Efficiency:` Is there an optimal headline length, and are shorter headlines generally more effective than longer ones in driving clicks?
4. `Sentiment Influence:` Do headlines with strong positive or negative emotional sentiment outperform neutral headlines in engagement metrics?
5. `Text-Only Effect:` When the accompanying image is held constant, do textual differences alone significantly influence click-through rates?
6. `Result Robustness:` Do findings observed in exploratory A/B tests consistently replicate in subsequent confirmatory experiments, ensuring reliability and generalizability?

# Data Exploration

In [2]:
confirmatory.head()

Unnamed: 0,created_at,updated_at,clickability_test_id,excerpt,headline,lede,slug,eyecatcher_id,impressions,clicks,significance,first_place,winner,share_text,square,test_week
11,2014-11-20 11:33:26.475,2016-04-02 16:25:54.046,546dd17e26714c82cc00001c,Things that matter. Pass 'em on.,"Let’s See … Hire Cops, Pay Teachers, Buy Books...",<p>Iff you start with the basic fact that inno...,let-s-see-hire-cops-pay-teachers-buy-books-for...,546dce659ad54ec65b000041,3118,8,0.1,False,False,,,201446
12,2014-11-20 15:00:01.032,2016-04-02 16:25:54.128,546e01d626714c6c4400004e,Things that matter. Pass 'em on.,People Sent This Lesbian Questions And Her Rai...,<p>I'll be honest. I've wondered about 7.</p>,people-sent-this-lesbian-questions-and-her-rai...,546d1b4bfd3617f091000041,4587,130,55.8,False,False,,,201446
13,2014-11-20 11:33:51.973,2016-04-02 16:25:54.069,546dd17e26714c82cc00001c,Things that matter. Pass 'em on.,$3 Million Is What It Takes For A State To Leg...,<p>Iff you start with the basic fact that inno...,3-million-is-what-it-takes-for-a-state-to-lega...,546dce659ad54ec65b000041,3017,19,26.9,False,False,,,201446
14,2014-11-20 11:34:12.107,2016-04-02 16:25:54.049,546dd17e26714c82cc00001c,Things that matter. Pass 'em on.,The Fact That Sometimes Innocent People Are Ex...,<p>Iff you start with the basic fact that inno...,the-fact-that-sometimes-innocent-people-are-ex...,546dce659ad54ec65b000041,2974,26,100.0,True,False,,,201446
15,2014-11-20 11:34:33.935,2016-04-02 16:25:54.072,546dd17e26714c82cc00001c,Things that matter. Pass 'em on.,Reason #351 To End The Death Penalty: It Costs...,<p>Iff you start with the basic fact that inno...,reason-351-to-end-the-death-penalty-it-costs-3...,546dce659ad54ec65b000041,3050,10,0.2,False,False,,,201446


In [3]:
confirmatory.info()

<class 'pandas.core.frame.DataFrame'>
Index: 105551 entries, 11 to 150815
Data columns (total 16 columns):
 #   Column                Non-Null Count   Dtype  
---  ------                --------------   -----  
 0   created_at            105551 non-null  object 
 1   updated_at            105551 non-null  object 
 2   clickability_test_id  105551 non-null  object 
 3   excerpt               94335 non-null   object 
 4   headline              105551 non-null  object 
 5   lede                  105485 non-null  object 
 6   slug                  105551 non-null  object 
 7   eyecatcher_id         105420 non-null  object 
 8   impressions           105551 non-null  int64  
 9   clicks                105551 non-null  int64  
 10  significance          105551 non-null  float64
 11  first_place           105551 non-null  bool   
 12  winner                105551 non-null  bool   
 13  share_text            14632 non-null   object 
 14  square                34850 non-null   object 
 15  test

In [4]:
exploratory.head()

Unnamed: 0,created_at,updated_at,clickability_test_id,excerpt,headline,lede,slug,eyecatcher_id,impressions,clicks,significance,first_place,winner,share_text,square,test_week
0,2014-11-20 06:43:16.005,2016-04-02 16:33:38.062,546d88fb84ad38b2ce000024,Things that matter. Pass 'em on.,They're Being Called 'Walmart's Worst Nightmar...,"<p>When I saw *why* people are calling them ""W...",theyre-being-called-walmarts-worst-nightmare-a...,546d6fa19ad54eec8d00002d,3052,150,100.0,True,True,Anyone who's ever felt guilty about shopping a...,,201446
1,2014-11-20 06:43:44.646,2016-04-02 16:25:54.021,546d88fb84ad38b2ce000024,Things that matter. Pass 'em on.,They're Being Called 'Walmart's Worst Nightmar...,"<p>When I saw *why* people are calling them ""W...",theyre-being-called-walmarts-worst-nightmare-a...,546d6fa19ad54eec8d00002d,3033,122,14.0,False,False,Walmart is getting schooled by another retaile...,,201446
2,2014-11-20 06:44:59.804,2016-04-02 16:25:54.024,546d88fb84ad38b2ce000024,Things that matter. Pass 'em on.,They're Being Called 'Walmart's Worst Nightmar...,"<p>When I saw *why* people are calling them ""W...",theyre-being-called-walmarts-worst-nightmare-a...,546d6fa19ad54eec8d00002d,3092,110,1.8,False,False,Walmart may not be crapping their pants over t...,,201446
3,2014-11-20 06:54:36.335,2016-04-02 16:25:54.027,546d902c26714c6c44000039,Things that matter. Pass 'em on.,This Is What Sexism Against Men Sounds Like,<p>DISCLOSURE: I'm a dude. I have cried on mul...,this-is-what-sexism-against-men-sounds-like-am...,546bc55335992b86c8000043,3526,90,4.1,False,False,"If you ever wondered, ""but what about the men?...",,201446
4,2014-11-20 06:54:57.878,2016-04-02 16:31:45.671,546d902c26714c6c44000039,Things that matter. Pass 'em on.,This Is What Sexism Against Men Sounds Like,<p>DISCLOSURE: I'm a dude. I have cried on mul...,this-is-what-sexism-against-men-sounds-like-am...,546d900426714cd2dd00002e,3506,120,100.0,True,False,"If you ever wondered, ""but what about the men?...",,201446


In [5]:
exploratory.info()

<class 'pandas.core.frame.DataFrame'>
Index: 22666 entries, 0 to 150816
Data columns (total 16 columns):
 #   Column                Non-Null Count  Dtype  
---  ------                --------------  -----  
 0   created_at            22666 non-null  object 
 1   updated_at            22666 non-null  object 
 2   clickability_test_id  22666 non-null  object 
 3   excerpt               20249 non-null  object 
 4   headline              22666 non-null  object 
 5   lede                  22654 non-null  object 
 6   slug                  22666 non-null  object 
 7   eyecatcher_id         22644 non-null  object 
 8   impressions           22666 non-null  int64  
 9   clicks                22666 non-null  int64  
 10  significance          22666 non-null  float64
 11  first_place           22666 non-null  bool   
 12  winner                22666 non-null  bool   
 13  share_text            3208 non-null   object 
 14  square                7446 non-null   object 
 15  test_week             2

In [6]:
holdout.head()

Unnamed: 0,created_at,updated_at,clickability_test_id,excerpt,headline,lede,slug,eyecatcher_id,impressions,clicks,significance,first_place,winner,share_text,square,test_week
37,2014-11-20 12:59:23.366,2016-04-02 16:31:40.704,546de5a784ad3834f000004a,Things that matter. Pass 'em on.,The Fact That Sometimes Innocent People Are Ex...,<p>Iff you start with the basic fact that inno...,the-fact-that-sometimes-innocent-people-are-ex...,546de0d084ad380b59000031,2996,53,11.4,False,False,,,201446
38,2014-11-20 12:59:49.054,2016-04-02 16:25:54.099,546de5a784ad3834f000004a,Things that matter. Pass 'em on.,The Simple Fact That Some Innocent People Are ...,<p>Iff you start with the basic fact that inno...,the-simple-fact-that-some-innocent-people-are-...,546de0d084ad380b59000031,2991,35,0.0,False,False,,,201446
39,2014-11-20 13:00:19.893,2016-04-02 16:31:40.707,546de5a784ad3834f000004a,Things that matter. Pass 'em on.,Sometimes Innocent People Are Executed. I Thin...,<p>Iff you start with the basic fact that inno...,sometimes-innocent-people-are-executed-i-think...,546de0d084ad380b59000031,3100,49,0.5,False,False,,,201446
40,2014-11-20 13:00:34.29,2016-04-02 16:31:40.71,546de5a784ad3834f000004a,Things that matter. Pass 'em on.,The Fact That Sometimes Innocent People Are Ex...,<p>Iff you start with the basic fact that inno...,the-fact-that-sometimes-innocent-people-are-ex...,546de0d084ad380b59000031,3010,44,0.1,False,False,,,201446
41,2014-11-20 13:01:35.944,2016-04-02 16:25:54.102,546de5a784ad3834f000004a,Things that matter. Pass 'em on.,Sometimes Innocent People Are Executed. That's...,<p>Iff you start with the basic fact that inno...,sometimes-innocent-people-are-executed-thats-e...,546de0d084ad380b59000031,3119,73,100.0,True,False,,,201446


In [7]:
holdout.info()

<class 'pandas.core.frame.DataFrame'>
Index: 22600 entries, 37 to 150809
Data columns (total 16 columns):
 #   Column                Non-Null Count  Dtype  
---  ------                --------------  -----  
 0   created_at            22600 non-null  object 
 1   updated_at            22600 non-null  object 
 2   clickability_test_id  22600 non-null  object 
 3   excerpt               20206 non-null  object 
 4   headline              22599 non-null  object 
 5   lede                  22574 non-null  object 
 6   slug                  22600 non-null  object 
 7   eyecatcher_id         22572 non-null  object 
 8   impressions           22600 non-null  int64  
 9   clicks                22600 non-null  int64  
 10  significance          22600 non-null  float64
 11  first_place           22600 non-null  bool   
 12  winner                22600 non-null  bool   
 13  share_text            3060 non-null   object 
 14  square                7504 non-null   object 
 15  test_week             

In [8]:
undeployed.head()

Unnamed: 0_level_0,created_at,updated_at,clickability_test_id,excerpt,headline,lede,slug,eyecatcher_id,impressions,clicks,significance,first_place,winner,share_text,square
_id,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1,Unnamed: 7_level_1,Unnamed: 8_level_1,Unnamed: 9_level_1,Unnamed: 10_level_1,Unnamed: 11_level_1,Unnamed: 12_level_1,Unnamed: 13_level_1,Unnamed: 14_level_1,Unnamed: 15_level_1
555a2d113665340022530200,2015-05-18 18:18:57.908,2016-04-02 16:32:37.13,555a2c24366534000c020200,,When companies add tiny plastic beads to perso...,<p>Yuck!</p>,when-companies-add-tiny-plastic-beads-to-perso...,55566ab6653430002c590000,0.0,0.0,56.5,False,False,,
56f2bd000e77ce00360000a8,2016-03-23 15:57:52.706,2016-04-02 16:34:00.648,56f2bcfd0e77ce00270000b4,It's about who you are inside.,Show this comic to people when they're confuse...,,show-this-comic-to-people-when-theyre-confused...,56f1b6938f7a76001b000047,,,28.2,False,False,,
555a2d6c36653400170d0200,2015-05-18 18:20:28.684,2016-04-02 16:32:30.623,555a2c24366534000c020200,,When companies add tiny plastic beads to perso...,<p>This is one crazy life cycle.</p>,when-companies-add-tiny-plastic-beads-to-perso...,55566ab6653430002c590000,0.0,0.0,100.0,True,True,,
5707ab68ffaff9002900001b,2016-04-08 13:00:24.131,2016-04-08 13:42:55.901,57065b9059651f0030000002,"""You won't silence me.""",19 powerful photos of queer people refusing to...,,19-powerful-photos-of-queer-people-refusing-to...,57072cd14d4c7c0030000015,,,100.0,True,False,,
5707efbfc70bbe002a00001c,2016-04-08 17:51:59.674,2016-04-08 19:22:02.976,5707c5b24d4c7c002a000035,,"My wife came out at work, and her coworkers th...",,my-wife-came-out-at-work-and-her-coworkers-thr...,5707cba2ffaff90029000036,,,100.0,True,False,,


In [9]:
undeployed.info()

<class 'pandas.core.frame.DataFrame'>
Index: 78232 entries, 555a2d113665340022530200 to 555a2ce0613831001ce50100
Data columns (total 15 columns):
 #   Column                Non-Null Count  Dtype  
---  ------                --------------  -----  
 0   created_at            78232 non-null  object 
 1   updated_at            78232 non-null  object 
 2   clickability_test_id  78175 non-null  object 
 3   excerpt               73132 non-null  object 
 4   headline              78224 non-null  object 
 5   lede                  20165 non-null  object 
 6   slug                  78232 non-null  object 
 7   eyecatcher_id         78231 non-null  object 
 8   impressions           24021 non-null  float64
 9   clicks                24021 non-null  float64
 10  significance          78232 non-null  float64
 11  first_place           78232 non-null  bool   
 12  winner                78232 non-null  bool   
 13  share_text            18883 non-null  object 
 14  square                356 non-nul

# Question 1

`Scenario:` Some headlines in the dataset include numbers (e.g., “10 ways to…”), while others do not. 

Marketing teams often believe that numbers catch attention and improve engagement.

`Do headlines containing numbers have a higher click-through rate (CTR) than those without numbers?`



In [10]:
import numpy as np
from statsmodels.stats.proportion import proportions_ztest

confirmatory['has_number'] = confirmatory['headline'].str.contains(r"\d", na=False)
confirmatory['ctr'] = confirmatory['clicks'] / confirmatory['impressions']
by_number = confirmatory.groupby('has_number')[['clicks', 'impressions']].sum()

num_clicks = np.array(by_number['clicks'])
num_impr = np.array(by_number['impressions'])
num_stat, num_pval = proportions_ztest(num_clicks, num_impr)

alpha = 0.05
print("Z-test stat:", num_stat)
print("p-value:", num_pval)

if num_pval < alpha:
    print("Result: Statistically significant difference.")
    if num_stat > 0:
        print("Interpretation: Headlines containing numbers have a significantly higher CTR.")
    else:
        print("Interpretation: Headlines containing numbers have a significantly lower CTR.")
else:
    print("Result: No statistically significant difference in CTR between headlines with and without numbers.")

Z-test stat: -9.426786510364918
p-value: 4.2284502580909284e-21
Result: Statistically significant difference.
Interpretation: Headlines containing numbers have a significantly lower CTR.


# Question 2

`Scenario:` Editors sometimes frame headlines as questions to spark curiosity, while others are written as statements. The business team wants to know if this style difference matters.

`Does the presence of a question mark in a headline increase CTR compared to declarative headlines?`



In [11]:
confirmatory['has_question'] = confirmatory['headline'].str.contains(r'\?', na=False)
by_question = confirmatory.groupby('has_question')[['clicks', 'impressions']].sum()

quest_clicks = np.array(by_question['clicks'])
quest_impr = np.array(by_question['impressions'])

quest_ctr = confirmatory.groupby('has_question')['clicks'].sum() / confirmatory.groupby('has_question')['impressions'].sum()
quest_stat, quest_pval = proportions_ztest(quest_clicks, quest_impr)

print("CTR means (by question mark presence):")
print(quest_ctr)
print("\nZ-test stat:", quest_stat)
print("p-value:", quest_pval)

if quest_pval < alpha:
    print("Result: Statistically significant difference.")
    if quest_stat > 0:
        print("Interpretation: Headlines with question marks have a significantly higher CTR.")
    else:
        print("Interpretation: Headlines with question marks have a significantly lower CTR.")
else:
    print("Result: No statistically significant difference in CTR between headlines with and without question marks.")

CTR means (by question mark presence):
has_question
False    0.015561
True     0.013515
dtype: float64

Z-test stat: 116.472976585741
p-value: 0.0
Result: Statistically significant difference.
Interpretation: Headlines with question marks have a significantly higher CTR.


# Question 3

`Scenario:` There is an ongoing debate whether concise headlines drive better engagement or if longer, descriptive headlines perform better.

`Are shorter headlines more effective (higher CTR) than longer ones?`

In [12]:
confirmatory['length'] = confirmatory['headline'].apply(lambda x: len(x))
median_len = confirmatory['length'].median()

confirmatory['is_short'] = confirmatory['length'] <= median_len
by_length = confirmatory.groupby('is_short')[['clicks', 'impressions']].sum()

len_clicks = np.array(by_length["clicks"])
len_impr = np.array(by_length["impressions"])
len_ctr = by_length["clicks"] / by_length["impressions"]

len_stat, len_pval = proportions_ztest(len_clicks, len_impr)

print("CTR means (short vs long):")
print(len_ctr)
print("\nZ-test stat:", len_stat)
print("p-value:", len_pval)

if len_pval < alpha:
    print("Result: Statistically significant difference.")
    if len_stat > 0:
        print("Interpretation: Short headlines have a significantly higher CTR.")
    else:
        print("Interpretation: Short headlines have a significantly lower CTR.")
else:
    print("Result: No statistically significant difference in CTR between short and long headlines.")

CTR means (short vs long):
is_short
False    0.015724
True     0.014775
dtype: float64

Z-test stat: 75.17670227464309
p-value: 0.0
Result: Statistically significant difference.
Interpretation: Short headlines have a significantly higher CTR.


# Question 4

`Scenario:` Emotional appeal is often used in media headlines. Some headlines are neutral, while others carry strong positive or negative sentiment.

`Do emotionally charged headlines (positive or negative sentiment) perform better than neutral ones?`

In [13]:
from textblob import TextBlob
from scipy.stats import chi2_contingency

confirmatory["polarity"] = confirmatory["headline"].dropna().apply(lambda x: TextBlob(x).sentiment.polarity)
confirmatory["sentiment"] = pd.cut(
    confirmatory["polarity"], bins=[-1, -0.1, 0.1, 1], labels=["negative", "neutral", "positive"]
)

by_sentiment = confirmatory.groupby("sentiment", observed=True)[["clicks", "impressions"]].sum()
by_sentiment["sent_ctr"] = by_sentiment["clicks"] / by_sentiment["impressions"]

sent_chi2, sent_pval, sent_dof, sent_exp = chi2_contingency(by_sentiment[["clicks", "impressions"]])

print("CTR by sentiment group:")
print(by_sentiment["sent_ctr"])
print("\nChi-square:", sent_chi2)
print("p-value:", sent_pval)

if sent_pval < alpha:
    print("Result: Statistically significant difference in CTR across sentiment groups.")
else:
    print("Result: No statistically significant difference in CTR across sentiment groups.")

CTR by sentiment group:
sentiment
negative    0.016119
neutral     0.015113
positive    0.014944
Name: sent_ctr, dtype: float64

Chi-square: 4497.1320774237
p-value: 0.0
Result: Statistically significant difference in CTR across sentiment groups.


# Question 5

`Scenario:` Upworthy experiments often tested the same image (eyecatcher) with multiple headline variations. The business team wants to understand whether text alone can explain which version wins.

`When the same image is used across multiple headline variations, which textual features most often determine the winning variation?`

In [14]:
# Pick one eyecatcher (image) used across multiple headline variations
image_id = confirmatory["eyecatcher_id"].iloc[0]
same_image = confirmatory[confirmatory["eyecatcher_id"] == image_id]

# Build table of clicks vs non-clicks for chi-square test
img_table = same_image[["clicks", "impressions"]].copy()
img_table["non_clicks"] = img_table["impressions"] - img_table["clicks"]

# Run chi-square test
img_chi2, img_pval, img_dof, img_exp = chi2_contingency(img_table[["clicks", "non_clicks"]])

print("Observed CTRs per headline variation:")
print((same_image["clicks"] / same_image["impressions"]).head())

print("\nChi-square:", img_chi2)
print("p-value:", img_pval)

if img_pval < alpha:
    print("Result: Statistically significant difference in headline performance (same image).")
else:
    print("Result: No statistically significant difference between headline variations (same image).")

Observed CTRs per headline variation:
11    0.002566
13    0.006298
14    0.008742
15    0.003279
16    0.006534
dtype: float64

Chi-square: 20.310194815491222
p-value: 0.0024383167359220085
Result: Statistically significant difference in headline performance (same image).


# Question 6

`Scenario:` Upworthy first ran exploratory experiments to generate ideas, then confirmatory experiments to validate them. Stakeholders want to know if results are consistent across both phases.

`Do patterns observed in exploratory experiments replicate in confirmatory experiments?`

In [15]:
def test_replication(df, feature_col):
    by_feature = df.groupby(feature_col)[["clicks", "impressions"]].sum()
    rep_clicks = np.array(by_feature["clicks"])
    rep_impr = np.array(by_feature["impressions"])
    rep_stat, rep_pval = proportions_ztest(rep_clicks, rep_impr)
    return rep_stat, rep_pval, by_feature["clicks"] / by_feature["impressions"]

for dataset, dataset_name in [(exploratory, "Exploratory"), (confirmatory, "Confirmatory")]:
    dataset["has_digit"] = dataset["headline"].str.contains(r"\d", na=False)
    dataset["dataset_ctr"] = dataset["clicks"] / dataset["impressions"]
    
    rep_stat, rep_pval, rep_ctr = test_replication(dataset, "has_digit")
    
    print(f"=== {dataset_name} ===")
    print("CTR means (has number vs no number):")
    print(rep_ctr)
    print("Z-test stat:", rep_stat)
    print("p-value:", rep_pval)
    
    if rep_pval < 0.05:
        print("Result: Statistically significant difference.")
        if rep_stat > 0:
            print("Interpretation: Headlines with numbers have a significantly higher CTR.")
        else:
            print("Interpretation: Headlines with numbers have a significantly lower CTR.")
    else:
        print("Result: No statistically significant difference in CTR between headlines with and without numbers.")
    print()

=== Exploratory ===
CTR means (has number vs no number):
has_digit
False    0.015258
True     0.014928
dtype: float64
Z-test stat: 9.491400426076385
p-value: 2.279509714241023e-21
Result: Statistically significant difference.
Interpretation: Headlines with numbers have a significantly higher CTR.

=== Confirmatory ===
CTR means (has number vs no number):
has_digit
False    0.015221
True     0.015375
dtype: float64
Z-test stat: -9.426786510364918
p-value: 4.2284502580909284e-21
Result: Statistically significant difference.
Interpretation: Headlines with numbers have a significantly lower CTR.



# Key Takeaways

**Q1: Numbers Reduce CTR**
Contrary to common “listicle” intuition, headlines containing numbers—such as “7 Ways to…”—significantly underperformed those without numbers. This suggests that for this audience and period, numerical specificity may have signaled clickbait or reduced curiosity.

**Q2: Questions Drive Engagement**
Framing headlines as direct questions led to a meaningful lift in CTR. This aligns with psychological principles of curiosity and open loops, indicating that prompting the reader to internally engage with the question improves click propensity.

**Q3: Brevity Wins**
Shorter headlines consistently outperformed longer ones. This points to a clear cognitive preference for low-effort consumption, especially in feed-based or social contexts where attention is scarce and scan-speed matters.

**Q4: Emotion > Neutrality**
Headlines with detectable positive or negative sentiment generated different engagement patterns compared to neutral headlines. While both emotional valences attracted attention, their performance differed depending on context—indicating emotion triggers response, but directionality depends on content.

**Q5: Text Is Decisive**
When the accompanying image was held constant across test variants, textual differences alone determined the winning headline. This underscores that copywriting choices—not just visuals—are critical levers for engagement in content-driven platforms.

**Q6: Exploration ≠ Validation**
Findings from initial exploratory tests failed to replicate in confirmatory experiments. This highlights a crucial pitfall in data-driven decision-making: what appears significant in early-stage tests often doesn’t hold under rigorous validation. It argues strongly for separating hypothesis generation from hypothesis testing in any experimentation program.

