Data-Driven Insights on User Behavior and Ad Clicks

Ad Click Analysis: Insights Through A/B Testing

A/B testing—also called split testing or bucket testing—compares the performance of two versions of content to see which one appeals more to visitors/viewers. It tests a control (A) version against a variant (B) version to measure which one is most successful based on your key metrics. 

The number of ad clicks generated by a campaign is often seen as a measure of its success. A higher click-through rate (CTR), which is the ratio of ad clicks to impressions (the number of times an ad is displayed), indicates a greater level of user engagement and a more effective ad. Ad clicks play an important role in various online advertising models, including affiliate marketing, paid search and display advertising.

The number of ad clicks generated by a campaign is often seen as a measure of its success. A higher click-through rate (CTR), which is the ratio of ad clicks to impressions (the number of times an ad is displayed), indicates a greater level of user engagement and a more effective ad.

## DATA UNDERSTANDING

https://statso.io/wp-content/uploads/2023/01/ctr.zip

| **Feature Name**            | **Description**                                                 | **Type**           | **Examples**               |
|-----------------------------|---------------------------------------------------------------|--------------------|----------------------------|
| Daily Time Spent on Site    | Time spent by a user on the site daily (in minutes)           | Continuous Numeric | 68.95, 80.23               |
| Age                         | Age of the user                                               | Continuous Numeric | 35, 23                     |
| Area Income                 | Average income of the user's geographical area (in dollars)  | Continuous Numeric | 55000, 72000               |
| Daily Internet Usage        | Daily usage of the internet by the user (in minutes)         | Continuous Numeric | 120.5, 96.8                |
| Ad Topic Line               | Topic headline of the ad viewed                              | Categorical Text   | "Top Ad Offer", "Great Deal" |
| City                        | City where the user resides                                   | Categorical Text   | "New York", "San Francisco" |
| Gender                      | Gender of the user                                            | Categorical        | Male, Female               |
| Country                     | Country of the user                                           | Categorical Text   | "USA", "India"             |
| Timestamp                   | Date and time of the ad interaction                          | DateTime           | 2024-10-31 14:53:00        |
| Clicked on Ad               | Whether the user clicked on the ad (1 = Yes, 0 = No)         | Binary Categorical | 1, 0                       |


In [1]:
import pandas as pd

In [3]:
data=pd.read_csv("Clicked_On_AD.csv")
data.head()

Unnamed: 0,Daily Time Spent on Site,Age,Area Income,Daily Internet Usage,Ad Topic Line,City,Gender,Country,Timestamp,Clicked on Ad
0,62.26,32,69481.85,172.83,Decentralized real-time circuit,Lisafort,Male,Svalbard & Jan Mayen Islands,6/9/2016 21:43,0
1,41.73,31,61840.26,207.17,Optional full-range projection,West Angelabury,Male,Singapore,1/16/2016 17:56,0
2,44.4,30,57877.15,172.83,Total 5thgeneration standardization,Reyesfurt,Female,Guadeloupe,6/29/2016 10:50,0
3,59.88,28,56180.93,207.17,Balanced empowering success,New Michael,Female,Zambia,6/21/2016 14:32,0
4,49.21,30,54324.73,201.58,Total 5thgeneration standardization,West Richard,Female,Qatar,7/21/2016 10:54,1


In [None]:
Daily Time Spent on Site: Does more time spent on the site lead to more ad clicks?
Age Group: Are younger users more likely to click on ads?
Gender: Do males or females click on ads more frequently?
Country/City: Are users from specific locations more likely to engage with ads?

In [None]:
Null Hypothesis (H₀): There is no difference between groups (e.g., gender or time spent on-site) in terms of ad clicks.
Alternative Hypothesis (H₁): There is a statistically significant difference between groups.

In [None]:
Segment the Data
Create groups for comparison:

Group A: Users who meet one criterion (e.g., low daily time spent on the site).
Group B: Users who meet the other criterion (e.g., high daily time spent on the site).
For example:

Split users into age groups (e.g., below 30 vs. 30 and above).
Compare users by daily time spent on-site (e.g., <60 minutes vs. ≥60 minutes).


In [None]:
3. Choose a Metric
Define the metric you want to measure, such as:

Click Rate: Percentage of users who clicked on ads.
Time Spent: Average time spent on-site for each group.

In [None]:
erform the Test
Perform statistical tests to compare groups:

T-Test or Z-Test: For continuous variables like daily time spent on-site.
Chi-Square Test: For categorical variables like gender or city.
Mann-Whitney U Test: If the data is not normally distributed.

In [None]:
 Analyze the Results
Check the p-value to determine significance (e.g., p < 0.05).
If the null hypothesis is rejected, conclude that there is a statistically significant difference.

In [None]:
Example Analysis: Daily Time Spent on Site and Ad Clicks
If you want to test whether users who spend more time on the site are more likely to click ads:

Divide users into two groups:
Group A: Users with daily time ≤ the median.
Group B: Users with daily time > the median.
Calculate the click rates for both groups.
Perform a statistical test to compare the two groups.


In [5]:
data.columns

Index(['Daily Time Spent on Site', 'Age', 'Area Income',
       'Daily Internet Usage', 'Ad Topic Line', 'City', 'Gender', 'Country',
       'Timestamp', 'Clicked on Ad'],
      dtype='object')