# Determination of Efficacy 
## on Changes To Ney York Times App/Page

In this report, we evaluate the impact of a recent update to the New York Times app, specifically designed to enhance user engagement with its Games, Cooking, and Audio sections. The objective of this change was to increase user interaction and time spent within these sections, ultimately boosting overall app usage and retention. By analyzing key engagement metrics such as session duration, click-through rates, and content consumption patterns before and after the update, we aim to determine the efficacy of this strategic modification. Our findings will provide valuable insights into the success of the update and guide future decisions for optimizing user experience. This report will detail the methods used for data collection and analysis, as well as the conclusions drawn from the results.

Below is the process of evaluating the efficacy of the changes made.  The data used in the following hypothesis tests uses synthetic data and is not to be used for actual determination; it is solely used to describe the testing process and should apply actual data following actual testing of the changes to the app.

In [1]:
import numpy as np
import pandas as pd
import matplotlib.pyplot as plt
import seaborn as sns

In [3]:
NYT_eff_df = pd.read_csv('NYT_app_data.csv')
NYT_eff_df.head()

Unnamed: 0,id,first_name,last_name,email,gender,ip_address,location,interests,time_end,no_clicks,no_likes,test_group,time_on_page
0,1,Ives,Sheppey,isheppey0@cafepress.com,Male,212.87.187.119,Luntas,,7/27/2024,19,7,A,25
1,2,Val,Scrammage,vscrammage1@umich.edu,Male,165.115.52.15,Chojnice,,4/9/2024,8,3,A,21
2,3,Carmelia,Durrance,cdurrance2@amazon.co.jp,Female,206.152.91.174,Shangbahe,,3/19/2024,8,5,B,34
3,4,Glennis,Grovier,ggrovier3@loc.gov,Female,121.163.201.146,Pulo,,6/21/2024,2,6,B,26
4,5,Pippy,Jenman,pjenman4@earthlink.net,Female,98.12.142.172,Kedungdoro,,1/19/2024,2,8,B,33


In order to protect the data of users, the dateframe will be reducedd to remove unnecessary and personal data.  The reucead dataframe will be named NYT_red_df

In [4]:
NYT_red_df = NYT_eff_df[['id', 'no_clicks', 'no_likes', 'time_on_page', 'test_group']]
NYT_red_df.head()


Unnamed: 0,id,no_clicks,no_likes,time_on_page,test_group
0,1,19,7,25,A
1,2,8,3,21,A
2,3,8,5,34,B
3,4,2,6,26,B
4,5,2,8,33,B


Next, we will use three different metrics to determine the efficacy of the changes.  The metrics are: 'no_clicks' (number of clicks by the users of each group), 'no_likes' (number of likes by users of each group), and 'time_on_page' time spent on the page by the users of each group).

Group A represents the users post-change in the page, and Group_B represents users of the original format.

## Analysis Bases on Time Spent on the Page

Null Hypothesis: the change did not produce any significant change based on 'time_on_page' (the amount of time the users spent on the app/page).

Alternative Hypothesis: With a significance of 95%, there was a change based on 'time_on_page' (the amount of time the users spent on the app/page).


In [8]:

import scipy.stats as stats

# Separate the data into two groups
# Note: data is split 50-50
group_A = NYT_red_df[NYT_red_df['test_group'] == 'A']['time_on_page']
group_B = NYT_red_df[NYT_red_df['test_group'] == 'B']['time_on_page']

# Perform an independent t-test
t_stat, p_value = stats.ttest_ind(group_A, group_B)


# Calculate and print the average time on page for each group
avg_time_A = group_A.mean()
avg_time_B = group_B.mean()

print(f"Average time on page for Group A: {avg_time_A:.2f}")
print(f"Average time on page for Group B: {avg_time_B:.2f}")

# Display the t-statistic and p-value
print(f"T-statistic: {t_stat}")
print(f"P-value: {p_value}")

# Interpret the results
alpha = 0.05
if p_value < alpha:
    print("Reject the null hypothesis: There is a statistically significant difference between group_A and group_B for time on page.")
else:
    print("Fail to reject the null hypothesis: There is no statistically significant difference between group_A and group_B for time on page.")


Average time on page for Group A: 30.14
Average time on page for Group B: 28.04
T-statistic: 6.184878172487526
P-value: 9.052292506694058e-10
Reject the null hypothesis: There is a statistically significant difference between group_A and group_B for time on page.


## Analysis Based on the Number of Clicks on the Page

Null Hypothesis: the change did not produce any statistically significant change based on 'no_clicks' (the number of clicks by the users of the app/page).

Alternative Hypothesis: With a significance of 95%, there was a statistically significant change based on 'no_clicks' (the number of clicks by the users of the app/page).

In [9]:

import scipy.stats as stats

# Separate the data into two groups
# Note: data is split 50-50
group_A = NYT_red_df[NYT_red_df['test_group'] == 'A']['no_clicks']
group_B = NYT_red_df[NYT_red_df['test_group'] == 'B']['no_clicks']

# Calculate the average values for each group
mean_A = group_A.mean()
mean_B = group_B.mean()

# Display the averages
print(f"Average no_clicks for Group A: {mean_A}")
print(f"Average no_clicks for Group B: {mean_B}")

# Perform an independent t-test
t_stat, p_value = stats.ttest_ind(group_A, group_B)

# Display the t-statistic and p-value
print(f"T-statistic: {t_stat}")
print(f"P-value: {p_value}")

# Interpret the results
alpha = 0.05
if p_value < alpha:
    print("Reject the null hypothesis: There is a statistically significant difference between group_A and group_B for time on page.")
else:
    print("Fail to reject the null hypothesis: There is no statistically significant difference between group_A and group_B for time on page.")


Average no_clicks for Group A: 8.36779324055666
Average no_clicks for Group B: 5.237424547283702
T-statistic: 11.0648471671839
P-value: 6.346578116512094e-27
Reject the null hypothesis: There is a statistically significant difference between group_A and group_B for time on page.


## Analysis Based on the Number of Likes on the Page

Null Hypothesis: the change did not produce any statistically significant change based on 'no_likes' (the number of 'likes' by the users of the app/page).

Alternative Hypothesis: With a significance of 95%, there was a statistically significant change based on 'no_likes' (the number of 'likes' by the users of the app/page).

In [10]:

import scipy.stats as stats

# Separate the data into two groups
# Note: data is split 50-50
group_A = NYT_red_df[NYT_red_df['test_group'] == 'A']['no_likes']
group_B = NYT_red_df[NYT_red_df['test_group'] == 'B']['no_likes']

# Perform an independent t-test
t_stat, p_value = stats.ttest_ind(group_A, group_B)

mean_A = group_A.mean()
mean_B = group_B.mean()

# Display the averages
print(f"Average no_likes for Group A: {mean_A}")
print(f"Average no_likes for Group B: {mean_B}")

# Display the t-statistic and p-value
print(f"T-statistic: {t_stat}")
print(f"P-value: {p_value}")

# Interpret the results
alpha = 0.05
if p_value < alpha:
    print("Reject the null hypothesis: There is a statistically significant difference between group_A and group_B for time on page.")
else:
    print("Fail to reject the null hypothesis: There is no statistically significant difference between group_A and group_B for time on page.")


Average no_likes for Group A: 6.013916500994036
Average no_likes for Group B: 4.959758551307847
T-statistic: 6.6379719842228315
P-value: 5.2114357175856565e-11
Reject the null hypothesis: There is a statistically significant difference between group_A and group_B for time on page.
