Skip to content

This project is the Udacity's A/B Testing final course project where a hypothetical experiment was conducted to test few metrics in the hypothetical online platform.

Notifications You must be signed in to change notification settings

imnikhilanand/AB-Testing-Udacity-Project

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

30 Commits
 
 
 
 
 
 
 
 

Repository files navigation

AB-Testing-Udacity-Project

This project is the Udacity's A/B Testing final course project where a hypothetical experiment was conducted to test few metrics in the hypothetical online platform.

Experiment Design

During the experiment, the Udacity courses have two options:

  1. Start Free Trail
  2. Access Course Materials

If the student clicks start free trail, the student will be asked to enter their credit card details and they will be enrolled in the free course. After 14 days, they will be automatically charged unless they cancel first.

If the student clicks access course materials, they will be able to view the videos and take the quizzes for free, but they won't be able to receive coaching support and verified certificate, and they won't be able to upload their project to get the feedback.

For the experiment, udacity wants to test a change where if the student clicks - "Start free trail", they will be asked how much time they are willing to dedicate for the course. If the student selects 5 or more hours per week, a message would appear indicating that Udacity usually requires a greater time commitment. At this point the student would have the option to either join the free trail or access the course materials for free instead.

The hypothesis was that this might set clearer expectations for students upfront, thus reducing the number of frustrated students who left the free trial because they didn't have enough time—without significantly reducing the number of students to continue past the free trial and eventually complete the course. If this hypothesis held true, Udacity could improve the overall student experience and improve coaches' capacity to support students who are likely to complete the course.

The unit of diversion is a cookie, although if the student enrolls in the free trial, they are tracked by user-id from that point forward. The same user-id cannot enroll in the free trial twice. For users that do not enroll, their user-id is not tracked in the experiment, even if they were signed in when they visited the course overview page.


User Flow

Metric Choice

MetricDescription
Number of CookiesThat is, number of unique cookies to view course overview page
Number of User-IdsThat is, number of users who enroll in the free trail
Number of ClicksThat is, number of unique cookies to click the "Start free trail" button (which happens before the free trail screener is trigger)
Click-through-ProbabilityThat is, number of unique cookies to click the "Start free trail" button divided by the number of cookies to view the course overview page
Gross ConversionThat is, number of user-ids to complete checkout and enroll in the free trail divided by number of cookies to click the "Start free trail" button.
RetentionThat is, number of user-ids to remain enrolled past the 14 day boundary (and thus make at least one payment) divided by number of user-ids to complete checkout
Net ConversionThat is, number of user-ids to remain enrolled past the 14 day boundary (and thus make atleast 1 payment) divided by the nuber of unique cookies to click the "Start free trail" button

Choosing Invariant Metrics

  1. Number of Cookies: Since the unit of diversion is Cookies, the number of cookies in both control and treatment groups will remain same.
  2. Number of Clicks: Since the number of cookies that clicks the start free trail is before the "Free trail screener", the number of clicks in the two groups will remain approximately same.
  3. Click through Probability: This metric is the combination of the above two metric. So it will also remain the same.

Choosing Evaluation Metrics

  1. Gross Conversion: (# no of user ids enrolled/# of unique daily cookies to click "start free trial" button) Since the two different groups have same number of unique cookies, but the number of cookies checking out would be different as they will be shown Trigger.
  2. Retention: (# no of user ids paid/# no of user ids that enrolled) Since the two different groups will have different behavior as the users in treatment groups would be more determined to complete the course and will remain enrolled. So the retention rate would be higher for the treatment group users.
  3. Net Conversion: (# no of user ids paid/# of unique daily cookies to click "start free trial" button) Since the two groups have different behavior after joining the free training, the number of users who will remain enrolled past 14 days will be higher in the treatment group.

Measuring Standard deviation

For each of the metric you selected as an evaluation metric, make an analytical estimate of its standard deviation, given a sample size of 5000 cookies visting the course overview page. Enter each estimate in the appropriate box to 4 decimal places.

The following metric values were provided by Udacity

MetricValuesMin Values
Number of Cookies40,0003000
Number of user-ids66050
Number of clicks on 'Start Free Trial'3200240
Click through Probability on 'Start Free Trial'0.080.01
Gross Conversion0.20620.01
Retention0.530.01
Net Conversion0.10930.0075

Since, the given sample size of the experiment is 5000 cookies, we can calculate the standard deviation of the probabilites on the scaled data of 5000 cookies. We will be estimating how for the sample population likely to be from the mean proportion.

MetricValuesScaled EstimatesMin Values
Number of Cookies40,00050003000
Number of user-ids66040050
Number of clicks on 'Start Free Trial'82.53200240
Click through Probability on 'Start Free Trial'0.08NA0.01
Gross Conversion0.2062NA0.01
Retention0.53NA0.01
Net Conversion0.1093NA0.0075

We can calculate standard deviation analytically by assuming binomial distribution for Gross Conversion, Net Conversion and Retension. Since n is very large, we can assume this binomial distribution to be close to normal distribution due to Central Limit Theorem.

To approximate a binomial distribution as a normal distribution, np and n(1-p) should be greater than 5.

  1. Gross Conversion: 400 * 0.2062 > 5 and 400 * (1-0.2062) > 5

  2. Retention: 82.5 * 0.53 > 5 and 82.5 * (1-0.53) > 5

  3. Net Conversion: 400 * 0.109 > 5 and 400 * (1-0.1093) > 5

MetricValuesStandard Error
Gross Conversion0.20620.02022
Retention0.530.05511
Net Conversion0.10930.0156

Sizing

Number of Samples vs. Power

The alpha value for the experiment is 0.05 and power for the expriment is given as 0.80 (by setting beta to 0.2)

Using calculator we can estimate the total pageview to be:

Gross Conversion: 25830 * 2 * 40000 / 3200 = 645750

Retention: 39115 * 2 * 40000 / 660 = 4741213

Net Conversion: 27411 * 2 * 40000 / 3200 = 685275

Duration vs. Exposure

Since, the daily traffic on the website is 40,000, total number of Days for the following three metric will be:

MetricDays Required
Gross Conversion= 645750/40000 = ~17
Retention= 4741213/40000 = ~119
Net Conversion= 685275/40000 = ~18

Experimental Analysis

So, if we perform the entire experiment and try to measure all the three metric we will be rquiring ~119 days to get the results. This would be the case when we will divert the entire traffic i.e. 100%. If we track the metrics - Gross Conversion and Net Conversion, it will be feasible.

There are two major problems with measuring Retention:

  1. We cannot perform any other expriment during the same period.

  2. There are business risks for performing expriment for that long.

Net Conversion is product of Gross Conversion and Retention, we can make inference about retention from these two metrics.

There are risks associated with running experiment for too long we can divert some proportion of traffic and increase the duration of the expriment.

Sanity Checks

Sanity check is performed to check if the number of participants in the two groups are divided in same proportion. To perfrom this test we have to check if the difference in the number of the two groups should not be signficantly different from 0. The assignment of users in the two groups is random which means we can assume the users in contorl (or treatment) to be a binomial distributioin with a probability of 0.5. As the value of n is too large, we can assume then distribution to be normal. A two proportion z-test will be performed to evaluate this.

For the invariance metrics:

MetricCI lowerCI UpperObservedPassed
Pageviews0.49880.50110.5006Yes
Clicks0.49580.50410.5004Yes
Click-Through Probability-0.001290.001290.0000566No

Result Analysis

Similar to the Click-Through Probability, we have to compute test the hypothesis for the evaluation metrics. This time we will compute the confidence interval of the difference between the two groups. At the end we will check if the observed difference is significant or not.

Recall out hypothesis:

  • H0: CGtreatment = CGcontrol

  • H1: CGtreatment != CGcontrol

  • H0: CNtreatment = CNcontrol

  • H1: CNtreatment != CNcontrol

Since, the time required to pay for the trail is 14 days, we can estimate the expriment for 37-14 = 23 days. So, we will be taking only that data to calculate the total sample size.

MetricCI lowerCI UpperdD minStatistically Significant?Practically Significant?
Gross Conversion-0.0291-0.0119-0.0205-0.01YesYes
Net Conversion-0.011600.0018-0.00480.0075NoNo

Effective Test Size

From the above calculations you can observe that we need atleast 685,275 pageviews for the experiments. But the total number of pageviews we are getting is 423,525. We cannot do much about it at this point of time.

Sign Test

From the sign test for number of pageviews for both the groups, we can observe that the :

To be continued

Summary

The experience was conduccted for 23 days. From the above analysis of AB test, we can observe that Gross conversion have reduced significantly but not the Net Conversion rate. Though it had reduced a bit. We were able to estimate the change only in two of the metric, as Retention rate measurment might require ~119 dasys to evaluate.

Recommendation

From the above analysis, I would recommend Gross Conversion has dropped that means the enrollment of the users have dropped significantly, both statistically and practically. It dropped around 2.6%.

Though we cannot reject the null hypothesis for this project, bu the Net conversion rate has dropped. It would mostly be between -1.16% and 0.19%.

Given the results, we can assume the introduction of 'Free Trail Screener' may indeed help to set clearer expectations for students upfront. However the results for Net Convetrsion is incompatible with the assumptions and business needs. I would recommend not to implement the changes as it might lower the sales of the paid courses.

About

This project is the Udacity's A/B Testing final course project where a hypothetical experiment was conducted to test few metrics in the hypothetical online platform.

Topics

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages