# Assignment - Evaluating A/B Tests 

As you go through this notebook, you will find the symbol **???** in certain places. To complete this assignment, you must replace all the **???** with appropriate values, expressions, or statements to ensure that the notebook runs properly end-to-end. 

**Guidelines**

1. Make sure to run all the code cells in order. Otherwise, you may get errors like `NameError` for undefined variables.
2. Do not change variable names, delete cells, or disturb other existing code. It may cause problems during evaluation.
3. In some cases, you may need to add some code cells or new statements before or after the line of code containing the **???**. 
4. Since you'll be using a temporary online service for code execution, save your work by running `jovian.commit` at regular intervals.
5. Questions marked **(Optional)** will not be considered for evaluation and can be skipped. They are for your learning.
6. If you are stuck, you can ask for help on the bootcamp Slack group. Post errors, ask for hints, and help others, but **please don't share the complete solution code on Slack** to give others a chance to write the code themselves.
7. There are some tests included with this notebook to help you test your implementation. However, after submission, your code will be tested with some hidden test cases. Make sure to test your code exhaustively to cover all edge cases.



### How to Run the Code and Save Your Work

**Option 1: Running using free online resources (1-click, recommended)**: Click the **Run** button at the top of this page and select **Run on Binder**. You can also select "Run on Colab" or "Run on Kaggle", but you'll need to create an account on [Google Colab](https://colab.research.google.com) or [Kaggle](https://kaggle.com) to use these platforms.


**Option 2: Running on your computer locally**: To run the code on your computer locally, you'll need to set up [Python](https://www.python.org) & [Conda](https://docs.conda.io/projects/conda/en/latest/user-guide/install/), download the notebook and install the required libraries. Click the **Run** button at the top of this page, select the **Run Locally** option, and follow the instructions.

**Saving your work**: You can save a snapshot of the assignment to your [Jovian](https://jovian.ai) profile, so that you can access it later and continue your work. Keep saving your work by running `jovian.commit` from time to time.

In [5]:
project_name='evaluating-ab-tests-assignment'

In [6]:
!pip install jovian --upgrade --quiet

In [7]:
import jovian

In [8]:
jovian.commit(project=project_name, privacy='secret')

<IPython.core.display.Javascript object>

[jovian] Attempting to save notebook..[0m
[jovian] Creating a new project "aakashns/evaluating-ab-tests-assignment"[0m
[jovian] Uploading notebook..[0m
[jovian] Capturing environment..[0m
[jovian] Committed successfully! https://jovian.ai/aakashns/evaluating-ab-tests-assignment[0m


'https://jovian.ai/aakashns/evaluating-ab-tests-assignment'

Let's import some modules which might be useful later.

In [13]:
import math
import numpy as np
from scipy.stats import norm

## Problem Statement - A/B Testing

> **QUESTION**: In preparation for the upcoming batch of the Zero to Data Analyst Bootcamp, the Jovian team is looking to improve the course registration page. In particular, we're interested in trying out two variations of the banner text:
>
> ![](https://i.imgur.com/cSu1RI3.png)
>
> Variant A is what we've used for previous batches, while Variant B is the proposed "improved" version. Instead of choosing one or the other, we decided to test out both options by showing different versions of the page to different website visitors (hence the name A/B Testing), and make a data driven decision. 
>
> Over a week of testing, Variant A was shown to 85% of visitors who came to the site, and Variant B was shown to 15% of visitors. Here are the results produced by the experiment:
>
> <img src="https://i.imgur.com/ym1Os3U.png" width="360">
>
> Does Variant B produce a statistically significant improvement in the conversion rate? Should we switch to variant B and discard variant A completely? Use a significance level of 0.01 for the test.

### Step 1. State the Null Hypothesis And the Alternate Hypothesis

The number we're interested in is the average conversion rate i.e. the percentage of website visitors that registered for the program.

(Optional) State the null hypothesis and the alternate hypothesis in your own words:

- **Null Hypothesis**: ???

- **Alternate Hypothesis**: ???

The two hypotheses can be stated mathematically as follows (can you see why?):

$H_0: \mu \le 52/17000$

$H_1: \mu > 52/17000$

Here, $\mu$ represents the average conversion rate. We'll start by assuming that the null hypothesis is true.



Let's save our work before continuing

In [9]:
jovian.commit()

<IPython.core.display.Javascript object>

[jovian] Attempting to save notebook..[0m
[jovian] Updating notebook "aakashns/evaluating-ab-tests-assignment" on https://jovian.ai/[0m
[jovian] Uploading notebook..[0m
[jovian] Capturing environment..[0m
[jovian] Committed successfully! https://jovian.ai/aakashns/evaluating-ab-tests-assignment[0m


'https://jovian.ai/aakashns/evaluating-ab-tests-assignment'

### Step 2. Compute the Z Statistic


<img src="https://i.imgur.com/AUJX4qi.png" width="120">

where:

* $\overline{X}$ is the sample mean (computed using the observed values)
* $\mu$ is the population mean (stated in the null hypothesis)
* $\sigma$ is the population standard deviation (if unavailable, use sample standard deviation as an approximation)
* $n$ is the number of samples collected



> **QUESTION 1**: Compute the sample mean (conversion rate) $\overline{X}$ using the observations for Variant B.

In [None]:
sample_mean = ???

In [None]:
print('The sample mean is', sample_mean)

> **QUESTION 2**: Estimate the population mean (conversion rate) $\mu$ using the observations for Variant A

In [None]:
population_mean = ???

In [None]:
print('The population mean is', population_mean)

Note that the act of a visitor visiting is a Bernoulli trial. There are two possible outcomes: the visitor registers (success) or the visitor does not register (failure). The conversion rate indicates the probability of success (i.e. registration). 

As this discussed in the lesson on [Hypothesis Testing](https://jovian.ai/aakashns/hypothesis-testing-and-statistical-significance), the population standard deviation for a Bernoulli trial can be computed as $\sigma = \sqrt{\mu(1 - \mu}$.

> **QUESTION 3**: Compute the population standard deviation $\sigma$ using the formula $\sigma = \sqrt{\mu(1 - \mu}$, where $\mu$ is population mean (conversion rate).

In [None]:
std = ???

In [None]:
print('The population standard deviation is', std)

> **QUESTION 3**: What is the sample size of the set of observations for Variant B?

In [None]:
sample_size = ???

In [None]:
print('The sample size is', sample_size)

We are now ready to compute the Z statistic.

<img src="https://i.imgur.com/AUJX4qi.png" width="120">

where:

* $\overline{X}$ is the sample mean (computed using the observed values)
* $\mu$ is the population mean (stated in the null hypothesis)
* $\sigma$ is the population standard deviation (if unavailable, use sample standard deviation as an approximation)
* $n$ is the number of samples collected


> **QUESTION 4**: Compute the Z statistic using the above formula.

In [None]:
z_statistic = ???

In [None]:
print('The Z statistic for the A/B test is', z_statistic)

Let's save our work before continuing

In [None]:
jovian.commit()

<IPython.core.display.Javascript object>

[jovian] Attempting to save notebook..[0m


### Step 3. Identify whether the test is left-tailed, right-tailed or two-tailed

Use this chart to identify whether you're doing a left-tailed, right-tailed or two-tailed test:

![](https://i.imgur.com/rtLYm3c.png)

> **QUESTION 5**: Is the A/B test we're conducting left-tailed, right-tailed or two-tailed? Set the value of the variable `test_type` to `"left-tailed"`, `"right-tailed"` or `"two-tailed"` to answer this question.

In [15]:
# Should have the value "left-tailed", "right-tailed" or "two-tailed"
test_type = ???

In [None]:
print("The A/B test we're conducting is a {} test".format(test_type))

Let's save our work before continuing

In [None]:
jovian.commit()

<IPython.core.display.Javascript object>

[jovian] Attempting to save notebook..[0m
[jovian] Updating notebook "aakashns/evaluating-ab-tests-assignment" on https://jovian.ai/[0m
[jovian] Uploading notebook..[0m
[jovian] Capturing environment..[0m


### Step 4. Calculate the $p$ value using the Z statistic

The $p$ value for a statistical test is the the probability of obtaining a sample “equally or more extreme” than the observed data, assuming that the null hypothesis is true. Use the following guidelines to compute the p-value:

* **Left tailed**: In this case, the Z statistic is negative, and the p-value is the area to the left of the observed Z statistic, so it can be computed simply as `norm.cdf(z)`

* **Right tailed**: In this case, the Z statistic is positive, and the value p-value is the area to the right of the observed Z statistic, so it can be computed as `1 - norm.cdf(z)` (since the total area under the curve representing the probability of all possible z values is 1).

* **Two tailed**: In this case, we need to consider both the positive and negative values of the Z statistic. The p-value is the sum of the area to the left of the negative Z statistic and the area to the right of the positive z statistic, so it can be computed as `norm.cdf(-z)` + `1 - norm.cdf(z)` (where `z` indicates the absolute value of the Z statistic)

> **QUESTION 6**: Compute the $p$ value for the A/B test described in the problem statement above.

In [None]:
p_value = ???

In [None]:
print('The p value for the A/B test is', p_value)

> **QUESTION 7**: Are the results of the A/B test statistically significant? Use a significance level of 0.01.

In [None]:
# Should be set to True or False
is_significant = ???

In [None]:
if is_significant:
    print('The results of the A/B test are statistically significant')
else:
    print('The results of the A/B test are NOT statistically significant')

### Computing Uplift and Confidence Level

Uplift is defined as the ratio of increase in the  


Uplift = (New conversion rate - Old Conversion Rate) / Old Conversion Rate

> **QUESTION 8**: Compute the uplift in conversion due to Variant B. Use Variant A to compute the old conversion rate

In [None]:
uplift = ???

In [None]:
print(uplift)

> **QUESTION 9**: Compute the confidence level of the results.

In [None]:
confidence_level = ???

In [None]:
print(confidence_level)

Let's save our work before continuing

In [None]:
jovian.commit()

<IPython.core.display.Javascript object>

[jovian] Attempting to save notebook..[0m
[jovian] Updating notebook "aakashns/evaluating-ab-tests-assignment" on https://jovian.ai/[0m
[jovian] Uploading notebook..[0m
[jovian] Capturing environment..[0m


## Make a Submission

Run the following code cell to make a submission. Alternatively, you can also submit your Jovian notebook link on the [assignment page](https://jovian.ai/learn/zero-to-data-analyst-bootcamp/assignment/evaluating-a-b-tests). 




In [None]:
jovian.submit('zerotoanalyst-a3')

You can make any number of submission, but only your final submission will be considered for evaluation.