# A/B Testing at Nosh Mish Mosh

The Nosh Mish Mosh is a recipe and ingredient meal delivery service. They ship the raw materials and you get to cook them at your home! They have decided to hire a data analyst to help make product and interface decisions. Get started to help them figure out the amount of data they will need to make meaningful decisions.

***

# Nosh Mish Mosh: An Assortment of Edible Aliments

1. We have collected customer data for the past week and exposed it through a Python library, so first import `noshmishmosh`.

In [1]:
import noshmishmosh

2. Next, we will need to do a little bit of data analysis — let us use `numpy` to help. Import `numpy` into your workspace.

In [2]:
import numpy as np

## A/B Testing at Nosh Mish Mosh
3. Nosh Mish Mosh wants to run an experiment to see if we can convince more people to purchase meal plans if we use a more artisanal-looking vegetable selection. We have photographed these modern meals with blush tomatoes and graffiti eggplants, but are not sure if this strategy will sell enough units to benefit from establishing a business relationship with a new provider.

Before running this experiment, of course, we need to know the sample size that will be required to detect the difference we are hoping for. There are three things we need to know before we can determine that number.

* the **Baseline Conversion Rate**
* desired **Lift** (minimum detectable effect)
* and the **Statistical Significance Threshold**

4. Let us get the ball rolling on finding those numbers! In order to get our baseline, we need to first know how many users visit the site in a typical week. Let us grab that logged information, which is stored in `noshmishmosh.customer_visits`. Assign that to a new variable called `all_visitors`.

In [3]:
all_visitors = noshmishmosh.customer_visits

5. Next we need to know how many visitors to the site ultimately end up buying a meal or set of meals in a typical week. We have that information saved into `purchasing_customers` field on `noshmishmosh`. Save that information into a variable called `paying_visitors`.

In [4]:
paying_visitors = noshmishmosh.purchasing_customers

6. Calculate the lengths of the two lists, saving the results into variables called `total_visitor_count` and `paying_visitor_count`, respectively.

In [5]:
total_visitor_count = len(all_visitors)
paying_visitor_count = len(paying_visitors)

7. Now to get the baseline: Divide the number of purchasing visitors by the number of total visitors. Save the result in a variable called baseline_percent. Since we want a percentage as our answer, multiply the result by 100.0.

In [6]:
baseline_percent = paying_visitor_count / total_visitor_count * 100

8. Print out the `baseline_percent` so we know what to use for our baseline percentage in the A/B Sample Size Calculator.

In [7]:
baseline_percent

18.6

## Mish Mosh B'Gosh: the Lift

9. These rainbow fingerling potatoes do not come cheap. We would like to know for sure that, with this change, we will be pulling in at least \$$1240$ more every week. In order to figure out how many more customers we need, we will have to investigate the average revenue generated from a given sale. Luckily we have a list of the money spent by each customer in a typical week: `noshmishmosh.money_spent`. Save that list into a variable called `payment_history`.

In [8]:
payment_history = noshmishmosh.money_spent

10. We need to find how many purchases it would take to reach \$$1240$ in additional revenue using our historical data.

    Let us start with computing the average payment per paying customer using `np.mean`, saving it as `average_payment`.

In [11]:
average_payment = np.mean(payment_history)

average_payment

26.543655913978498

11. We want to know how many of these "usual" payments it would take to clear our \$$1240$ mark. Round the number up using `np.ceil` (because that is how many new customers it takes to bring in more than \$$1240$). Save that value into a `new_customers_needed` variable.

In [10]:
additional_sales_target = 1240

new_customers_needed = np.ceil(additional_sales_target / average_payment)

new_customers_needed

47.0

12. Now find the additional percent of weekly visitors who must make a purchase in order to make this change worthwhile. Do this by dividing the number of customers by the total visitor count for a typical week (calculated earlier), and multiplying by `100.0`. Save the result in a variable called `percentage_point_increase`. Print `percentage_point_increase` to see what it is.

In [12]:
percentage_point_increase = new_customers_needed / total_visitor_count * 100

percentage_point_increase

9.4

13. In order to find our desired lift, we need to express `percentage_point_increase` as a percent of `baseline_percent`. You can do this by dividing `percentage_point_increase` by `baseline_percent` and multiplying by `100.0`.

    Store the results in a variable called `lift`.

In [13]:
lift = percentage_point_increase / baseline_percent * 100

14. Print out the result `lift`.

In [14]:
lift

50.53763440860215

## Nosh Mish Mosh: Tying It All Together

15. The last thing we need to calculate the sample size for Nosh Mish Mosh's artisanal rebranding is our statistical significance threshold. We would like to be fairly certain, but this is not going to be a million dollar decision, so let us go with $10\%$.

16. Now put it all together! Punch the **baseline**, the minimum desired **lift**, and the **statistical significance threshold** into the <a href="https://content.codecademy.com/courses/learn-hypothesis-testing/a_b_sample_size/index4.html">calculator</a> and evaluate how many people need to be shown the new assets before we can check if the results are a significant improvement. Save the results in a variable called `ab_sample_size`.

In [15]:
ab_sample_size = 507