# A/B Testing at Nosh Mish Mosh

The Nosh Mish Mosh is a recipe and ingredient meal delivery service. They ship the raw materials and you get to cook them at your home! They’ve decided to hire a data analyst to help make product and interface decisions. Get started to help them figure out the amount of data they’ll need to make meaningful decisions.

Note that a solution.py file is loaded for you in the workspace, which contains solution code for this project. We highly recommend that you complete the project on your own without checking the solution, but feel free to take a look if you get stuck or want to check your answers when you’re done!``


## Tasks

### Nosh Mish Mosh: An Assortment of Edible Aliments

1. We’ve collected customer data for the past week and exposed it through a Python library, so first import `noshmishmosh`.

    <details>
        <summary>Stuck? Get a hint</summary>
    
    Importing a module in Python is as simple as

    ```python
    import noshmishmosh
    ```
    </details>

2. Next, we’ll need to do a little bit of data analysis — let’s use `numpy` to help. Import `numpy` into your workspace as `np`.

    <details>
        <summary>Stuck? Get a hint</summary>
    
    In the pursuit of brevity, we like to rename the NumPy to something a little more manageable.

    ```python
    import numpy as np
    ```
    </details>



### A/B Testing at Nosh Mish Mosh

3. Nosh Mish Mosh wants to run an experiment to see if we can convince more people to purchase meal plans if we use a more artisanal-looking vegetable selection. We’ve photographed these modern meals with blush tomatoes and graffiti eggplants, but aren’t sure if this strategy will sell enough units to benefit from establishing a business relationship with a new provider.

    Before running this experiment, of course, we need to know the sample size that will be required to detect the difference we are hoping for. There are three things we need to know before we can determine that number.

    - the **Baseline Conversion Rate**
    - **Minimum Detectable Effect** (desired lift)
    - and the **Statistical Significance Threshold**

4. Let’s get the ball rolling on finding those numbers! In order to get our baseline, we need to first know how many users visit the site in a typical week. Let’s grab that logged information, which is stored in `noshmishmosh.customer_visits`. Assign that to a new variable called `all_visitors`.

    <details>
        <summary>Stuck? Get a hint</summary>
    
    ```python
    all_visitors = noshmishmosh.customer_visits
    ```
    </details>

5. Next we need to know how many visitors to the site ultimately end up buying a meal or set of meals in a typical week. We have that information saved into `purchasing_customers` field on `noshmishmosh`. Save that information into a variable called `paying_visitors`.

    <details>
        <summary>Stuck? Get a hint</summary>
    
    ```python
    paying_visitors = noshmishmosh.purchasing_customers
    ```
    </details>

6. Calculate the lengths of the two lists, saving the results into variables called `total_visitor_count` and `paying_visitor_count`, respectively.

    <details>
        <summary>Stuck? Get a hint</summary>
    
    Calculate length using the `len` function:

    ```python
    total_visitor_count = len(all_visitors)
    paying_visitor_count = len(paying_visitors)
    ```
    </details>

7. Now to get the baseline: Divide the number of purchasing visitors by the number of total visitors. Save the result in a variable called `baseline_percent`. Since we want a percentage as our answer, multiply the result by `100.0`.

    <details>
        <summary>Stuck? Get a hint</summary>
    
    ```python
    baseline_percent = paying_visitor_count / total_visitor_count * 100
    ```
    </details>

8. Print out the `baseline_percent` so we know what to use for our baseline percentage in the A/B Sample Size Calculator.

    <details>
        <summary>Stuck? Get a hint</summary>
    
    ```python
    print(baseline_percent)
    ```
    </details>


### Mish Mosh B'Gosh: The Effect Size

9. These rainbow fingerling potatoes don’t come cheap. We’d like to know for sure that, with this change, we’ll be pulling in at least $1240 more every week. In order to figure out how many more customers we need, we’ll have to investigate the average revenue generated from a given sale. Luckily we have a list of the money spent by each customer in a typical week: `noshmishmosh.money_spent`. Save that list into a variable called `payment_history`.

    <details>
        <summary>Stuck? Get a hint</summary>
    
    ```python
    payment_history = noshmishmosh.money_spent
    ```
    </details>

10. We need to find how many purchases it would take to reach $1240 in additional revenue using our historical data.

    Let’s start with computing the average payment per paying customer using `np.mean`, saving it as `average_payment`.

    <details>
        <summary>Stuck? Get a hint</summary>
    
    If you shortened `numpy` to `np` this would be:

    ```python
    average_payment = np.mean(payment_history)
    ```
    </details>

11. We want to know how many of these “usual” payments it would take to clear our $1240 mark. Round the number up using `np.ceil` (because that’s how many new customers it takes to bring in more than $1240). Save that value into a `new_customers_needed` variable.

    <details>
        <summary>Stuck? Get a hint</summary>
    
    It’s possible to print out the divided value and round up ourselves but there is a function in Python’s `numpy` library that can do this for us:

    ```python
    new_customers_needed = np.ceil(1240 / average_payment)
    ```
    </details>

12. Now find the additional percent of weekly visitors who must make a purchase in order to make this change worthwhile. Do this by dividing the number of customers by the total visitor count for a typical week (calculated earlier), and multiplying by 100. Save the result in a variable called `percentage_point_increase`. Print `percentage_point_increase` to see what it is.

    <details>
        <summary>Stuck? Get a hint</summary>
    
    ```python
    percentage_point_increase = new_customers_needed / total_visitor_count * 100
    ```
    </details>

13. In order to find our minimum detectable effect/desired lift, we need to express `percentage_point_increase` as a percent of `baseline_percent`. You can do this by dividing `percentage_point_increase` by `baseline_percent` and multiplying by `100.0`.

    Store the results in a variable called `mde`.

    <details>
        <summary>Stuck? Get a hint</summary>
    
    We’ve calculated a lot of percentages so far in this project, and the fact that these are both already percentages does not change the formula.

    ```python
    mde = percentage_point_increase / baseline_percent * 100
    ```
    </details>

14. Print out the result `mde`.



### Nosh Mish Mosh: Tying It All Together

15. The last thing we need to calculate the sample size for Nosh Mish Mosh’s artisanal rebranding is our statistical significance threshold. We’d like to be fairly certain, but this isn’t going to be a million dollar decision, so let’s go with 10%.

16. Now put it all together! Punch the *baseline*, the *minimum detectable effect*, and the statistical significance threshold into the calculator and evaluate how many people need to be shown the new assets before we can check if the results are a significant improvement. Save the results in a variable called `ab_sample_size`.


In [4]:
from html import escape
 
class myDisplayObject:
    def __init__(self, html, width="100%", height="200px", ratio=1):
        self.width = width
        self.height = height
        self.ratio = ratio
        self.html = html
 
    # cribbed from branca Py package
    def _repr_html_(self, **kwargs):
        """Displays the Diagram in a Jupyter notebook."""
        html = escape(self.html)
        iframe = (
            '<iframe srcdoc="{html}" width="{width}" height="{height}"'
            'style="border:none !important;" '
            '"allowfullscreen" "webkitallowfullscreen" "mozallowfullscreen">'
            '</iframe>'
        ).format(html=html, width=self.width, height=self.height)
        return iframe

myDisplayObject(open('../assets/ab_ss_calculator.html', 'r').read(), height="220px")

### Solution

In [5]:
# Step 1
import noshmishmosh

# Step 2
import numpy as np

# Step 4
all_visitors = noshmishmosh.customer_visits

# Step 5
paying_visitors = noshmishmosh.purchasing_customers

# Step 6
total_visitor_count = len(all_visitors)
paying_visitor_count = len(paying_visitors)

# Step 7
baseline_percent = paying_visitor_count / total_visitor_count * 100

# Step 8
print("Baseline percent:")
print(baseline_percent)

# Step 9
payment_history = noshmishmosh.money_spent

# Step 10
average_payment = np.mean(payment_history)

# Step 11
new_customers_needed = np.ceil(1240 / average_payment)

# Step 12
percentage_point_increase = new_customers_needed / total_visitor_count * 100
print("Percentage point increase:")
print(percentage_point_increase)

# Step 13
mde = percentage_point_increase / baseline_percent * 100

# Step 14
print("Minimum Detectable Effect:")
print(mde)

# Step 16
ab_sample_size = 	490

Baseline percent:
18.6
Percentage point increase:
9.4
Minimum Detectable Effect:
50.53763440860215
