<img src="http://imgur.com/1ZcRyrc.png" style="float: left; margin: 20px; height: 55px">

# Challenge: A/B Testing Hypothesis Tests

---

### Scenario

You are a data science team working for a web-based company and you are planning to roll out a new website design. One of two competing designs were presented to random samples of users, and their ultimate purchase total was recorded (if any).

Your task is to determine which of the two designs yields higher total purchases and if the result is statistically significant.

###### Remember to label your plots (both axes and a title).

In [1]:
from matplotlib import pyplot as plt
import numpy as np
import pandas as pd
from scipy import stats
import seaborn as sns

%matplotlib inline
np.random.seed(42)

In [2]:
## generate some data and randomize

# some people bought nothing, the others bought 
# with some distribution
data1 = [0] * 50
data1.extend(np.random.normal(14, 4, 150))
np.random.shuffle(data1)

# the second design hooked less people, 
# but those that were hooked bought more stuff
data2 = [0] * 100
data2.extend(np.random.normal(20, 5, 100))
np.random.shuffle(data2)

# make a DataFrame
df = pd.DataFrame()
df["A"] = data1
df["B"] = data2

df.head()

Unnamed: 0,A,B
0,14.685473,25.66671
1,20.152146,0.0
2,14.274252,18.370134
3,12.122102,26.632519
4,18.228489,25.862179


#### Plot out the distributions of group A and group B.

- Plot a histogram or other graph of ONLY the group A column, and ONLY the group B column. 
- Use Pandas to make a density plot for each.
- Put an appropriate title and axes labels on each plot.

In [None]:
# let's plot the data for group A first


In [None]:
# Create a density plot for group A

In [None]:
# make the same plot for data set B


In [None]:
# Create a density plot for group B

#### Make a box plot of the two groups

#### Using seaborne `.distplot()` - to plot the distributions of the graphs together

In [None]:
#Put both graphs in this box and seaborn will stack them - try sns.distplot


#### Are our data sets (approximately) normal? 


<a id="statistical-tests"></a>
### Statistical Tests

There are a few good statistical tests for A/B testing:
* [ANOVA](https://en.wikipedia.org/wiki/Analysis_of_variance)
* [Welch's t-test](https://en.wikipedia.org/wiki/Welch's_t-test)
* [Mann-Whitney test](https://en.wikipedia.org/wiki/Mann%E2%80%93Whitney_U_test)

**Each test makes various assumptions:**
* ANOVA assumes the residuals are normally distributed and data have equal variances.
* The Welch t-test assumes normal distributions but not necessarily equal variances and more effectively accounts for small sample sizes.
* The Mann-Whitney test assumes nothing about the distributions but requires at least 20 data points in each set, producing a weaker p value.

Typically you need to choose the most appropriate test. Tests that make more assumptions are more discriminating (producing stronger p values) but can be misleading with data sets that don't satisfy the assumptions.

In statistics, **one-way analysis of variance** (abbreviated one-way **ANOVA**) is a technique used to compare the means of three or more samples (using the **F distribution**). 

The **ANOVA** tests the *null hypothesis* (the default position that there is no relationship) that samples in two or more groups are drawn from populations with the same mean values. 
- *One-way* ANOVA: tests the difference in population means based on one characteristic or factor.
- *Two-way* ANOVA: tests comparisons between populations based on multiple characteristics.
> - When there are only two means to compare, we use the **t-test**.
> - When testing for differences among at least three groups, the **ANOVA** is used. 

#### Which test is most appropriate for our data?

In [None]:
# Answer:


#### Use the Mann-Whitney test on our data.

- Look up the function in SciPy [here](https://docs.scipy.org/doc/scipy/reference/generated/scipy.stats.mannwhitneyu.html).
- Statistic: Float the Mann-Whitney U statistic — equal to min(U for x, U for y) if alternative is equal to none (deprecated; exists for backward compatibility) — and U for Y otherwise.
- P value: Float p value assuming an asymptotic normal distribution — one sided or two sided, depending on the choice of alternative.

The Mann–Whitney U test (also called the Mann–Whitney–Wilcoxon (MWW), Wilcoxon rank-sum test, or Wilcoxon–Mann–Whitney test) is a nonparametric test of the null hypothesis of whether it is equally likely that a randomly selected value from one sample will be less than or greater than a randomly selected value from a second sample.

Unlike the t-test, it does not require the assumption of normal distributions. It is also nearly as efficient as the t-test on normal distributions.

<a id="interpret-your-results"></a>
### Interpret Your Results
* Compute the total customer spend for each group.
* Is there a significant difference in the mean total purchases in the two designs?
* Which design do you recommend? Why? 
* Write two sentences explaining your results and your recommendation.

In [None]:
#Was there a large difference in the customer spend? Compute this.


In [None]:
# Given the lack of significant (pvalue of .4) with small difference in overall spend - I would not update the site


## Testing more than 2 means
Now let's create some new data sets.
- Let's make them rather different from each other...
- ...and normally-distributed

In [None]:
# some people bought less
data1 = np.random.normal(10, 5, 100)

# some people bought a medium amount
data2 = np.random.normal(20, 5, 100)

# some people bought more
data3 = np.random.normal(30, 5, 100)

# turn into a DataFrame 
# as we did above, with column headers "A", "B", "C"

# 
three_means_df = None 


# Verify the data looks like you expect it to


#### Are our data sets (approximately) normal? 
- Create a histogram for each group to decide. use Pandas `.hist()`.
- Don't forget to label the axes and add a title

In [None]:
# create stacked histograms

In [None]:
#sometimes it's easier to see on seaborn by stacking three distplots


In [None]:
# What is your finding?


#### Are the variances of our variables similar?

#### Use the one-way ANOVA to test for differences in our data.

- Look up the function in SciPy [here](https://docs.scipy.org/doc/scipy/reference/generated/scipy.stats.f_oneway.html).
- Statistic: (Float) The computed F-value of the test.
- P value: (Float) The associated p-value from the F-distribution.

In [None]:
# use the one-way ANOVA to test for differences


In [None]:
# python use scientific notation for very large or small numbers


### Interpret Your Results
* Is there a significant difference in the mean of these three groups?
* Which design do you recommend? Why? 
* Write two sentences explaining your results and your recommendation.

## Optional Practice: Acme Shopping

#### Research Question:
Are the spending amounts of men and women different at Acme?

In [None]:
# Generating Data
np.random.seed(123)
df_m = pd.DataFrame({
    'sex': 'M',
    'amount': np.random.normal(loc=60, scale=3, size=100)
})

df_f = pd.DataFrame({
    'sex': 'F',
    'amount': np.random.normal(loc=70, scale=4, size=100)
})

df = pd.concat([df_m, df_f], axis=0)
df.head()

###### Plot the data for each sex. What do you see?


###### Formulate a hypothesis test:

###### Run a t-test

###### Make a conclusion

###### ANSWER: