### Experimental design:

Designing an experiment is the only way to get  data that is perfectly fitted to the question(s) at hand. 
Properly designed experiments will have all the variables you care about, measured in the way you think is best, collected in a context that emphasizes the features you want to understand.

Experimental design can be a powerful tool in your data science arsenal, and like all powerful tools, it needs to be handled the <b>right way</b>.

In this lesson, we'll discuss A/B testing and how to:
- evaluate a good experiment, 
- writing a research proposal, and 
- analytical techniques to use when an experiment yields non-normal data.

> Particularly important when it comes to actually designing your own experiment is making sure that the two groups, the A and B, are <b>as comparable as possible</b>. The only difference should be the treatment they receive.

The components of an A/B test are:

1. Two versions of something whose effects will be compared.
1. A sample, divided into two groups. 
1. A hypothesis. Your hypothesis is what you expect to happen. 
1. Outcome(s) of interest. What you expect will change as a result of using version A or version B, and how you will measure that change.
1. Other measured variables. This includes information about the two groups that can be used to ensure they are similar ...


### DRILL: Getting Testy...
For each of the following questions, outline how you could use an A/B test to find an answer. Be sure to identify all five key components of an A/B test we outlined above.

1. Does a new supplement help people sleep better?
2. Will new uniforms help a gym's business?
3. Will a new homepage improve my online exotic pet rental business?
4. If I put 'please read' in the email subject will more people read my emails?

### Q1: Does a new supplement help people sleep better?

<b>1.Two versions:</b>

    We need a control version: some people that don't use the new supplement, and the 'test version' that used the new product.

<b>2.A sample:</b>

    We need two samples, one from each of the two versions. These samples need to be chosen randomly. The samples should be similar and the only difference should be attributed to the test.
    
<b>3.A hypothesis:</b>   

    I expect that the second group will report that they slept better than the group that did not take the new product.

<b>4.An outcome:</b>

    The key metric we should measure as the principal outcome is the number of hours the subjects from the second group slept. This number should be higher than the control group.
    
<b>5.Other measured variable:</b>    

    To ensure the test result is dependent only on one variable, the new product, we should ask the subjects if they take any other medication that could help them sleep well, or if they will be in a holiday during the test period, or if they have a stressful week at work, or if the weather was rainy or not etc.

With these answers we can discover if there could be any other factor that could influence our result.

### Q2: Will new uniforms help a gym's business?

From the question I am guessing that the trainers will wear the new uniform. The gym should offer one free hour of training where the trainers should wear the old uniform. Subjects that participated should subscribe or not.
The same action should be done using the new uniform and see the level of subscription.
The bonus hour could be done, also, outside the gym, in a mall, or in a very crowded area to have exposure.

One other way is: one month the trainer uses the old uniform and the other month the new uniform. The indicator should be the number of the clients that extend their monthly pass.

<b>1.Two versions:</b>

    We need a control version: the subjects that participated in the bonus training hour while trainers used the former uniform.
    The test version: the subjects were exposed to the new outfit.

<b>2.A sample:</b>

    We need two samples, one from each of the two versions.
    These samples need to be chosen randomly. The samples should be similar and the only difference should be attributed to the test.
    
<b>3.A hypothesis:</b>   

    I expect that the second group that was exposed to new uniform to subscribe more than the first.

<b>4.An outcome:</b>

    The key metric we should measure as the principal outcome is the number of new subscriptions the subjects from the second group bought. This number should be higher than the control group.
    
<b>5.Other measured variable:</b>    

    To ensure the test result is dependent only on one variable, the new uniform, we should ask the subjects to fill in a form about their age, gender, education, proximity to the gym, how far in time was the previous period of training etc.

With these answers we can discover if there could be any other factor that could influence our result, due to the difference in the subjects from our test groups.

### Q3: Will a new homepage improve my online exotic pet rental business?

1.Two versions:

    We need a control version: some visitiors that use the former version of the website, and the 'test version' that used the new website.

2.A sample:

    We need two samples, one from each of the two versions. These samples need to be chosen randomly. The samples should be similar and the only difference should be attributed to the test.

3.A hypothesis:

    I expect that the second group will finish navigating the website with a request to purchase a product buy clicking the 'buy' button.

4.An outcome:

    The key metric we should measure as the principal outcome is the number purchase orders. This number should be higher than the control group.

5.Other measured variable:

    To ensure the test result is dependent only on one variable, the new website,
- we should make sure the visitors are unique,
- we should check the cookies to see if they visited the website before, or any other data that could classify the user


With these answers we can discover if there could be any other factor that could influence our result.


### Q4: If I put 'please read' in the email subject will more people read my emails?

1.Two versions:

    We need a control version: some visitiors that will receive the old format email, and the 'test version' that will receive the new email version.
  

2.A sample:

    We need two samples, one from each of the two versions. These samples need to be chosen randomly. The samples should be similar and the only difference should be attributed to the test.

3.A hypothesis:

    I expect that the second group will have more users that opened my email than the first.

4.An outcome:

    The key metric we should measure as the principal outcome is the number of opened emails. This number should be higher than the control group.

5.Other measured variable:

    To ensure the test result is dependent only on one variable, the new format type,
- we should make sure the subscribers are unique,


With these answers we can discover if there could be any other factor that could influence our result.
