# 8. Estimates, Hypothesis Tests and Experiments
*Module: Experimental Design (Sprint 2 of 2)*

*Experiments and the scientific method are at the heart of how we “know” what we know when it comes to data analysis. But how does it translate to the different situations we encounter in practice and what are some common pitfalls to be aware of?*

|Data Journalist| Data Engineer | Statistical Modeler| Business Analyst |
|:----------------:|:----:|:------------------:|:----:|
|I need to understand how **causation is established** in scientific studies so that I can interpret studies and focus my analyses|I need to be able to **implement experiment-driven algorithms** such as (A/B testing and Epsilon Greedy) so that I can provide a testing capability|I need to understand how to **isolate factors and design appropriate experiments** so that I can answer a wide range of research questions|I need to identify opportunities to test and **optimize with techniques such as A/B Testing and Epsilong Greedy** so that my organization can continuously improve|
|I need to understand **how population characteristics are inferred** from their samples so I can draw accurate conclusions about third party research as well as my own analysis|I need to understand the computing and analytical performance **tradeoffs between different levels of sampling** so that I can optimize for different objectives|I need to understand the **kinds of statistical hypotheses I can make as well as the tests they apply to** so that I can answer a variety of research questions|I need to understand how to **construct a testable hypothesis** about the populations represented by my business data so that I can drive strategic decisions about novel scenarios|

## Analytical Process Big Picture
![Curriculum Summary](../curriculum_summary.png)

## Tests are powerful in any kind of learning
- How do have we used tests so far, intentionally or not?
- How does that apply to analysis of data?
- Do tests and experimental design matter if we are only analyzing pre-collected data?



## Key Questions
- How do we estimate characteristics of the real world using data
- How do we express the uncertainty in our estimates?
- What are the different ways our estimate can be wrong?
- How do we choose our estimates?
- What kinds of models can we create from data?
- How do we create models from the results of our experiments?

## Key Concepts and Definitions
- effect size
- percentile
- quantile
- model
- analytic distribution
- empirical distribution
- standard normal distribution

- confidence interval
- standard error
- p-value
- null hypothesis
- alternative hypothesis

- estimation
- estimator
- mean squared error
- maximum likelood estimator

- false positive, false negative
- type1 / type 2 error
- internal / external validity
- threats to validity
- experimental designs
- quasi-experimental designs
- natural experiment


## Themes of this Sprint
- Estimates
- Uncertainty
- Hypothesis Tests
- Validity
- Tests / Experiments / Knowledge
- Causality and Relationships
- Sources of Error



# Experimental Design Video Resources and Courses

https://www.coursera.org/learn/real-life-data-science/lecture/Getan/experimental-design-and-observational-analysis

https://www.coursera.org/learn/data-scientists-tools/lecture/NUYrv/experimental-design

https://www.youtube.com/watch?v=vSXOJnGNtM4

https://www.coursera.org/learn/real-life-data-science/lecture/LU8XW/a-b-testing

https://www.udacity.com/course/ab-testing--ud257

## Related / Correlation / Causation
>If variables X and Y (e.g., the number of televisions (X) in various countries and the infant mortality rate (Y) of those countries) are found to be associated, then there are three basic possibilities. 
- First X could be causing Y (televisions lead to more health awareness, which leads to better prenatal care) 
- or Y could be causing X (high infant mortality leads to attraction of funds from richer countries, which leads to more televisions) 
- or unknown factor Z could be causing both X and Y (higher wealth in a country leads to more televisions and more prenatal care clinics). 

>It is worth memorizing these three cases, because they should always be considered when association is found in an observational study as opposed to a randomized experiment. (It is also possible that X and Y are related in more complicated ways including in large networks of variables with feedback loops.)

>Causation (“X causes Y”) can be logically claimed if X and Y are associated, and X precedes Y, and no plausible alternative explanations can be found, par- ticularly those of the form “X just happens to vary along with some real cause of changes in Y” (called confounding).

## Experimental Design
http://www.statisticshowto.com/experimental-design/
    
> Experimental design is a way to carefully plan experiments in advance so that your results are both objective and valid. The terms “Experimental Design” and “Design of Experiments” are used interchangeably and mean the same thing. However, the medical and social sciences tend to use the term “Experimental Design” while engineering, industrial and computer sciences favor the term “Design of experiments.”

## Confirmatoratory vs Exploratory Research 
https://en.wikipedia.org/wiki/Research_design

>Confirmatory research tests a priori hypotheses — outcome predictions that are made before the measurement phase begins. Such a priori hypotheses are usually derived from a theory or the results of previous studies. The advantage of confirmatory research is that the result is more meaningful, in the sense that it is much harder to claim that a certain result is generalizable beyond the data set. The reason for this is that in confirmatory research, one ideally strives to reduce the probability of falsely reporting a coincidental result as meaningful. This probability is known as α-level or the probability of a type I error.

>Exploratory research on the other hand seeks to generate a posteriori hypotheses by examining a data-set and looking for potential relations between variables. It is also possible to have an idea about a relation between variables but to lack knowledge of the direction and strength of the relation. If the researcher does not have any specific hypotheses beforehand, the study is exploratory with respect to the variables in question (although it might be confirmatory for others). The advantage of exploratory research is that it is easier to make new discoveries due to the less stringent methodological restrictions. Here, the researcher does not want to miss a potentially interesting relation and therefore aims to minimize the probability of rejecting a real effect or relation; this probability is sometimes referred to as β and the associated error is of type II. In other words, if the researcher simply wants to see whether some measured variables could be related, he would want to increase the chances of finding a significant result by lowering the threshold of what is deemed to be significant.

>Sometimes, a researcher may conduct exploratory research but report it as if it had been confirmatory ('Hypothesizing After the Results are Known', HARKing—see Hypotheses suggested by the data); this is a questionable research practice bordering on fraud.


## Experimental Design and Modeling relationships

>Most formal (confirmatory) statistical analyses are based on models. Statis- tical models are ideal, mathematical representations of observable characteristics. Models are best divided into two components. **The structural component of the model (or structural model) specifies the relationships between explana- tory variables and the mean (or other key feature) of the outcome variables**. The **“random” or “error” component of the model (or error model) characterizes the deviations of the individual observations from the mean**. (Here, “error” does not indicate “mistake”.) The two model components are also called **“signal” and “noise”** respectively. 

>Statisticians realize that no mathematical models are perfect representations of the real world, but some are close enough to reality to be useful. A full description of a model should include all assumptions being made because statistical inference is impossible without assumptions, and sufficient deviation of reality from the assumptions will invalidate any statistical inferences.

>A slightly different point of view says that models describe how the distribution of the outcome varies with changes in the explanatory variables.

 
> **Statistical models have both a structural component and a random component which describe means and the pattern of deviation from the mean, respectively.**


## Construct Validity
> Construct validity is a characteristic of devised measurements that describes how well the measurement can stand in for the scientific concepts or “constructs” that are the real targets of scientific learning and inference.

# Internal Validity (concept for Next Sprint)
http://www.indiana.edu/~educy520/sec5982/week_9/520in_ex_validity.pdf

>Why is Internal Validity Important?
We often conduct research in order to determine
cause-and-effect relationships.
■ Can we conclude that changes in the independent
variable caused the observed changes in the
dependent variable?
■ Is the evidence for such a conclusion good or poor?
■ If a study shows a high degree of internal validity then
we can conclude we have strong evidence of
causality.
■ If a study has low internal validity, then we must
conclude we have little or no evidence of causality.


# Necessary Conditions for Causality
>Three conditions that are necessary to claim that
variable A causes changes in variable B:
• Relationship condition: Variable A and variable B
must be related.
• Temporal Antecedence condition: Proper time order
must be established.
• Lack of Alternative Explanation Condition:
Relationship between variable A and variable B
must not be attributable to a confounding,
extraneous variable.


>Threats to internal validity compromise our confidence
in saying that a relationship exists between the
independent and dependent variables.

>Threats to external validity compromise our
confidence in stating whether the study’s results are
applicable to other groups.