# The Scientific Method

## Table of Contents

- [The Scientific Method](#smet)
    - [Step 1: Research Your Study Area](#step1)
    - [Step 2: Operationalize Your Study](#step2)
    - [Step 3: Data Collection](#step3)
    - [Step 4: Statistical Analysis](#step4)
    - [Step 5: Writing the Results](#step5)
- [Experimental Methods](#exp)
    - [True Randomized Experiments](#true)
    - [Quasi-Experiments](#qua)
    - [Pros and Cons of Quasi-Experiments](#comp)
        - [Observational studies](#obs)
- [Resources](#res)

---
<a id='smet'></a>

## The Scientific Method

The **scientific method** is a proven procedure for expanding knowledge through experimentation and analysis. It is a process that uses careful planning, rigorous methodology, and thorough assessment. Statistical analysis plays an essential role in this process.

When you’re talking about scientific studies or experiments, you’re invariably using **inferential statistics**. Science doesn’t generally care if an effect exists only in the sample.

In an experiment that includes statistical analysis, the analysis is at the end of a long series of events. To obtain valid results, it’s crucial that you carefully plan and conduct a scientific study for all steps up to and including the analysis. 

---
<a id='step1'></a>

## Step 1: Research Your Study Area

Good scientific research depends on gathering a lot of information before you even start collecting data. You’ll need to investigate your subject-area to write a research question that your study can reasonably answer. Then, you’ll need to develop in-depth knowledge about other studies to devise a plan for conducting your study.

### Define Your Research Question

The first step of your study is to formulate a **research question**. This is the question you want your study to answer. Research questions focus your experiment, help guide your decision-making process, and helps prevent side issues from distracting you from your goal.

Your **research question** should be appropriate for your discipline. Consequently, the properties of suitable research questions vary significantly by subject area. For example, acceptable research questions look different for physics, psychology, biology, and political science. However, they have some common qualities.

**Research questions** must be clear and concise. Readers of your short research question should clearly understand the goal of your study. Additionally, ensure the scope of the inquiry is narrow enough that your research can reasonably answer it using available time and resources.

After you devise your question, you’ll need to conduct a much more in-depth review of the literature. And, you will likely perform some iterative fine-tuning. During the literature review, you might find yourself tweaking the research question.

### Literature Review

A **literature review** is a very extensive background investigation into your research question. There are two primary goals of a literature review for a scientific study that involves statistical analysis.

- First, you need to understand fully the subject-area that contains your research question. Define the current state of scientific knowledge surrounding your research question. This process helps you determine how your study fits within the field, enables you to understand the thought processes behind similar studies, and provides you with a general sense of the findings thus far.

- Secondly, you need information that helps you operationalize your study. **Operationalization** is the process of taking the general idea of your research question and creating an actionable plan that allows an experiment to answer the question. If your study includes statistical analysis, you’ll need to determine how other studies have used statistics to answered similar questions.

`The research phase should produce a research question`, in-depth knowledge of the subject-area and relevant findings, and a thorough understanding of how other researchers have operationalized similar studies. This background information helps you design your own experiment.

---
<a id='step2'></a>

## Step 2: Operationalize Your Study

**Operationalizing** a study is the process of taking your **research question**, using the background information you gathered, and a formulating an **actionable plan**. This plan includes everything from defining variables to how you’ll analyze the data.

### Variables: What Will You Measure?

Studies that use statistics to answer questions require you to **collect data** in the form of variables that you’ll analyze. Consequently, you must **define the variables** that you will measure and **decide how you’ll measure** them. If you do not collect the correct data or measure it inaccurately, you might not be able to answer your research question. In fact, thanks to **confounding variables**, the variables you do not measure can impact the results for the variables that you do measure! Take your time determining which variables you’ll need to measure to answer your research question.

### Types of Variables and Treatments

Typically, studies want to understand how changes in one or more variables affect the outcome variable. Depending on the type of experiment, the researchers will either control or not control the variables. If you control the variables, you’ll need to decide on the settings for the controllable variables.

Most studies include a treatment, intervention, or some other comparison it wants to make. You’ll need to define the treatment and ensure a system is in place to deliver it as required. That’s true not only for medical treatments but with any intervention.

### Measurement Methodology: How Will You Take Measurements?

You’ll also need to specify how you will take measurements. What equipment will you use? How will you reduce other sources of variation? **Precision** and **accuracy** are essential in research. Ensure that your plan describes how to obtain good measurements.

### Create a Sampling Plan: How Will You Collect Samples for Studying?

Researchers must specify the particular **population** they’re studying.

After you define your **population**, you need to devise a plan for collecting a sample from that population. Your sample contains the people or objects that your study assesses. Studies that use **inferential statistics** take sample data and draw inferences about a population. However, these studies must gather samples in a manner that produces **unbiased estimates**. This process often involves **random sampling** because a convenience method can introduce bias.

Literature reviews often reveal sample collection methodologies other researchers have used in your study area. Determine where and how you’ll collect the sample, including the date and time, location, and so on.

Finally, how much data should you collect? On the one hand, you want to collect enough data to have a reasonable chance of detecting a practically significant effect. On the other hand, you don’t want to obtain such a large sample that it wastes your time and resources. A **power analysis** helps you choose a sample size that strikes a balance between these two competing goals. However, to perform a **power analysis**, you need estimates for effect size and variability in the data. Again, look at your literature review!

### Design the Experimental Methods

You’ll need to define your hypothesis in a form amenable to statistical analysis and choose the appropriate analysis. Your **hypothesis** must be testable, which means that the data you collect will either support or reject the hypothesis. Determine the statistical analyses that can adequately **test your hypotheses**. These methodology decisions start at a very high level, such as choosing between a randomized experiment or an observational study. From there, you can work your way down to more fundamental questions.

Additionally, there are the nuts and bolts for each type of analysis that you’ll need to decide. What significance level will you use? One-tailed or two-tailed hypothesis tests? If you use ANOVA, will you follow up with a post hoc test? If so, which one?

Your plan should `limit the number of analyses and models you’ll use`. Each statistical test has an **error rate**. The more tests you perform, the higher the overall chances of a false result. Making these methodology decisions in advance helps reduce data mining. It prevents you from using multiple techniques and then cherry picking the best results. In this manner, a data analysis plan lowers the probability of **false positives** caused by running into **chance correlations**.

The **operationalization stage** should produce a plan that tells you what you’ll measure, how you’ll measure it, how you will collect a sample, the size of the sample, and how you’ll analyze the data.

---
<a id='step3'></a>

## Step 3: Data Collection

At this point, you’ve operationalized your study and have a plan of action. After you make the necessary arrangements, you should be ready to **collect data.** Depending on the nature of your research, this can be quite a long process. Whether you’re in the lab measuring, out administering surveys in the field, or working with human subjects, **data collection** is often the portion of the study that takes the most time and work.

Often, you’ll need to set up the **proper conditions** to take measurements and verify that everything is working correctly. Perhaps you need to get the lab conditions just right and ensure the equipment is functioning properly to obtain valid measurements. Or, you’re going through a detailed process to obtain a truly random sample.

While you’re generally working from your operational plan, it’s not uncommon to encounter **surprises**, and you’ll need to adapt. Hopefully, your subject-area knowledge and literature review help you anticipate most surprises, but the thing about science is that you’re often studying something that researchers haven’t fully studied before. Expect surprises.

---
<a id='step4'></a>

## Step 4: Statistical Analysis

Like the data collection stage of your study, you should already have the analysis phase defined. In a nutshell, be sure that you’re **analyzing the data correctly**, **satisfying the assumptions** where necessary, and **drawing the proper conclusions**.

However, there is a vital point to make here. Problems anywhere in this process can prevent you from making discoveries or invalidate the findings well before you even get to the statistical analysis. As the old saying goes, **garbage in, garbage out**. If you put garbage data into the statistical analysis, it’ll spit out garbage results. If all the steps leading up to your analysis are not carefully thought out and performed, you might not be able to trust the results or miss important findings. Science is all about getting all the details correct.

---
<a id='step5'></a>

## Step 5: Writing the Results

After you collect the data and analyze it, you need to **write up the results** to inform other researchers about what you’ve found. Indicate which hypotheses the data support, the overall conclusions, and what they represent in the framework of the scientific field or real-world setting. However, it involves more than just writing up the findings.

The **scientific method** `works by replicating results - or the failure to do so`. The scientific process tends to cause the correct answers for research questions to rise the top over time through successful replication. Conversely, it weeds out incorrect results after they fail to replicate.

Consequently, you’ll need to provide enough information about how you conducted your study so other researchers can repeat it and, hopefully, replicate the results. Typically, you’ll include aspects of the first four steps (background research, operationalization, data collection, and analysis) in the final write up. The standards vary by field, so you should see how studies in your area document themselves. In this manner, your research becomes part of the knowledgebase for future studies to build on - just like you did during your literature review! Additionally, all the details help other researchers determine the strengths and weaknesses of your study so they can interpret the results while understanding the context.

---
<a id='exp'></a>

## Experimental Methods

An **experiment** is a procedure to make a discovery or test a hypothesis. To conduct an experiment, researchers manipulate, measure, and control variables. Typically, the goal is to establish a causal relationship between variables. Researchers want to influence, predict, and explain outcomes. Experiments push back the boundaries of science by creating new knowledge. 

For experiments to create new knowledge, scientists use designs that include procedures and variables that allow them to identify relationships that answer their research questions.

The term experiment covers many different types of design. The strictest definition of an experiment is known as a true experiment. In a true experiment, the researchers randomly select subjects from the population, randomly assign subjects to the treatment groups, and controls the treatments and all relevant conditions they experience in a lab setting.

However, there are broader definitions of experiments. These definitions include quasi-experimental designs that don’t use representative samples or random assignment. Researchers might not even control the treatments and other conditions that the subjects experience.

How people within a field think about experiments depends on the area.

### Types of Variables in an Experiment

Let’s define the two fundamental types of variables that you’ll include in your experiment.

#### Dependent Variables

The **dependent variable** is a variable in the experiment that you want to explain or predict. The values of this variable depend on other variables. It’s also known as the **response variable**, **outcome variable**, and it is commonly denoted using a It `Y`. Traditionally, analysts graph dependent variables on the vertical, or Y, axis. Frequently, you’ll compare the outcome variable between groups to estimate the effect size of your treatment, intervention, or process.

#### Independent Variables

**Independent variables** are the variables that you include in the experiment to explain or predict changes in the dependent variable. In true experiments, independent variables are systematically set and changed by the researchers. However, in observational studies, the values of the independent variables are not set by researchers but observed instead. These variables are also known as **predictor variables**, **input variables**, **experimental factors**, and are commonly denoted us- ing `Xs`. On graphs, analysts place independent variables on the horizontal, or X, axis.

### Causation versus Correlation

When you conduct an experiment, you typically want to identify **causal relationships**. Does event A cause outcome B? However, determining that an event causes an outcome, rather than merely being correlated, requires your experiment to include design elements that control or rule out other possible explanations. **Causation** indicates that an event affects an outcome. `Correlation does not imply causation`.

In statistics, causation is a bit tricky. `Correlation doesn’t necessarily imply causation`. An association or correlation between variables indicates that the values vary together. It does not necessarily suggest that changes in one variable cause changes in the other variable. Proving causality can be difficult.

If correlation does not prove causation, what statistical test do you use to assess causality? That’s a trick question because no statistical analysis can make that determination.

### Confounding Variables

As a critical component of the scientific method, experiments typically set up contrasts between a **control group** and one or more **treatment groups**. The idea is to determine whether the effect, which is often the difference between a treatment group and the control group, is **statistically significant**. If the effect is significant, group assignment correlates with different outcomes.

However, as you have read, correlation does not necessarily imply causation. In other words, the experimental groups can have different mean outcomes, but the treatment might not be causing those differences even when the differences are statistically significant.

The difficulty in **establishing causality** is the potential existence of **confounding variables** or **confounders**. **Confounders** are alternative explanations for differences between the experimental groups.

**Confounding variables** correlate with both the experimental groups and the outcome variable. In this situation, confounding variables can be the actual cause for the outcome differences rather than the treatments themselves. `If an experiment does not account for confounding variables, they can bias the results and make them untrustworthy`.

### Why Determining Causality Is Important

What is the big deal in the difference between correlation and causation?

If you’re only predicting events, not trying to understand why they happen, and do not want to alter the outcomes, correlation can be perfectly fine. However, if you want to reduce the number of attacks, you’ll need to find something that genuinely causes a change in the attacks.

For intentional changes in one variable to affect the outcome variable, there must be a causal relationship between the variables. After all, if studying does not cause an increase in test scores, there’s no point for studying. If the medicine doesn’t cause an improvement in your health or ward off disease, there’s no reason to take it.

Before you can state that some course of action will improve your outcomes, you must be sure that a causal relationship exists between your variables.

### Causation and Hypothesis Tests

Let’s take a moment to reflect on why `statistically significant hypothesis test results do not signify causation`.

**Hypothesis tests** are inferential procedures. They allow you to use relatively small samples to draw conclusions about entire populations. For the topic of **causation**, we need to understand what **statistical significance** means.

Use a **hypothesis test** to determine whether your data provide sufficient evidence to conclude that a relationship in your sample exists in the population. Tests exist for correlation coefficients, differences between group means, and regression coefficients among many other relationships. You might observe a relationship in your sample, but you need to know whether it exists in the population. **Random sampling error** (i.e., the luck of the draw) might have created the appearance of a “relationship” in your sample.

**Statistical significance** indicates that you have sufficient evidence to conclude that the relationship you observe in the sample also exists in the population. That’s it. `It doesn’t address causality at all`.

There’s a critical separation between **significance** and **causality**:

- Statistical procedures indicate whether you have sufficient evidence to conclude that a sample effect exists in the population.
- Experimental designs determine how confidently you can assume that a treatment causes the effect.

How do experiments determine that a relationship is causal?

In short, `to have a chance at asserting that a relationship is causal, your study must have a design that helps rule out other explanations for the association`. Scientific studies commonly use the following **two methods to handle confounders**:

- Use random assignment in a true experiment to reduce the likelihood that systematic differences exist between experimental groups when the investigation begins.
- Statistically control for them in quasi-experiments and observational studies.

---
<a id='true'></a>

## True Randomized Experiments

To classify as a **true experiment**, the researchers must do the following:

- Uses a representative sample of the population under study.
- Randomly assign subjects to the experimental groups randomly.
- Have a control group.
- Control the treatment or process that they are testing.

True experiments are the best way to identify causal relationships. These studies often occur in lab settings that control other sources of variation effectively. 

**Random assignment** uses chance to assign subjects to the **control and treatment groups** in an experiment. This process helps ensure that the groups are equivalent at the beginning of the study. Having comparable groups increases your confidence that the treatments caused the differences between groups at the end of the study.

Additionally, researchers must be able to **control the treatment or process** that each group experiences and **control other sources of variation**. `True experiments typically have a control group that serves as a baseline to compare to the outcomes of the treatment groups`. If the infection rate for a vaccine is 10%, you don’t know if that’s an improvement unless you can compare it to an unvaccinated control group.

**Random assignment** is the magic ingredient that gives `true experiments powerful abilities to detect causal relationships`.

Note that **random assignment** is different than **random sampling**. **Random sampling** is a process for obtaining a sample that accurately represents a population. **Random assignment** uses a chance process to assign subjects to experimental groups (we can use a coin toss to assign each person to either the control group or supplement group). Using random assignment requires that the experimenters can control the group assignment for all study subjects.

The **random assignment process** distributes **confounding properties** amongst your experimental groups equally. In other words, `randomness helps eliminate systematic differences between groups`.

When a study ends, we compare outcomes between groups to see if there are differences. For example, we might use a hypothesis test to determine whether the differences between groups means are **statistically significant**.

**Random assignment** is a simple, elegant solution to a complex problem. For any given study area, there can be a long list of **confounding variables** to worry about. However, `using random assignment, you don’t need to know what they are, how to detect them, or even measure them`. Instead, use random assignment to equalize them across your experimental groups so they’re not a problem.

Because **random assignment** helps ensure that the groups are comparable when the experiment begins, you can be more confident that the treatments caused the post-study differences.

---
<a id='qua'></a>

## Quasi-Experiments

**Quasi-experiments** are similar to true experiments in their attempt to establish causality, but they do not meet all the requirements of a true experiment. Quasi-experiments can bear a strong resemblance to true experiments. They typically have an **outcome/dependent variable**, at least one **independent variable**, and designs that compare **treatment groups** to a **control group**.

There are a wide variety of types of **quasi-experiments** to handle different situations. However, the common characteristic for all quasi-experiments is that `they do not use random assignment`. In some cases, it is impossible or unethical to assign subjects to treatment groups randomly. Perhaps the researchers cannot assign subjects to the experimental groups. Or, maybe they can assign them to the groups but cannot use a random process.

Typically, the researchers do not control all the relevant variables in a quasi-experiment. In fact, they might not control the treatment or intervention itself. Instead, the subjects might choose the treatment groups themselves or be assigned by a non-random process - which might not be under the experimenter’s control.

**Quasi-experiments** often include **pre-tests** that allow researchers to determine whether there are differences between the experimental groups at the beginning of the experiment that might affect the outcomes. Unlike true experiments, **quasi-experimental** designs often re-quire the researchers to observe and measure **confounding variables** and then account for them using a statistical model.

---
<a id='comp'></a>

## Pros and Cons of Quasi-Experiments

**True experiments** tend to occur in labs where researchers control all conditions, but generalizability to the real world might suffer. On the other hand, **quasi-experiments** frequently occur in more natural settings outside of the lab. Consequently, statisticians refer to this type of experiment as a **natural experiment**. Subjects are in their natural environments and often making their own decisions about the treatments and other factors that are relevant to the researcher’s outcome variable. Consequently, generalizability to the real-world is less of a concern for **quasi-experiments**.

However, moving away from **random assignment** increases questions about **causality**. Differences in outcomes might be attributable to **confounding variables** and **alternative explanations** rather than the treatment itself.

**True experiments** are typically more expensive and complicated to set up. Consequently, limited resources can prevent researchers from conducting a true experiment, but they might be able to afford a less stringent design. In cases where **random assignment** is impossible or unethical, **true experiments** are not an option. 

---
<a id='obs'></a>

## Observational Studies

**Observational studies** are a type of **quasi-experiment** that you use `when you can’t assign sub- jects to the groups randomly, and you do not control the treatment or intervention`. The researchers simply observe. It’s the very definition of a natural experiment. Observational studies reduce the problem of **confounding variables** by incorporating confounders into a statistical model of the experimental design.

For a myriad of reasons, researchers might not be able to use random assignment. **Observational studies** use samples to draw conclusions about a population when the researchers do not control the treatment, or independent variable, that relates to the primary research question.

In an **observational study**, the researchers only observe the subjects and do not interfere or try to influence the outcomes. In other words, the researchers do not control the treatments or assign subjects to experimental groups. Instead, they observe and measure variables of interest and look for relationships between them. Usually, researchers conduct observational studies when it is difficult, impossible, or unethical to assign study participants to the experimental groups randomly. If you can’t randomly assign subjects, then you observe them in their self-selected states.

**Randomized studies** are better, and you should usually randomize whenever possible. However, if randomization is not possible, science should not come to a halt. After all, we still want to learn things, discover relationships, and make discoveries. For these cases, observational studies are a good alternative.

**Observational studies** don’t use **random assignment** and **confounders** can be distributed disproportionately. Consequently, experimenters need to know which variables are confounders, measure them, and then use a method to account for them. **Trait matching** and statistically controlling confounders using **multivariate procedures** are two standard approaches for incorporating **confounding variables**.

### Matching

**Matching** is a technique that involves selecting study participants with similar characteristics outside the variable of interest or treatment. Rather than using **random assignment** to equalize the experimental groups, the experimenters do it by **matching observable characteristics**. The researchers use subject-area knowledge to identify characteristics that are critical to match. For every participant in the **treatment group**, the researchers find a participant with comparable traits to include in the **control group**. `Matching facilitates valid comparisons between similar groups`.

**Matching** has some drawbacks. The experimenters might not be aware of all the relevant characteristics they need to match. In other words, the groups might be different in an essential aspect that the researchers don’t recognize.



---
<a id='res'></a>

# Resources

- [Statistics by Jim](https://statisticsbyjim.com/)