<img src="./images/banner.png" width="800">

# Types of Variables

In the world of statistics and data analysis, understanding the concept of variables is crucial. A **variable** is a characteristic or property that can take on different values. For example, when studying a group of people, variables could include age, height, weight, gender, or income.


Variables are essential because they help us:
- Organize and categorize data
- Identify patterns and relationships
- Make predictions and draw conclusions
- Communicate findings effectively


To work with variables effectively, it's important to understand the different types of variables and their properties. This knowledge will guide you in choosing the appropriate statistical methods and techniques for analyzing your data.


In this lecture, we'll explore the main types of variables:
1. Qualitative and Quantitative Variables
2. Discrete and Continuous Variables
3. Independent and Dependent Variables
4. Confounding Variables

We'll also discuss observational studies and the concept of confounding variables.


By the end of this lecture, you'll have a solid foundation in understanding the different types of variables and their roles in statistical analysis. This knowledge will empower you to work with data more effectively and make informed decisions based on your findings.


Let's dive in! 🌟

**Table of contents**<a id='toc0_'></a>    
- [Qualitative and Quantitative Variables](#toc1_)    
  - [Qualitative (Categorical) Variables](#toc1_1_)    
  - [Quantitative (Numerical) Variables](#toc1_2_)    
- [Discrete and Continuous Variables](#toc2_)    
  - [Discrete Variables](#toc2_1_)    
  - [Continuous Variables](#toc2_2_)    
  - [Approximate Numbers and Rounding Off](#toc2_3_)    
- [Independent and Dependent Variables](#toc3_)    
  - [Independent Variables](#toc3_1_)    
  - [Dependent Variables](#toc3_2_)    
  - [Identifying Independent and Dependent Variables](#toc3_3_)    
- [Observational Studies and Confounding Variables](#toc4_)    
  - [Observational Studies](#toc4_1_)    
  - [Confounding Variables](#toc4_2_)    
- [Summary](#toc5_)    

<!-- vscode-jupyter-toc-config
	numbering=false
	anchor=true
	flat=false
	minLevel=2
	maxLevel=6
	/vscode-jupyter-toc-config -->
<!-- THIS CELL WILL BE REPLACED ON TOC UPDATE. DO NOT WRITE YOUR TEXT IN THIS CELL -->

## <a id='toc1_'></a>[Qualitative and Quantitative Variables](#toc0_)

As we discussed in the previous lecture on types of data, similar to data, variables can be classified into two main types: qualitative and quantitative. Let's explore each type in more detail.

<img src="./images/types-of-data.png" width="800">

### <a id='toc1_1_'></a>[Qualitative (Categorical) Variables](#toc0_)


Qualitative variables, also known as categorical variables, represent characteristics or attributes that cannot be quantified numerically. These variables are often expressed in words or labels.

Qualitative variables can be further divided into two subcategories:

1. **Nominal Variables**: Nominal variables have no inherent order or ranking. Examples include:
   - Gender (male, female, non-binary)
   - Eye color (blue, brown, green)
   - Marital status (single, married, divorced)

2. **Ordinal Variables**: Ordinal variables have a natural order or ranking, but the differences between values are not necessarily equal. Examples include:
   - Education level (high school, bachelor's, master's, doctorate)
   - Income bracket (low, medium, high)
   - Likert scale responses (strongly disagree, disagree, neutral, agree, strongly agree)


### <a id='toc1_2_'></a>[Quantitative (Numerical) Variables](#toc0_)


Quantitative variables, also known as numerical variables, represent characteristics that can be measured and expressed numerically. These variables can be further divided into two subcategories:

1. **Discrete Variables**: Discrete variables can only take on specific, separate values, often integers. Examples include:
   - Number of siblings (0, 1, 2, 3, ...)
   - Number of cars owned (0, 1, 2, ...)
   - Number of students in a class (25, 26, 27, ...)

2. **Continuous Variables**: Continuous variables can take on any value within a specific range, including fractional or decimal values. Examples include:
   - Height (1.65 m, 1.78 m, 1.82 m, ...)
   - Weight (65.3 kg, 72.1 kg, 80.5 kg, ...)
   - Time (2.5 seconds, 3.8 seconds, 4.2 seconds, ...)


Here's a simple example in Python to illustrate the difference between discrete and continuous variables:


In [1]:
# Discrete variable: Number of siblings
siblings = [0, 1, 2, 3, 1, 2, 0, 3, 2, 1]

# Continuous variable: Height (in meters)
height = [1.65, 1.78, 1.82, 1.60, 1.75, 1.68, 1.84, 1.72, 1.80, 1.77]

In the next section, we'll explore discrete and continuous variables in more detail.

## <a id='toc2_'></a>[Discrete and Continuous Variables](#toc0_)

Let's take a closer look at the two types of quantitative variables: discrete and continuous variables.


<img src="./images/quantitative-data.png" width="800">

### <a id='toc2_1_'></a>[Discrete Variables](#toc0_)


- **Definition**: Discrete variables can only take on specific, separate values, often integers or whole numbers. These variables usually involve counting.

- **Examples**:
  - Number of pets owned (0, 1, 2, 3, ...)
  - Number of students absent in a class (0, 1, 2, ...)
  - Number of cars in a parking lot (10, 11, 12, ...)

- **Characteristics of Discrete Variables**:
  - Values are distinct and separate
  - Often represented by integers or whole numbers
  - Gaps exist between values (e.g., you can't have 1.5 pets)


### <a id='toc2_2_'></a>[Continuous Variables](#toc0_)


- **Definition**: Continuous variables can take on any value within a specific range, including fractional or decimal values. These variables usually involve measuring.

- **Examples**:
  - Height (1.65 m, 1.78 m, 1.82 m, ...)
  - Weight (65.3 kg, 72.1 kg, 80.5 kg, ...)
  - Time taken to complete a task (2.5 seconds, 3.8 seconds, 4.2 seconds, ...)

- **Characteristics of Continuous Variables**:
  - Values can take on any number within a range
  - Often represented by real numbers (including fractions and decimals)
  - No gaps exist between values (e.g., height can be 1.75 m or 1.76 m)


In the next section, we'll explore the concepts of independent and dependent variables in the context of experiments and observational studies.

## <a id='toc3_'></a>[Independent and Dependent Variables](#toc0_)

When conducting experiments or observational studies, it's essential to understand the roles of independent and dependent variables.


<img src="./images/1.png" width="800">

<img src="./images/2.png" width="800">

<img src="./images/3.png" width="800">

### <a id='toc3_1_'></a>[Independent Variables](#toc0_)


- **Definition**: An independent variable is a variable that is manipulated or controlled by the investigator in an experiment. It is believed to have an effect on the dependent variable.

- **Role in Experiments**:
  - The investigator deliberately changes or manipulates the independent variable to observe its effect on the dependent variable.
  - Different levels or conditions of the independent variable are assigned to different groups of subjects.

- **Manipulated by the Investigator**:
  - The investigator has control over the independent variable and can decide which subjects receive which levels or conditions.
  - Example: In a study on the effect of sleep duration on memory, the investigator might assign participants to either a 6-hour sleep group or an 8-hour sleep group (independent variable).


### <a id='toc3_2_'></a>[Dependent Variables](#toc0_)


- **Definition**: A dependent variable is a variable that is measured, counted, or recorded by the investigator in an experiment. It is believed to be affected by the independent variable.


- **Role in Experiments**:
  - The dependent variable is the outcome or response that the investigator measures to determine the effect of the independent variable.
  - Changes in the dependent variable are presumed to be caused by the manipulation of the independent variable.

- **Measured, Counted, or Recorded by the Investigator**:
  - The investigator observes and records the values of the dependent variable for each subject or group in the experiment.
  - Example: In the sleep duration study, the investigator might measure the participants' memory performance (dependent variable) using a memory test.


### <a id='toc3_3_'></a>[Identifying Independent and Dependent Variables](#toc0_)


To identify the independent and dependent variables in a study, ask yourself:
- What is being manipulated or changed by the investigator? (Independent variable)
- What is being measured or observed as a result of the manipulation? (Dependent variable)


Understanding the roles of independent and dependent variables is crucial for designing and interpreting experiments and observational studies.


## <a id='toc4_'></a>[Observational Studies and Confounding Variables](#toc0_)

In addition to experiments, researchers often conduct observational studies to investigate relationships between variables. However, observational studies have limitations and can be affected by confounding variables.


<img src="./images/observation-experiment.png" width="400">

### <a id='toc4_1_'></a>[Observational Studies](#toc0_)


- **Definition**: An observational study is a type of study where the investigator observes and measures variables without manipulating them. The investigator does not control or assign the independent variable.

- **Purpose**:
  - To examine relationships between variables as they naturally occur.
  - To generate hypotheses for future experimental research.

- **Limitations in Determining Cause-Effect Relationships**:
  - Observational studies cannot definitively establish cause-effect relationships because the investigator does not manipulate the independent variable.
  - Other factors (confounding variables) may influence the relationship between the variables, making it difficult to determine the true cause of the observed effects.


### <a id='toc4_2_'></a>[Confounding Variables](#toc0_)


- **Definition**: A confounding variable is an extraneous variable that is related to both the independent and dependent variables in a study. It can influence the outcome of the study and make it difficult to interpret the results accurately.

- **Impact on Study Interpretation**:
  - Confounding variables can lead to misleading conclusions about the relationship between the independent and dependent variables.
  - They can create the appearance of a relationship between variables when none exists, or they can mask a true relationship.

- **Avoiding Confounding Variables**:
  - *Random Assignment*: In experiments, randomly assigning subjects to different levels of the independent variable helps to distribute potential confounding variables evenly across groups, minimizing their impact.
  - *Standardization*: Keeping all other variables constant across groups (except for the independent variable) helps to control for potential confounding variables.


Here's an example of how confounding variables can affect the interpretation of a study:

> Suppose an observational study finds a positive correlation between ice cream sales and drowning incidents. It might be tempting to conclude that eating ice cream causes drowning. However, a confounding variable, such as hot weather, could be responsible for both increased ice cream sales and more people swimming (leading to more drowning incidents). In this case, the hot weather is the confounding variable that influences both the independent variable (ice cream sales) and the dependent variable (drowning incidents).


<img src="./images/confounding-variable.png" width="800">

Hot weather (confounding variable) influences both ice cream sales (independent variable) and drowning incidents (dependent variable), creating a spurious relationship between the two variables.


Understanding the limitations of observational studies and the impact of confounding variables is crucial for accurately interpreting research findings and drawing appropriate conclusions.

## <a id='toc5_'></a>[Summary](#toc0_)


In this lecture, we explored the different types of variables and their roles in statistical analysis and research. Let's recap the key points:

- Variables are characteristics or properties that can take on different values, and they are essential for organizing data, identifying patterns, and making informed decisions.

- Variables can be classified into two main categories:
  - **Qualitative (Categorical) Variables**: Variables that represent characteristics or attributes that cannot be quantified numerically. They can be further divided into nominal (no inherent order) and ordinal (natural order, but differences not necessarily equal) variables.
  - **Quantitative (Numerical) Variables**: Variables that represent characteristics that can be measured and expressed numerically. They can be further divided into discrete (specific, separate values) and continuous (any value within a range) variables.

- When working with continuous variables, it's important to consider the level of precision required and be aware that the values are often approximations due to rounding off.

- In experiments and observational studies, variables can be classified as:
  - **Independent Variables**: Variables that are manipulated or controlled by the investigator, believed to have an effect on the dependent variable.
  - **Dependent Variables**: Variables that are measured, counted, or recorded by the investigator, believed to be affected by the independent variable.

- Observational studies are used to examine relationships between variables as they naturally occur, but they have limitations in determining cause-effect relationships due to the presence of confounding variables.

- **Confounding Variables**: Extraneous variables that are related to both the independent and dependent variables, which can influence the outcome of a study and lead to misleading conclusions.

- To minimize the impact of confounding variables, researchers can use random assignment (in experiments) and standardization (keeping other variables constant).


Understanding the different types of variables and their roles is crucial for effective data analysis and interpretation. By correctly identifying and classifying variables, researchers can:
- Choose appropriate statistical methods and techniques
- Design experiments and observational studies effectively
- Control for confounding variables
- Accurately interpret research findings and draw valid conclusions


Mastering the concepts of variable types empowers researchers and data analysts to make informed decisions, uncover meaningful insights, and communicate their findings effectively.
