### Crash Course in Causality

### Concepts To Know Prior to learning Causal ML

**Causality**: Understanding cause-and-effect relationships and distinguishing between correlation and causation.

**Machine Learning (ML)**: Techniques and algorithms that enable computers to learn from data and make predictions or decisions without being explicitly programmed.

**Observational Data**: Data collected from real-world observations or studies where treatments or interventions are not randomly assigned.

**Counterfactuals**: Hypothetical outcomes that represent what would have happened under different treatment conditions, essential for estimating causal effects.

**Confounding**: When the observed association between a treatment and an outcome is influenced by a third variable, leading to biased estimates of causal effects if not properly accounted for.

### What is Causal ML? 

**Causal machine learning** (Causal ML) is a field that combines elements of **causal inference and machine learning** to estimate **causal effects** from observational data. Traditional machine learning methods focus on prediction tasks, such as classification and regression, without explicitly considering causal relationships between variables. In contrast, **causal ML** aims to go beyond prediction by identifying and estimating the **causal effects** of **interventions or treatments on outcomes of interest**.


 <div style="text-align:center">
    <img src="https://raw.githubusercontent.com/uber/causalml/master/docs/_static/img/logo/causalml_logo.png" alt="Image Alt Text" width="500"/>
</div>


### Causal Inference 

Let's now learn about **Causal Inference**

**Causal inference** is the process of drawing conclusions about **cause-and-effect relationships** between variables based on observed data. It involves determining whether changes in one variable directly lead to changes in another variable, beyond mere correlation.


 <div style="text-align:center">
    <img src="https://miro.medium.com/v2/resize:fit:1200/1*7qYeeKBkiW9IPLqwIqW_1A.png" alt="Image Alt Text" width="500"/>
</div>


#### Key aspects of causal inference include:

**Identification of Causal Effects**: Determining which variables are causes and which are effects, and estimating the magnitude and direction of the causal relationships.

**Control of Confounding**: Addressing confounding variables that may distort the observed association between the variables of interest.

**Counterfactual Reasoning**: Considering what would have happened under different conditions or interventions to estimate causal effects.

**Causal Mechanisms**: Understanding the underlying mechanisms through which causal effects occur.


**Causal inference** methods include **randomized controlled trials** (RCTs) in experimental settings, as well as observational studies, structural equation modeling, and causal graphical models in observational settings. The goal of causal inference is to provide **reliable and actionable insights into the effects of interventions or policies on outcomes of interest, aiding decision-making** in various fields such as healthcare, economics, and social sciences.


### Now Let's learn the terminologies related to Causal Ml

**Treatment**: The variable or intervention being studied for its causal effect on an outcome. It can be binary (e.g., receiving a drug or not) or continuous (e.g., dosage level).

**Outcome**: The variable of interest whose response is being studied to determine the causal effect of the treatment.

**Randomized Controlled Trial**: RCT stands for Randomized Controlled Trial. It is a type of experimental study design commonly used in medical and scientific research to evaluate the effectiveness of interventions or treatments. In an RCT, participants are randomly assigned to either a treatment group, which receives the intervention being studied, or a control group, which does not receive the intervention (typically receiving a placebo or standard treatment). Randomization helps to ensure that the groups are comparable at baseline, reducing the risk of confounding variables influencing the results. RCTs are considered the gold standard for assessing causal relationships between treatments and outcomes because they provide strong evidence of causality when properly conducted.

**Confounding**: The presence of unobserved variables that affect both the treatment assignment and the outcome, leading to biased estimates of causal effects if not properly controlled for.

 <div style="text-align:center">
    <img src="https://i.ytimg.com/vi/Od6oAz1Op2k/hqdefault.jpg?v=61d2166f" alt="Image Alt Text" width="500"/>
</div>

**Selection Bias**: Selection bias occurs when the process of selecting participants for a study systematically excludes certain groups or individuals, resulting in a sample that is not representative of the target population. This bias can distort the results of a study, leading to erroneous conclusions about causal relationships between variables

**Counterfactuals**: Counterfactuals are hypothetical scenarios representing what would have happened under different conditions or interventions. They are essential for estimating causal effects by comparing observed outcomes to hypothetical outcomes. In causal inference, counterfactuals allow researchers to assess the impact of treatments or interventions on outcomes of interest. Estimating counterfactuals involves imagining what the outcome would have been if individuals had received different treatments, enabling the evaluation of causal relationships in observational studies. Counterfactual analysis helps researchers make informed decisions and draw valid causal conclusions from observed data.

**Average Treatment Effect**(ATE): The average causal effect of the treatment on the outcome across the entire population.

**Five Steps of RCTs**: 

The video outlines a five-step process for conducting an RCT, which includes **selecting participants, dividing them into control and treatment groups, administering the treatment to one group, monitoring outcomes, and analyzing the results** to determine the effect of the treatment.

### Now's Let's go through an example : 

By conduct a **randomized controlled test** to determine the effect of a specific action, in this case, **sending emails to potential customers to see if it increases purchase conversion**. The process involves five steps:

**Selecting Participants**: The first step is to select users who will participate in the test. It's crucial to choose these individuals based on uniform criteria to ensure fairness and accuracy in the test results.

**Splitting Users into Groups**: Once the participants are selected, they are divided into two groups evenly. This division is essential for comparing the outcomes of the two groups later on.

 <div style="text-align:center">
    <img src="groups.png" alt="Image Alt Text" width="500"/>
</div>


**Intervention**: One group receives the intervention (in this case, the emails), while the other group does not. This setup allows for a comparison between the two groups to see if the intervention had any effect.

**Observation and Analysis**: After the intervention, the behavior of both groups is observed and analyzed to determine if there was a significant difference in purchase conversion between the group that received the emails and the group that did not.

**Conclusion**: Now for example purpose let's consider, conclusions are drawn about the effectiveness of the emails in increasing purchase conversion. This step involves interpreting the data collected and determining whether the emails helped, hurt, or had no effect on purchase conversion.

This aims to explain the concept of causal inference and how randomized controlled tests can be used to infer causality between an action (sending emails) and an outcome (increase in purchase conversion)

### Some of the drawbacks for RCT'S :

**Experiment Setup and Feasibility**: Setting up RCTs can be practically impossible for certain experiments due to logistical, ethical, or financial constraints, limiting the scope of questions that can be addressed through this methodology.

**Time Constraints**: Some RCTs require a long duration to show results, making them unsuitable for scenarios that demand quick decision-making or where market dynamics evolve rapidly.

**Ethical Considerations**: Ethical dilemmas arise, especially in medical trials, where withholding treatment from a control group could lead to ethical concerns, making some RCTs infeasible.

**Generalizability Issues**: The controlled environment of RCTs might not accurately represent real-world scenarios, raising questions about the applicability and generalization of the findings outside the study context.

**Resource Intensiveness**: RCTs can be significantly resource-intensive, requiring substantial time, money, and manpower to conduct, which may not be feasible for all researchers or organizations.









### Main Challenges to Causal Inferencing 

**1. Confounders** 

## Let's take an example : a medical one 

<div style="text-align:center">
    <img src="FLU.jpeg" alt="Image Alt Text" height="300", width="300"/>
</div>

So, let's consider flu is the problem and I have developed a vaccine that may be a cure, Now before releasing it to public I want to test whether conduct a test 

1. I take some people who have flu and tell them to take the vaccine. **Treatment Group**
2. And other set of people who flu, I won't treat them **Control Group**

Now's let's consider that 75% of **Treatment Group** is recovered and only 50% of **Control Group** is recovered.

<div style="text-align:center">
    <img src="confounder.png" alt="Image Alt Text" height="500", width="500"/>
</div>


Now this would mean that the vaccine is working. But no neccesarily let's consider the **Control group have an average age of 65** while the **treatment group have an average age of 35**..

This might mean people in the treatment group might have been recovered without vaccine but the test doesn't prove that either. Now in this case, **Age** is a **confounding variable**.


Now, That's is why like conducting **A/B Tests** such as this **Randomization** is important. In this case to make sure **age and other confounding variables** are equal between 2 groups.

**2. Selection Bias** 

**Selection bias** arises when the treatment group selected does not accurately represent the entire population. In the above scenario, the treatment group consists solely of young individuals, failing to encompass the entire demographic spectrum. Therefore, this situation exhibits a clear instance of selection bias.

**3. Counterfactuals** 

**Counterfactuals**  represent hypothetical scenarios - they describe what would have happened if an individual had not received the vaccine. When analyzing previous data, we also calculate counterfactuals for each person to ensure a like-for-like comparison.

<div style="text-align:center">
    <img src="Counterfactuals.jpg
" alt="Image Alt Text" height="500", width="500"/>
</div>

In this context, The **Red data points** represent the **counterfactuals**, which are computed using **matching or imputation techniques via machine learning**. More details about these methods will be provided in the subsequent sections.

### Assumptions 

Let's talk about **Assumptions** we need to make for **Causality**

1. We must make certain assumptions to adjust previous data, aiming to make it as representative as a randomized controlled trial (RCT) as possible.
2. There will be some confounders that may have unexpected and unintended effects on the outcomes.
<div style="text-align:center">
    <img src="assumptions.png
" alt="Image Alt Text" height="300", width="300"/>
</div>

Causal inference from historical data should mimic a randomized controlled trial (RCT).

### 1. Causal Markov Condition ( Markov Assumption)

The **Causal Markov Condition** is a fundamental assumption underlying the framework of causal inference. This condition pertains to the structure of causal relationships and how we can represent and understand these relationships through **causal graphs**.

The **Causal Markov Condition** implies that a variable is independent of its non-effects given its direct causes. In simpler terms, this condition says that once you account for a variable's immediate causes, its outcomes are independent of any other variables that precede it in the causal chain. This assumption allows researchers to simplify complex causal relationships into more manageable segments that can be analyzed individually.

<div style="text-align:center">
    <img src="graph.png
" alt="Image Alt Text" height="300", width="500"/>
</div>

For our medical, we have graph that looks like above ..but it's convoluted 

In the context of causal graphs, which are diagrams that depict causal relationships between variables, the Causal Markov Condition helps ensure that the graph is a Directed Acyclic Graph (DAG). A DAG is a representation in which variables are nodes connected by arrows that indicate the direction of causation, and there are no loops—meaning, you cannot start at one node and follow the arrows back to the starting point.

<div style="text-align:center">
    <img src="Causality.png" alt="Image Alt Text" height="300", width="500"/>
</div>

Here, the confounding variables that directly influence the treatment. Furthermore, the treatment itself impacts the outcome.


### 2. SUTVA (Stable Unit Treatment Value Assumption)

Essentially, this implies that a sample from the control group **does not influence** a sample from the treatment group. This condition is necessary to avoid any potential interaction effects.

In our case we didn't have people who took the vaccine interact with people who didn't since, since they are not in the same group.

### 3. Ignorability

This assumption states that there exist no other confounders that have effect on treatment and output. 

The key points about ignorability explained using the flu vaccine example:

1. The goal is to estimate the causal effect of the flu vaccine on recovery using observational data.
   
2. Ignorability assumes that after adjusting for age (observed variable), treatment assignment (receiving vaccine or not) is effectively randomized.
   
3. However, age is a confounder, affecting both treatment assignment and recovery, violating ignorability.
   
4. Violation of ignorability implies presence of other unmeasured confounders influencing treatment and outcome.
   
5. With ignorability violated, causal effect estimates from observational data may be biased due to unmeasured confounders.

<div style="text-align:center">
    <img src="ignorability.jpg" alt="Image Alt Text" height="300", width="500"/>
</div>


### Measuring Average Treatment Effect :

Let's say we want to answer does vaccine help people. 
Considering this hypothetical data.

The 1st column is person and the second and third column is whether the person got better and or not better depending on whether they receive the treatment 

<div style="text-align:center">
    <img src="measure.jpg" alt="Image Alt Text" height="500", width="500"/>
</div>

Here Ajay, Akash recieved the vaccine and got better..Tony and goku didn't get vaccine still got better.

To answer the question does the vaccine help the people :

1st Let's Count the people who got elixir and got better and divide by people who took the vaccine

$$Mean(Treatment)=(1+0+1+0+1)/5 = +0.6 $$

Now, we count people who didn't get vaccine and got better and divide by people who didn't get the vaccine 

$$Mean(Control)=(1+1+0+0+0)/5 = +0.4$$

We subtract these number and we get the effect :

$$Effect = +0.2$$

Here, It looks the vaccine has a positive effect but let's see now by adding a age column :

<div style="text-align:center">
    <img src="addage.jpg" alt="Image Alt Text" height="500", width="500"/>
</div>

Looks like average age for those who received treatment is 48 but who didn't receive is 29.5 years

$$Mean(Age | Treatment) = 26+51+67/3 = 48 $$

$$Mean(Age | Control) = 24+35/2 = 29.5 $$

That's a big difference that age could be causing some effect on output.

At this stage, we need to calculate counterfactuals for each individual. This involves determining whether those who received the vaccine would have recovered without it, and whether those who didn't receive the vaccine would have also recovered.

There are 2 ways to do that :
1. **Matching**
2. **Machine learning**

**Matching** : Here essentially you have to find people of same age who received other treatment and use that as the counterfactual estimate. In this example Sam(24) and Varun(24) are same age 

SAM(24)    --- 0   1

Here 0 in sam and 1 in varun are inputted using Matching

Varun(24)----  0   1

**Machine Learning**:
Building a model that takes age and treatment as input and then predicts the output. We train it on factual data and predict Counterfactuals.

<div style="text-align:center">
    <img src="MLway.jpeg" alt="Image Alt Text" height="300", width="300"/>
</div>


Let's assume Counterfactuals are populated in red color using ML :

<div style="text-align:center">
    <img src="ite.jpg" alt="Image Alt Text" height="500", width="500"/>
</div>

To Calculate **ATE**, We need to calculate Individual Treatment Effect(ITE) which means to we calculate the case where the person got or would have gotten the treatment with the case where they had not gotten or would not have got the VACCINE treatment

We know take the average of ITE to get the ATE:


$$ ATE = 0+(-1)+0+0+0+1+1+(-1)+0+1/10 = +0.1 $$ 

It's Looks like the vaccine does help when accounting for age.

Now, we can say for everyone who has flu can take the vaccine.

Now let's take on ATE which is conditioned on age.Since, age is confounding variable. this is known as **Conditional Average Treatment Effect(CATE)**

So, for this we would just average it for each condition like below : 

$$ CATE(Age >=35) = 0+0+1+1+0/5 = +0.4$$

$$ CATE(Age <35) = 0+(-1)+0+0+(-1)+1/5 = -0.2$$


#### Treatment effects different age groups differently, this is called Treatment Heterogeneity.


#### So, we can conclude those vaccine have better impact on older people compared to younger

**Summary:**

Causal inferencing simulates randomized controlled tests (RCTs) using past data when RCTs are impractical.

**Challenges in Causal Inferencing:**

* **Confounders:** Variables that influence both the treatment and outcome, biasing results.
* **Selection Bias:** Unrepresentative sample groups due to differences in treatment assignment.
* **Counterfactuals:** Determining what would have happened if individuals had not received the treatment.

**Assumptions for Causal Inferencing:**

* **Causal Markov Condition:** Confounders have no causal effect on the outcome, given the treatment.
* **Stable Unit Treatment Value Assumption (SUTVA):** Treatment effects do not influence other individuals.
* **Ignorability:** No unmeasured confounders influence the treatment or outcome.

**Measuring Average Treatment Effect (ATE):**

* Compare outcomes for individuals who received and did not receive the treatment, accounting for counterfactuals to estimate causal effects.

**Conditional Average Treatment Effect (CATE):**

* Determines how the treatment affects different subgroups (e.g., age groups) by conditioning the ATE on specific variables.

**Conclusion:**

Causal inferencing allows for the determination of causal relationships from past data, even without RCTs. However, it requires careful consideration of assumptions, confounders, and counterfactuals to ensure reliable results.