Causal Inference in Data Science
A computational introduction to causality and counterfactual reasoning with Python
This repository contains the exercises and data for the Causal Inference in Data Science Live Training. This training provides an invaluable, hands-on guide to applying causal inference in the wild to solve real-world data science tasks. Using an end-to-end example, we will walk through the process of posing a causal hypothesis, modeling our beliefs with causal graphs, estimating causal effects with the doWhy library in Python, and finally evaluating the soundness of our results. Rather than taking an abstract and mathematical approach to these steps, the focus of this training will be on accessible computational methods to practically answer causal questions in the context of a data science workflow.
And/or please do not hesitate to reach out to me directly via email at firstname.lastname@example.org or over twitter @jonathandinu
If you find any errors in the code or materials, please open a Github issue in this repository
What you'll learn-and how you can apply it
- Understand how to reason causally and why it is necessary for modern data science.
- Use the doWhy library to build, estimate, and evaluate causal models.
- Learn how to practically apply causal inference to real-world data science problems.
This training course is for you because...
- You have taken an introductory data science course or statistics course but want to take the next step to understand the foundations of causal inference and how to effectively apply the theory to real-world problems.
- You have heard about the power of causal reasoning, but do not know how to get started learning its basics or applying it to your own problems.
- You are an aspiring data scientist looking to break into the field and need to learn the practical skills necessary for what you will encounter on the job.
- You are a quantitative researcher interested in applying theory to real projects by taking a computational approach to causal inference.
- You are a software engineer interested in leveraging analytics to augment your application development process.
- Experience with an object-oriented programming language, e.g., Python (all code demos during the training will be in Python)
- Familiarity with basic probability and statistics (e.g. distributions and hypothesis testing).
- A working knowledge of the scientific Python libraries (numpy, pandas and scikit-learn) is helpful but not required.
Download the appropriate Python 3.7 Anaconda Distribution for your operating system: https://www.anaconda.com/distribution/
- Data Science Fundamentals Part 2: Machine Learning and Statistical Analysis (Lesson 7 and 8) ](https://learning.oreilly.com/videos/data-science-fundamentals/9780134778877)
- Inside Airbnb
The time frames are only estimates and may vary according to how the class is progressing
Identifying Causal Effects (50min)
- Randomized Control Trials
- Counterfactuals and Potential Outcomes
- Causal Graphical Models
Estimating Causal Effects (50min)
- Propensity Score Matching
- Instrument Variables
- Causal Effect Inference with Machine Learning
Break 10 min
Evaluating Causal Models (30min)
- Random Confounders and Placebos
- Cross Validation
- Sensitivity Analysis
Discovering Causal Structure (25min)
- Guess and Test
- Automated Graph Discovery
- Fairness and Machine Learning: Chapter 4
- Causal Inference: What If
- Causality: Models, Reasoning, and Inference
- Causal Inference for Statistics, Social, and Biomedical Sciences
- Counterfactuals and Causal Inference Methods and Principles for Social Research
- Advanced Data Analysis from an Elementary Point of View
- Chapter 18: Graphical Models
- Chapter 19: Graphical Causal Models
- Chapter 20: Identifying Causal Effects from Observations
- Chapter 21: Estimating Causal Effects from Observations
- Chapter 22: Discovering Causal Structure from Observations
- Graphical & Latent Variable Modeling
- Applied Causality (Columbia)
- Intermediate Statistics (CMU): Causal Inference
- Causal Inference and Learning (UIC)
- KDD Tutorial on Causal Inference and Counterfactual Reasoning