# Causal inference via Deep Learning

In this tutorial, we will demo causal inference (CI) based on deep learning (DL) models and other models tailored for CI, including
- TAR-Net [1]
- Dragon-Net [2]
- Double ML [3]



[1] U. Shalit, F. D. Johansson, and D. Sontag, “Estimating individual treatment effect: generalization bounds and algorithms,” in ICML 2017.

[2] C. Shi, D. Blei, and V. Veitch, “Adapting neural networks for the estimation of treatment effects,” in NeurIPS 2019.

[3] V. Chernozhukov, D. Chetverikov, M. Demirer, E. Duflo, C. Hansen, W. Newey, and J. Robins, ”Double/debiased machine learning for treatment and structural parameters,” The Econometrics Journal, vol. 21, no. 1, pp. C1–C68, 01 2018.

In [None]:
import numpy as np
import tensorflow
import os


# Download the dataset

In [None]:
!mkdir -p datasets/IHDP-100 
!wget -P /content/datasets/IHDP-100 https://www.fredjo.com/files/ihdp_npci_1-100.train.npz
!wget -P /content/datasets/IHDP-100 https://www.fredjo.com/files/ihdp_npci_1-100.test.npz

--2023-05-04 20:30:29--  https://www.fredjo.com/files/ihdp_npci_1-100.train.npz
Resolving www.fredjo.com (www.fredjo.com)... 185.199.108.153, 185.199.109.153, 185.199.110.153, ...
Connecting to www.fredjo.com (www.fredjo.com)|185.199.108.153|:443... connected.
HTTP request sent, awaiting response... 200 OK
Length: 16129570 (15M) [application/octet-stream]
Saving to: ‘/content/datasets/IHDP-100/ihdp_npci_1-100.train.npz’


2023-05-04 20:30:29 (114 MB/s) - ‘/content/datasets/IHDP-100/ihdp_npci_1-100.train.npz’ saved [16129570/16129570]

--2023-05-04 20:30:30--  https://www.fredjo.com/files/ihdp_npci_1-100.test.npz
Resolving www.fredjo.com (www.fredjo.com)... 185.199.108.153, 185.199.109.153, 185.199.110.153, ...
Connecting to www.fredjo.com (www.fredjo.com)|185.199.108.153|:443... connected.
HTTP request sent, awaiting response... 200 OK
Length: 1801570 (1.7M) [application/octet-stream]
Saving to: ‘/content/datasets/IHDP-100/ihdp_npci_1-100.test.npz’


2023-05-04 20:30:30 (23.2 MB/s) - ‘

In [None]:
def _show_details(dataset_name, split, data):
  print('========================================================')
  print(f"The details of the {split} split of {dataset_name} dataset:")
  for k in data.keys():
    print(k, data[k].shape, data[k].dtype)
  print('========================================================')

def load_dataset(dataset_name = 'IHDP-100', split = 'train', details = False):
  dataset_path = os.path.join(datasets_folder, dataset_name)
  dataset_filename = None
  if dataset_name == 'IHDP-100':
    dataset_filename = f'ihdp_npci_1-100.{split}.npz'
  elif dataset_name == 'IHDP-1000':
    dataset_filename = f'ihdp_npci_1-1000.{split}.npz'
  elif dataset_name == 'Jobs':
    dataset_filename = f'jobs_DW_bin.new.10.{split}.npz'
  else:
    print("The given dataset is not supported. Use IHDP-100 by default.")
    dataset_filename = f'ihdp_npci_1-100.{split}.npz'

  data = None
  try:
    data = np.load(os.path.join(dataset_path, dataset_filename))
    if details: _show_details(dataset_name, split, data)
  except FileNotFoundError:
    print(f'Cannot find the file of the {split} split of dataset {dataset_name}.')
  except:
    print("Loading dataset failed.")

  return data

datasets_folder = '/content/datasets'
dataset_name = 'IHDP-100'

training_data = load_dataset('IHDP-100', 'train', True)
test_data = load_dataset('IHDP-100', 'test', True)

The details of the train split of IHDP-100 dataset:
ate () int64
mu1 (672, 100) float64
mu0 (672, 100) float64
yadd () int64
yf (672, 100) float64
ycf (672, 100) float64
t (672, 100) float64
x (672, 25, 100) float64
ymul () int64
The details of the test split of IHDP-100 dataset:
ate () int64
mu1 (75, 100) float64
mu0 (75, 100) float64
yadd () int64
yf (75, 100) float64
ycf (75, 100) float64
t (75, 100) float64
x (75, 25, 100) float64
ymul () int64


In [None]:
print(training_data['ymul'])

1
