# Reproducibility study - Counterfactual Generative Networks

## Setup
We first need to install the packages 

In [None]:
import sys
py = sys.executable

In [None]:
!{py} -m pip install -r requirements.txt

In [None]:
import gdown

In [None]:
gdown.download('https://drive.google.com/u/0/uc?export=download&confirm=rHtT&id=1NSv4RCSHjcHois3dXjYw_PaLIoVlLgXu', 'colored_mnist.tar.gz')
gdown.download("https://drive.google.com/u/0/uc?id=1VkKexkWh5SeB8fgxAZxLKgmmvDXhVYUy&export=downloadl", "u2net.pth")
gdown.download("https://drive.google.com/u/0/uc?id=12yVFHPUjKmUFGnO2D4xVlTSpF8CUj136&export=download", "cgn.pth")

In [None]:
gdown.download("https://drive.google.com/uc?id=1ft5tjOh9Rx_6OBkqyPL4NaqC70Rl0kxK", "imagenet-mini.zip")

In [None]:
%%sh
mkdir -p imagenet/data/imagenet-mini
unzip -q -o imagenet-mini.zip -d imagenet/data/imagenet-mini
rm imagenet-mini.zip


In [None]:
%%sh
#!/usr/bin/env bash
# Move Colored MNIST
tar -xzf colored_mnist.tar.gz 
mv colored_mnist mnists/data
rm colored_mnist.tar.gz

# Download BG challenge dataset
wget https://github.com/MadryLab/backgrounds_challenge/releases/download/data/backgrounds_challenge_data.tar.gz
tar -xzf backgrounds_challenge_data.tar.gz
mkdir imagenet/data/in9
mv bg_challenge/* imagenet/data/in9
rmdir bg_challenge
rm backgrounds_challenge_data.tar.gz

# Download the Cue Conflict dataset
git clone --quiet https://github.com/rgeirhos/texture-vs-shape/
mkdir imagenet/data/cue_conflict
mv texture-vs-shape/stimuli/style-transfer-preprocessed-512/* imagenet/data/cue_conflict
rm -rf texture-vs-shape


In [None]:
%%sh
#!/usr/bin/env bash

mv u2net.pth imagenet/weights

mv cgn.pth imagenet/weights

wget -q "https://s3.amazonaws.com/models.huggingface.co/biggan/biggan-deep-256-pytorch_model.bin"
mv biggan-deep-256-pytorch_model.bin imagenet/weights/biggan256.pth

## MNIST

Before doing any MNIST operations we need to generate the non-counterfactual datasets.

In [None]:
!{py} -u mnists/generate_data.py --dataset colored_MNIST
!{py} -u mnists/generate_data.py --dataset double_colored_MNIST
!{py} -u mnists/generate_data.py --dataset wildlife_MNIST

### Table 2

Note: generating table 2 can take upto 5 hours

In [None]:
!{py} experiments/table2.py

In [None]:
!cat table2_data_2.json

### Heatmaps

NOTE: this experiments depends on the counterfactual dataset generated in the previous experiment.

In [None]:
!{py} mnists/generate_10_colored.py

In [None]:
!{py} mnists/train_classifier.py --dataset double_colored_MNIST --grad_cam
!{py} mnists/train_classifier.py --dataset double_colored_MNIST_counterfactual --grad_cam
!{py} mnists/train_classifier.py --dataset double_colored_MNIST_counterfactual --grad_cam  --original

A sample heatmap (double colored MNIST):

In [None]:
!{py} mnists/plot_grad_cam.py

![heatmap](https://github.com/MundVetter/FACT_CGN/blob/main/mnists/data/grad_cam/double_colored_MNIST_counterfactual_False_False/heatmap.png?raw=1)

Note that ordering is OS specific 

## Imagenet

In [None]:
from datetime import datetime
import matplotlib.pyplot as plt
import cv2

data_path = lambda run_name: f'imagenet/data/{datetime.now().strftime("%Y_%m_%d_%H_")}{run_name}_trunc_1.0/'

(Optional) If you want to train your own CGN remove the # from the command below. Note that training takes a lot of time (20 hours+)! Also make sure your GPU memory has enough memory. 

In [None]:
# !{py} imagenet/train_cgn.py --model_name cgn --batch_acc 500 --episodes 200 --batch_sz 5 --log_losses --save_iter 1500

If you want to use your self trained CGN, change the path below to the location of the self trained CGN. Otherwhise you don't have to do anything :)

In [None]:
weight_path =  "imagenet/weights/cgn.pth"

### Generating counterfactuals

In [None]:
!{py} -u imagenet/generate_data.py --mode random --weights_path {weight_path} --n_data 5 --run_name "random"

In [None]:
img_path = data_path("random") + "ims/" + 'random_0000000_x_gen.jpg'
img = cv2.cvtColor(cv2.imread(img_path), cv2.COLOR_BGR2RGB)
plt.imshow(img)

### Generating non-counterfactuals

(Optional) Uncomment the next line if you want to get accurate training accuracies, inception scores and mu mask values. Note that this will increase the run time of the `generate_data` function to at least 4 hours.

In [None]:
n_data = 100
# n_data = 50000

In [None]:
!{py} -u imagenet/generate_data.py --mode random_same_class --weights_path {weight_path} --n_data {n_data} --run_name "random_same"

In [None]:
random_same = data_path("random_same") + "ims"

The Inception Score is calcualted using TensorFlow because the Pytorch implementation deviates from the results obtained from the original paper.

IS for CGN:

In [None]:
!{py} -u -m imagenet.inception_score --path {random_same} --batch-size 64 --splits 1 --cuda --kind x_gen --tensorflow

IS for BigGAN:

In [None]:
!{py} -u -m imagenet.inception_score --path {random_same} --batch-size 64 --splits 1 --cuda --kind x_gt --tensorflow

$\mu_{mask}$ value for the CGN

In [None]:
!{py} -u -m imagenet.calculate_mask --path {random_same}

### Create interpolations

The interpolation method can be modified to "bg", "text", "shape" and "all".

In [None]:
!{py} -u imagenet/generate_data.py --mode best_classes --weights_path {weight_path} --interp shape --n_data 5 --run_name interpolate_test

Showing the first interpolated image

In [None]:
import matplotlib.image as mpimg
i_t = data_path("interpolate_test") + "ims/" + "interpolate_test_0000000_x_gen_interp.jpg"
img = cv2.cvtColor(cv2.imread(i_t), cv2.COLOR_BGR2RGB)
plt.imshow(img)

### ImageNet-9 accuracy and Shape vs Texture bias

The following code will train the classifiers for the imagenet-9 accuracy table. This will take at least 10 hours including data generation.

The best hyper parameters from our hyperparameter search are included in the commands below.

Imagenet mini only

In [None]:
!{py} imagenet/train_classifier.py -a resnet50 -b 32 --lr 0.0001 -j 0 --mini --epochs 30 -p 100 --pretrained --name classifier_mini_IN_only

Imagenet mini + CGN

In [None]:
%%sh
mkdir -p imagenet/data/cf
mkdir -p imagenet/data/cf/val
mkdir -p imagenet/data/cf/train

In [None]:
train_path = data_path("train")

In [None]:
!{py} -u imagenet/generate_data.py --mode random --weights_path {weight_path} --n_data 100000 --run_name "train"

In [None]:
test_path = data_path("val")

In [None]:
!{py} -u imagenet/generate_data.py --mode random --weights_path {weight_path} --n_data 10000 --run_name "val"

In [None]:
!mv {train_path} imagenet/data/cf/train
!mv {test_path} imagenet/data/cf/val

In [None]:
!{py} imagenet/train_classifier.py -a resnet50 -b 32 --lr 0.0001 -j 6 --mini --cf_ratio 2.0 --epochs 30 -p 100 

## Appendix

**Table 7**

In [None]:
!{py} experiments/figure_7.py
!{py} experiments/plot_fig7.py

MNIST examples

In [None]:
!{py} experiments/new_images.py