Skip to content

0. Get Started

Jingkang Yang edited this page Aug 21, 2022 · 1 revision

Get Started

1. Environment Setup

To setup the environment, we use conda to manage our dependencies, and CUDA 10.1 to run our experiments.

You can specify the appropriate cudatoolkit version to install on your machine in the environment.yml file, and then run the following to create the conda environment:

conda env create -f environment.yml
conda activate openood

2. Dataset Preparation

Datasets are provided here. Our codebase accesses the datasets from ./data/ by default.

├── ...
├── data
│   ├── images
│   └── imglist
├── openood
├── scripts
├── main.py
├── ...

We use ImgList to localize datasets, which record the path and label of every image into a txt file, so that dataloader can get access to the images accordingly.

For example, in ./data/imglist/digits/train_mnist.txt, the text file looks like

mnist/train/1_45221.jpg 1
mnist/train/3_5770.jpg 3
mnist/train/2_1264.jpg 2
mnist/train/3_55893.jpg 3
mnist/train/8_18439.jpg 8
...

Every line is in the form of image_path and label. Notice that image_path is a relative path with the default root of ./data/images/.

Users can add custom datasets by preparing their data directories and generating their own imglists accordingly.

3. Train MNIST

sh scripts/0_basics/mnist_train.sh

Notice that our developers use Slurm to maintain the codebase, the running command looks like

GPU=1
CPU=1
node=73
jobname=openood

PYTHONPATH='.':$PYTHONPATH \
srun -p dsta --mpi=pmi2 --gres=gpu:${GPU} -n1 \
--cpus-per-task=${CPU} --ntasks-per-node=${GPU} \
--kill-on-bad-exit=1 --job-name=${jobname} \
-w SG-IDC1-10-51-2-${node} \
python main.py \
--config configs/datasets/digits/mnist.yml \
configs/networks/lenet.yml \
configs/pipelines/train/baseline.yml \
--optimizer.num_epochs 100 \
--num_workers 8

If you are not a Slurm user, you can simply remove content from srun and python into the following script.

PYTHONPATH='.':$PYTHONPATH \
python main.py \
--config configs/datasets/digits/mnist.yml \
configs/networks/lenet.yml \
configs/pipelines/train/baseline.yml \
--optimizer.num_epochs 100 \
--num_workers 8

Our code will save the trained model in a new dir ./results by default.

You can try to understand our code and pipeline from main.py. Notice that we stack several .yml files to control the program process, which can be referred as config files. Later config files will add on and override the previous ones. You can also modify some extra configs (e.g., --optimizer.num_epochs) in bash.

4. Test MNIST

After the training, you will obtain a dir ./results/mnist_lenet_base_e100_lr0.1 to save the best and the last checkpoints. We can use the following script to evaluate the model accuracy on test set.

sh scripts/0_basics/mnist_test.sh

One example result looks like:

5. Test on Classic OOD Benchmark

Now that we evaluate the performance of the trained model on classic OOD benchmark. The brief introduction to classic OOD benchmark can be found here. To run the testing:

sh scripts/c_ood/0_mnist_test_ood_msp.sh

One example result looks like:

6. Test on FS-OOD Benchmark

We notice that some methods on classic OOD benchmark might heavily rely on the difference on covariate shifts --- a type of distribution shift that is mainly concerned with changes in appearances like image contrast, lighting or viewpoint --- to detect OOD detection, rather than focusing on semantic difference. To avoid OOD detectors using too much sprious information, we propose FS-OOD by introducing a covariate-shifted ID set. The brief introduction to FS-OOD benchmark can be found here.

To run the code, the following script should be operated.

sh scripts/c_ood/0_mnist_test_fsood_msp.sh

One example result looks like: