How to create, train and evaluate DTD dataset #5

dreamflasher · 2020-10-08T13:51:20Z

Hi, would you be so kind to explain how to do the DTD training and evaluation?
In the paper you mention that DTD are the inliers and imagenet30 the outliers. How is the folder structure of "~/data/dtd/" supposed to look like?

For training, do I assume correctly unlabeled multi-class? I.e. CUDA_VISIBLE_DEVICES=0,1,2,3 python -m torch.distributed.launch --nproc_per_node=4 train.py --dataset dtd --model resnet18 --mode simclr_CSI --shift_trans_type rotation --batch_size 32 --one_class_idx None

And for evaluation, how do I specify the out-distribution? I only see the "dataset" flag, but I would need to specify in-distribution and out-of-distribution datasets, right?

Thank you again for your help!

The text was updated successfully, but these errors were encountered:

dreamflasher · 2020-10-08T15:31:17Z

And the same question for the steel dataset in the appendix; looks like this didn't get in the code?

jihoontack · 2020-10-09T03:01:25Z

Hi! Thank you again for your interest!

Before answering the question, we found that there is a minor value mistake in Table 6 (DTD to ImageNet detection). Even after fixing the minor bug, we found out that our message doesn't change.

reported: Base 96.4, CSI(Rotation) 65.4
fixed: Base 90.0, CSI(Rotation) 79.9
The mistake was due to the evaluation code. Please aware if you are using an imagenet sized in-lier dataset (add option to line 146 in evals/ood_pre.py e.g., P.dataset == 'dtd')

To use DTD as in-liers, you should first divide the DTD dataset into train/test sets. The following code is the one I have implemented to divide the set. Run this code at the ~/data/dtd folder. (and note that you should create test folder before running the code)

import os
import shutil

f = open('labels/test1.txt', 'r')
while True:
    line = f.readline()
    if not line: break

    line = line.replace("\n", "")
    test_class, test_sample_name = line.split('/')

    if not os.path.exists(f'./test/{test_class}'):
        os.mkdir(f'./test/{test_class}')

    shutil.move(f'./images/{line}', f'./test/{line}')

f.close()

After dividing the set, you can use DTD as a training dataset with some modification on your dataset.py code (just as same as loading cifar10 or ImageNet). Also, I believe you should modify the code a little since we restricted the argument parsers for --dataset.

For CSI training, we have used unlabeled multiclass training for DTD.
CUDA_VISIBLE_DEVICES=0,1,2,3 python -m torch.distributed.launch --nproc_per_node=4 train.py --dataset dtd --model resnet18_imagenet --mode simclr_CSI --shift_trans_type rotation --batch_size 32

For CSI evaluation:
python eval.py --mode ood_pre --dataset dtd --ood_dataset imagenet --model resnet18_imagenet --ood_score CSI --shift_trans_type rotation --print_score --ood_samples 10 --resize_factor 0.54 --resize_fix --load_path <MODEL_PATH>

For the steel dataset, we didn't open the code since it shows similar results with the DTD dataset. Of course, you can download the dataset and run the code: https://www.kaggle.com/c/severstal-steel-defect-detection/data

Thank you again for your interest and feel free to ask if you have any questions!

dreamflasher · 2020-10-09T08:54:20Z

Thank you again for your great support and responsiveness and your great work!

dreamflasher closed this as completed Oct 9, 2020

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

How to create, train and evaluate DTD dataset #5

How to create, train and evaluate DTD dataset #5

dreamflasher commented Oct 8, 2020

dreamflasher commented Oct 8, 2020

jihoontack commented Oct 9, 2020 •

edited

Loading

dreamflasher commented Oct 9, 2020

How to create, train and evaluate DTD dataset #5

How to create, train and evaluate DTD dataset #5

Comments

dreamflasher commented Oct 8, 2020

dreamflasher commented Oct 8, 2020

jihoontack commented Oct 9, 2020 • edited Loading

dreamflasher commented Oct 9, 2020

jihoontack commented Oct 9, 2020 •

edited

Loading