Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

ImageNet dataset cannot be loaded #2

Closed
HigasaOR opened this issue Dec 20, 2021 · 5 comments
Closed

ImageNet dataset cannot be loaded #2

HigasaOR opened this issue Dec 20, 2021 · 5 comments

Comments

@HigasaOR
Copy link

I tested the code (run_attack.sh) and found that I cannot load imagenet dataset. I dug into it and found that maybe its because in dataset.py, in class AdvImageNet:
self.image_list is a set loaded with the predifined data/image_list.json, so an element string in it looks like this:
n01820546/ILSVRC2012_val_00027008.JPEG
Nonetheless, the is_valid_file function used in super init keeps only the last 38 char of the image file path, like
ILSVRC2012_val_00027008.JPEG
, to check if it's listed in self.image_list. Thus, the function will always return false as there is no class folder in the string, and no image will be loaded.

A simple workaround will work (at least I've tested):

class AdvImageNet(torchvision.datasets.ImageFolder):

    def __init__(self, image_list="data/image_list.json", *args, **kwargs):
        self.image_list = list(json.load(open(image_list, "r"))["images"])
        for i in range(len(self.image_list)):
            self.image_list[i] = self.image_list[i].split('/')[1]
        super(AdvImageNet, self).__init__(
            is_valid_file=self.is_valid_file, *args, **kwargs)

    def is_valid_file(self, x: str) -> bool:
        return x[-38:] in self.image_list

Another possibility is that the imagenet structure used by this repo is different from mine:

  val/ <-- designated as DATA_DIR in run_attack.sh
    n01820546/
      ILSVRC2012_val_00027008.JPEG

In this case, could you specify how the dataset should be structured? Thank you!

@Muzammal-Naseer
Copy link
Owner

Hi HigasaOR,

Thank you for your interest in our work!

Please set --test_dir to 'ImageNet/val' directory, where 'val' contains sub-folders (classes) of ImageNet validation set.

Cheers,
Muzammal

@HigasaOR
Copy link
Author

Hi,

Thanks for replying! This is exactly what I have done: val/ contains class folders (ex. n01820546/), and in the class folder there are the class' images (ex. ILSVRC2012_val_00027008.JPEG) (please see my imagenet validation set structure above, I believe it is the same as in your description?). DATA_DIR in run_attack.sh will be assigned to the --test_dir argument. But it won't work because the issue I mentioned.

Thanks

@Muzammal-Naseer
Copy link
Owner

Hi HigasaOR,

I checked again and the code is working fine. Can you please post what error you are facing while loading?

Muzammal

@HigasaOR
Copy link
Author

[me@my-machine On-Improving-Adversarial-Transferability-of-Vision-Transformers]$ bash ./scripts/run_attack.sh
Traceback (most recent call last):
  File "/home/me/Programs/On-Improving-Adversarial-Transferability-of-Vision-Transformers/test.py", line 208, in <module>
    main()
  File "/home/me/Programs/On-Improving-Adversarial-Transferability-of-Vision-Transformers/test.py", line 149, in main
    test_loader, test_size = get_data_loader(args)
  File "/home/me/Programs/On-Improving-Adversarial-Transferability-of-Vision-Transformers/test.py", line 106, in get_data_loader
    test_set = AdvImageNet(root=test_dir, transform=data_transform)
  File "/home/me/Programs/On-Improving-Adversarial-Transferability-of-Vision-Transformers/dataset.py", line 13, in __init__
    super(AdvImageNet, self).__init__(is_valid_file=self.is_valid_file, *args, **kwargs)
  File "/home/me/.local/lib/python3.9/site-packages/torchvision/datasets/folder.py", line 310, in __init__
    super(ImageFolder, self).__init__(root, loader, IMG_EXTENSIONS if is_valid_file is None else None,
  File "/home/me/.local/lib/python3.9/site-packages/torchvision/datasets/folder.py", line 146, in __init__
    samples = self.make_dataset(self.root, class_to_idx, extensions, is_valid_file)
  File "/home/me/.local/lib/python3.9/site-packages/torchvision/datasets/folder.py", line 192, in make_dataset
    return make_dataset(directory, class_to_idx, extensions=extensions, is_valid_file=is_valid_file)
  File "/home/me/.local/lib/python3.9/site-packages/torchvision/datasets/folder.py", line 102, in make_dataset
    raise FileNotFoundError(msg)
FileNotFoundError: Found no valid file for the classes n01440764, n01443537, n01484850, n01491361, n01494475, 
... (all other classes), n15075141.

Thanks

@HigasaOR
Copy link
Author

As this cannot be reproduced on every machine, I'll just close this issue. At least the code works with little modification I mentioned in my situation. Thanks for your help.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants