Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Facing issues with MMF config for running the VilBERT pretrained model on Hateful Meme dataset. #1

Open
AnjumJ123 opened this issue Apr 20, 2022 · 1 comment

Comments

@AnjumJ123
Copy link

I am trying to reproduce the code for running VilBERT on hateful meme dataset, but the exiting code needs to be modified to point to the new data source for hateful meme challenge data and then be linked in the code.

Facebook released a webpage (https://hatefulmemeschallenge.com/) where the dataset can be downloaded. Would you be able to change the notebook slightly to reproduce the code? I am facing challenges in mmf_convert_hm to get the hateful meme data to be converted into the MMF format and the image, .jsonl files to be moved to the corresponding folders as the repo expects. Facing challenge in terms of ensuring the MMF pre-requisites after downloading the datasets and the changes that need to be made if any in the .yaml and config files to make the code to work.

@AnjumJ123
Copy link
Author

Not sure why it is looking for dev.jsonl when that is not one of the .jsonl files in the hateful_memes dataset. Any suggestions to address this error?

'!mmf_run config="projects/visual_bert/configs/hateful_memes/from_coco.yaml"
model=visual_bert
dataset=hateful_memes
run_type=train_val
training.log_interval=200
training.max_updates=22000
training.batch_size=64
training.evaluation_interval=200
training.tensorboard=True
training.checkpoint_interval=200
checkpoint.resume_pretrained=True
checkpoint.resume_zoo=visual_bert.pretrained.coco
dataset_config.hateful_memes.annotations.train[0]="hateful_memes/defaults/annotations/train.jsonl"
dataset_config.hateful_memes.annotations.val[0]="hateful_memes/defaults/annotations/dev_unseen.jsonl"
dataset_config.hateful_memes.annotations.test[0]="hateful_memes/defaults/annotations/test_unseen.jsonl"'

Error:

You can disable this warning by setting the environment variable OC_DISABLE_DOT_ACCESS_WARNING=1
warnings.warn(message=msg, category=UserWarning)
Overriding option config to projects/visual_bert/configs/hateful_memes/from_coco.yaml
Overriding option model to visual_bert
Overriding option datasets to hateful_memes
Overriding option run_type to train_val
Overriding option training.log_interval to 200
Overriding option training.max_updates to 22000
Overriding option training.batch_size to 64
Overriding option training.evaluation_interval to 200
Overriding option training.tensorboard to True
Overriding option training.checkpoint_interval to 200
Overriding option checkpoint.resume_pretrained to True
Overriding option checkpoint.resume_zoo to visual_bert.pretrained.coco
Using seed 30549397
Logging to: ./save/logs/train_2022-04-20T16:33:30.log
Downloading features.tar.gz: 100% 8.44G/8.44G [06:24<00:00, 22.0MB/s]
Traceback (most recent call last):
File "/usr/local/bin/mmf_run", line 8, in
sys.exit(run())
File "/usr/local/lib/python3.7/dist-packages/mmf_cli/run.py", line 111, in run
main(configuration, predict=predict)
File "/usr/local/lib/python3.7/dist-packages/mmf_cli/run.py", line 40, in main
trainer.load()
File "/usr/local/lib/python3.7/dist-packages/mmf/trainers/base_trainer.py", line 59, in load
self.load_datasets()
File "/usr/local/lib/python3.7/dist-packages/mmf/trainers/base_trainer.py", line 83, in load_datasets
self.dataset_loader.load_datasets()
File "/usr/local/lib/python3.7/dist-packages/mmf/common/dataset_loader.py", line 18, in load_datasets
self.val_dataset.load(self.config)
File "/usr/local/lib/python3.7/dist-packages/mmf/datasets/multi_dataset_loader.py", line 114, in load
self.build_datasets(config)
File "/usr/local/lib/python3.7/dist-packages/mmf/datasets/multi_dataset_loader.py", line 131, in build_datasets
dataset_instance = build_dataset(dataset, dataset_config, self.dataset_type)
File "/usr/local/lib/python3.7/dist-packages/mmf/utils/build.py", line 106, in build_dataset
dataset = builder_instance.load_dataset(config, dataset_type)
File "/usr/local/lib/python3.7/dist-packages/mmf/datasets/base_dataset_builder.py", line 96, in load_dataset
dataset = self.load(config, dataset_type, *args, **kwargs)
File "/usr/local/lib/python3.7/dist-packages/mmf/datasets/builders/hateful_memes/builder.py", line 39, in load
self.dataset = super().load(config, dataset_type, *args, **kwargs)
File "/usr/local/lib/python3.7/dist-packages/mmf/datasets/mmf_dataset_builder.py", line 141, in load
dataset = dataset_class(config, dataset_type, imdb_idx)
File "/usr/local/lib/python3.7/dist-packages/mmf/datasets/builders/hateful_memes/dataset.py", line 19, in init
super().init(dataset_name, config, *args, **kwargs)
File "/usr/local/lib/python3.7/dist-packages/mmf/datasets/mmf_dataset.py", line 25, in init
self.annotation_db = self._build_annotation_db()
File "/usr/local/lib/python3.7/dist-packages/mmf/datasets/mmf_dataset.py", line 39, in _build_annotation_db
return AnnotationDatabase(self.config, annotation_path)
File "/usr/local/lib/python3.7/dist-packages/mmf/datasets/databases/annotation_database.py", line 24, in init
self._load_annotation_db(path)
File "/usr/local/lib/python3.7/dist-packages/mmf/datasets/databases/annotation_database.py", line 32, in _load_annotation_db
self._load_jsonl(path)
File "/usr/local/lib/python3.7/dist-packages/mmf/datasets/databases/annotation_database.py", line 39, in _load_jsonl
with PathManager.open(path, "r") as f:
File "/usr/local/lib/python3.7/dist-packages/mmf/utils/file_io.py", line 45, in open
newline=newline,
FileNotFoundError: [Errno 2] No such file or directory: '/root/.cache/torch/mmf/data/datasets/hateful_memes/defaults/annotations/dev.jsonl'

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant