Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Pretraining with my own dataset #44

Closed
alskdjfasdfsadf opened this issue May 25, 2023 · 3 comments
Closed

Pretraining with my own dataset #44

alskdjfasdfsadf opened this issue May 25, 2023 · 3 comments

Comments

@alskdjfasdfsadf
Copy link

Hi, thanks for your work. Whenever I try to pretrain with my own dataset, following error is happening:

torchrun --data_path=/home/user/augdata --exp_name=ptaugdata --exp_dir=/home/user/models
usage: torchrun [-h] [--nnodes NNODES] [--nproc-per-node NPROC_PER_NODE] [--rdzv-backend RDZV_BACKEND] [--rdzv-endpoint RDZV_ENDPOINT] [--rdzv-id RDZV_ID] [--rdzv-conf RDZV_CONF] [--standalone]
[--max-restarts MAX_RESTARTS] [--monitor-interval MONITOR_INTERVAL] [--start-method {spawn,fork,forkserver}] [--role ROLE] [-m] [--no-python] [--run-path] [--log-dir LOG_DIR]
[-r REDIRECTS] [-t TEE] [--node-rank NODE_RANK] [--master-addr MASTER_ADDR] [--master-port MASTER_PORT] [--local-addr LOCAL_ADDR]
training_script ...
torchrun: error: the following arguments are required: training_script, training_script_args

do I need to specify training_script and training_script_args?

@keyu-tian
Copy link
Owner

Yes, you can refer to /pretrain/README.md for the complete cmd, which is:

$ cd /path/to/SparK/pretrain
$ torchrun --nproc_per_node=8 --nnodes=1 --node_rank=0 --master_addr=localhost --master_port=<some_port> main.py \
  --data_path=/path/to/imagenet --exp_name=<your_exp_name> --exp_dir=/path/to/logdir \
  --model=resnet50 --bs=512

The first line is missing, e.g. main.py is the training_script.

@alskdjfasdfsadf
Copy link
Author

Thank you very much. In the article, it is saying that 'All models are pre-trained with 1.28 million unlabeled images
from ImageNet-1K (Deng et al., 2009) training set for 1600 epochs.', which means that I can use SparK pretraining with unlabeled dataset. However, when I tried to use unlabeled image dataset , the following error is happening:
File "/home/user/SparK/pretrain/utils/imagenet.py", line 39, in init
super(ImageNetDataset, self).init(
File "/home/user/.local/lib/python3.10/site-packages/torchvision/datasets/folder.py", line 144, in init
classes, class_to_idx = self.find_classes(self.root)
File "/home/user/.local/lib/python3.10/site-packages/torchvision/datasets/folder.py", line 218, in find_classes
return find_classes(directory)
File "/home/user/.local/lib/python3.10/site-packages/torchvision/datasets/folder.py", line 42, in find_classes
raise FileNotFoundError(f"Couldn't find any class folder in {directory}.")

@keyu-tian
Copy link
Owner

keyu-tian commented May 26, 2023

You need to define a new Python class for your dataset, to replace our ImageNetDataset in https://github.com/keyu-tian/SparK/blob/main/pretrain/utils/imagenet.py#L30. Just define a class with __len__(self) and __getitem__(self, index: int) implemented. The getitem should return the index-th image in your dataset, and be processed by a transformation like the trans_train in /pretrain/utils/imagenet.py.

PS: i recommend to try your pretraining with or without --init_weight=/path/to/res50_withdecoder_1kpretrained_spark_style.pth. If this arg is used, you will pretrain from our pretrained model, rather than from scratch, which could be better.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants