Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Which GPU did you use? #14

Open
nessessence opened this issue Jan 26, 2021 · 5 comments
Open

Which GPU did you use? #14

nessessence opened this issue Jan 26, 2021 · 5 comments

Comments

@nessessence
Copy link

Sorry, there is training-time show in your experiment. I wonder which GPU did you use, and how many of them?

@GalSang17
Copy link

I am also using this implementation, is it impossible to call distributed training?

@jeonsworld
Copy link
Owner

Transfer learning was performed using v100. Check the relative time of the tensorboard for the learning time.

@jeonsworld
Copy link
Owner

Distributed training on a single node can be executed as follows.

python3 -m torch.distributed.launch --nproc_per_node=NUM_OF_GPU train.py --train_batch_size BATCH_SIZE_PER_GPU --name cifar10-100_500 --dataset cifar10 --model_type ViT-B_16 --pretrained_dir checkpoint/ViT-B_16.npz 

@TitaniumOne
Copy link

@jeonsworld Hello, how can I use multi-GPUs?

@jeonsworld
Copy link
Owner

@jeonsworld Hello, how can I use multi-GPUs?

There are DataParallel and Distributed ways to use multi-gpu in pytorch.
The current code supports distributed learning and uses the following command.

python3 -m torch.distributed.launch --nproc_per_node=NUM_OF_GPU train.py --train_batch_size BATCH_SIZE_PER_GPU --name cifar10-100_500 --dataset cifar10 --model_type ViT-B_16 --pretrained_dir checkpoint/ViT-B_16.npz 

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants