Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

What are the differences between single machines and tow machines? #9

Closed
Gironote opened this issue Jul 24, 2020 · 2 comments
Closed

Comments

@Gironote
Copy link

Gironote commented Jul 24, 2020

I am currently working on this.
Single machine (node), multiple GPUs
$ python train.py --dist-url 'tcp://127.0.0.1:29500' --dist-backend 'nccl' --multiprocessing-distributed --world-size 1 --rank 0
I use a titian RTX. Is it right use this code? What are the differences between single machines and tow machines?
image

......................................................................................................

image
this error occur.....
Do i have to use single gpu?

@maudzung
Copy link
Owner

Hi @Jihun-Kim-kmu
You have a single GPU, so you should follow the instruction for training on a single machine, single gpu (2.4.4.1)
You can read more here

@Gironote
Copy link
Author

Gironote commented Jul 25, 2020

this step How many figures should I put in?
I tried
$ python train.py --gpu_idx 0 --multiscale_training --batch_size 128 --num_workers 0...
$ python train.py --gpu_idx 0 --multiscale_training --batch_size 128 --num_workers 1...
$ python train.py --gpu_idx 0 --multiscale_training --batch_size 128 --num_workers 16...

but train.py: error: argument --num_workers: invalid int value: '0...' this error occur
train.py: error: argument --num_workers: invalid int value: '1...' this error occur
train.py: error: argument --num_workers: invalid int value: '16...' this error occur

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants