Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

KeyError: 'RANK' #215

Closed
Totoro-wen opened this issue Oct 23, 2019 · 9 comments
Closed

KeyError: 'RANK' #215

Totoro-wen opened this issue Oct 23, 2019 · 9 comments

Comments

@Totoro-wen
Copy link

No description provided.

@Totoro-wen
Copy link
Author

当我单卡运行python tools/train.py --cfg siamrpn_r50_l234_dwxcorr_otb/config.yaml时,遇到下面的问题:
Traceback (most recent call last):
File "tools/train.py", line 319, in
main()
File "tools/train.py", line 258, in main
rank, world_size = dist_init()
File "/home/aibc/Wen/Siamese/pysot/pysot/utils/distributed.py", line 104, in dist_init
rank, world_size = _dist_init()
File "/home/aibc/Wen/Siamese/pysot/pysot/utils/distributed.py", line 83, in _dist_init
rank = int(os.environ['RANK'])
File "/home/aibc/anaconda2/envs/py36_pytorch1_2/lib/python3.6/os.py", line 669, in getitem
raise KeyError(key) from None
KeyError: 'RANK'
应该怎么解决呢?
谢谢!

@lb1100
Copy link
Contributor

lb1100 commented Oct 23, 2019

This code can't run on single node.

@Totoro-wen
Copy link
Author

This code can't run on single node.

单卡也能多个节点吗?

@lb1100
Copy link
Contributor

lb1100 commented Oct 23, 2019

No. Two gpus at least.

@Totoro-wen
Copy link
Author

No. Two gpus at least.

谢谢!

@carmete
Copy link

carmete commented Oct 23, 2019

When you are in experiments/desired experiment/ running python -m torch.distributed.launch --nproc_per_node=1 --master_port=2333 ../../tools/train.py --cfg config.yaml should work with a single GPU.

Your error is caused by not using torch.distributed.launch as described here https://github.com/STVIR/pysot/blob/master/TRAIN.md#single-node-multiple-gpus. Therefore the environment variable RANK will not be set.

I don't know if there are any disadvantages to this abuse of the distributed setting though.
You can also specify --nproc_per_node=X with X>1.

@smile-hahah
Copy link

No. Two gpus at least.

Thanks for your good work! Now,I reperformance this project with two gpus ,EAO=0.41;In the second time, I set nproc_per_node=1,I find this project can work with single gpu,I want to know:would single gpu will affect the results?

@ZhiyuanChen
Copy link
Collaborator

Since no further question is asked, I'm closing it for now.
Please reopen it or create a new issue if you have further question.

No. Two gpus at least.

Thanks for your good work! Now,I reperformance this project with two gpus ,EAO=0.41;In the second time, I set nproc_per_node=1,I find this project can work with single gpu,I want to know:would single gpu will affect the results?

Glad to hear you have reached an EAO of 0.41! Would you mind sharing your experience with others at #94 ?
The result is easily influenced by random, so I would say yes and no.

@ZhiyuanChen
Copy link
Collaborator

Since no further question is asked, I'm closing it for now.
Please reopen it or create a new issue if you have further question.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

5 participants