Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Can not start without fix train.py #7

Closed
gertelrina opened this issue Apr 4, 2022 · 4 comments
Closed

Can not start without fix train.py #7

gertelrina opened this issue Apr 4, 2022 · 4 comments

Comments

@gertelrina
Copy link

Have bug, described bellow:
File "../trivialaugment/TrivialAugment/train.py", line 353, in spawn_process
assert worldsize == C.get()['gpus'], f"Did not specify the number of GPUs in Config with which it was started: {worldsize} vs {C.get()['gpus']}"

It happens due to incorrect unpacking "args.config". need to use "args.config[0]" instead "args.config" (line: 343-344).
After this it work right :)

@SamuelGabriel
Copy link
Contributor

Hi @gertelrina

Thanks for the shoutout. Could you please let me know what the exact diff of your change is? If you have it lying around still, I would be very happy about a stack trace as well. :)

@ADAMS12345678
Copy link

I also encountered this bug.
File " .../trivialaugnent-naster/TrivialAugnent/train.py" , line 348,in spawn_process
assert worldsize == C.get()[ 'gpus '],f"Did not specify the nunber of CPus inConfig with which it was started: {worldsize} vs {C.get()[ 'gpus']]”
File "/root/ENTER/envs/trivial/lib/python3.8/site-packages/theconf-0.1.7-py3.8.egg/theconf/config.py" , line 126, in__getitem__
_return self.conf[ key]
KeyError : 'gpus'

I haven't solved it yet, looking forward to the author's answer. Thanks!

@SamuelGabriel
Copy link
Contributor

Just as a quick follow up, in your config file there is a gpus key? Just to be sure. I am trying to reproduce this right now on my end.

@SamuelGabriel
Copy link
Contributor

I just gave it a try installing all dependencies anew and cloning this repo and the training command from the README did not throw the error you had. So maybe try this command just like it is in the readme and re-install the dependencies. :)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants