-
Notifications
You must be signed in to change notification settings - Fork 114
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
torch.distributed.elastic.multiprocessing.api:failed #44
Comments
This issue open-mmlab/mmcv#1969 may help you |
Maybe you can reduce your batch_size,when I set my batch size to half the original size, this error disappeared. I hope this can help you.
…---Original---
From: ***@***.***>
Date: Mon, Sep 18, 2023 22:09 PM
To: ***@***.***>;
Cc: ***@***.******@***.***>;
Subject: Re: [Sense-X/Co-DETR]torch.distributed.elastic.multiprocessing.api:failed (Issue #44)
Hello, I also encountered the same issue. Did you manage to resolve it in the end?
—
Reply to this email directly, view it on GitHub, or unsubscribe.
You are receiving this because you authored the thread.Message ID: ***@***.***>
|
Thank you for your reply. I have already solved this error by adding the "classes" information in the "train" section of the data dictionary in the config file. |
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
When I ran it, I got this error.
I just changed the config : bash tools/dist_train.sh projects/configs/co_dino/co_dino_5scale_swin_large_16e_o365tococo.py 4 run
The text was updated successfully, but these errors were encountered: