Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Stuck in search.py #66

Open
MaDaJie2706 opened this issue Aug 31, 2021 · 5 comments
Open

Stuck in search.py #66

MaDaJie2706 opened this issue Aug 31, 2021 · 5 comments

Comments

@MaDaJie2706
Copy link

First of all, thanks you very much for your generous to sharing code public.
My problem happen after i changed @ray.remote(num_gpus=4, max_calls=1) and ray.init(redis_address=args.redis) to @ray.remote(num_gpus=1, max_calls=1) and ray.init() in search.py,the running state of the code remains as shown in the image below.The code ran for two hours without making any progress.And it does not return any error messages.
Hope that you can answer question soon.Thank you very much!
P/s I use "!python3 FastAutoAugment/search.py -c confs/wresnet40x2_cifar.yaml" to run this code.
VB2YUAEDVD5R9{HR4@65SRS

@Linker-Stars
Copy link

I also have the same problem. Does anyone have a solution?

@licrane
Copy link

licrane commented Dec 9, 2021

me too

@BaronWang0130
Copy link

same problem without using ray cluster framework

@kolingv
Copy link

kolingv commented Apr 28, 2022

any solution for this issue?

@Deimos-Apollon
Copy link

Deimos-Apollon commented Jul 26, 2023

I found out that it stucks in infinite loop in search.py 'while True' cycle
When i raised KeyboardInterrupt, it shows:
изображение

I removed ray cluster's lines so line numbering may differ, but stuck occurs here:

изображение

interestingly that this stuck occurs on different stages - first, when training 2 of 5 model, second, when training first 5 of 5 models has finished.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

6 participants