New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Seemingly non-deterministic "No such file or dir" for async log file #586
Comments
What filesystem are you using? |
Hi @aniezurawski , I'm on Linux with a NFS4 file system. |
First of all, don't worry about your data. When you see this message ( Now back to the issue. Does it occur often? Is it easily reproducible or is it random? |
I'd say it occurs about every second or third time I start an experiment - randomly. |
I failed to reproduce the issue on NFS4. Does it work for you on other non-network filesystems like ext4? Could you share your code? Does it still occur on newest version of neptune-client (0.10.0)? EDIT: Could you share content of |
I think I know now what's happening. The underlying framework that I'm using in my project changes the working directory away from the project directory. This likely is a race condition, where the CWD is changed between creation and access of the neptune log file. I've found the run directory declared as a constant in neptune-client/neptune/new/constants.py Line 24 in ccf7029
|
Unfortunately, it is not possible for now. Thanks for your suggestion. |
@emanuel-metzenthin could say more? What framework was it? |
It is rllib by ray. They also have a log directory and change the working directory to it unfortunately. There also seems to be no way to switch off that behavior. |
Try version 0.10.2 of neptune-client. |
Thank you very much! This seems to work. |
Had the same issue when using a distributed system. Tried |
We still do not know what exectly is the base cause if this issue. But we have implemented some workaround and it's already merged to master. It will be released soon. |
Please try version 0.10.8. Since it's only workaround for the base issue let me know if any further problems occur. |
Hi!
I have some issues with running and logging experiments.
Sometimes (I couldn't figure out any reason for it) I get the following error when executing experiments. If it appears I don't get charts on the neptune dashboard as the async thread gets killed.
The file does exist though.
Running:
Python 3.8.5
neptune-client 0.9.7 (using neptune.new)
Thanks for any suggestions!
The text was updated successfully, but these errors were encountered: