Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[BUG]: num_workers>0 hangs on Windows and sometimes MacOS. #740

Closed
JacksonBurns opened this issue Mar 22, 2024 · 3 comments
Closed

[BUG]: num_workers>0 hangs on Windows and sometimes MacOS. #740

JacksonBurns opened this issue Mar 22, 2024 · 3 comments
Assignees
Labels
bug Something isn't working
Milestone

Comments

@JacksonBurns
Copy link
Member

This was originally discovered in #714 - setting num_workers above 0 causes Chemprop to hang on Windows on both GitHub actions and locally, and on MacOS for GitHub actions.

See these two replies to a highly relevant issue on the PyTorch forum - we may need to refactor our calls to train, or just disallow parallel dataloading based on platform:

@davidegraff
Copy link
Contributor

Seems like the default of 8 is a remnant of v1. I don't think it's a bad change to use the torch default, and if we reintroduce caching (#697) then there's really no need to parallelize dataloading. FWIW I think this is due to differences in parallelism implementations across platforms because of python GIL; POSIX uses fork() which is significantly faster to spin up and wind down than spawn() used in Windows/MacOS

@JacksonBurns
Copy link
Member Author

Here's the relevant pytorch docs page as well https://pytorch.org/docs/stable/notes/windows.html#usage-multiprocessing

@kevingreenman kevingreenman added this to the v2.0.0 milestone Mar 28, 2024
@donerancl
Copy link
Contributor

This has been addressed

  • message that num workers> 0 results in hanging for Windows and MacOS
  • default num workers = 0

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

No branches or pull requests

5 participants