Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

use a queue for reading files in parallel #6330

Open
jbweston opened this issue May 1, 2019 · 2 comments
Open

use a queue for reading files in parallel #6330

jbweston opened this issue May 1, 2019 · 2 comments

Comments

@jbweston
Copy link

jbweston commented May 1, 2019

The problem

I and some colleagues are developing a sphinx extension (doctree transform) that executes code blocks embedded in RST documents and inserts the code's output into the final document.

Sometimes the code blocks take a long time to run (several seconds), so it is useful to use the parallel build feature of sphinx. However, when parallel reading, sphinx passes the input files to the worker processes lexicographically, i.e. the first process gets the first N files, the second process the next N files etc. If the files that take a long time to execute just so happen to have names that are lexicographically close, there is a good chance that they will be assigned to the same worker process, hence losing the advantage of parallelism.

A possible solution

Have the main sphinx process maintain a queue of files to build, and have the worker processes pop files from this shared queue as needed.

I am willing to have a go at implementing this, if it is in principle a feature that would be accepted if implemented correctly etc.

@jbweston jbweston added the type:enhancement enhance or introduce a new feature label May 1, 2019
@tk0miya
Copy link
Member

tk0miya commented May 4, 2019

+1; Absolutely! I agree with your suggestion.

@AA-Turner
Copy link
Member

A good idea, makes sense.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

3 participants