use a queue for reading files in parallel #6330
Labels
builder
good first issue
help wanted
internals:parallel
type:enhancement
enhance or introduce a new feature
Milestone
The problem
I and some colleagues are developing a sphinx extension (doctree transform) that executes code blocks embedded in RST documents and inserts the code's output into the final document.
Sometimes the code blocks take a long time to run (several seconds), so it is useful to use the parallel build feature of sphinx. However, when parallel reading, sphinx passes the input files to the worker processes lexicographically, i.e. the first process gets the first N files, the second process the next N files etc. If the files that take a long time to execute just so happen to have names that are lexicographically close, there is a good chance that they will be assigned to the same worker process, hence losing the advantage of parallelism.
A possible solution
Have the main sphinx process maintain a queue of files to build, and have the worker processes pop files from this shared queue as needed.
I am willing to have a go at implementing this, if it is in principle a feature that would be accepted if implemented correctly etc.
The text was updated successfully, but these errors were encountered: