Skip to content

Redesign DotRunner as a single global queue with per-format batching#12037

Closed
vtjnash wants to merge 20 commits into
doxygen:masterfrom
vtjnash:jn/dot-batch-all
Closed

Redesign DotRunner as a single global queue with per-format batching#12037
vtjnash wants to merge 20 commits into
doxygen:masterfrom
vtjnash:jn/dot-batch-all

Conversation

@vtjnash
Copy link
Copy Markdown
Contributor

@vtjnash vtjnash commented Mar 9, 2026

Replace the per-.dot-file DotRunner/ThreadPool architecture with a single DotRunner that holds a global queue of all jobs across all source files. Running dot is now batched as one invocation per output format using -O (auto-naming) instead of -o: dot -Tpng -O file1.dot file2.dot ...

The DOT_MULTI_TARGETS setting is now ignored (old default OFF, now assumed ON) since the corresponding feature in graphviz is now over 20 years old (1.8.10).

Alternative to #12028. For llvm, this generates all plots in about 20 seconds, which seemed not worth the additional effort to make it threaded (though it could still do so if other projects are making substantially more complicated or more numerous graphs than llvm).

As before, a disclaimer that this was written almost entirely by Claude (though you can potentially guess at all my requested fixes to the initial prompt from the list of commits). I'm happy to clean this up more if this looks generally acceptable (and I recommend squashing on merge).

vtjnash and others added 14 commits March 8, 2026 18:41
Replace the per-.dot-file DotRunner/ThreadPool architecture with a single
DotRunner that holds a global queue of all jobs across all source files.
Running dot is now batched as one invocation per output format using -O
(auto-naming) instead of -o: `dot -Tpng -O file1.dot file2.dot ...`

Remove ThreadPool and DOT_NUM_THREADS usage from DotManager. The cmapx
map output extension changes from .map to .cmapx to match -O naming.

The DOT_NUM_THREADS could be restored by splitting up `system` into a
loop for `spawn` and a loop for `waitpid` each taking a fixed proportion
of the graphs, but that isn't currently expected to provide significant
difference in performance.

The DOT_MULTI_TARGETS setting is now ignored (old default OFF, now
assumed ON) since the corresponding feature in graphviz is now over 20
years old (1.8.10).

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Replaces: doxygen#12028
Group jobs by format+directory, cd into each directory before invoking
dot, and pass only the file basename. Cleanup and .md5 writes happen
after all formats are generated, using the saved absolute paths.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Rather than saving/restoring the process cwd around Portable::system
calls, pass the desired working directory directly. On Unix (fork path)
the child calls chdir() before exec; on Solaris (vfork) and Windows
console paths it prepends cd to the shell command; on Windows
non-console it passes the directory to CreateProcessW lpCurrentDirectory.

DotRunner::run() now passes the dot file directory to Portable::system
and uses absolute paths for post-run checks, removing the Dir::setCurrent
save/restore.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
chdir() is async-signal-safe so it can safely be called in a vfork child.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Also replace exit() with _exit() after execve to avoid flushing stdio
buffers in the fork child.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
…imits

Error messages now include the working directory (chdir target) alongside
the command and arguments.

Dot invocations are split into batches so the command line stays under
32767 characters on Windows and 1MB on Linux/other Unix.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
MAX_ARG_STRLEN is the per-argument length limit enforced by the Linux
kernel. Fall back to 131072 on systems that don't define it.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
graphviz -O appends the format extension to the full input filename
(including .dot) on some versions, producing graph.dot.svg instead of
graph.svg. Rename the output if needed so the rest of doxygen finds
the expected path. Applied after both the batch run and the PDF re-run.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
…ehavior

dot -O always appends the format suffix to the full input filename
(including .dot), per the graphviz docs. Remove the conditional
FileInfo::exists() guard since the rename is unconditionally required.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
dot -O appends the format suffix to the full input filename, so update
all output path computations to use job->dotFile + "." + format directly:

- dotgraph.cpp: imgName() includes .dot. before the format extension
- dotgraph.h: absMapName() uses .dot.cmapx
- dotrunner.cpp: post-processing uses job->dotFile + "." + format;
  rename blocks removed
- dot.cpp: writeDotGraphFromFile and writeDotImageMapFromFile use
  Portable::system with explicit -o since the input and output paths
  are unrelated (user-provided file vs. generated output location)

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
dot -O writes file.dot.pdf/eps; passing absDotName() as figureName means
figureName+".pdf"/".eps" resolves to the correct file without any rename.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
@doxygen
Copy link
Copy Markdown
Owner

doxygen commented Mar 9, 2026

Also an interesting idea. Seems there is still an issue with generating the PDF version of the manual.

How much is this slower or faster than the other approach for llvm on a clean dir?

I find the use of the maximum argument length a bit fragile/artificial. Maybe we could convince the GraphViz developers to add a -f filelist.txt option to dot, so it can read the list of files from an input file instead of argv. Seems relatively easy to add when looking at input.c. We could the still make batches, but the size is no longer limited by the platform dependent maximum argument length.

dot -O produces file.dot.pdf; LaTeX appends its own extension list (.pdf
etc.) to the includegraphics argument, so {baseName.dot} resolves to
baseName.dot.pdf correctly.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
@vtjnash
Copy link
Copy Markdown
Contributor Author

vtjnash commented Mar 10, 2026

Fixed the remaining name issues with generating the pdf.

Basically the same speed. The tradeoff here isn't speed, but the ability to do dynamic load balancing with multiple processes, since this version only accepts a static list. (This does about 6 execv, instead of 15k). But it is unclear to me that dynamic (or static) load balancing is even needed after this PR.

Adding -f would be nice, but also means it takes that much longer for the option to be possible to enable in downstream projects. If we were to request an option, I'd suggest a delimiter instead like cut -f this_string_is_hopefully_unique to reliably and robustly detect end-of-output. Or the ability to specify new command line arguments between each graph (such as the output file). But I'd slightly rather work with the existing graphviz, to avoid another configuration options.

The maximum argument length is an annoying limitation of the unix kernels (apparently added to linux to avoid performance issues with their allocator), but was otherwise fairly trivial to implement, and is also related to logic that would used if this did support multiple threads again. The limit is per-argument, so it could be higher if this called directly into dot instead of using sh, but that

@doxygen
Copy link
Copy Markdown
Owner

doxygen commented Mar 11, 2026

Doesn't look quite right when running Doxygen on its own code.

Running doxygen -d time -d extcmd from build/doc_internal on a clean build/doxygen_docs dir I see

Spent 714.212632 seconds in Running dot...

Running doxygen -d time -d extcmd with stock version 1.16.1 I get

Spent 76.094098 seconds in Running dot...

So that is more than 9x slower! (on an 8 core system running macOS)

Looks like it only does one batch command with multiple dot files and then starts running them one at a time.

@vtjnash
Copy link
Copy Markdown
Contributor Author

vtjnash commented Mar 11, 2026

Looks like it is accidentally making a (crude) estimate of the quality of CREATE_SUBDIRS. I assumed that was legacy code for FAT, and not relevant anymore (so I had asked Claude to ignore that in ba9e348). I pushed a fix--it even cleans up my code a bit more, in my opinion! On my machine, this PR now runs the dot step in 30 seconds on one core (vs. 46 seconds on this machine on master even with all 127 cores).

@doxygen
Copy link
Copy Markdown
Owner

doxygen commented Mar 12, 2026

I still see some blocking issues. Most images do not show up (they link to name.svg instead of name.dot.svg). If I manually correct some, it still doesn't work as rendering the SVG triggers a syntax error. Looking at the svg file it has two </svg> tags at the end of the file for some strange reason. Seems to only happen for svg files in sub directories.

Sandbox User and others added 3 commits March 13, 2026 01:33
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
…ion)

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
…bering bug

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
@vtjnash
Copy link
Copy Markdown
Contributor Author

vtjnash commented Mar 13, 2026

The duplicate thing sounds quite odd. It looks like this is intentional in doxygen for a couple years (ab35a71) because the patcher makes an svg inside an svg inside an svg. Looking into this further, I also seem to have files where doxygen erased the whole file after dot generated it:

$ cat build/doxygen_docs/html/d0/d65/clangparser_8h__dep__incl.dot.svg
<?xml version="1.0" encoding="UTF-8" standalone="no"?>
<!DOCTYPE svg PUBLIC "-//W3C//DTD SVG 1.1//EN"
 "http://www.w3.org/Graphics/SVG/1.1/DTD/svg11.dtd">
<!-- Generated by graphviz version 2.43.0 (0)
 -->
<!-- Title: src/clangparser.h Pages: 1 -->

It looks like these issues were caused by the primary graph id field being populated as "page0,1_graph0" instead of the expected "graph0" for files after the first one. This breaks the simple text replacement code.

Filed bug as https://gitlab.com/graphviz/graphviz/-/work_items/2827, though made a workaround fix here to not deal with waiting for a release and version checks

@doxygen
Copy link
Copy Markdown
Owner

doxygen commented Mar 16, 2026

I think performance should be further improved by combining worker threads and batch operation.

The user should be enable to specify the number of worker threads (via the existing DOT_NUM_THREADS option) and also specify a maximum batch size (new DOT_BATCH_SIZE option, which then may need to be further limited by command line restrictions).

Also, specifying multiple targets seem to be possible using multiple -T options, and by leaving out the .dot extension, more images fit on the command line and the generated files do not get the double extension (also better for backward compatibility, i.e. existing links to pictures stay valid after rerunning Doxygen).

Example

dot -Tsvg -Tcmapx -O file1 file2

will generate

file1.svg
file1.cmapx
file2.svg
file2.cmapx

@vtjnash
Copy link
Copy Markdown
Contributor Author

vtjnash commented Mar 16, 2026

I think performance should be further improved by combining worker threads and batch operation.

Seems reasonable, though note the PR already seems to give a 50% performance enhancement over using all possible threads, which seemed worthwhile on its own already–before complicating review further with bringing back threads?

The user should be enable to specify the number of worker threads (via the existing DOT_NUM_THREADS option) and also specify a maximum batch size (new DOT_BATCH_SIZE option, which then may need to be further limited by command line restrictions).

DOT_NUM_THREADS is still the config option for the maximum, the PR just currently uses 1, with the option in the future to spawn more processes. I don't think we should add a configuration for DOT_BATCH_SIZE, since we can observe exactly the limit here, there's no other command line restrictions to be concerned with. We could also just set it to the minimum POSIX standard value (4096) if we want to be robust here. That should still be plenty of batching to maximize performance. (e.g. https://www.in-ulm.de/~mascheck/various/argmax/)

Also, specifying multiple targets seem to be possible using multiple -T options, and by leaving out the .dot extension, more images fit on the command line and the generated files do not get the double extension (also better for backward compatibility, i.e. existing links to pictures stay valid after rerunning Doxygen).

Interesting ideas, though I don't think it is worth it. The additional savings from a couple more command line arguments is negligible. The additional complexity to batch jobs by their total set of extensions seems unpleasant (esp. in C++) and not justified for the bugs it may cause. This PR already does the rename to ensure existing links work.

@Smattr
Copy link
Copy Markdown

Smattr commented Mar 28, 2026

Maybe we could convince the GraphViz developers to add a -f filelist.txt option to dot

One of the Graphviz maintainers here. I’m surprised you’re hitting these OS limits, but I’m supportive of adding something like this to Graphviz if it helps you.

@doxygen
Copy link
Copy Markdown
Owner

doxygen commented Mar 29, 2026

One of the Graphviz maintainers here. I’m surprised you’re hitting these OS limits, but I’m supportive of adding something like this to Graphviz if it helps you.

@Smattr Thanks for your interest in helping out. Let me give some context and explain what I think would be the best way to improve Graphviz for use with Doxygen.

For large projects, when you enable all dot related options, Doxygen can produce a huge amount of dot files (the number can easily run in the tens of thousands). Currently, for each image dot is invoked. To speed things up Doxygen can already use multiple worker threads to launch multiple instances of dot in parallel.

What this PR has demonstrated is that the overhead of launch dot many times is very significant. If dot is instructed to produce a whole set of files in one go it is much faster (like 10x faster!).

Currently, the only way to get dot to process multiple images in one go, is with the -O option. But this means that all files need to be passed in via the command line, and that there is no way to tell which output file to generate for each .dot file, as this is automatically done by dot in this case. The size of the command line is platform dependent and can be quite low (like 32kb on Windows).

Ideally, I would like to be able to pass a file, where each line represents an invocation of the dot tool, and dot will then act as if it is called multiple times with different parameters. Here is an example of such file batch.txt

"/tmp/output/html/test.dot_da268e64a51e9a0e393f9cefd6bb4b81.dot" -Tpng -o "/tmp/output/html/dot_test.dot_da268e64a51e9a0e393f9cefd6bb4b81.png"
"/tmp/output/html/test.dot_da268e64a51e9a0e393f9cefd6bb4b81.dot" -Tcmapx -o "/tmp/output/html/dot_test.dot_da268e64a51e9a0e393f9cefd6bb4b81.map"
"/tmp/output/latex/test.dot_da268e64a51e9a0e393f9cefd6bb4b81.dot" -Tpdf -o "/tmp/output/latex/dot_test.dot_da268e64a51e9a0e393f9cefd6bb4b81.pdf"

When running e.g. dot -B batch.txt it should behave the same as running

dot "/tmp/output/html/test.dot_da268e64a51e9a0e393f9cefd6bb4b81.dot" -Tpng -o "/tmp/output/html/dot_test.dot_da268e64a51e9a0e393f9cefd6bb4b81.png"
dot "/tmp/output/html/test.dot_da268e64a51e9a0e393f9cefd6bb4b81.dot" -Tcmapx -o "/tmp/output/html/dot_test.dot_da268e64a51e9a0e393f9cefd6bb4b81.map"
dot "/tmp/output/latex/test.dot_da268e64a51e9a0e393f9cefd6bb4b81.dot" -Tpdf -o "/tmp/output/latex/dot_test.dot_da268e64a51e9a0e393f9cefd6bb4b81.pdf"

If you could help with such option, then I can use it in Doxygen, and it will bring a huge speed-up without having to force things to fit with the limitations of the -O option. I envision that the batch size and the number of worker threads can then be further tuned by the user to get the best performance. Please also bump the version of dot, so I can check if dot supports this option and fallback to the old method to make the transition for users go smoothly. Let me know if I can help in any way.

@doxygen
Copy link
Copy Markdown
Owner

doxygen commented Mar 29, 2026

@Smattr I've created a simple proof concept patch that adds the new option. It diffs against Graphviz main (added the -B option only for dot, it doesn't appear in dot -h yet):
graphviz_dot_batch_option.patch

@vtjnash
Copy link
Copy Markdown
Contributor Author

vtjnash commented Mar 29, 2026

Ideally, I would like to be able to pass a file, where each line represents an invocation of the dot tool, and dot will then act as if it is called multiple times with different parameters. Here is an example of such file batch.txt

My optimal request above was that this should read from stdin, not a file. Currently it only expects sequential graph {} and other similar dot files concat. In between those, it would be helpful to be able to set all the options for the next graph, such as, which carry-over until (initialized from the command line itself) the next setting:

dot -Tpng -o "/tmp/output/html/dot_test.dot_da268e64a51e9a0e393f9cefd6bb4b81.png"
graph {}
dot "/tmp/output/html/test.dot_da268e64a51e9a0e393f9cefd6bb4b81.dot" -Tcmapx -o "/tmp/output/html/dot_test.dot_da268e64a51e9a0e393f9cefd6bb4b81.map"
dot -Tpdf -o "/tmp/output/latex/dot_test.dot_da268e64a51e9a0e393f9cefd6bb4b81.pdf"
graph {}

@doxygen
Copy link
Copy Markdown
Owner

doxygen commented Mar 31, 2026

I think it is easy to combine both requests, by allowing e.g. -B- or -Bstdin to mean reading from stdin. Actual reading can then be done in the same way.

@vtjnash
Copy link
Copy Markdown
Contributor Author

vtjnash commented Mar 31, 2026

I forgot one other detail, which is that the streaming mode needs a way to indicate when each is finished and dot is ready to handle the next job. A simple byte marker (NUL) between outputs would suffice.

@Smattr
Copy link
Copy Markdown

Smattr commented Apr 1, 2026

Seems this feature request has already expanded in scope :)

Please also bump the version of dot

You mean major version? We would usually only bump minor version for an additive feature like this.

I've created a simple proof concept patch that adds the new option…

I would favour the format being NUL-byte separated so the Graphviz parsing logic can be simplified. The caller (Doxygen) can also more easily produce such a thing with a programmatic equivalent of find … -print0 without having to worry about escaping rules.

In between those, it would be helpful to be able to set all the options for the next graph

This will unfortunately be significantly more complicated. Graphviz uses a lot of globals, both in obvious places and in less obvious places. Tracking these down and resetting/updating them in between graph runs is going to be a heavy lift. It was just never designed with this execution model in mind.

Taking a batch file that is simply a list of further argv elements is not too hard. This more ambitious “multi-job” list is going to be trickier.

I forgot one other detail, which is that the streaming mode needs a way to indicate when each is finished and dot is ready to handle the next job. A simple byte marker (NUL) between outputs would suffice.

TBH I don’t know why Graphviz doesn’t do this already. How does anyone delimit multiple outputs on stdout today? Though admittedly this kind of separation logic would only work for text-based output formats.

@Smattr
Copy link
Copy Markdown

Smattr commented Apr 1, 2026

BTW why is this a Graphviz-specific issue? I realise for Doxygen’s concrete case it is. But these Windows limits must plague others too. My naive thought is that one could implement a generic run-this binary that reads an argv list from stdin and then execve-s that. You CreateProcess this run-this, pass in your actual desired command on stdin, then onwards to great profit. Thus you bypass the CreateProcess limits, unless this limit also applies to execve? I tried searching for something authoritative but all I discovered was that Microsoft themselves even implement per-program specialised work arounds to this problem.

@vtjnash
Copy link
Copy Markdown
Contributor Author

vtjnash commented Apr 1, 2026

Windows doesn't have execve, it only has CreateProcess. This limit is mandated in the posix standard, since the memory needs to be copied out of the old process before replacing it with the new one, it doesn't like to do that too much during execve.

If all that mattered was posix performance, I'd say just use posix_spawn and eat the extra bit of overhead. Unfortunately, CreateProcess, despite being much more limited than posix_spawn, is also very much slower due to virus scanners and other similar design decisions at Microsoft.

@doxygen
Copy link
Copy Markdown
Owner

doxygen commented Apr 1, 2026

Seems this feature request has already expanded in scope :)

Please also bump the version of dot

You mean major version? We would usually only bump minor version for an additive feature like this.

No only the minor version, just something we can check against to tell if the feature works for the installed version of dot or not (it can take a long time before distros update their tool versions).

I've created a simple proof concept patch that adds the new option…

I would favour the format being NUL-byte separated so the Graphviz parsing logic can be simplified. The caller (Doxygen) can also more easily produce such a thing with a programmatic equivalent of find … -print0 without having to worry about escaping rules.

That's fine for Doxygen's purpose, but if an end user wants to create such a file for some reason it may be a bit more involved to create in an editor and then use e.g. tr '\n' '\0' < batch.txt > batch0.txt to get it in the correct format.

In between those, it would be helpful to be able to set all the options for the next graph

This will unfortunately be significantly more complicated. Graphviz uses a lot of globals, both in obvious places and in less obvious places. Tracking these down and resetting/updating them in between graph runs is going to be a heavy lift. It was just never designed with this execution model in mind.

With the proof of concept I made, it just loops at the highest level. I hope this is sufficient to (re)set the globals used. Seemed to work fine.

Taking a batch file that is simply a list of further argv elements is not too hard. This more ambitious “multi-job” list is going to be trickier.

I prefer a simple solution over no solution of course, but ideally it should support multiple jobs. Otherwise, we still need to launch dot for each set of options and use the -O option which complicates things, as there is no control over the name of the output files.

I forgot one other detail, which is that the streaming mode needs a way to indicate when each is finished and dot is ready to handle the next job. A simple byte marker (NUL) between outputs would suffice.

TBH I don’t know why Graphviz doesn’t do this already. How does anyone delimit multiple outputs on stdout today? Though admittedly this kind of separation logic would only work for text-based output formats.

I think this is a different operating model, where inputs can be streamed into dot's stdin via some format (with support for options and dot files) and output is then also streamed to stdout in a structured way (with file names, possible error messages, and output binary files). I would consider this a different request. At least not needed for what I had in mind. This could also be created as a new/separate "dot-server" tool that uses graphviz as a library. I'd like to avoid shipping this kind of tool with Doxygen though, because it introduces a build time Graphviz package dependency, and the licenses are not compatible.

@Smattr
Copy link
Copy Markdown

Smattr commented Apr 2, 2026

I hope this is sufficient to (re)set the globals used. Seemed to work fine.

Sure, it’ll probably work for some narrow set of cases. Though even the history of this PR tripped over a scenario where this doesn’t/didn’t work. My point is that this is not a well-used/-tested path, so there are likely a bunch of known and unknown bugs.

Having said that, my vague gestures towards boogie men aren’t particularly satisfying. I don’t want to stymie progress here. If you’re willing to accept we’re driving off road here and might hit a few trees, we can use Doxygen as a pilot to drive this Graphviz feature’s development. I suggest this should involve some seat belts on your side in the form of (a) a way to run batch and single runs, then diff the results and (b) a way for users to opt-out of the batch mode if/when bugs are hit.

And just to set expectations, this is all just my personal stance. I have not polled the other Graphviz maintainers for their opinion on this topic. If we really want to do this, the next step would be filing a Graphviz issue to get buy-in from the others.

facsimiles-push Bot pushed a commit to facsimiles/graphviz that referenced this pull request Apr 3, 2026
When rendering multiple jobs, one after the other, paging would be reinitialized
but this logic omitted initializing the current page number. As far as I can
tell, this is simply because `init_job_pagination` never anticipated that the
current page number could be non-zero when it was called.

This is described as a regression in 2.40.0 because the visible side effects of
this were inadvertently introduced by 6fec15b.
But the change introducing this was actually fixing a bug, so it is debatable
whether this bug should really be considered as being introduced then.

Github: doxygen/doxygen#12037 (comment)
Gitlab: fixes #2827
Reported-by: Jameson Nash
@doxygen
Copy link
Copy Markdown
Owner

doxygen commented Apr 3, 2026

@Smattr I think it is a good idea to go ahead and file an issue for this request, so we know if there is buy-in.
If not we probably will have to continue with a suboptimal -O based solution. Can you do this or should I?

I'd expect that similar issues will already be present when using the existing -O option (as found for the issue you referenced), so using that as an alternative is also not necessarily much safer.

In any case, I will make sure there is a fallback for the users in case there are unforeseen issues.

@Smattr
Copy link
Copy Markdown

Smattr commented Apr 4, 2026

Done: https://gitlab.com/graphviz/graphviz/-/work_items/2831

Also for reference for anyone writing version gates, the fix to the page count issue went into Graphviz commit b620cbbb8e40dacebc19ce494115f64e17b87804 which, if all goes to plan, should land in Graphviz 14.1.5.

@Smattr
Copy link
Copy Markdown

Smattr commented Apr 8, 2026

Does this new batch mode have to be implemented as a new flag to dot? I ask because dot is a pretty trivial program, ~50 non-comment lines of C. All the heavy lifting lives in shared libraries. One could implement a new dot_batch_mode binary in a similar number of lines that reads its args from stdin and runs in a loop, expecting multiple jobs.

@doxygen
Copy link
Copy Markdown
Owner

doxygen commented Apr 8, 2026

Does this new batch mode have to be implemented as a new flag to dot? I ask because dot is a pretty trivial program, ~50 non-comment lines of C. All the heavy lifting lives in shared libraries. One could implement a new dot_batch_mode binary in a similar number of lines that reads its args from stdin and runs in a loop, expecting multiple jobs.

That is certainly also an option as far as I'm concerned. Knowing a little bit how things are structured in GraphViz, I can imagine this makes sense. For Doxygen, one can already configure the path to the dot executable, so if this is pointing to differently named executable, it can be used as a way to know that batch mode can/should be used.

vtjnash added a commit to vtjnash/llvm-project that referenced this pull request Apr 10, 2026
…ygen

Previously this was likely 1.9.8, with the Ubuntu 24.04 worker. Now this
is 1.17.0-dev.

Fixes 3 significant issues for LLVM:

- `dot` execution performance is very slow (cuts this half hour step
  down to mere seconds).
  doxygen/doxygen#12037
- multi-thread performance is very slow (worse than single threading),
  and now uses all cores for ncpu times speedup (when using version with
  fix, autodetected by cmake).
  doxygen/doxygen#12027
- file links for IR.cpp and similar files were wrong
  doxygen/doxygen#11944

Assisted-by: Claude Code
@Smattr
Copy link
Copy Markdown

Smattr commented Apr 19, 2026

I’m having a little trouble rearticulating the motivation for this Graphviz batch mode. In the era of pervasive multicore machines, it seems preferable to me to parallelise individual dot invocations instead of cramming everything into a single serial run. Even if you want to parallelise but batch within each thread, it seems like you can’t accurately predict the runtime of individual renders so you’ll end up waiting on a long tail of one thread running jobs that could otherwise have been parallelised. Thoughts?

@vtjnash
Copy link
Copy Markdown
Contributor Author

vtjnash commented Apr 20, 2026

In the traditional unix design with fork, the spawning processes cost is roughly inversely proportional to core counts (since the work to spawn a process increases with the thread count, while the cost of doing that work increases with core count). So doing this work from multiple threads gets slower as more threads get added. One-shotting all the graphs in one dot process on one core is actually measurably a lot faster than using all cores because of the overhead of each dot processes, compounded by the overhead of each graphviz thread

@Smattr
Copy link
Copy Markdown

Smattr commented Apr 20, 2026

Sorry, I think I’m being dense and still not quite getting this. That reply unfortunately left me even more confused. Specifically:

  • Why does the cost of the work increase with core count? I follow your remark about the cost of forking increasing with thread count, but I don’t follow why the cost of the work the threads are doing increases with core count.
  • Why are we talking about fork? I thought Windows was the problem child here and Windows AIUI does not have fork.
  • Why does running multiple subprocesses imply fork and/or multithreading in the parent process? AFAIK you can CreateProcess/posix_spawn multiple subprocesses from a single thread. Monitoring them requires some kind of polling/select, but you’re not forced into creating multiple parent threads.

@vtjnash
Copy link
Copy Markdown
Contributor Author

vtjnash commented Apr 20, 2026

Ah, sorry, I was answering the wrong question.

When forking, the processor must enumerate all the other threads to copy their memory. This involves taking a lock, and lock performance scales negatively with the number of threads, and locks are more costly with more cores because of the coherency protocol.

The problem here is actually unix, because fork is so slow it greatly dominates the time of generating all the plots. But if it was only unix at fault here, this could use posix_spawn and do much better. But Windows is also slow here, so the idea of batch mode is to fix all platforms.

It doesn't and shouldn't, but that is how it was implemented in doxygen presently. That could be fixed too. I've proposed that could be done as a followup, but thought it wouldn't be necessary to do in this PR.

Lastly, to your other point of waiting for unbalanced work in batch mode: my proposed enhancement for load-balancing was originally to add a streaming mode to dot/doxygen (ala #12028), not a batch mode. But the work being done by doxygen appears to be a very large amount of very small work typically, so statically-scheduled batch mode (and even single-thread mode) is pretty close to equivalent.

@Smattr
Copy link
Copy Markdown

Smattr commented Apr 20, 2026

Ah OK. So the problem here really is Windows, because it sounds like using posix_spawn on other platforms is the right solution/optimisation independent of any of the other concerns.

Since you understand the requirements better than me/us and we have not encountered other Graphviz users with this use case, do you want to develop this batcher/streamer within Doxygen? We could then look at upstreaming it into Graphviz.

Re comments on #12028:

I also tried to do this in-process, but dot turns out to be not safe to use with threads.

Yes, this is a long standing problem. We have https://gitlab.com/graphviz/graphviz/-/work_items/2558 but we’re literally years away from resolving that.

Doxygen could make an almost-like-dot tool that it uses, though that may be an increasingly legal gray area, as far as the compatibility of the copyleft licenses for each project.

From the perspective of the Graphviz maintainers, I don’t think we have any qualms with this. AT&T are the copyright holders, so it would be them if anyone taking issue with this. I don’t think they have any active interest in Graphviz these days. (with the proviso IANAL)

@doxygen
Copy link
Copy Markdown
Owner

doxygen commented Apr 27, 2026

@vtjnash I've merged this PR (see 08e95ca).

I've added threading support back in and made the batch size a configuration option (DOT_BATCH_SIZE).
When running it on Doxygen's own source code with call graphs enabled (15000+ dot files), the difference of using multiple threads is huge. And the performance gain by using batch mode rather than single graphs is still about 50%, so also significant.

@Smattr For now I limit the path size to a safe value.

Somehow Github doesn't recognize it as being merged.

@albert-github albert-github added the fixed but not released Bug is fixed in github, but still needs to make its way to an official release label Apr 27, 2026
@albert-github
Copy link
Copy Markdown
Collaborator

Might be that GitHub doesn't want to close as it still has some conflicts?

As it is merged I think this PR can be closed.

@vtjnash
Copy link
Copy Markdown
Contributor Author

vtjnash commented Apr 30, 2026

Awesome, thanks!

@doxygen doxygen removed the fixed but not released Bug is fixed in github, but still needs to make its way to an official release label Apr 30, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants