-
Notifications
You must be signed in to change notification settings - Fork 123
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Fix puml plugin spawning zombie jvm processes #1245
Conversation
cb42470
to
a4d7995
Compare
@ang-zeyu does this fix work? |
@damithc it should be good to go. A caveat though: Originally I changed the puml process execution to synchronous but this slows build times tremendously (~20%) since it's blocking. This 1 line solution fixes it as well and maintains build times, but essentially what it does is postpone all puml diagram generation to the end. Meaning you may still experience a 1-2s freeze at the end (instead of freezing throughout the build). Not too sure which is better 😮 |
Have you pushed this 1-line solution? The diff I see is bigger than one line. |
BTW, I suspect the original version spawned DxR number of JVMs where D is the number of diagrams and R is the number of times they are reused. 10x3=30. Imagine I add 20 more diagrams; that's going to be 90 JVMs! I wonder if it is possible to limit the number of threads i.e., set a max thread pool size. |
@alyip98 did you come across this problem when you were implementing this feature? |
yup its the current ^; it's the
Noticed this as well, it happens as we are generating "one png per file location it is included at" instead of "one png per puml definition". Now that you mention it, I'll try including it here as well since it's rather related. |
a4d7995
to
7c9f2ff
Compare
@damithc could you test this out? It should
I'm not too sure if there was a reason for generating diagrams per use instead of per definition as well though @alyip98 |
Tried it out. There is a huge improvement. No more than 4 JVMs were active at any time and near-100% CPU usage lasted only 1-2 seconds. The total time seems to have improved too (it's now about 250 seconds, used to take 300-600 depending on the amount of thrashing), possibly due to less thrashing. Nice fix @ang-zeyu 💯 |
7c9f2ff
to
2b142a3
Compare
realised this should fix #1237 as well, modified the tests a little to make sure of it. |
2b142a3
to
9fbc1bd
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm a little hesitant about using the event loop and setImmediate
to defer the puml image generation to the end of the build process. Depending on how other callbacks are handled in the poll queue, the behaviour might not be as we expect (please correct me if I'm wrong here, it's been a while since I had to look at the node event loop!).
Do you think we can make deferring the image generation more explicit by moving it to postRender
? Hopefully it achieves the same effect, since postRender
callbacks are executed after the expensive render step is done (although not at the end of the build process, as the current implementation seems to do).
processedDiagrams.add(outputFilePath); | ||
function generateDiagram(imageOutputPath, content) { | ||
// Avoid generating twice | ||
if (processedDiagrams.has(imageOutputPath)) return; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Just a nit, but shall we have braces around the body of the if
statement for consistency?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm a little hesitant about using the event loop and setImmediate to defer the puml image generation to the end of the build process. Depending on how other callbacks are handled in the poll queue, the behaviour might not be as we expect (please correct me if I'm wrong here, it's been a while since I had to look at the node event loop!).
I think we should be good here 😬, https://nodejs.org/en/docs/guides/event-loop-timers-and-nexttick/, setImmediate
is executed in the check
phase, only after all callbacks (ie. all of our page.generate()
s) have resolved.
Moving part of it to postRender
works a well, but to a lesser extent as you mentioned. It also involves some extra parsing work as we can only move part and not the whole plugin to postRender
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The ideal solution would to be find some way to let prioritise the resource cleanup callbacks (im guessing that's how node does it) after the puml processes have finished. There dosen't seem to be an interface for this after hours of searching though, which makes sense since node's event loop wasn't really built for continuous execution of a massive number of callbacks.
An alternative solution, is that if and when we manage to parallelise page generation, we could change the execution back to synchronous. This way we have a good amount of thread usage still 🎉
e2f153e
to
26324d9
Compare
Resolved this, and an import path conflict with the directory structure pr that was causing ci to fail:
|
26324d9
to
52f49f7
Compare
Gave it some more thought (thanks for the comment on This should also ensure, as opposed to the previous 2 solutions:
Realised that this wasn't really true either (notice the amount of jvms popping up and going away in the gif), it only appeared so as the previous solution generated everything at once (almost).
|
I tried this version. Definitely better than the |
It's rather expected as the I pushed a simple hotfix, that is to basically check whether the output file already exists. Works fine on my machine, could you give it a try?
Thanks! Fixed. Was simply an issue with trying to render |
Much better. Only one CPU peak (very brief) and finished in 260s (previous version took 350!). A very noticeable improvement. |
reverted it, realised using a file check breaks live reload unfortunately, since files from previous runs aren't deleted. @marvinchin let me know if there are any other issues here! |
Just now I was wondering why the diagram is not getting updated in live preview :-) |
🙃 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks fine to me! Suggested adding a comment to make it easier to understand how the throttling works.
reject(new Error(`Error while generating ${page.sourcePath}`)); | ||
})); | ||
|
||
pageGenerationQueue.splice(0, MAX_CONCURRENT_PAGE_GENERATION_PROMISES) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do you think this comment would make how we handle the max concurrency clearer?
// Take the first MAX_CONCURRENT_PAGE_GENERATION_PROMISES callbacks and execute them. When a page generation callback completes, it pops the next unprocessed callback off pageGenerationQueue and executes it.
Asynchronously executing the puml child process results in zombie jvm processes as there are multiple callbacks attached to it and page generation is heavily callback based. Let's break page generation into chunks, which allows node to clean up the zombie processes in a timely fashion. Let's also change the generated image's path to resolve from the location of it's definition (ie. the file where the <puml> tag is used), to reduce the amount of jvms spawned through including the same file multiple times.
698af19
to
da28a31
Compare
updated and squashed, thanks again for looking at this @marvinchin! 🙂 |
Hi @ang-zeyu, I am working on fixing something related to plantUML checking errors before image generation (to prevent generating images that basically contain the error message). I realized that I could possibly spin up only one JVM process after collecting all diagram codes and then pipe it through that java process to generate the images before the site build is done (might need to handle live preview differently), wonder if you still have the context on this PR and if this approach sounds promising? I suspect if this works then it will reduce the overhead of spinning up and shutting down JVM. |
Sounds interesting! @tlylt
If I still remember the implementation details this should be fine (if spinning up the JVM at the right place). Just note that we might prefer to keep the implementation strictly isolated as a plugin, but this wouldn't be possible right now with what's supported here. https://markbind.org/devdocs/devGuide/writingPlugins.html
(should be irrelevant to your solution though) We were previously throwing N page generation callbacks into the event loop. This isn't an issue most of the time but for a large site like the 2103 one (700+ pages), the many JVMs spawned midway don't get to close due to way We've since then "staggered" the page generation callbacks (throw a fixed K into the loop at a time until all N pages are done), so the JVMs can close even midway. |
@tlylt I think one issue might be the memory cost (which was the main issue here with zombie processes) of generating all images at once with one jvm. |
What is the purpose of this pull request? (put "X" next to an item, remove the rest)
• [x] Bug fix
Fixes #1243(partially)Fixes #1237
What is the rationale for this request?
Fix puml plugin spawning many zombie jvm processes
Proposed commit message: (wrap lines at 72 characters)
Fix puml plugin spawning zombie jvm processes
Asynchronously executing the puml child process results in zombie jvm
processes as there are multiple callbacks attached to it and page
generation is heavily callback based.
Let's wrap the plantUML diagram generation in a setImmediate call,
delaying said generation to after page generation has completed.
This resolves the zombie processes present during page generation,
which may cause persistent freezes on a user's machine, while
preserving the non-blocking behaviour of puml diagram generation.