address RSS growth when there are excessive annotation events #6115

garlick · 2024-07-18T18:08:56Z

Progress towards making flux core more robust in situations like #6114. Marking WIP because there's still some unexplained RSS growth with "annotation spam" to run down.

codecov · 2024-07-18T18:51:41Z

Codecov Report

Attention: Patch coverage is 81.25000% with 3 lines in your changes missing coverage. Please review.

Project coverage is 83.35%. Comparing base (346885e) to head (7a3abb7).

Additional details and impacted files

@@           Coverage Diff           @@
##           master    #6115   +/-   ##
=======================================
  Coverage   83.35%   83.35%           
=======================================
  Files         521      521           
  Lines       84458    84464    +6     
=======================================
+ Hits        70401    70407    +6     
  Misses      14057    14057

Files	Coverage Δ
src/common/libschedutil/alloc.c	`66.66% <100.00%> (+0.27%)`	⬆️
src/modules/job-manager/journal.c	`80.57% <100.00%> (+0.22%)`	⬆️
src/modules/job-manager/job-manager.c	`63.28% <75.00%> (-0.22%)`	⬇️
src/modules/job-manager/event.c	`78.35% <50.00%> (+0.04%)`	⬆️

... and 7 files with indirect coverage changes

grondo · 2024-07-18T20:30:27Z

FYI - I ran 1024 jobs on this branch with heap profiling enabled. This was without the repro for #6114, so just one annotation per job. I captured the diff between before running any jobs, and after all jobs were inactive. It seems like most of this makes sense, but posting this here for your information (according to heap profile, 1024 jobs results in ~47M of heap, mostly from storing the job information in memory it appears)

heap.pdf

Oh, and here's one after running flux kvs dropcache and flux content dropcache (24.6M used)

heap.pdf

Problem: schedutil_alloc_respond_pack() has a va_start() without a corresponding va_end(). Add the va_end().

Problem: events posted with the EVENT_NO_COMMIT flag are kept in job->eventlog, contributing to memory growth. These events are not important and need not be kept. This does mean that journal consumers like job-list will not see those events when they are reloaded and the journal is replayed, but since the idea is that these events are ephemeral, that is probably OK. Nothing is lost when the instance restarts since these events were never stored in the KVS. Fix up one test that was expecting job list to recreate annotations after a reload.

Problem: when the scheduler was malfunctioning and spamming the job manager with annotations, we had no tools for diagnosis. Count events that are posted to the journal and include this count in the 'flux module stats job-manager' output. A rapidly increasing event count would have been a helpful clue in the above case.

Problem: when handling a successful alloc callback from the scheduler, if the KVS commit of R fails (unlikely), the rank 0 broker may crash with a double free. In the KVS commit continuation, take the future off of the global cleanup list before checking for KVS error, not after, because the future is destroyed on all this function's return paths. Since the alloc context is in the future aux container with alloc_destroy() registered as a destructor, and the future is destroyed on the function's error path, drop the additonal call to alloc_destroy() on the error path.

garlick · 2024-07-19T19:22:25Z

I tacked on a commit that fixes a bug on an unlikely error path in libschedutil, but haven't had any other insights into the current heap behavior. I'm dropping the WIP because I think this addresses the major problem and could go in without further ado.

grondo

Great work, LGTM!

garlick · 2024-07-19T22:19:02Z

Thanks! Setting MWP.

garlick force-pushed the issue#6114 branch from bf134b1 to 7a3abb7 Compare July 18, 2024 18:25

garlick added 4 commits July 19, 2024 12:20

libschedutil: add missing va_end()

921033d

Problem: schedutil_alloc_respond_pack() has a va_start() without a corresponding va_end(). Add the va_end().

garlick force-pushed the issue#6114 branch from 7a3abb7 to fbb9601 Compare July 19, 2024 19:20

garlick changed the title ~~WIP: address RSS growth when there are excessive annotation events~~ address RSS growth when there are excessive annotation events Jul 19, 2024

grondo approved these changes Jul 19, 2024

View reviewed changes

garlick added the merge-when-passing label Jul 19, 2024

mergify bot merged commit f5c5079 into flux-framework:master Jul 19, 2024
33 checks passed

chu11 mentioned this pull request Jul 19, 2024

kvs: support way to "kill" transactions #6124

Open

garlick deleted the issue#6114 branch July 19, 2024 22:34

garlick mentioned this pull request Aug 1, 2024

rapid broker RSS growth and slow job-manager response on large system #6114

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

address RSS growth when there are excessive annotation events #6115

address RSS growth when there are excessive annotation events #6115

garlick commented Jul 18, 2024

codecov bot commented Jul 18, 2024

grondo commented Jul 18, 2024

garlick commented Jul 19, 2024

grondo left a comment

garlick commented Jul 19, 2024

address RSS growth when there are excessive annotation events #6115

address RSS growth when there are excessive annotation events #6115

Conversation

garlick commented Jul 18, 2024

codecov bot commented Jul 18, 2024

Codecov Report

grondo commented Jul 18, 2024

garlick commented Jul 19, 2024

grondo left a comment

Choose a reason for hiding this comment

garlick commented Jul 19, 2024