Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

libsubprocess: reduce remote output prep/check #5932

Merged

Conversation

chu11
Copy link
Member

@chu11 chu11 commented May 3, 2024

Problem: Profiling shows that a significant amount of time can be spent in the prep/check of remote subprocess output. This is even in the case when the output buffer is empty.

Enable the remote output prep/check only when the buffer is non-empty.


Side note, after trying and flailing on the "input" side of this, I realized that a bunch of code in libsubprocess's remote code simply assumes the prep/check for input is always running. So that work requires more thought. Posting this so we can atleast get this in for the next release if I can't figure out the input side by then.

@garlick
Copy link
Member

garlick commented May 3, 2024

Is there a reason remote subprocesses do not use fbuf watchers? We use them for local subprocess I/O. Maybe all this is already solved there? (I didn't look)

@chu11
Copy link
Member Author

chu11 commented May 3, 2024

Is there a reason remote subprocesses do not use fbuf watchers? We use them for local subprocess I/O. Maybe all this is already solved there? (I didn't look)

I can't remember why.

Is your idea that we use fbuf watchers and have those "drive" the callbacks to the user vs the prep/check we have setup? My only thought is if it's a huge win, since there's a prep/check in the watcher too. Although, I suppose the watcher already has its "is buffer empty" logic taken care of?

Lemme try and figure out the "input" side of this as well, b/c that's trickier. Perhaps the truth of why it's non-watcher based is in there.

@garlick
Copy link
Member

garlick commented May 3, 2024

Is your idea that we use fbuf watchers and have those "drive" the callbacks to the user vs the prep/check we have setup?

Yes.

@chu11
Copy link
Member Author

chu11 commented May 3, 2024

I was finally able to have the remote input stream (i.e. "write_buffer" in the code) work by enabling prep/check only when needed. There were basically some racy bits with the remote server regarding sending data (i.e. like stdin) before the remote process started and some racy bits with when the remote process has exited.

It's possible this is why the prep/check were enabled all the time, just to make this logic simpler.

given this info, will continue conversation in #5919 regarding potential paths forward

Copy link
Contributor

@grondo grondo left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Tested this and seems to be working!

@chu11
Copy link
Member Author

chu11 commented May 7, 2024

oh whoops, sorry you reviewed. Given discussion in #5919 I thought we were going to table this approach, go with the FLUX_SUBPROCESS_FLAGS_LOCAL_UNBUF approach.

@grondo
Copy link
Contributor

grondo commented May 7, 2024

I probably lost track of the discussion, but it would be nice to reduce the high load on the broker due to the current situation. Would this work as an interim mitigation? (guess we could have input from @garlick)

@chu11
Copy link
Member Author

chu11 commented May 7, 2024

Would this work as an interim mitigation?

Yes, I do believe it would work as a short term mitigation.

@grondo
Copy link
Contributor

grondo commented May 7, 2024

Let's go ahead and merge then if @garlick approves?

Copy link
Member

@garlick garlick left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM!

chu11 added 2 commits May 7, 2024 16:23
Problem: A number of indentations in are incorrect, likely due to a
search and replace that changed a function name.

Fix the invalid indentations.  Note that in one case the proper indentation
would exceed 80 chars, necessitating special formatting.
Problem: Profiling shows that a significant amount of time
can be spent in the prep/check of remote subprocess output.
This is even in the case when the output buffer is empty.

Enable the remote output prep/check only when the buffer is
non-empty.
@grondo grondo force-pushed the issue5919_subprocess_remote_output_check branch from 531c1f9 to 51f7b8f Compare May 7, 2024 23:23
@grondo
Copy link
Contributor

grondo commented May 7, 2024

I set MWP here.

Copy link

codecov bot commented May 7, 2024

Codecov Report

All modified and coverable lines are covered by tests ✅

Project coverage is 83.30%. Comparing base (6733a10) to head (51f7b8f).

Additional details and impacted files
@@            Coverage Diff             @@
##           master    #5932      +/-   ##
==========================================
- Coverage   83.32%   83.30%   -0.02%     
==========================================
  Files         515      515              
  Lines       83351    83356       +5     
==========================================
- Hits        69452    69440      -12     
- Misses      13899    13916      +17     
Files Coverage Δ
src/common/libsubprocess/remote.c 77.56% <100.00%> (+0.36%) ⬆️
src/common/libsubprocess/subprocess.c 88.09% <100.00%> (ø)

... and 9 files with indirect coverage changes

@mergify mergify bot merged commit 4014964 into flux-framework:master May 7, 2024
35 checks passed
@chu11 chu11 deleted the issue5919_subprocess_remote_output_check branch May 8, 2024 02:38
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants