Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fix handling of Ctrl-SPACE (NUL) over job ptys (e.g. under flux job attach) #5833

Merged
merged 3 commits into from Mar 26, 2024

Conversation

grondo
Copy link
Contributor

@grondo grondo commented Mar 26, 2024

This PR fixes handling of NULs in pty data handling. Jansson by default will not unpack NUL unless JSON_ALLOW_NUL is used, so the read side of the pty was just dropping them.

As part of developing a test, an issue with echo something | flux run -o pty.interactive ... was discovered, which is also fixed here.

Fixes #5832

Copy link
Member

@garlick garlick left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM!

Copy link
Member

@chu11 chu11 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

lgtm, just a few minor comments

Comment on lines 658 to 668
if (flux_msg_unpack (msg, "o", &o) < 0
|| json_unpack_ex (o,
NULL,
JSON_ALLOW_NUL,
"{s:s% s?s}",
"data", &data, &len,
"encoding", &encoding) < 0) {
errno = EPROTO;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

is it possible flux_msg_unpack() could return errno != EPROTO, like ENOSYS or something? Dunno if want to split the if checks up.

Copy link
Contributor Author

@grondo grondo Mar 26, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is the server side, so ENOSYS not possible.

Edit: Actually you're correct, this could be called on either side. I'll split this one up.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok, I split those up. Good suggestion.

Comment on lines 131 to 132
# Note: test below uses printf(1) to send [Ctrl-SPACE (NUL), newline, Ctrl-D]
# over the pty connection and expects ^@ to appear in output.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

for the uninitiated, mention "^@" (NUL)?

Comment on lines +274 to 276
if (write (STDOUT_FILENO, data, len) < 0) {
llog_error (c, "write %zu bytes: %s", len, strerror (errno));
return;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

i don't know this code that well and this change looks odd in that there isn't a change on the other side of this. I suppose the other side is listening to all stdio and that internal to "data" it has encoded if this is stdin/out/err or whatever?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not sure what you mean about the other side of this? This is a tty, there is no stdin stdout stderr, or I should say all of them point to the same file descriptor. STDIN_FILENO was just used as a convenience before I think. Probably it should have been stdout all along, I'm not sure.

I'm not actually sure I fully understand how STDIN_FILENO gets closed in the reproducer, but this seemed like a valid change given the above.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not sure what you mean about the other side of this?

I was meaning the server side.

This is a tty, there is no stdin stdout stderr, or I should say all of them point to the same file descriptor.

That's what I assumed, otherwise this wouldn't make sense :-) Not sure if a comment is worthwhile.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh, on the server side there's no possibility of redirection like in the client

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, maybe a comment would have been a good idea

Copy link

codecov bot commented Mar 26, 2024

Codecov Report

Merging #5833 (f4fe981) into master (1cc4f17) will decrease coverage by 0.02%.
The diff coverage is 87.50%.

Additional details and impacted files
@@            Coverage Diff             @@
##           master    #5833      +/-   ##
==========================================
- Coverage   83.31%   83.30%   -0.02%     
==========================================
  Files         510      510              
  Lines       82828    82830       +2     
==========================================
- Hits        69012    69002      -10     
- Misses      13816    13828      +12     
Files Coverage Δ
src/common/libterminus/pty.c 87.80% <100.00%> (+0.05%) ⬆️
src/common/libterminus/client.c 82.81% <50.00%> (ø)

... and 11 files with indirect coverage changes

Problem: The NUL character, generated by Ctrl-SPACE, Ctrl-2, etc, is
dropped by the libterminus pty server, making this sequence impossible
to use with terminal programs running under Flux establish ptys such
as under flux-alloc(1).

Use json_unpack_ex(3) instead of flux_msg_unpack(3) so that the
JSON_ALLOW_NUL flag can be used to avoid ignoring a NUL character
encoded by the client. Also simplify the code by using size_t for
the unpacked length so the variable doesn't have to be reassigned.

Fixes flux-framework#5832
Problem: libterminus writes client pty input to STDIN_FILENO, but this
can fail if stdin becomes closed, e.g. if a user does something like

  echo foo | flux run -o pty.interactive ...

Write to STDOUT_FILENO instead, which is unlikely to be closed.

Also, update a misleading error message in this case.
Problem: There's no test that ensures NUL can be transmitted over
a pty allocated for a job.

Add a test to t2612-job-shell-pty.t that sends a NUL as input to a
job with a pty and ensures it is received by the job task.
@grondo
Copy link
Contributor Author

grondo commented Mar 26, 2024

Thanks! Setting MWP.

@mergify mergify bot merged commit 93a6dfc into flux-framework:master Mar 26, 2024
33 checks passed
@grondo grondo deleted the issue#5832 branch March 26, 2024 22:37
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

pty: flux pty code drops ctrl-<space>
3 participants