Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Prevent potential hang with scripts (take 3) #1893

Merged
merged 2 commits into from
Oct 30, 2020

Conversation

cgull
Copy link
Contributor

@cgull cgull commented Oct 29, 2020

@juikim last tried to fix this in #1843. However, we're still having problems with hangs from pkg when background processes remain and hold PKG_MSGFD open.

When I looked at the code in pkg_run_scripts(), I found several problems:

  • getline() is used to read input, this will hang if input is not terminated with LF.
  • The code handles the ordering of testing process termination and input availability incorrectly. If a script starts a daemon that holds PKG_MSGFD open and exits, the code will fall into getline() and wait forever.
  • The line/linecap variables allocated the input line buffer only once at the start of the loop; longer lines in input after that wouldn't have been completely read by getline().

I reworked the code to check waitpid(), then check for input, then read any input, then continue or terminate the loop. Now, it will wait and read any input until the child exits, then it will read any available input but not wait. Once the child has terminated and we know that from waitpid(), any leftover output from the child will be buffered in kernel and immediately available; once we have read that we can exit the loop. It also removes use of stdio and line blocking for the input reads.

One minor point: I looked at pkg_emit_msg() and the event code and I saw no obvious requirement that the message needs to be terminated by an LF. Is this correct?

I also added a test to verify this issue and the fix.

Rewrite core wait loop to consume all immediately available output,
and handle non-LF-terminated output.

Break wait loop out to separate function.  Use it in
pkg_script_run_lua() too to eliminate copypasta.

Fix some error handling in pkg_script_run().
@bapt bapt merged commit d40dbc6 into freebsd:master Oct 30, 2020
@bapt
Copy link
Member

bapt commented Oct 30, 2020

not there was a small style issue, but I'll fix after the merge, thank you very much!

@cgull
Copy link
Contributor Author

cgull commented Nov 5, 2020

Thanks for taking this! When do you expect to branch/release 1.16?

@bapt
Copy link
Member

bapt commented Nov 5, 2020

asap, I am working on more test cases for triggers

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants