Prevent potential hang with scripts (take 3) #1893
Merged
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
@juikim last tried to fix this in #1843. However, we're still having problems with hangs from pkg when background processes remain and hold
PKG_MSGFDopen.When I looked at the code in
pkg_run_scripts(), I found several problems:getline()is used to read input, this will hang if input is not terminated with LF.PKG_MSGFDopen and exits, the code will fall intogetline()and wait forever.line/linecapvariables allocated the input line buffer only once at the start of the loop; longer lines in input after that wouldn't have been completely read bygetline().I reworked the code to check
waitpid(), then check for input, then read any input, then continue or terminate the loop. Now, it will wait and read any input until the child exits, then it will read any available input but not wait. Once the child has terminated and we know that fromwaitpid(), any leftover output from the child will be buffered in kernel and immediately available; once we have read that we can exit the loop. It also removes use of stdio and line blocking for the input reads.One minor point: I looked at
pkg_emit_msg()and the event code and I saw no obvious requirement that the message needs to be terminated by an LF. Is this correct?I also added a test to verify this issue and the fix.