Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Executing asynchronous command #92

Closed
9 of 10 tasks
Tracked by #51
magicant opened this issue Oct 6, 2021 · 8 comments
Closed
9 of 10 tasks
Tracked by #51

Executing asynchronous command #92

magicant opened this issue Oct 6, 2021 · 8 comments
Labels
tracker List of subtasks

Comments

@magicant
Copy link
Owner

magicant commented Oct 6, 2021

  • Basic execution of the asynchronous command Basic semantics of asynchronous item #93
    • In this implementation, the current shell environment will always have a single child process for each asynchronous command. Asynchronous multi-command pipelines are run by grandchild processes.
    • In this implementation, child process IDs are not saved in Env because we don't need to.
  • Nullifying the standard input of asynchronous commands when job control is off 02b34e7
  • Ignoring SIGINT and SIGQUIT in asynchronous commands when job control is off Ignore SIGINT and SIGQUIT in asynchronous command #247
  • Remembering the child process ID of the last asynchronous command in Env Expand special parameters $? and $! #95
    • This is needed to support $!.
  • Remembering the child process ID of all jobs in Env Manage jobs in JobSet #153
    • This is needed to support jobs -l.
  • Forgetting old jobs when there are too many jobs to remember

Questions:

  • Should the shell perform waitpid for any child process immediately after the child's state changes, or only when the user explicitly invokes the wait or fg built-in?
    • Performing waitpid immediately reduces zombie processes that occupy the process ID space, but incurs a possibility of the child process ID reused before waited for.
    • Can we waitpid selectively for child processes that cannot be waited for by the user?
      • waitid with WNOWAIT?
    • An interactive job-controlling shell must report job status before each prompt. To do so, the shell needs to waitid for child processes with the following combinations of options:
      • WSTOPPED|WCONTINUED|WNOHANG without WNOWAIT: to report the latest job status as to whether it is running or stopped. With WNOWAIT, we would keep receiving the old status even after a new status gets available.
      • WEXITED|WNOHANG|WNOWAIT: to report exited or signaled jobs. Without WNOWAIT, we would consume the process ID of an exited job and allow another process to reuse the same process ID, thereby disabling the user to specify which job to wait for.
  • Does the shell perform waitpid(-1, ...) or waitid(P_ALL, ...) occasionally? If not, how does it consume its unknown child that had forked before the shell started?
    • Hint: the shell can do such a wait if there are no processes known in the current shell execution environment.
  • How do we handle jobs whose process ID has not been expanded by $!?
    • Hint: when such a job exits, the shell can immediately reap it by waitpid to allow the OS to reuse the process ID.
    • However, we can keep the job in the job table so it can be listed by the jobs built-in.
  • What is the strategy for removing excessive jobs from a job set? Do we need a priority queue or any other data structure to decide which job to remove?
    • We should prefer removing terminated jobs rather than removing jobs that are still alive.
    • We should prefer removing old jobs.
    • We might prefer removing jobs whose process ID has not been known to the user by the $! expansion.

Optional todos:

  • Running each command of an asynchronous multi-command pipeline as a direct child process.
  • Remembering all the child process IDs in a multi-command pipeline.

These are needed to support the PIPESTATUS special variable. They also allows extension of jobs -l to print the process IDs of all pipeline components. [1]

@magicant magicant changed the title Asynchronous command Executing asynchronous command Oct 6, 2021
@magicant
Copy link
Owner Author

magicant commented Oct 6, 2021

POSIX employs the notion of process IDs known in the current shell execution environment. The relevant specifications are follows:

Asynchronous Lists

When an element of an asynchronous list (the portion of the list ended by an , such as command1, above) is started by the shell, the process ID of the last command in the asynchronous list element shall become known in the current shell execution environment; [...]. This process ID shall remain known until:

  1. The command terminates and the application waits for the process ID.
  2. Another asynchronous list is invoked before "$!" (corresponding to the previous asynchronous list) is expanded in the current execution environment.

The implementation need not retain more than the {CHILD_MAX} most recent entries in its list of known process IDs in the current shell execution environment.

Description for set -b

When the shell notifies the user a job has been completed, it may remove the job's process ID from the list of those known in the current shell execution environment

Description for bg

Using bg to place a job into the background shall cause its process ID to become "known in the current shell execution environment", as if it had been started as an asynchronous list; [...].

Description for fg

Using fg to place a job into the foreground shall remove its process ID from the list of those "known in the current shell execution environment''; [...].

Description for jobs

When jobs reports the termination status of a job, the shell shall remove its process ID from the list of those "known in the current shell execution environment''; [...].

Description for wait

If the wait utility is invoked with no operands, it shall wait until all process IDs known to the invoking shell have terminated [...].

If one or more pid operands are specified that represent known process IDs, the wait utility shall wait until all of them have terminated. If one or more pid operands are specified that represent unknown process IDs, wait shall treat them as if they were known process IDs that exited with exit status 127. [...]

The known process IDs are applicable only for invocations of wait in the current shell execution environment.


To sum up, whether a job's process ID is known in the current shell execution environment or not affects whether wait can operate on the job and return its exit status.

Note that, according to POSIX, the process ID remains unknown when a foreground job is stopped. That means you cannot wait for such a process by specifying the process ID as an operand to wait. However, none of bash, dash, ksh, mksh, and yash treats such a process as unknown.

@magicant magicant added the tracker List of subtasks label Oct 6, 2021
@magicant
Copy link
Owner Author

magicant commented Oct 9, 2021

To understand the process ID reuse problem, consider the following script:

exit 1&
pid1=$!
exit 2&
pid2=$!
wait $pid1
echo $?
wait $pid2
echo $?

This should print 1 followed by 2, but what if $pid1 and $pid2 happen to be the same? The two wait built-in invocations cannot tell which process to wait for. This failure may occur in the following scenario:

  1. The exit 1& command forks a child process that returns the exit status of 1.
  2. The child process exits so soon that the shell performs wait(p)id for it before executing the exit 2& command.
  3. Now, the child process ID has been returned to the operating system kernel. The kernel is free to reuse the process ID.
  4. The exit 2& command forks another child process that returns the exit status of 2.
  5. The second child process may have the same process ID as the first, depending on the kernel's arbitrary choice of the ID.

The only known reliable way to prevent such reuse is to keep the child process pending by not performing wait(p)id for the child after it terminates.

@magicant
Copy link
Owner Author

The current plan is that the shell waitpids for all child processes after each command execution to update the status in Env and reap any exited processes. That means the shell will still suffer the small possibility of process ID conflicts described in the previous comment.

Rationale:

  • Zombie processes left unreaped may occupy the system's process space and prevent the shell and other processes from creating more processes, which is not desirable.
  • waitpiding only for reapable processes would make the shell implementation complicated.
  • Other shells (including yash 2) behave this way and users are not complaining about that.
  • If we really need to resolve the process ID conflict issue, it will be possible to add a shell option that allows extending the value of $! with characters that disambiguate the processes sharing the same process ID.

@magicant
Copy link
Owner Author

magicant commented Apr 19, 2022

POSIX allows the shell to forget jobs whose process ID is not known to the user as a result of expanding $!. However, no existing shells seem to forget such jobs. So, the current plan for yash-rs is to remember jobs regardless of whether they are known by the user.

(Update: To support this, the shell has to update a flag when expanding $!, which requires mutability of the job.)

@magicant
Copy link
Owner Author

POSIX only requires the shell to include the process IDs of live asynchronous tasks in the list of known process IDs in the current shell execution environment. This implies a process may be removed from the list when put in the foreground and then be listed again when moved to the background. The shell must retain such processes in the list because they are recently added entries. When we have more than CHILD_MAX jobs, it is not necessarily the earliest started job that we can drop from the list.

@magicant
Copy link
Owner Author

POSIX is silent on how many jobs the shell must retain. Maybe we should interpret it as the shell is required to retain infinite number of jobs it may spawn. If so, we effectively have no chance to prune asynchronous tasks from the internal list.

Even if the shell remembers tasks without limit, process IDs will eventually wrap around and get reused, so the shell's internal list would not overflow in practice.

@magicant
Copy link
Owner Author

If an asynchronous task is not job-controlled and its process ID is not expanded with $!, then the shell may disown the task when another asynchronous task is started. Job-controlled tasks cannot be disowned as they may become known in the current execution environment when they are resumed with bg later.

@magicant
Copy link
Owner Author

If the shell immediately removes an asynchronous task from its internal list just because its process ID has not been expanded before another asynchronous task is started, the wait built-in without operands cannot await the removed task. Such behavior seems to be allowed in POSIX, but it would affect portability of user scripts.

Conclusion: The shell should not remove any tasks from the job list unless jobs are properly waited for or disowned.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
tracker List of subtasks
Projects
Status: Done
Development

No branches or pull requests

1 participant