Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Kernel subshells (JEP91) implementation #1249

Open
wants to merge 20 commits into
base: main
Choose a base branch
from

Conversation

ianthomas23
Copy link
Collaborator

@ianthomas23 ianthomas23 commented Jun 13, 2024

This is the implementation of the kernel subshells JEP (jupyter/enhancement-proposals#91). It follows the latest commit (1f1ad3d) with the addition of a %subshell magic command that is useful for debugging. To try this out I have a JupyterLab branch that talks to this branch and is most easily tried out using https://mybinder.org/v2/gh/ianthomas23/jupyterlab/jep91_demo?urlpath=lab; once the mybinder instance has started, open the subshell_demo_notebook.ipynb and follow the instructions therein.

The idea is that this is mergeable as it is now, it is backward compatible in that it does not break any existing use of ipykernel (subject to CI confirmation). There are some ramifications of the protocol additions (outlined below) that will need addressing eventually, but I consider these future work that can be in separate PRs.

Outline of changes

  1. The parent subshell (i.e. the main shell) runs in the main thread.
  2. Each new subshell runs in a separate thread.
  3. There is a new thread that deals with all communication on the shell channel, previously this was performed in the main thread.
  4. Communication between the shell channel thread and other threads is performed using ZMQ inproc pair sockets, which are essentially shared memory and avoid the use of thread synchronisation primitives.
  5. Incoming shell messages are handled by the shell channel thread which extracts the subshell_id from the message and passes it on to the correct subshell.
  6. Subshells are created and deleted via messages sent on the control channel. These are passed to the shell channel thread via inproc pair sockets so that the SubshellManager in the shell channel thread is responsible for subshell lifetimes.

Example scenario

Here is an example of the communication between threads when running a long task in the parent subshell (main thread) and whilst this is running a child subshell is created, used, and deleted.

sequenceDiagram
    participant client as Client
    participant control as Control thread
    participant shell as Shell channel thread
    participant main as Main thread

    client->>+shell: Execute request (main shell)
    shell->>-main: Execute request (inproc)
    activate main

    client->>+control: Create subshell request
    control->>-shell: Create subshell request (inproc)
    activate shell
    create participant subshell as Subshell thread
    shell-->>subshell: Create subshell thread

    shell->>control: Create subshell reply (inproc)
    deactivate shell
    activate control
    control->>-client: Create subshell reply

    client->>+shell: Execute request (subshell)
    shell->>-subshell: Execute request (inproc)
    activate subshell

    subshell->>shell: Execute reply (inproc)
    deactivate subshell
    activate shell
    shell->>-client: Execute reply (subshell)

    client->>+control: Delete subshell request
    control->>-shell: Delete subshell request (inproc)
    activate shell
    destroy subshell
    shell-->>subshell: Delete subshell thread

    shell->>control: Delete subshell reply (inproc)
    deactivate shell
    activate control
    control->>-client: Delete subshell reply

    main->>shell: Execute reply (inproc)
    deactivate main
    activate shell
    shell->>-client: Execute reply (main shell)
Loading

Future work

ipykernel

  1. Shell channel thread deserialises the whole some of the message to get the subshell_id. Ideally it would only deserialise the header. May need changes in Jupyter Client.
  2. Signalling a subshell to stop uses a threading.Event following the existing anyio implementation which requires an extra thread per Event. It would be nice if this could be changed so a subshell is a single thread not two.
  3. Execution count. Should either be a separate count per subshell or a single count for a kernel. Needs a decision and changes in IPython as is currently not atomic.
  4. History. Related to item 2 above.
  5. input() on more than one subshell at the same time run but do not store correctly.
  6. Debugger use needs investigating.
  7. Busy/idle status needs investigating. Should there, as now, be separate status for each subshell, or the concept of kernel (i.e. any subshell) busy status? This issue is much wider than subshells as it includes status of the control channel, and how Jupyter Server should track status (Improve the busy/idle execution state tracking for kernels. jupyter-server/jupyter_server#1429).
  8. Use of display hooks for e.g. Matplotlib. Should these be on the parent subshell, or child subshells too?

JupyterLab

The JupyterLab branch I am using to demo this isn't really intended to be merged. But if it was, it needs:

  1. Check kernel_info to see if subshells are supported.
  2. Delete subshell when close a subshell's ConsolePanel.
  3. Report subshell IDs in tree view?
  4. Display of subshell busy/idle status.

(Edited for clarity)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

1 participant