Intellisense for notebooks (old way)

Deprecated

This page describes the old way that notebook intellisense worked. It's used if the python.pylanceLspNotebooksEnabled setting is false.

Once this experiment is pushed out to 100%, the following code could be eliminated:

src/standalone/intellisense/fileBasedCancellationStrategy.node.ts - only used for custom pylance servers
src/standalone/intellisense/fileBasedCancellationStrategy.node.ts - sets up the custom middleware pieces
src/standalone/intellisense/languageServer.node.ts - starts the custom pylance servers
vscode-python/src/client/activation/languageClientMiddleware.ts - creates a middleware that 'hides' requests for notebook cells.

Intellisense Overview

Intellisense in VS code works by sending LSP requests to a separate process (well in most cases, see this for more info)

Something like so:

Intellisense for notebooks

Intellisense for notebooks works pretty much the same way but with each cell of a notebook being a text document:

Concatentation of the notebook cells

This poses a problem for the language server (Pylance) because code from one cell can be referenced in another.

Example:

In that example, the pandas import crosses the cell boundary.

This means pylance cannot just analyze each cell individually.

The solution was to concatenate the cells in order.

This changes the original architecture to something like so:

How does concatenation actually work

Concatenation is mostly just a raw concat of all of the contents of each cell on top of each other. Then the concat document has functions to map back and forth between the original cells and the concatenated contents.

Code for this can be found here

Here's an example of using it:

public async provideReferences(
        document: vscode.TextDocument,
        position: vscode.Position,
        options: {
            includeDeclaration: boolean;
        },
        token: vscode.CancellationToken,
        _next: protocol.ProvideReferencesSignature
    ) {
        const client = this.getClient();
        if (this.shouldProvideIntellisense(document.uri) && client) {
            const documentId = this.asTextDocumentIdentifier(document);
            const newDoc = this.converter.toConcatDocument(documentId);
            const newPos = this.converter.toConcatPosition(documentId, position);
            const params: protocol.ReferenceParams = {
                textDocument: newDoc,
                position: newPos,
                context: {
                    includeDeclaration: options.includeDeclaration
                }
            };
            const result = await client.sendRequest(protocolNode.ReferencesRequest.type, params, token);
            const notebookResults = this.converter.toNotebookLocations(result);
            return client.protocol2CodeConverter.asReferences(notebookResults);
        }
    }

That is the handler for the references LSP request.

It is

translating the incoming cell uri into a concat document
translating the incoming cell position into a concat document
sending the request using the concat data
translating the results back into a cell uri

Finding modules

When pylance starts up, it is passed an interpreter that defines what modules are installed. In this example, pylance is running with a Python 3.10 environment that is missing scikit-learn:

For python files, this interpreter is set at the bottom left of VS code:

That interpreter is used by pylance to determine where it will find all of the modules it checks. So in this example, the window's 3.10 64 bit environment does not have the module 'scikit-learn'

Notebooks complication

Notebooks don't have a 'global' interpreter, but rather a 'kernel' that is used to run the code. This kernel is almost always associated with a python interpreter.

This interpreter is what we need to pass to pylance so it can find the correct modules.

This complicates how pylance is started.

For a normal python file, this is how things are started:

For a notebook, we can't use the global interpreter, but rather we start a pylance server per kernel in use:

This is necessary because each pylance needs to have a separate 'interpreter' to use to search for modules.

This means there are now 4 pylance servers running.

1 for the python extension to handle python files
3 for each notebook that is opened with a different kernel

Document Selectors and Documents

Having multiple language servers running would usually mean each server was assigned to a specific document selector, otherwise you'd end up with duplicate results for say hover or completion.

However that's not the case. That's because of limitations in how selectors are specified.

The can specify a scheme, a language, or a pattern match
They cannot run logic (they're static)
They cannot exclude things

The python extension's selector is basically "language": "python" and the jupyter extension's selector is basically "scheme":"vscode-notebook-cell", then how do we resolve the duplicates?

Both extensions use something called middleware.

Middleware

The VS Code language client npm module is a library for talking to LSP enabled language servers. Both the Python Extension and the Jupyter extension use it in order to send messages to pylance. The library allows for the creation of a 'Middleware' object that can listen to any LSP request before it is sent to the server.

This provides an opportunity to filter messages based on the outbound document URI. Meaning we can eliminate duplicates in the example above.

Python extension lets all non notebook requests go through normally and swallows notebook requests (handling the negative case that selectors can't handle)
Jupyter extension has one middleware started per kernel. Each middleware piece swallows all requests not notebook related and checks if the request matches the kernel on a server. (handling the 'function' check for a selector)

This diagram shows a request for a specific notebook cell:

Actual implementation

The middleware that makes these decisions can be found

Here for python
Here for jupyter

Jupyter's mutliplexing code for picking which pylance server to be run can be found here.

Future changes

Having 4 pylance servers running at the same time is rather redundant and a waste of CPU so we'd like to eliminate this need. In order to do that, Pylance would have to support a custom message indicating that certain URIs have different interpreters.

If that were to happen, we wouldn't need any middleware layers at all. Pylance would just handle all requests for all python files, and the jupyter extension would just need to pass a message indicating certain cells use a different interpreter.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Intellisense for notebooks (old way)

Deprecated

Table of Contents

Intellisense Overview

Intellisense for notebooks

Concatentation of the notebook cells

How does concatenation actually work

Finding modules

Notebooks complication

Document Selectors and Documents

Middleware

Actual implementation

Future changes

Project Management

Contributing

Code Architecture

API

Features

FAQ and Troubleshooting

Misc

Clone this wiki locally