Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

API for getting a text document without opening #15723

Closed
felixfbecker opened this issue Nov 18, 2016 · 32 comments
Closed

API for getting a text document without opening #15723

felixfbecker opened this issue Nov 18, 2016 · 32 comments
Assignees
Labels
api bug Issue identified by VS Code Team member as probable bug verified Verification succeeded
Milestone

Comments

@felixfbecker
Copy link
Contributor

felixfbecker commented Nov 18, 2016

As an extension developer, I sometimes want to get the content of a textDocument identified by a URI from VS Code, but without opening it and triggering onDidOpen events.

My use case is extending the LanguageClient to support the textDocument/xfiles extension. Language servers in isolated environments need a way to request file contents without accessing the file system directly, and there should be a way to extend the LanguageClient to support these requests as well. I cannot read from disk because the URI could also be the URI of an in-memory document.

There are probably other use cases where an extension needs to inspect file contents without wanting VS Code to open them.

Current API:

		/**
		 * All text documents currently known to the system.
		 *
		 * @readonly
		 */
		export let textDocuments: TextDocument[];

		/**
		 * Opens the denoted document from disk. Will return early if the
		 * document is already open, otherwise the document is loaded and the
		 * [open document](#workspace.onDidOpenTextDocument)-event fires.
		 * The document to open is denoted by the [uri](#Uri). Two schemes are supported:
		 *
		 * file: A file on disk, will be rejected if the file does not exist or cannot be loaded, e.g. `file:///Users/frodo/r.ini`.
		 * untitled: A new file that should be saved on disk, e.g. `untitled:c:\frodo\new.js`. The language will be derived from the file name.
		 *
		 * Uris with other schemes will make this method return a rejected promise.
		 *
		 * @param uri Identifies the resource to open.
		 * @return A promise that resolves to a [document](#TextDocument).
		 */
		export function openTextDocument(uri: Uri): Thenable<TextDocument>;

Proposal:

		/**
		 * Like openTextDocument(), but does not fire an didOpen event and does not add the document to the textDocuments array.
                 * 
		 * @param uri Identifies the resource to open.
		 * @return A promise that resolves to a [document](#TextDocument).
		 */
                export function getTextDocument(uri: Uri): Thenable<TextDocument>;
@jrieken jrieken added the info-needed Issue requires more information from poster label Nov 21, 2016
@jrieken
Copy link
Member

jrieken commented Nov 21, 2016

but without opening it and triggering onDidOpen events.

Why? I don't see that as a benefit but as additional complexity

@felixfbecker
Copy link
Contributor Author

felixfbecker commented Nov 21, 2016

Many language servers need to index all files in the project. The linked protocol extensions are an effort to allow them to do that over LSP, without accessing the file system, because servers often run in an isolated environment like a Docker container. They can, for example, operate solely on the git:// protocol and provide IntelliSense directly for GitHub repos - without even checking the commit out, they can get file contents at any commit much more efficiently from git directly. Another example is working over FTP, the IDE doesn't have to download the whole project to work on it, it only needs to download the files the LS needs.

Now, to implement these requests in VS Code the workspace/files request can use the findFiles() API to return a list of all files. So far so good. Now the LS will iterate over the result and start indexing, for every file it will request the contents of that file, parse it, add the symbols to the index and then discard the content again. This textDocument/content request should only return a TextDocumentItem, but not cause the file to get "opened" in the editor. If VS Code triggered onDidOpen events, these events would get send to the LS to - which means the content will be transferred twice, one time for the request, and then with the event. All the LS wants to do is index the symbols - if onDidOpen events are triggered, it will cause VS Code and the LS to keep all these files in memory. For a project with 26k files, you would have 26k files opened and loaded in memory, twice.

I don't see why there would be added complexity - openTextDocument would just call getTextDocument, and then additionally trigger the events and add it to the opened files. I can do a PR if that would be appreciated?

@jrieken
Copy link
Member

jrieken commented Nov 21, 2016

which means the content will be transferred twice, one time for the request, and then with the event.

That is incorrect. The document data is transferred only once and inside the extension host references are passed around.

if onDidOpen events are triggered, it will cause VS Code and the LS to keep all these files in memory.

No, the editor can decide at any time to close a document again. This usually happens when the last editor showing a document closes but also in other cases.

@felixfbecker
Copy link
Contributor Author

felixfbecker commented Nov 21, 2016

Okay, so maybe I misunderstand the openTextDocument function then?

  • Does it cause VS Code to "open" a text document inside the editor (eg show it to the user)?
  • Does it cause VS Code to send an onDidOpen notification to language servers (that includes the content)?

@jrieken
Copy link
Member

jrieken commented Nov 21, 2016

So, openTextDocument just loads the document into memory (think of bytes to text document). To show that document in an editor you must invoke showTextDocument.

@aeschli or @dbaeumer know best but to my knowledge onDidOpen events on the language server side are only send for documents you have expressed interest for using the documentSelector or synchronize-config

@felixfbecker
Copy link
Contributor Author

felixfbecker commented Nov 21, 2016

Thanks for clearing that up @jrieken! I think I can work with that.
I will assume onDidOpen is getting sent for every document that has been showed in the editor then, to use the right terminology. It is definitely not only through synchronize?

@dbaeumer
Copy link
Member

The language server client filters open events based on a document selector. However if the server expressed interest in lets say *.php files and the openTextDocument for a PHP file triggers an open event then that event is forwarded to the server as well.

@jrieken I assume that openTextDocument does trigger on open event.

@jrieken
Copy link
Member

jrieken commented Nov 21, 2016

@dbaeumer Yes it does - unless it already is open.

@felixfbecker Objects closing this after clarifying things?

@felixfbecker
Copy link
Contributor Author

felixfbecker commented Nov 21, 2016

Well then my assumption was correct after all

<- workspace/xfiles
-> here are all files in the workspace
<- textDocument/xcontent for 1
-> here is the content (TextDocumentItem) of text document 1
-> textDocument/didOpen here, I just opened text document 1, have the content // this should not happen
...
proceed for the next 26k text documents

I need to prevent VS Code from sending a didOpen notification to the server - I just want the text document content (TextDocumentItem)

@jrieken
Copy link
Member

jrieken commented Nov 22, 2016

Where do you want to access the document content? In the extension host or in a language server?

@felixfbecker
Copy link
Contributor Author

felixfbecker commented Nov 22, 2016

In the language client, I want to register a custom handler that uses the vscode API to get the document content, all to allow the server to get the document content.

So to the correct answer for your question would be: both. Once I have it in the former, I can enable it in the latter.

@dbaeumer
Copy link
Member

@felixfbecker I can understand that you don't want to read the file from disk on the server side but why can't the client simply load the content of the file.

@felixfbecker
Copy link
Contributor Author

felixfbecker commented Nov 22, 2016

@dbaeumer Because there is no guarantee that the URI is a file:// URI. It could be an inmemory:// URI that VS Code passes for new files that are not saved to disk yet. Or ftp:// if VS Code ever decides to implement that. So the content retrieval would best go through VS Code.

Is there anything that speaks against adding this API?

@jrieken jrieken added this to the March 2017 milestone Mar 6, 2017
@kieferrm kieferrm mentioned this issue Mar 6, 2017
58 tasks
@octref
Copy link
Contributor

octref commented Mar 7, 2017

Glad to see it on March. Want this for vue LS too.

@dbaeumer
Copy link
Member

dbaeumer commented Mar 7, 2017

@jrieken this is the content provider API we talked about. Right.

@jrieken
Copy link
Member

jrieken commented Mar 8, 2017

Yes, idea is that we will something like loadTextDocumentContent(uri:Uri): Promise<string>

@jrieken jrieken added feature-request Request for new features or functionality and removed info-needed Issue requires more information from poster labels Mar 21, 2017
@jrieken
Copy link
Member

jrieken commented Mar 21, 2017

fyi @joaomoreno you wanna use this for the git-extension

@felixfbecker
Copy link
Contributor Author

@jrieken can we make it return a full TextDocument, not just the string? That would be more consistent with the load API, return additional metadata (that could be extended in the future without breaking changes) and makes is in line with how our LSP extension was designed

@jrieken
Copy link
Member

jrieken commented Mar 21, 2017

Which then just means don't fire an event?

@felixfbecker
Copy link
Contributor Author

Probably, yeah. In pseudo code:

loadTextDocument(uri) {
  const doc = getTextDocument(uri);
  fireEvents(doc);
  return doc;
}

getTextDocument(uri) {
  // All the logic
}

@jrieken
Copy link
Member

jrieken commented Mar 21, 2017

re #15723 (comment) @joaomoreno actually not.

So, I am back to may old point of view that this is confusing and unhelpful. @felixfbecker your main motivation seems to come from the concern of transferring/opening a document twice. The sequence-diagram in #15723 (comment) isn't correct, the flow is:

<- openTextDocument for `foo://bar/file.txt`
-> onDidOpenTextDocument with *all* document state, lines etc // will store doc locally by its uri
-> didOpen `foo://bar/file.txt` // will lookup doc locally with uri
...
proceed for the next 26k text documents

Take a look at the $tryOpenDocument-method and see that it doesn't return anything. We simply rely on the fact that $acceptDocumentsAndEditorsDelta has been called before $tryOpenDocument returned. That means the extension host doesn't know if a document has been opened because of an extension or user-interaction.

@felixfbecker
Copy link
Contributor Author

@jrieken I'm confused by your diagram, what do these calls represent? Is that the current or proposed state? My diagram were supposed to be LSP calls, I am not too familiar with the VS Code internals. Could you explain how you would solve the use case of a textDocument/content LSP request from server to client, as described in the linked spec?

@jrieken
Copy link
Member

jrieken commented Mar 21, 2017

Is that the current or proposed state?

That is today and the truth on the extension host which is also the source of truth for the LSP because the LSP-client is implemented as an extension. The existence of documents in the extension host and the LSP server is driven by events, not open request which just trigger those events.

@felixfbecker
Copy link
Contributor Author

What do you mean with "open request"?

Is it possible with the current or proposed API to achieve something like this for language servers?

<- workspace/xfiles
-> all files in the workspace
<- textDocument/xcontent for 1
-> the content (TextDocumentItem) of text document 1
NO textDocument/didOpen notification, VS Code does NOT keep the content in memory
...
proceed for the next 26k text documents

@jrieken
Copy link
Member

jrieken commented Mar 22, 2017

VS Code does NOT keep the content in memory

Yes, we will not keep it in memory for long. The extension host function openTextDocument is slightly misleadingly named because open sets expectations for a close-counterpart. That doesn't exist because the main side tries to protect itself from greedy extensions that consume memory of the main side by opening document after document. The deal is that the main side closes documents and that extensions must accept that.

Documents requested by extensions get closed after a timeout and/or when many document have been requested. Because we recently migrated the underlying infrastructure not everything is there yet but in essence extensions cannot make the main side hold on to arbitrary amounts of documents.

@felixfbecker
Copy link
Contributor Author

Ah, gotcha. Of course it needs to be loaded in memory for a short time, but it should just not require an explicit close and not trigger any didOpen events.

@jrieken
Copy link
Member

jrieken commented Mar 22, 2017

Yes, it will not require an explicit close because the editor owns the lifecycle of all documents. There will onDidOpen and onDidClose events because that's how the world works.

What we will add is a new property TextDocument#isAttached which allows you to quickly figure out if a document is showing in one or more editors and we are thinking about an event ala onDidAttachTextDocument. That will help you work only with visible documents and should be easier to use than todays slightly bulky onDidChangeVisibleTextEditors-event.

@felixfbecker
Copy link
Contributor Author

@jrieken will it be possible in any way to prevent the textDocument/didOpen event from getting sent to a language server?

@jrieken
Copy link
Member

jrieken commented Mar 22, 2017

I believe @dbaeumer's plan is to use the isAttached information and only sync those by default

@dbaeumer
Copy link
Member

For the LSP protocol and VS Code library implementation the plan is as follows:

  • adapt the attach event story
  • define messages to open a text document and to retrieve a content without receiving an open.

jrieken added a commit that referenced this issue Mar 24, 2017
@jrieken
Copy link
Member

jrieken commented Mar 24, 2017

Closing because the open-editor situation has been clarified and improved. Extensions won't be able to open document that we hold on to for long. Also when an extension holds onto a closed document it won't explode anymore but has a boolean that reflects that state.

Today, knowing what document is open (as loaded from disk into memory) and knowing what document is showing is easy because a TextEditor is always bound to a document. What's not possible today is to know what the current 'tab-state' is. We track that issue in #15178.

@jrieken jrieken closed this as completed Mar 24, 2017
jrieken added a commit that referenced this issue Mar 24, 2017
@jrieken jrieken added bug Issue identified by VS Code Team member as probable bug and removed feature-request Request for new features or functionality labels Mar 27, 2017
@dbaeumer
Copy link
Member

Marking as verified. @jrieken and I discussed this lengthily and the API is fine as it is. For the language server protocol however we need the tab-state as discussed in #15178.

@dbaeumer dbaeumer added the verified Verification succeeded label Mar 31, 2017
@vscodebot vscodebot bot locked and limited conversation to collaborators Nov 18, 2017
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
api bug Issue identified by VS Code Team member as probable bug verified Verification succeeded
Projects
None yet
Development

No branches or pull requests

5 participants