Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

General discussion about support for other languages than just Python #1536

Closed
clouds56 opened this issue Apr 2, 2019 · 91 comments
Closed
Labels
language-any Area covering general issues geared to supporting any language (not just Python)

Comments

@clouds56
Copy link

clouds56 commented Apr 2, 2019

Is it possible to make jupyter support other language like julia and R?
Or at least provide some interface for other extension to integrate.

@rchiodo
Copy link
Contributor

rchiodo commented Apr 2, 2019

@clouds56, at the moment no.

@rchiodo
Copy link
Contributor

rchiodo commented Jul 2, 2019

This would be a good spot to have people vote on other languages. Which ones do you want besides python?

@rchiodo
Copy link
Contributor

rchiodo commented Jul 2, 2019

Julia?

@rchiodo
Copy link
Contributor

rchiodo commented Jul 2, 2019

R?

@rchiodo
Copy link
Contributor

rchiodo commented Jul 2, 2019

Haskell?

@rchiodo
Copy link
Contributor

rchiodo commented Jul 2, 2019

C#/F#?

@rchiodo
Copy link
Contributor

rchiodo commented Jul 9, 2019

Scala?

@raghavgautam
Copy link

Bash ?

@chrishales709
Copy link

SAS?

@dr-br
Copy link

dr-br commented Oct 9, 2019

cling?

@stevengj
Copy link

stevengj commented Oct 11, 2019

I don't see any point in voting on languages here — supporting two languages should be the same amount of work as supporting all languages. The whole point of Jupyter is that it defines a language-independent protocol … if there is a general vscode extension for notebooks, it should work with any notebook in any language (i.e. it should be decoupled from vscode-python, analogous to nteract/hydrogen).

That is, when you open a Jupyter notebook file (.ipynb), the kernelspec.name metadata should tell you the name foo of the kernel to run. You then look for a kernelspec file foo/kernel.json in the standard locations. This file looks something like

{
  "display_name": "Julia 1.2.0",
  "argv": [
    "/Applications/Julia-1.2.app/Contents/Resources/julia/bin/julia",
    "-i",
    "--startup-file=yes",
    "--color=yes",
    "--project=@.",
    "/Users/me/.julia/packages/IJulia/src/kernel.jl",
    "{connection_file}"
  ],
  "language": "julia",
  "env": {},
  "interrupt_mode": "signal"
}

and tells you exactly how to launch the kernel (via the argv). You create a connection file, pass this for the {connection_file} argument when starting the kernel, and then talk to the kernel over ZMQ in exactly the same way that you talk to the Python kernel.

@rchiodo
Copy link
Contributor

rchiodo commented Oct 11, 2019

Unfortunately supporting a kernel is not the same as supporting a language. We run a bunch of stuff that is specific to python in order to get things like theming, variables, startup directories, etc correct. None of those work without figuring out a language neutral or per language way to do them.

We will be supporting other kernels long before we support other languages. This item here:
https://github.com/microsoft/vscode-python/issues/3123
has the beginnings of how this will work.

@hochshi has implemented picking a kernel in remote situations here (which will hopefully go in soon)
microsoft/vscode-python#7790

@al6x
Copy link

al6x commented Nov 27, 2019

Supporting just the the script itself, as an interactive notebook, without the Jupyther dependency would be even better.

@rchiodo
Copy link
Contributor

rchiodo commented Nov 27, 2019

Supporting just the the script itself, as an interactive notebook, without the Jupyther dependency would be even better.

@alexeyPetrushin can you elaborate more? Jupyter is just a framework for hosting a process that's running your code. I don't think we'd reinvent this ourselves. Is there something you don't like about having jupyter installed?

@stevengj
Copy link

stevengj commented Nov 27, 2019

You don't need a dependency on the Jupyter software stack if you simply speak the Jupyter messaging protocol to the kernel. e.g. nteract does this.

(You obviously need the software installed for any kernel you want to use, e.g. you need the ipython kernel installed if you want to talk to Python, which pulls in components from Jupyter. But you wouldn't need e.g. the jupyterlab component. And other languages might not need Jupyter software at all. For example, the Jupyter kernel for Julia has no dependency on the Jupyter software stack — it simply speaks the Jupyter protocol over ZMQ to anything that launches it.)

Unfortunately supporting a kernel is not the same as supporting a language. We run a bunch of stuff that is specific to python in order to get things like theming, variables, startup directories, etc correct. None of those work without figuring out a language neutral or per language way to do them.

By "theming" I guess you mean syntax highlighting etcetera? Why can't you simply read the language field of the kernelspec file and activate the syntax mode for the corresponding language?

Not sure what you mean by "startup directories" … the kernelspec file also tells you exactly how to launch a kernel, including any relevant environment variables.

I'm not speaking theoretically — JupyterLab and nteract do this in a language-neutral way already.

@rchiodo
Copy link
Contributor

rchiodo commented Nov 27, 2019

There's code we run (python code) that does the following:

  • Tell jupyter to create SVGs as well as PNGs
  • Change the working directory of the kernel
  • Query for variables

This is the stuff that is not platform neutral at the moment, with variables being the toughest to implement. This is really our problem, and not a jupyter specific issue.

We could talk directly to the kernel through stdin/stdout, but then we'd be reinventing the server code that jupyter has for remote situations and it wouldn't work for already running jupyter servers. There's no benefit for us, other than perhaps performance.

@rchiodo
Copy link
Contributor

rchiodo commented Nov 27, 2019

Given all of that, why do you care if we have a dependency on the jupyter server? Is this causing bugs? Is the download too large? (We might make something simpler if we were to implement this ourselves).

@stevengj
Copy link

stevengj commented Nov 27, 2019

Tell jupyter to create SVGs as well as PNGs

The Jupyter protocol allows the kernel to send data in multiple formats, including both SVGs and PNGs, so I'm not sure what you're referring to here — there is no need to tell it you want SVGs or any specific format in general, because objects supporting rich display will typically send you data in multiple supported formats and you can pick which one to display.

Maybe you are referring to Matplotlib, which by default does not send SVGs for inline display in order to speed up the display of complicated plots? But in that case there is a standard way for the user to request SVG inline plots, with no need for intervention by the front-end.

Query for variables

The Jupyter protocol already has a language-neutral introspection API.

If there is some other form of introspection that you need, I'm sure @minrk and the other Jupyter developers would be open to discussing adding it to the protocol.

Change the working directory of the kernel

Normally in Jupyter the working directory is the directory where the notebook file is located.

Why do you think you need to change directories? The kernelspec file tells you how to launch the kernel, including any path information.

We could talk directly to the kernel through stdin/stdout.

The Jupyter protocol uses zeromq, not stdio. Have you looked at the messaging spec?

why do you care if we have a dependency on the jupyter server?

Because you are hooking into it in a language-dependent way. If you simply spoke the Jupyter protocol the way it was intended to be used, I personally don't care too much what your software dependencies are, but it's not clear to me why a Jupyter-server dependency would remain.

@rchiodo
Copy link
Contributor

rchiodo commented Nov 27, 2019

why do you care if we have a dependency on the jupyter server?

Because you are hooking into it in a language-dependent way. If you simply spoke the Jupyter protocol the way it was intended to be used, I personally don't care too much what your software dependencies are, but it's not clear to me why a Jupyter-server dependency would remain.

There is no current JMP for querying variable values. The %who line magic might be used, but it certainly won't page in MBs of dataframe data. We use python code to do this now.

We also switch directories in the kernel based on where the user ran code from. This would be simpler to add a message to the protocol and the %cd line magic can do this too. Although I'm not sure if line magics are supported in all kernels.

SVG formatting is similar. We use a line magic to do this now. Not sure it will work in all kernels.

Our usage of jupyter-server is orthogonal to language independence. It's our python code that we use that's the crux of the problem.

In fact it sounds like you don't care that we use jupyter at all. You just want us to eliminate our python code and push that same functionality into jupyter itself.

This might be what we do to solve multiple languages. Or we might solve different problems with other solutions. Variable enumeration and data fetching might be accomplished by talking to an attached debugger in the kernel. At least for python attaching a debugger has very negligible overhead.

@stevengj
Copy link

stevengj commented Nov 27, 2019

There is no current JMP for querying variable values. The %who line magic might be used, but it certainly won't page in MBs of dataframe data. We use python code to do this now.

Why do you need to query variable values in the front end?

You can always send an execute_request to the kernel to ask it to evaluate a variable, given the name, and send you back a mimebundle that tells you how to display the value. That's about as much as you can hope to do in a language-independent front-end.

We also switch directories in the kernel based on where the user ran code from.

Isn't this just the directory of the notebook file? Normally Jupyter front ends do this by simply running the "launch kernel" kernelspec command from the directory of the notebook, or whatever working directory you want to use.

"Magics" like %cd are specific to the IPython kernel — they aren't part of the Jupyter spec. Nor are they needed to set the working directory of the kernel, as explained above.

SVG formatting is similar. We use a line magic to do this now. Not sure it will work in all kernels.

SVG rich object display is possible for any kernel that sends image/svg+xml data as part of a mimebundle (e.g. in a display_data message).

If the user wants to tell a particular software package, e.g. matplotlib, to use a particular output format, they should do that themselves (e.g. by executing a notebook cell with the relevant line magic in the matplotlib case). I don't understand why you need to get involved.

@rchiodo
Copy link
Contributor

rchiodo commented Nov 27, 2019

We query variables here in the front end:

image

That's our variable explorer. It lists out the variables active in the current kernel and allows you to open a data viewer for anything that supports it. Opening the data viewer executes a bunch of python code to page in the data.

Switching directories in the kernel is done when more than one file is run in the same kernel. We might remove this (and instead have one kernel per file), which would mean the startup directory could be used.

SVG formatting is for our plot viewer support. We use it so we can open high res plots in a separate window.

image

Internally we ask Jupyter to output both SVG and PNG (with a %config line magic). When the user clicks on that little expand button, we open the SVG. We're likely going to get rid of this too as it slows down Jupyter a bunch on really complicated stuff and potentially inject something into matplotlib or the ipython kernel to instead give us the necessary data to draw a high res plot.

Variable support is really the biggest blocker right now. The other stuff can probably just be dropped.

(We also used to enforce a styling on matplotlib, but we got rid of that code as it caused more trouble than it was worth).

@stevengj
Copy link

stevengj commented Nov 27, 2019

For the variable explorer, in the short term you could simply not support it for non-Python kernels. Other notebook interfaces don't have this, so I think most people could live without.

To do this in a language-independent way in the long term, you would need:

  • a message to request a list of variables — the Jupyter folks might be willing to add this to the messaging spec.

  • for each variable, you could simply send an execute_request to get a "static" view of the variable (e.g. a mimebundle) as needed

  • for more dynamic views of the variable (e.g. live updates, a zoomable plot, or whatever), the language-independent way to do this would be to use Jupyter's widget framework to allow the kernel to communicate with a Javascript widget dynamically.

@al6x
Copy link

al6x commented Nov 28, 2019

Is there something you don't like about having jupyter installed?

A little bit off-topic, feel free to ignore, maybe I miss something. I don't care about Jupyter, I don't want to see it or know about it and I don't want to have it as an extra layer to debug in my code. Basically the only thing I need is a REPL-like script with support for Images (graphs) and VSCode syntax and auto-corrections. The easier it will be and less dependencies it has the better. Seems like straightforward way would be - just start Python (or Julia or Node.JS) process, send it code, get back result and/or image, show it in VSCode, no need for Jupyther. Yes, with that approach you need to have an adapter for each language (I guess that's how you currently using jupyther, but it seems like an overkill for that).

@aolney
Copy link

aolney commented Dec 7, 2019

It might be possible to leverage SoS for this, particularly SoS Notebook, which uses a superkernel in Python to access other kernels in a single notebook. It's pretty mature and actively maintained.

I've routinely used it for the past year with Python, R, F#, Torch, and Java. AFAIK you can "drop in" any Jupyter kernel. Special extensions are only needed for features like sending data between kernel (which I never use because you can always hit the disk).

Tagging @BoPeng for his thoughts.

@davidanthoff
Copy link

I think the medium term plan here should really be to actually move the Jupyter Notebook support out of the Python extension, and revitalize the Jupyter extension, so that it supports notebook support in a language neutral form. I think anything else is really at odds with the Jupyter goals/philosophy, to be honest.

@BoPeng
Copy link

BoPeng commented Dec 11, 2019

I agree with @davidanthoff that Jupyter support should be removed from the Python extension, and be enhanced to support more languages.

@andycraig
Copy link
Contributor

@rchiodo Thank you for all your work on this extension! I want to highlight that most people are voting directly on the languages rather than upvoting the main issue post, so while the issue itself has only 10 votes, Julia now has 73 votes and R has 35. It seems like there is a fair amount of interest in broader language support.

I’m a contributor to one of the VSCode R extensions and I would love to start adding features to take advantage of R Notebooks in VSCode. If there is something that I can do to help this process along please let me know.

Thank you!

@rchiodo
Copy link
Contributor

rchiodo commented Jan 14, 2020

Hi @andycraig. This is something we're actively pursuing but is waiting right now on VS code doing some work to support notebooks better. We're waiting for VS code to finalize their designs and then we'll be able to describe how other languages will/can work.

@andycraig
Copy link
Contributor

@rchiodo I appreciate the update and great to hear that there’s still a plan for broader language support for Notebooks in VSCode.

@dcuccia
Copy link

dcuccia commented Feb 19, 2020

This is something we're actively pursuing

@rchiodo that's great news. Where is the best place to follow this progress? FWIW, I just posted on the dotnet-interactive group about .Net integration to the VS Code Jupyter integration:

dotnet/interactive#179

Would this be appropriate as a separate feature request, or is it already being effectively tracked here (or elsewhere)?

@DonJayamanne
Copy link
Contributor

Yes, over the next few days, we'll be moving these issues over to that repo.
Please have a look at the native notebook support in the new extension (we do support more languages in that notebook) & also #273 for support of other languages in interactive window

@DonJayamanne DonJayamanne transferred this issue from microsoft/vscode-python Nov 13, 2020
@jlperla
Copy link

jlperla commented Jan 12, 2021

@rchiodo I had a chance to try out the variable listing to replace the current whos (which looks like it was deprecated in Julia in 0.6).

It could be that the first part of the answer is:

"query": "String.(Base.names(Main))",

Note that following:

julia> a = 5
5

julia> b = "test"
"test"

julia> String.(Base.names(Main))
7-element Array{String,1}:
 "Base"
 "Core"
 "InteractiveUtils"
 "Main"
 "a"
 "ans"
 "b"

But I tried to put the following in my settings.json and it didn't seem to help?

    "jupyter.variableQueries": [
        {
            "language": "julia",
            "query": "String.(Base.names(Main))",
            "parseExpr": "'(\\w+)'"
        }
    ]   

So something else is missing. I also am not sure I understand how things would parse after it has the names. It looks to me like

private async getVariableValueFromKernel(
targetVariable: IJupyterVariable,
notebook: INotebook,
token?: CancellationToken
): Promise<IJupyterVariable> {
let result = { ...targetVariable };
if (notebook) {
const output = await notebook.inspect(targetVariable.name, 0, token);
// Should be a text/plain inside of it (at least IPython does this)
if (output && output.hasOwnProperty('text/plain')) {
// tslint:disable-next-line: no-any
const text = (output as any)['text/plain'].toString();
// Parse into bits
const type = TypeRegex.exec(text);
const value = ValueRegex.exec(text);
const stringForm = StringFormRegex.exec(text);
const docString = DocStringRegex.exec(text);
const count = CountRegex.exec(text);
const shape = ShapeRegex.exec(text);
if (type) {
result.type = type[1];
}
if (value) {
result.value = value[1];
} else if (stringForm) {
result.value = stringForm[1];
} else if (docString) {
result.value = docString[1];
} else {
result.value = '';
}
if (count) {
result.count = parseInt(count[1], 10);
}
if (shape) {
result.shape = `(${shape[1]}, ${shape[2]})`;
}
}
// Otherwise look for the appropriate entries
if (output.type) {
result.type = output.type.toString();
}
if (output.value) {
result.value = output.value.toString();
}
// Determine if supports viewing based on type
if (DataViewableTypes.has(result.type)) {
result.supportsDataExplorer = true;
}
}
// For a python kernel, we might be able to get a better shape. It seems the 'inspect' request doesn't always return it.
// Do this only when necessary as this is a LOT slower than an inspect request. Like 4 or 5 times as slow
if (
result.type &&
result.count &&
!result.shape &&
isPythonKernelConnection(notebook.getKernelConnection()) &&
result.supportsDataExplorer &&
result.type !== 'list' // List count is good enough
) {
result = await this.getFullVariable(result, notebook);
}
return result;
}
}
would need to be language specific as well, unless I am missing something? But I don't know typescript, so hard for me to know what is going on.

@rchiodo
Copy link
Contributor

rchiodo commented Jan 12, 2021

Thanks @jlperla, I'll try the same thing with a julia kernel and debug bits of our stuff. Might be the parseExpr is not working.

The spot you linked to in the kernelVariables.ts is where we parse the results of a jupyter 'inspect' request (this request is what we use: https://jupyter-client.readthedocs.io/en/stable/messaging.html#introspection). We had hoped that other kernels would return data in the same format as ipython, but that's probaly not happening.

I'll debug it from our side and see what's happening.

@davidanthoff
Copy link

I think

"[" * join(map(i->"'$i'", String.(Base.names(Main))), ", ") * "]" |> println

would return the list in a format that is similar to the Python version, so using that as the query might just work?

@jlperla
Copy link

jlperla commented Jan 12, 2021

Thanks @rchiodo . I suspect @davidanthoff has the best approach for that string, and then the parsing shouldn't be different.

For the inspection, that makes sense. I am not sure how to call it manually to see what it is returning, but I think this is the IJulia code: https://github.com/JuliaLang/IJulia.jl/blob/9b10fa9b879574bbf720f5285029e07758e50a5e/src/handlers.jl#L229-L248

@rchiodo
Copy link
Contributor

rchiodo commented Jan 12, 2021

I used this as the query:

{
            "language": "julia",
            "query": "String.(Base.names(Main))",
            "parseExpr": " \"(\\w+)\""
        }

Which does give us the list of variables.

However after that, the inspect request fails with this message from the julia kernel:

image

@jlperla
Copy link

jlperla commented Jan 12, 2021

Thanks @rchiodo Try the suggestion from @davidanthoff . There aren't enough adjectives in the world to describe how much more knowledgeable and competent he is on these things than me.

@rchiodo
Copy link
Contributor

rchiodo commented Jan 12, 2021

It's not the query that is failing. I can get the list. It's the inspect request.

For example, with this code here:

a = 5

The regex I was using returns me:

"Base",
"Core"
"Main",
"a"

All 4 of those give the same inspect result.

Trying with @davidanthoff's query and regex gives the same result.

Should an inspect request for "a" throw a bounds error? Do you have a way for me to try the same inspect request in a cell (IPython does this if you prefix code with a ?, like ?x will inspect x)

@jlperla
Copy link

jlperla commented Jan 12, 2021

I see. I am not sure if it is relevant, but these guys told me the following:

julia> a = 5
5

julia> varinfo(Main, r"a")
  name                    size summary
  –––––––––––––––– ––––––––––– –––––––
  Base                         Module
  InteractiveUtils 198.160 KiB Module
  Main                         Module
  a                    8 bytes Int64
  ans                  8 bytes Int64

julia> varinfo(Main, r"^a$")
  name    size summary
  –––– ––––––– –––––––
  a    8 bytes Int64
``
So if `varinfo` is used behind the hood, maybe you need to put pass in the `^` and `$` around the string?

@rchiodo
Copy link
Contributor

rchiodo commented Jan 12, 2021

That gives the same bounds error on the inspect request. Inspecting "^a$" instead of "a" gives the same problem.

@davidanthoff
Copy link

Is the inspect request just sending some Julia code to eval in the kernel? What exactly is being sent? Or is there a special inspect Jupyter message?

@jlperla
Copy link

jlperla commented Jan 12, 2021

Wait, that may not be it. See that https://github.com/JuliaLang/IJulia.jl/blob/9b10fa9b879574bbf720f5285029e07758e50a5e/src/handlers.jl#L192 is called from the inspector implementaiton: https://github.com/JuliaLang/IJulia.jl/blob/9b10fa9b879574bbf720f5285029e07758e50a5e/src/handlers.jl#L236

Note that

julia> a = 5
5

julia> using IJulia

julia> IJulia.docdict("a")
Dict{String,Union{String, JSON.Writer.JSONText}} with 3 entries:
  "text/plain"    => "  No documentation found.\n\n  \e[36ma\e[39m is of type \e[36mInt64\e[39m.\n\n\e[1m  Summary\e[22…
  "text/markdown" => "No documentation found.\n\n`a` is of type `Int64`.\n\n# Summary\n\n```\nprimitive type Int64 <: S…
  "text/latex"    => "No documentation found.\n\n\\texttt{a} is of type \\texttt{Int64}.\n\n\\section{Summary}\n\\begin…

Is that the text that would be parsed by the

const type = TypeRegex.exec(text);
const value = ValueRegex.exec(text);
const stringForm = StringFormRegex.exec(text);
const docString = DocStringRegex.exec(text);
const count = CountRegex.exec(text);
const shape = ShapeRegex.exec(text);

But I think we are now in the guts where @stevengj might be able to point some of this stuff out.

@rchiodo
Copy link
Contributor

rchiodo commented Jan 12, 2021

Yes this text here:

e[36ma\e[39m is of type \e[36mInt64\e[39m.\n\n\e[1m  Summary\e[

is the type of response we're expecting.

@rchiodo
Copy link
Contributor

rchiodo commented Jan 12, 2021

Is the inspect request just sending some Julia code to eval in the kernel? What exactly is being sent? Or is there a special inspect Jupyter message?

The inspect request is a special jupyter kernel message.

The docs for it are here:
https://jupyter-client.readthedocs.io/en/stable/messaging.html#introspection

We send the inspect_request with the 'code' value set to the name of the variable.

@davidanthoff
Copy link

What are you sending as the cursor_pos in the inspect message? The stack trace from your screenshot to me looks like a string indexing error inside get_token at https://github.com/JuliaLang/IJulia.jl/blob/9b10fa9b879574bbf720f5285029e07758e50a5e/src/handlers.jl#L195 from your screenshot.

Is there any chance you could let me know the line number of the stack trace item that I marked with a red error from your screenshot?

image

Namely the line number of the stack trace in get_token.

@rchiodo
Copy link
Contributor

rchiodo commented Jan 12, 2021

We set cursor_pos to zero.

Trace for the get_token is at line 217:

" [4] get_token(::String, ::Int64) at C:\Users\Rich.julia\packages\IJulia\IDNmS\src\handlers.jl:217"

@greazer greazer changed the title DS: Support other languages General discussion about support for other languages than just Python Apr 1, 2021
@greazer greazer added the language-any Area covering general issues geared to supporting any language (not just Python) label Aug 4, 2021
@rchiodo
Copy link
Contributor

rchiodo commented Aug 30, 2021

Closing this as other languages are supported in the new UI (with other extensions supporting their own kernels and language servers)

@rchiodo rchiodo closed this as completed Aug 30, 2021
@github-actions github-actions bot locked as resolved and limited conversation to collaborators Sep 7, 2021
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
language-any Area covering general issues geared to supporting any language (not just Python)
Projects
None yet
Development

No branches or pull requests