New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

how to restore script tags in trusted HTML output #3118

Closed
minrk opened this Issue Oct 18, 2017 · 126 comments

Comments

Projects
None yet
@minrk
Contributor

minrk commented Oct 18, 2017

I've encountered a few notebooks while testing out JupyterLab where their output is silently omitted due to the sanitization of script tags even in trusted HTML. How can we restore the evaluation of script tags in trusted HTML output?

@blink1073

This comment has been minimized.

Member

blink1073 commented Oct 18, 2017

We removed execution of scripts in #2595.

This was done for improved security and because the public JavaScript APIs have changed from the classic notebook. For instance: no global require or $.

The intent is to use custom mimetypes that are handled by JupyterLab extensions.

We can re-address if needed, but would prefer to not allow arbitrary script execution when there is a chance to correct the behavior.

cc @sccolbert @ellisonbg @jasongrout

@minrk

This comment has been minimized.

Contributor

minrk commented Oct 19, 2017

Thanks! I hope this decision is reconsidered, at least for text/html. I don't think the security issue is relevant, since there's already an explicit trust of the output being produced in this case.

Custom extensions make sense for big, complicated outputs like bokeh/vega/etc., where the investment in writing, installing, and maintaining an extension is worthwhile. I'm also sympathetic to disabling the javascript output, because there's an implicit API regarding what's available in the js context in order to put things on the page. But I don't think that applies to script tags in text/html. It seems like wild overkill to need to write and install extension packages to hook up one click event to an HTML element. These are cases I wouldn't ever consider defining custom mimetypes and writing custom packages for, so it's strictly a lost feature relative to to the existing notebook.

If need be, I can write a general purpose mime-type handler for text/html that respects the html output that users ask Jupyter to display, but this seems like it should be the default behavior to me.

@ellisonbg

This comment has been minimized.

Contributor

ellisonbg commented Oct 20, 2017

@sccolbert

This comment has been minimized.

Contributor

sccolbert commented Oct 20, 2017

@rgbkrk

This comment has been minimized.

Member

rgbkrk commented Oct 20, 2017

Thanks for adding me to the thread @ellisonbg!

Script tags in HTML output are equivalent to JS output in that they have access to the same global scope. Several packages use script tags in HTML output and of those, they pretty much all used the classic notebook's available require and define without checking to see if they existed. This includes both Bokeh and Plotly, which also have jupyterlab extensions.

nteract (desktop)'s approach

  • We create a document fragment for the HTML, which does run the JS in the expected way
  • 🔜 We're likely to put HTML under a shadow root (shadow dom) for CSS scoping

What I wish for

Someday I'd like to be able to window large notebooks. The only way this is possible is if there are clean semantics for when an output is displayed. If it's always global, it means we can't load from the middle of the document. As cells are loaded and unloaded, we'd want the state to remain idempotent while viewing. That idempotency would also be a nice invariant for realtime, as we want to ensure a consistent view amongst users.

@minrk

This comment has been minimized.

Contributor

minrk commented Oct 23, 2017

Could it be handled as an IFrame with srcdoc, or similar? i.e. a 'full' HTML document, but enforce that the scope is local per output and has no access to the surrounding context, but can do anything HTML can do within the context of a single output? That could allow us to more strictly and clearly define the scope to which output scripts have access. Losing cross-output context is probably more reasonable than losing access to javascript in HTML as a whole.

I suspect there will be plenty of cases where require is tough to beat as the simplest way to accomplish a basic output with a javascript library in a generic and portable way:

<div id="uuid"/>
// maybe even force require itself to be loaded via a script tag
<script type="text/javascript">
require(['jslib/from/cdn'], function (lib) {
  lib.doStuffAt('uuid');
})
</script>

Defining a mimetype and writing/installing a separate extension for every existing javascript library for each of nteract, jupyterlab, and nbconvert/nbviewer is a pretty big hurdle, especially when what works right now in the existing notebook + nbconvert is so simple and general.

@sccolbert

This comment has been minimized.

Contributor

sccolbert commented Oct 23, 2017

I mean, there really only needs to be one JLab extension that a user can install, which would enable arbitrary script execution. The question is whether we want to enable it by default, given that a) it has security huge implications and b) the DOM structure of JLab is explicitly not public API.

@sccolbert

This comment has been minimized.

Contributor

sccolbert commented Oct 23, 2017

Using <iframe> around an output is an option, but it's really expensive and won't scale to large notebooks.

@sccolbert

This comment has been minimized.

Contributor

sccolbert commented Oct 23, 2017

Another options would be to have a user setting that allows script execution, and make it be opt-in.

@minrk

This comment has been minimized.

Contributor

minrk commented Oct 25, 2017

Another options would be to have a user setting that allows script execution, and make it be opt-in.

That sounds good. I think opt-in at the rendermime library level sounds good, though I also think the default JupyterLab application should opt-in to this by default.

@dopplershift

This comment has been minimized.

dopplershift commented Nov 3, 2017

I just want to add that this is a huge UX issue for me. Matplotlib 2.1 just added support for rendering animations in javascript (based on JSAnimation). I was really confused by the fact that my animations were broken in Jupyterlab. This confusion tripled when I found they worked perfectly in the regular notebook.

I don't have much of an opinion on security and whatnot, just a strong opinion that:

  1. It should be clear what happened. Maybe a warning when a <script> is discarded or something, but silent failure is a support nightmare
  2. It needs to be really clear how to make it work.
@scottdraves

This comment has been minimized.

Contributor

scottdraves commented Dec 5, 2017

Hi we really need JS execution. Currently it sounds like we would implement a custom mime type, application/javascript?

@jasongrout

This comment has been minimized.

Contributor

jasongrout commented Dec 5, 2017

You could do something like that, or you could implement a custom mimetype that executes the js you want to execute (if you know the js you want to execute in advance).

@scottdraves

This comment has been minimized.

Contributor

scottdraves commented Dec 5, 2017

Thanks. For the %%javascript magic it's not known in advance...

@JoaoFelipe

This comment has been minimized.

Contributor

JoaoFelipe commented Dec 5, 2017

In my opinion, disabling the javascript makes it much harder to develop new interactions for the notebook.
Since JupyterCon, I managed to port most of my existing code that uses javascript to mime-renderers, but I think that adding features to these mime-renderers is much more painful than working inside the notebook and moving the code to an external script afterwards.

A recent example: I have a tool that creates and displays citation graphs in Jupyter: https://github.com/JoaoFelipe/snowballing/blob/master/example/notebooks/CitationGraph.ipynb
By default, this tool creates the graph in svg, so we don't have to worry too much about the Javascript.
However, a friend wanted to hover a graph node and show all the nodes connected to it. To make it work, we had to use display(Javascript(...)) to implement it.
We probably wouldn't implement a mime-renderer in this case. The graph doesn't change too much after configuring it with widgets and it would be easier just to load the svg in a separate page and play with the browser js.

@FalseProtagonist

This comment has been minimized.

FalseProtagonist commented Jan 29, 2018

I don't have the right to a strong opinion really, but I am very against this.

Seems like part of the reason why this ecosystem is currently big, and for the success of the original notebooks was the "just do it", openness to being hacked around and extended. I know it's not only python but python is driving a lot of this, and in the core of the language it opposes handcuffing people. No type restrictions, no real private fields etc. I don't think it's coincidence, I think that this philosophy fits with data science.

If you allow in-app javascript, you effectively have a hackable editor in the vein of emacs, which leads to incredible power. Also some mess, but ask emacs users if they're rather lock up customization.

@timkpaine

This comment has been minimized.

Contributor

timkpaine commented Jan 29, 2018

If you deploy in a trusted environment, it's really really easy to reenable the JavaScript, you uncomment a single function and it's two uses in the compiler assets. I personally am growing to like the use of extensions, it makes it easier to control what's happening and prevent my users from hacking up JavaScript I can't control. Also 95% of what I've wanted to do has been trivial within the extensions architecture. With more and better examples (many more people building these now than initially, it was much harder on 0.3.0!) I think the structure imposed by the extensions will be a good thing.

@FalseProtagonist

This comment has been minimized.

FalseProtagonist commented Jan 29, 2018

My last post on this:

I don't think saying that you can mess with the compiled assets is a constructive response. If you're in any devops environment it has to be reproducible and you also want to have an automated process to nuke and redownload, so currently you'd have to either put a hacky script in to do the modification or fork the whole thing?

I'm absolutely certain that as you say, if you're using this often, and administrating users, then the pain of learning to make extensions with no raw js somewhat goes away. The target audience of notebooks though is as casual users as you can get, and closing off a load of incredibly useful and powerful options to them until they get an intimate knowledge of the codebase, custom js framework and extension structure seems obviously wrong.

@ellisonbg

This comment has been minimized.

Contributor

ellisonbg commented Jan 29, 2018

@timkpaine

This comment has been minimized.

Contributor

timkpaine commented Jan 29, 2018

@FalseProtagonist not sure if you meant that to be as hostile as it was, but I run a production deployment of this and have a reproducible build (including patches, custom extensions, other integration changes). So I don't have "hacky" anything. If this is something you or anyone might be interested in, I'm always more than willing to share whatever knowledge i've gained on the gitter.

@ellisonbg there was some talk of a generic js renderer. One issue with this though is other extension developers getting lazy and just marking that as a dependency. Even via configs, you don't want to allow extension developers to just say "and flip this config variable to enable this package" or there's no reason to even have it in the first place.

@dopplershift

This comment has been minimized.

dopplershift commented Jan 29, 2018

@ellisonbg We were using this to get Matplotlib animations rendered in the notebook. To reiterate, here's what I want:

  1. It should be clear what happened. Maybe a warning when a <script> is discarded or something, but silent failure is a support nightmare and terrible UX.
  2. It needs to be really clear how to make it work.
@timkpaine

This comment has been minimized.

Contributor

timkpaine commented Jan 29, 2018

@dopplershift there's a console warning, but something user facing might be nice

@dopplershift

This comment has been minimized.

dopplershift commented Jan 29, 2018

Given that we’re talking about users opening notebooks that worked fine before in the regular notebook, I think that’s a necessity.

@timkpaine

This comment has been minimized.

Contributor

timkpaine commented Jan 29, 2018

I think the purging happens in JS only, not on the python side, but it would be nice to include something in the output cell. Maybe we can put some type of error flag/text on the output cell when JS gets purged in addition to the warning

@ellisonbg

This comment has been minimized.

Contributor

ellisonbg commented Jan 30, 2018

Thanks everyone for the feedback, I think it would be a good idea to discuss options for this at the JupyterLab team meeting this Friday at 9am PST. Folks should be on the gitter channel then if someone wants to join and needs the link.

@fm75

This comment has been minimized.

fm75 commented Apr 30, 2018

@ellisonbg
"the problem with warnings is that many users will live with them for years, without ever telling us that it is a problem. We wanted something initially that would force users to come and give us feedback (it is working!). We realize that is frustrating you all, but from our perspective getting solid feedback like this is very non-trivial and helps us to understand the right balance of security and functionality."

I appreciate your position on this. That said, I spent several hours this weekend working quite happily in jupyter lab until I tried to inject some "animation" into my notebook. Having little experience with this I tracked down some examples on matplotlib which failed to work. After spending about 4 fruitless hours with no success in jupyterlab, I finally decided to just start jupyter notebook with a fairly simple example, where I found that the example worked as expected. #4492

Had I just been moving an existing notebook project into lab, I would have discovered this much quicker. Since jupyerlab beta has been described as ready for users, but beta for extension builders, I had a higher expectation.

If possible, detection in jupyterlab, and reporting it to the user with a request to file a bug report would be much better, IMO.

@jasongrout

This comment has been minimized.

Contributor

jasongrout commented Apr 30, 2018

It would be interesting to see if the matplotlib animation js works in JupyterLab even with the script execution enabled. Browsing that code, I can see there are problems in that code when having multiple versions of that code on the page at the same time, since it is defining global variables (which works in the classic notebook case where you only have one notebook on the page, but cannot be assumed to work in JupyterLab where you can have multiple versions of the code through different notebooks talking to different kernels).

Unfortunately, much of the js code that libraries are pushing to the classic notebook in script tags is very specific to the classic notebook, and as such should be considered a lightweight javascript extension of sorts of the classic notebook that needs to be ported over. Even if such js doesn't use classic notebook-specific functions, many libraries (like matplotlib animations) make the implicit assumption that there is only one notebook on the page. Once a library makes assumptions like this about being run in a specific frontend (such as the classic notebook), it's not guaranteed to work in other frontends such as jupyterlab, nteract, etc.

In short, enabling script tags in html does not solve the real problems of libraries writing javascript code assuming it is running in the classic notebook, or running in the constraints of the classic notebook (such as only having a single notebook on the page).

I agree that this discussion and these reasons should be much easier to find for users. It should not have taken 4 hours for you to realize that this problem is not your code, but deeper issues with porting libraries to work in JupyterLab.

@dopplershift

This comment has been minimized.

dopplershift commented Apr 30, 2018

Which is why my original argument was:

  1. It should be clear what happened. Maybe a warning when a <script> is discarded or something, but silent failure is a support nightmare and terrible UX.
  2. It needs to be really clear how to make it work.

@jasongrout makes good arguments why matplotlib's current support is fundamentally broken in jupyter lab, and that's fine. But doing this silently was...misguided at best, and I find it really surprising that this silent breakage has persisted this long.

@sccolbert

This comment has been minimized.

Contributor

sccolbert commented Apr 30, 2018

Maybe a warning when a <script> is discarded or something,

Warnings are logged to the browser console.

https://github.com/jupyterlab/jupyterlab/blob/master/packages/rendermime/src/renderers.ts#L64-L66

@dopplershift

This comment has been minimized.

dopplershift commented Apr 30, 2018

So the user experience here, one targeted largely at people who are scientists first and not developers, is that they should be figuring out how to open their browser's developer tools and check the javascript console? That's standard workflow for people like you and I, but it's not remotely trivial for many of the scientists I work with.

@sccolbert

This comment has been minimized.

Contributor

sccolbert commented Apr 30, 2018

That comes across as a bit passive aggressive and disingenuous. Of course the current state is not the end state. Several core team members (me included) have acknowledged the issue, stated the reasoning behind it, and also stated that the situation can be improved.

@dopplershift

This comment has been minimized.

dopplershift commented Apr 30, 2018

It was completely passive aggressive--and that's because I'm struggling to formulate something more constructive.

<rant>
While acknowledging the problem and stating that the situation can be improved is nice, I don't see any notion of "we should really address the error handling now so more of our users don't waste time on this." @jasongrout's recent comment is a notable outlier. Given that this issue has persisted for 6 months with no action, I get a frustrating feeling that no one actually thinks the silent error behavior (javascript console notwithstanding) is all that much of a problem.

To be clear, I'm not upset about having my stuff broken; I'm upset that several person-hours burned by poor error communication is seemingly met with a <shrug>.

Is this really the behavior that should be shipping if Jupyter Lab is "ready for users"?
</rant>

I'd submit a PR, but I'm not remotely functional enough in JavaScript/TypeScript. Would it work to replace offending tags with a visible note (e.g. "This output contains javascript which will not work in Jupyter Lab.")? Or maybe add it if <script> is detected? Is there something I'm missing as to why this won't work?

@jasongrout

This comment has been minimized.

Contributor

jasongrout commented Apr 30, 2018

Would it work to replace offending tags with a visible note (e.g. "This output contains javascript which will not work in Jupyter Lab.")? Or maybe add it if <script> is detected? Is there something I'm missing as to why this won't work?

Thanks for the constructive concrete suggestion about how things can be improved!

Perhaps your idea can be implemented in a fairly straightforward way:

  1. Uncomment
    // TODO - arbitrary script execution is disabled for now.
    // Eval any script tags contained in the HTML. This is not done
    // automatically by the browser when script tags are created by
    // setting `innerHTML`. The santizer should have removed all of
    // the script tags for untrusted source, but this extra trusted
    // check is just extra insurance.
    // if (trusted) {
    // // TODO do we really want to run scripts? Because if so, there
    // // is really no difference between this and a JS mime renderer.
    // Private.evalInnerHTMLScriptTags(host);
    // }
    , and change it to call a new "logScriptExecutionError" function (name can definitely be improved! :).
  2. Make this new function, based on
    // This is disabled for now until we decide we actually really
    // truly want to allow arbitrary script execution.
    /**
    * Eval the script tags contained in a host populated by `innerHTML`.
    *
    * When script tags are created via `innerHTML`, the browser does not
    * evaluate them when they are added to the page. This function works
    * around that by creating new equivalent script nodes manually, and
    * replacing the originals.
    */
    // export
    // function evalInnerHTMLScriptTags(host: HTMLElement): void {
    // // Create a snapshot of the current script nodes.
    // let scripts = toArray(host.getElementsByTagName('script'));
    // // Loop over each script node.
    // for (let script of scripts) {
    // // Skip any scripts which no longer have a parent.
    // if (!script.parentNode) {
    // continue;
    // }
    // // Create a new script node which will be clone.
    // let clone = document.createElement('script');
    // // Copy the attributes into the clone.
    // let attrs = script.attributes;
    // for (let i = 0, n = attrs.length; i < n; ++i) {
    // let { name, value } = attrs[i];
    // clone.setAttribute(name, value);
    // }
    // // Copy the text content into the clone.
    // clone.textContent = script.textContent;
    // // Replace the old script in the parent.
    // script.parentNode.replaceChild(clone, script);
    // }
    // }
    , but instead of adding the script tag to the document to execute, add some content explaining the error and maybe a link to this discussion?
@jasongrout

This comment has been minimized.

Contributor

jasongrout commented Apr 30, 2018

or maybe way more simply: insert some code at

console.warn('JupyterLab does not execute inline JavaScript in HTML output');
, where the current error is triggered, that adds the error message to the rendering.

@dopplershift

This comment has been minimized.

dopplershift commented Apr 30, 2018

I think I've just been suckered. I suppose I can bang javascript rocks together until I get something resembling a tool. 😁

@jasongrout

This comment has been minimized.

Contributor

jasongrout commented Apr 30, 2018

I would go with the second simpler suggestion. Perhaps just doing something like creating a div with an error message, and doing host.appendChild(errorDiv) would be enough. Or maybe it's better to prepend the error message.

Thanks!

@bensteers

This comment has been minimized.

bensteers commented May 2, 2018

I absolutely love JupyterLab .....except for this. It's been frustrating me for a while and it throws a wrench into my workflow because whenever I need javascript in my notebook, I have to close my current server and spin up a notebook server, and after getting used to JupyterLab (again, it's great <3), I'd really rather not have to go back to regular notebooks.

I just don't get, how is it any more secure for me to run my notebook on an old notebook server as opposed to lab? I'm going to run it either way, this is just making it more difficult and a less enjoyable experience. I get that you have security concerns, and I respect that you want to be cautious, and I am fine with your decision. But you should really have a IDGAF option to allow for JS execution (instead of just commenting it out), because I want to be able to migrate away from old notebooks as soon as possible and this is literally my final hurdle. Didn't mean to rant, I just feel strongly about this lol.

Since you're asking what ppl are using js for in notebooks, I'd say primarily ipywidgets and d3, and occasionally matplotlib interactive.

@DavidPowell

This comment has been minimized.

DavidPowell commented May 2, 2018

I just wanted to chime in with my use case. I'm creating custom javascript to animate SVG elements (which were also within in the notebook via the tikz-magic). I'm essentially using the Jupyter Notebook as an authoring tool for html output, which is generated via nbsphinx. Python code is used to generate some figures, but is ultimately hidden in the final output. Here is an example of the kind of output I'd like to achieve (obviously in practice animations would be more meaningful and useful).

For this purpose the ability to inject arbitrary javascript into the output is essential, and the need for a custom plugin is stopping me from moving over from jupyter notebook to jupyterlab. I think an opt-in ability to render arbitrary javascript would be sufficient for my use case.

While this kind of document authoring may not be a core use-case of Jupyter, I've yet to find anything else which gives me such nice integration of the outputs with markdown text, mathjax equations and figures generated from python code.

@florian-rabe

This comment has been minimized.

florian-rabe commented May 2, 2018

@ellisonbg

For all who are struggling with this - it would be super helpful if you can describe your usage case.

I want to write a kernel that provides HTML output with Javascript event handlers for those HTML elements.
The HTML output would have no interaction with the rest of the page (except for possibly multiple of my output cells interacting with each other).

By the way, I'd also like to share Javascript between all my output cells, i.e., serve it only once (e.g., the first time my kernel is used on the page) and then use it in every output cell.
I understand that may be difficult for other reasons but I think it's worth mentioning in this context.

@scottdraves

This comment has been minimized.

Contributor

scottdraves commented May 3, 2018

Hey, I just released an experimental JS mime type handler (a fix for %%javascript). You can install it with:

jupyter labextension install beakerx-jupyterlab-js

This is definitely not a general solution but we have found it useful.

This is a prerelease of a component (first of several) which we are breaking out of BeakerX to make separately installable.

Here's an example:
screen shot 2018-05-02 at 10 24 30 pm

@gnestor

This comment has been minimized.

Contributor

gnestor commented May 3, 2018

Looks like @scottdraves beat me to it! I just submitted a PR to jupyterlab that adds a javascript-extension to the core. It does exactly what beakerx-jupyterlab-js does, but it exposes a couple globals such as element, document, and window. This will allow notebook users to do things like plot data using d3.js or other JS libraries (using the element global which references the output DOM node). Screen capture and demo notebook at #4515.

I hope that this satisfies at least 60% of the use cases that require running inline Javascript in a notebook (in JupyterLab). We can iterate on this PR and try to satisfy the remaining use cases that don't involve interacting with APIs from the classic notebook.

@akhmerov

This comment has been minimized.

Member

akhmerov commented May 3, 2018

Are extensions executed on untrusted output? If so, the javascript extensions seem to render the trust mechanism ineffective and make the installation vulnerable.

@vidartf

This comment has been minimized.

Member

vidartf commented May 3, 2018

It should be up to the extension to correctly handle trust, i.e. mark itself as unsafe, and check model.trust before executing any code (if not, render an error message).

@alimanfoo

This comment has been minimized.

alimanfoo commented May 8, 2018

Hi, apologies I haven't had time to read the full thread here, but in case it helps here is another use case. In the library zarr we have a function which renders some HTML with an embedded script tag to generate an interactive tree. We pull in the jstree library to do this. The current solution works in notebooks and also nbviewer, but not in jupyterlab because of the script tag. It assumes require is present but requires jquery if not already present. I'd be happy to do something different if there is a solution for all three platforms, would appreciate any advice.

@scottdraves

This comment has been minimized.

Contributor

scottdraves commented May 9, 2018

@gnestor that looks awesome, thank you! 🐬

@fm75

This comment has been minimized.

fm75 commented May 10, 2018

FWIW. I found a work-around for what I needed to do that works in jupyter lab.

from IPython.display import clear_output

for image in images:
    clear_output(wait=True)
    plt.imshow(image, cmap=plt.cm.bone)
    plt.pause(0.03)

It is not interactive, but I probably could use some standard ipywidgets if I need to control the image display more than just run a "video".

@evertheylen

This comment has been minimized.

evertheylen commented May 28, 2018

For those that, like me, just wanted to insert some quick javascript into their HTML, remember that stuff like onclick and onerror still works. I used the following:

<img src=x onerror="javascript:(function(){ 
    alert('hello world'); 
})()" style="display:none"/>

Use at your own disgust ;)

@vidartf

This comment has been minimized.

Member

vidartf commented May 29, 2018

Is this closed by #4515 ?

@gnestor

This comment has been minimized.

Contributor

gnestor commented May 29, 2018

It is 👍

@gnestor gnestor closed this May 29, 2018

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment