Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Html viewer #5855

Merged
merged 24 commits into from Jan 26, 2019
Merged

Html viewer #5855

merged 24 commits into from Jan 26, 2019

Conversation

@ian-r-rose
Copy link
Member

@ian-r-rose ian-r-rose commented Jan 11, 2019

Fixes #2369.

This PR is in two parts which are fairly separable if we want to go that way.

HTML Document Renderer

The first part builds on the jupyterlab_html mimerenderer that @mflevine wrote to make a general HTML file viewer. It opens local HTML files in an iframe as a MainAreaWidget. There are a few things that make this a bit tricky:

  • We would like to still live-edit the HTML file, so we need to work with the live text model.
  • On the other hand, we would like the HTML file to be able to properly resolve local CSS and image assets on the server.
  • Finally, we want to make sure we don't introduce a security hole by allowing arbitrary HTML.

The approach I took to satisfy these constraints was to

  1. Sandbox the iframes. This disallows most unsafe interactions, including embedding JS that executes code on the server via the Jupyer REST APIs. I have allowed for one exception to the sandboxing: we allow same-origin-requests so that the HTML can resolve CSS and images on the server.
  2. In order to resolve local resources while still having the live data model, I am manually inserting an HTML <base> element which indicates the file's download URL on the server. This seems to work pretty well!

html-editor

IFrame Extension

The second part of the PR adds a new "IFrame" main area widget, which allows the user to choose an external URL to embed in an iframe. This can be used, e.g., to embed external documentation in a JupyterLab pane. As above, this iframe is sandboxed for security reasons, so not everything will work.

I have found that many external sites disallow iframing, so overall this portion of the PR feels a bit less useful than the first. If we like it, it can go in here, or it can be broken into a separate PR.

iframe

to suppress pointer events while resize drags are occuring. There may be a
better solution to this problem.
*/
body.p-mod-override-cursor .jp-HTMLViewer {
Copy link
Member

@timkpaine timkpaine Jan 11, 2019

@ian-r-rose
Copy link
Member Author

@ian-r-rose ian-r-rose commented Jan 11, 2019

I am starting to think that the arbitrary-url iframe functionality is better left to an extension like @timkpaine's. As it stands, there can be a lot of CORS, http/s, and disallowed iframe-ing issues without a server-side component (which https://github.com/timkpaine/jupyterlab_iframe has).

@timkpaine, do you think any of the content in e7037ba would be useful to migrate to your extension?

@ian-r-rose
Copy link
Member Author

@ian-r-rose ian-r-rose commented Jan 11, 2019

Some discussion of trusting HTML files so that JS may be executed here. @mflevine brings up the good point that there are instances where not being able to run JS is not good enough, e.g.e a D3 dashboard. We may be able to follow the same security model as the notebook, where a user explicitly "trusts" an HTML file.

* to the server, thereby executing arbitrary Javascript.
*
* Here, we sandbox the iframe so that it can't execute Javsacript
* or launch any popups. We allow one exception: 'allow-same-origin'
Copy link

@blois blois Jan 11, 2019

allow-same-origin allows document traversal between pages with the same origins- it does not (AFAIK) impact resource loading within the iframe. If it is indeed not strictly needed then it's probably better omitted as it becomes a more significant issue if combined with allow-scripts, or maybe clarifying.

Copy link
Member Author

@ian-r-rose ian-r-rose Jan 11, 2019

Thanks for the information @blois! i'd certainly like to make this as strict as possible, so if resource loading is not affected here, I'd just as soon remove the exception.

I'm currently not allowing scripts, but am trying to think through what the circumstances might need to be to allow them. I am currently thinking something around explicit user trusting of the file may be okay. It's hard to be completely safe, since the whole point of the application is running code on the server. Do you have any insights there?

Copy link

@blois blois Jan 11, 2019

Since you're just assigning it via a Blob then using a srcdoc iframe along with the sandbox attribute is probably the best, as it's the easiest way to give your iframe a separate origin.

Copy link
Member Author

@ian-r-rose ian-r-rose Jan 11, 2019

In my testing there were a number of fetches of local resources that failed when I didn't include 'allow-same-orign', so I'm concerned that's too strict of a sandboxing.

@timkpaine
Copy link
Member

@timkpaine timkpaine commented Jan 11, 2019

@ian-r-rose I'll poach the phosphor hover stuff, otherwise i agree that it's best to keep it as a local html viewer for now not a generic site viewer

@ian-r-rose
Copy link
Member Author

@ian-r-rose ian-r-rose commented Jan 11, 2019

Okay, I have removed the arbitrary URL functionality for the time being.

I have also added the concept of "trust" to the HTML file, which allows users to enable script execution if they choose.

@ian-r-rose
Copy link
Member Author

@ian-r-rose ian-r-rose commented Jan 12, 2019

@mflevine, if you get the chance, could you check whether the "trust" setup works for your use-case?

@@ -36,6 +96,7 @@ namespace Private {
export function createNode(): HTMLElement {
let node = document.createElement('div');
let iframe = document.createElement('iframe');
iframe.setAttribute('sandbox', '');
Copy link
Contributor

@bollwyvl bollwyvl Jan 19, 2019

Cool. Empty is almost brokenly "secure," but we want developers to opt their users into this thing only with care.

Let's keep going with the security model.

Setting the referrerPolicy to the "annoyingly secure by default," no-referrer and again offer an enumeration of other possible values seems like a good way forward.

Doesn't work on IE/Edge, but meh.

Looks like csp isn't ready for primetime.

Copy link
Member Author

@ian-r-rose ian-r-rose Jan 19, 2019

Sure, that makes sense to me. I'll follow basically the same model.

/**
* Exceptions to the iframe sandboxing policies.
* These are specified here:
* https://www.w3schools.com/tags/att_iframe_sandbox.asp
Copy link
Contributor

@bollwyvl bollwyvl Jan 19, 2019

@bollwyvl
Copy link
Contributor

@bollwyvl bollwyvl commented Jan 19, 2019

Great stuff here. Very important for integrating with existing UIs.

Specifically on the iframe extension: yerp, we're basically building a browser in lab. It was going to happen.

i'd vote for:

  • a security dropdown of checkables for enabling/disabling security/privacy measures
  • eschewing the URL modal for an inline text box in the tool bar which updates when the url changes (not always possible with sandbox)
    • knowing which URL you end up at is pretty key
  • adding a forward/back button (again, won't always work)

@ian-r-rose
Copy link
Member Author

@ian-r-rose ian-r-rose commented Jan 19, 2019

Thanks for the review @bollwyvl.

I've dropped the ability to input arbitrary URLs for now, for a lot of the complicating reasons you note. So for now, I'd like to defer building out a more general UI for iframes. Do you think the current status is okay for just editing local HTML files?

@bollwyvl
Copy link
Contributor

@bollwyvl bollwyvl commented Jan 20, 2019

@ian-r-rose i'm checking this out locally right now!

@bollwyvl
Copy link
Contributor

@bollwyvl bollwyvl commented Jan 20, 2019

So the local editing mode is interesting as it is. I am still a little concerned by the default 'trusted' stance (with nothing in between) of allow-same-origin and allow-scripts, which is highlighted by:

<html>
<head>
    <!-- i guess we want always want these to work -->
    <link rel="stylesheet" href="/static/style/style.min.css" />
</head>
<body>
    <h1>pwnt?</h1>
    <h2>parent</h2>
    <div class="well">
        <!-- i don't like these much -->
        <script>
            try {
                document.write(window.parent.document.head.innerHTML.match(/token.*\n/));
            } catch(err) {
                document.write(err);
            }
        </script>
    </div>
    <h2>referrer</h2>
    <div class="well">
        <!-- this doesn't seem to work much -->
        <script>document.write(window.document.referrer);</script>
    </div>
    <h2>cookies</h2>
    <div class="well">
        <!-- fair amount of leakage -->
        <script>document.write(window.document.cookie);</script>  
        <script>document.write(window.parent.document.cookie);</script>       
    </div>
    <h2>alert</h2>
    <script>
        alert("try")
        
    </script>
    <h2>alert harder</h2>
    <script>
        window.parent.document.querySelector("iframe").sandbox = (
        window.parent.document.querySelector("iframe").sandbox + ' allow-modals' )
        alert("try")
    </script>
</body>
</html>

On the up-side: this is a whole app deployment environment that can do anything as you.

On the down-side: this is a whole exploit deployment mechanism that can do anything as your victim.

This is the kind of thing I put big nasty labels on Jyve for, because it's basically a few clicks to defeat all the other attempts at security that lab has.

Basically, allow-same-origin and allow-scripts means you can reach out and change your security model, which means the rest of the security measures can be defeated.

Talking back and forth "securely" is possible with postMessage, i guess...

@ian-r-rose
Copy link
Member Author

@ian-r-rose ian-r-rose commented Jan 20, 2019

Thanks for the example @bollwyvl.

Ultimately you and @blois are probably right that allow-same-origin is probably best to remove. My goal had been to make it so that I could locally render the built jlab docs without anything failing to load, and I needed to include that to make certain contents like fonts get on the page.

So a more secure, but still useful model would be

  1. For untrusted content, use complete sandboxing.
  2. For trusted content, use allow-scripts, and no other exceptions.

Regarding your example: these are definitely security concerns, but can I not do the exact same things in an HTML output of a notebook cell? It's my understanding that the colab team sandboxes their outputs for this reason.

@bollwyvl
Copy link
Contributor

@bollwyvl bollwyvl commented Jan 21, 2019

that to make certain contents like fonts get on the page.

Yeah, they, as a sigificant vector in their own right (right up there with gpu shaders), are a ready-made delivery mechanism for exploits. I should do some more exploring.

but can I not do the exact same things in an HTML output of a notebook cell?

Oh, I thought those were still being well and clobbered. Turns out...

<script>alert(document.cookie)</script>

...does what it says on the tin.

For the purposes of live-hacking, it's still of course to important to have iframes, but only for implementation concerns like document scope isolation (e.g. css). We're not going to find any security here.

So yeah: I'd say the DX of being able to live edit HTML is very positive, and if that's all we're really talking about on this one, it's fine to move forward.

On the security side, we're stuck with the old excel "are you sure you want to give this worksheet read/write access to your registry... or not see anything at all?" Starting pessimistic, and letting the user adjust the settings (quickly) is probably the best we can do at this point!

@ian-r-rose
Copy link
Member Author

@ian-r-rose ian-r-rose commented Jan 21, 2019

Yeah, I think we should table the implementation of opening arbitrary URLs for now and focus this around live-editing local HTML.

For external URLs, I'd point people to @timkpaine's extension.

@jasongrout jasongrout self-requested a review Jan 26, 2019
tooltip={`Whether the HTML file is trusted.
Trusting the file allows scripts to run in it,
which may result in security risks.
Only only enable for files you trust.`}
Copy link
Contributor

@jasongrout jasongrout Jan 26, 2019

Delete one "only"

@jasongrout
Copy link
Contributor

@jasongrout jasongrout commented Jan 26, 2019

@ian-r-rose removed the allow-same-origin constraint from the defaults. It appears that it is still possible to load images from the local server (i.e., you can have an <img src="file.jpg">, but not possible to load fonts from the same origin (which breaks loading font-awesome). We're doing this to be conservative. If this causes problems for people, we can revisit this.

The reason to disallow allow-same-origin is to prevent requests to the jupyter server rest api, launching kernels, etc.

and as such are more secure than trusted notebooks.

Fix typo.

Fix logic.
@jasongrout
Copy link
Contributor

@jasongrout jasongrout commented Jan 26, 2019

Thanks!

@jasongrout jasongrout merged commit cd0664a into jupyterlab:master Jan 26, 2019
2 of 3 checks passed
@ian-r-rose
Copy link
Member Author

@ian-r-rose ian-r-rose commented Jan 26, 2019

Thanks to @blois, @bollwyvl, and @jasongrout for discussions about user experience and security.

@ian-r-rose ian-r-rose deleted the html-viewer branch Jan 26, 2019
timkpaine added a commit to timkpaine/jupyterlab_iframe that referenced this issue Feb 20, 2019
@timkpaine
Copy link
Member

@timkpaine timkpaine commented Feb 20, 2019

@ian-r-rose just FYI, your CSS worked for drag/drop but it didn't work when adjusting the relative sizes of the docked widgets. So I added this which seems be fixing that issue.

@timkpaine
Copy link
Member

@timkpaine timkpaine commented Apr 24, 2019

@ian-r-rose fwiw i've finished the proxy
#6235 (comment)

@lock lock bot locked as resolved and limited conversation to collaborators Aug 7, 2019
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Projects
None yet
Linked issues

Successfully merging this pull request may close these issues.

7 participants