Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Nbconvert HTTP service #4656

Merged
merged 18 commits into from Dec 20, 2013
Merged

Nbconvert HTTP service #4656

merged 18 commits into from Dec 20, 2013

Conversation

takluyver
Copy link
Member

This brings back print preview and 'download as' (currently Python, HTML and reST are in the menu).

New URL patterns:

  • GET /nbconvert/<format>/path/to/Notebook.ipynb
  • POST /nbconvert/<format> with the JSON notebook model as the body data.

nbconvert exporters gain a mime_type traitlet, used in the HTTP response headers so the browser knows what type of file it's getting.

<!-- <li id="download_py"><a href="#">Python (.py)</a></li> -->
<li id="download_py"><a href="#">Python (.py)</a></li>
<li id="download_html"><a href="#">HTML (.html)</a></li>
<li id="download_rst"><a href="#">reST (.rst)</a></li>
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

  1. I would make it a submenu.
  2. It would be nice to have it dynamic based on a list somewhere on server side.
  3. 1 and 2 could be done in subsequent PRs.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Stupid me it is already a submenu, would change it's nate to Save and download as...

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Submenu name: the model we're pushing (according to @minrk) is that the file on disk is always up to date, and user interaction is only for checkpoints. So I think calling it 'save and download as' would be confusing (e.g. is that '(save and download) as'?)

I'm not convinced about building it dynamically: I think there will be exporters exposed that you don't necessarily want in the menu, in which case it's an extra level of complexity to configure which ones are exposed. We may later want to dynamically list some options depending on whether the tools are there to support it (e.g. list PDF if LaTeX is installed), but I think that's best done in a separate PR.

@Carreau
Copy link
Member

Carreau commented Dec 7, 2013

I suppose one could look at the key in exporter_map and make exporter_map a configurable also.
We should also think that people might just want to open the link and not download for html ones.

@Carreau
Copy link
Member

Carreau commented Dec 7, 2013

Need a rebase also.

@takluyver
Copy link
Member Author

The 'print preview' option, just above the submenu, will open the HTML view in a new window/tab, without downloading it.

@takluyver
Copy link
Member Author

@minrk, the conflict here is with your PR to add raw_mimetype metadata to raw cells, and pass them through the relevant exporters. We've both add a mime type traitlet to exporters, with a few differences:

  • Yours are raw_mimetype (a single string) and raw_mimetypes (a list), mine is just called mime_type.
  • Yours are for selecting cells from the input, mine is just metadata for the output.
  • Yours are on TemplateExporter, mine are on the Exporter parent class..

Are there any situations where raw_mimetype won't be the same as the output mime type? I guess if we had a PDF exporter class it wouldn't. So we presumably need both traitlets, but one should be set to the other by default. And I should make the name of mine more explicit, something like output_mimetype. Does that sound sensible?

@minrk
Copy link
Member

minrk commented Dec 7, 2013

Let's look at the specific use cases:

  • output_mimetype describes the final output format
  • raw_mimetype is used as the default value for raw cells with no mimetype specified
  • raw_mimetypes is a list of supported mimetypes for raw cell inclusion
  • raw_mimetype is always raw_mimetypes[0]

Currently, I don't think there are any cases where output_mimetype and raw_mimetype would differ. The PDF Exporter we discussed yesterday would probably be one, though.

I think we can replace raw_mimetype with use of raw_mimetypes[0], so we don't need three traits with exactly the same information in almost every case, and your output_mimetype makes sense, or even just mimetype wouldn't bother me. And keep the same raw_mimetypes default behavior, which is already constructed from raw_mimetype (which would become output_mimetype).

@minrk
Copy link
Member

minrk commented Dec 7, 2013

It would be nice to have it dynamic

Even if you want to save it for another PR, I think the appropriate request for this would be

GET /nbconvert/

to list available formats.

@damianavila
Copy link
Member

Even if you want to save it for another PR, I think the appropriate request for this would be
GET /nbconvert/
to list available formats

Certainly, I would like to have available "all" the options we currently support in nbconvert...
I know that probably having latex/pdf and reveal slideshow make the UI more complex than now... and more if we support a slideshow preview as the print preview for the HTML... but it would be very useful for the users... OK, maybe we can put it there later, but at least I think you have to add Markdown md to the current proposed menu...

<li class="dropdown-submenu"><a href="#">Download as</a>
<ul class="dropdown-menu">
<li id="download_ipynb"><a href="#">IPython Notebook (.ipynb)</a></li>
<!-- <li id="download_py"><a href="#">Python (.py)</a></li> -->
<li id="download_py"><a href="#">Python (.py)</a></li>
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is is really necessary to add the extensions for each available options?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No, but I like it - it makes it clear what you're getting.

@takluyver
Copy link
Member Author

Rebased, condensed the mime type information on nbconvert exporters as Min suggested, and added GET /nbconvert - the result is a JSON dict mapping exporter short names (e.g. python) to more dicts of information - currently just the output mimetype.

@takluyver
Copy link
Member Author

Added a whatsnew entry.

@Carreau
Copy link
Member

Carreau commented Dec 11, 2013

Would (partially?) close #3402 (linking both issues otherwise people tend to submit equivalent PRs like #4674 )

@westurner
Copy link

What does this do if I have not installed pandoc?

@minrk
Copy link
Member

minrk commented Dec 11, 2013

What does this do if I have not installed pandoc?

Fail :)

@westurner
Copy link

refs #4655

@damianavila
Copy link
Member

Fail :)

We need to alert to the user in some way... maybe an alert or something like that...

@takluyver
Copy link
Member Author

I talked with Min about what we can do to fail more informatively. He reckons we should copy across the error pages that nbviewer uses, which include details about the failure. To make room for that (which will happen in a separate PR), I've made the 'download as' links open in a new tab, so that we don't navigate the user away from the interactive notebook view if the conversion fails. The downside of that is that if it succeeds, the user sees a brief flash of a new tab opening and closing before the browser offers the download.

@ellisonbg
Copy link
Member

I am stuck grading today but will try to review this tomorrow. I have some
ideas on how we can do this in a possibly different way...

On Wed, Dec 11, 2013 at 3:39 PM, Thomas Kluyver notifications@github.comwrote:

I talked with Min about what we can do to fail more informatively. He
reckons we should copy across the error pages that nbviewer uses, which
include details about the failure. To make room for that (which will happen
in a separate PR), I've made the 'download as' links open in a new tab, so
that we don't navigate the user away from the interactive notebook view if
the conversion fails. The downside of that is that if it succeeds, the user
sees a brief flash of a new tab opening and closing before the browser
offers the download.


Reply to this email directly or view it on GitHubhttps://github.com//pull/4656#issuecomment-30376208
.

Brian E. Granger
Cal Poly State University, San Luis Obispo
bgranger@calpoly.edu and ellisonbg@gmail.com

@takluyver
Copy link
Member Author

We discussed doing it all through a JSON API, but base64 encoding massive blobs of content to stick them in a JSON structure seems distinctly icky.

One possibility that I advanced is having the big blob as the body of the HTTP response, with JSON metadata in an HTTP header (X-IPython-Metadata or similar). This would also mean that the same URL could be used for an API request or directly in a browser.

@jdfreder
Copy link
Member

I haven't looked the code yet, but does this create a separate HTTP server for nbconvert or just extend the RESTful API of the notebook server? -Just wondering if one can start a standalone nbconvert service without the notebook...

@takluyver
Copy link
Member Author

This just adds handlers to the regular notebook server. AIUI, you could quite easily write a server that would only expose some of the handlers of the main notebook server.

@ellisonbg
Copy link
Member

OK, quick grading break...here is a summary of my idea - largely inspired
by my thinking about these issue in the background over the last bit of
days...

  • Notebook list/tree view should list all files of all types (maybe sorted
    by type...).
  • The tree view should provide "Download" link for all flies listed.
  • The nbconvert web service should 1) take JSON configuration, 2) run
    nbconvert, 3) return log. It should not return the actual output file. That
    will be handled using the regular downloads.
  • The UI for all of this should be in the tree view. When it supports
    multiple selection that will make it easy to select a couple of notebook
    and run them.
  • When we allow users to edit regular text files, this approach will allow
    them to edit the local config file for nbconvert as well.

Thoughts?

On Wed, Dec 11, 2013 at 3:50 PM, Jonathan Frederic <notifications@github.com

wrote:

I haven't looked the code yet, but does this create a separate HTTP server
for nbconvert or just extend the RESTful API of the notebook server? -Just
wondering if one can start a standalone nbconvert service without the
notebook...


Reply to this email directly or view it on GitHubhttps://github.com//pull/4656#issuecomment-30376874
.

Brian E. Granger
Cal Poly State University, San Luis Obispo
bgranger@calpoly.edu and ellisonbg@gmail.com

@takluyver
Copy link
Member Author

Providing a UI to convert from the dashboard should be easy enough in a later PR. I'm -1 on saving the output to a file and reading it back for the response - that seems somewhat backwards. Except of course when it's necessary, as it will be for PDF.

@Carreau
Copy link
Member

Carreau commented Dec 12, 2013

Agreed with thomas.
I think that user want the current notebook as rst now and when they are editing it. Saving to disk using dashboard will induce confusion as to wether or not converted files are up to date.

To take the example of Office, you do "file > export as...". You are not asked to go to the explorer and right click > convert to.

Also opening a new page when downloading is ok IMHO, as the conversion is never instantaneous, would it even be possible to show "your notebook is being processed"
That when ready trigger the download itself?

Envoyé de mon iPhone

Le 12 déc. 2013 à 01:55, Thomas Kluyver notifications@github.com a écrit :

Providing a UI to convert from the dashboard should be easy enough in a later PR. I'm -1 on saving the output to a file and reading it back for the response - that seems somewhat backwards. Except of course when it's necessary, as it will be for PDF.


Reply to this email directly or view it on GitHub.

@westurner
Copy link

We discussed doing it all through a JSON API, but base64 encoding massive blobs of content to stick them in a JSON structure seems distinctly icky.

One possibility that I advanced is having the big blob as the body of the HTTP response, with JSON metadata in an HTTP header (X-IPython-Metadata or similar). This would also mean that the same URL could be used for an API request or directly in a browser.

@ellisonbg
Copy link
Member

My feeling about that model is that it is a leaky abstraction that will
lead to frustration. There are a couple of leaks in the model:

  • The configuration of nbconvert in many cases will be done by users at the
    directory level in a .ipy config file. That is out-of-band relative to an
    nbconvert web service.
  • The result of running nbconvert is not a single file. There can be
    images, and any file referred to in the "files" URL. Downloading the rst
    file alone in many cases will be useless.
  • There are many failure modes of nbconvert. Because of this we really need
    to bring the full log back to the user. That is very difficult to do in a
    sane manner when the result of clicking on "Download As" is to grab the
    file.
  • In many cases, users will want to open the converted file to look at the
    contents. Downloading to someones "Desktop" or "Downloads" folder and then
    forcing the user to 1) move the file back into the notebook directories or
    2) open the file in an external text editor are both very
    counter-productive. They should see the file in the tree view and be able
    to click the edit button. Of course we need to offer downloads, but that
    shouldn't replace the "file on the file system view".
  • The web service model makes it extremely painful if you want to manage
    the output of nbconvert alongside the original notebooks in a version
    control system.
  • We need to be very mindful of the file system abstraction that we are
    exposing. One of the main ideas of the notebook servers is that "notebooks
    are just files on the file system". I think the corollaries are
    "PDF/rst/markdown files are just files on the files system".

On Thu, Dec 12, 2013 at 12:13 AM, Matthias Bussonnier <
notifications@github.com> wrote:

Agreed with thomas.
I think that user want the current notebook as rst now and when they are
editing it. Saving to disk using dashboard will induce confusion as to
wether or not converted files are up to date.

To take the example of Office, you do "file > export as...". You are not
asked to go to the explorer and right click > convert to.

Also opening a new page when downloading is ok IMHO, as the conversion is
never instantaneous, would it even be possible to show "your notebook is
being processed"
That when ready trigger the download itself?

Envoyé de mon iPhone

Le 12 déc. 2013 à 01:55, Thomas Kluyver notifications@github.com a
écrit :

Providing a UI to convert from the dashboard should be easy enough in a
later PR. I'm -1 on saving the output to a file and reading it back for the
response - that seems somewhat backwards. Except of course when it's
necessary, as it will be for PDF.


Reply to this email directly or view it on GitHub.


Reply to this email directly or view it on GitHubhttps://github.com//pull/4656#issuecomment-30395959
.

Brian E. Granger
Cal Poly State University, San Luis Obispo
bgranger@calpoly.edu and ellisonbg@gmail.com

This is a slightly worse user experience if it succeeds, because the new
tab flashes up before closing again, but it will let us display an
informative error page if it fails, without navigating the user away
from the interactive notebook view.
@takluyver
Copy link
Member Author

Rebased to fix the tests with the changes to the subprocess stream capturing API in #4690.

@mrmagooey
Copy link
Contributor

I would provide a counter viewpoint to @ellisonbg's view of the exporter/download process, I'm actually hoping that the download process will provide the abstraction necessary to not scare off new users of an IPython system. I plan to give my co-workers their first access to Python via a remote IPython notebook server and being able to provide the ability to download the pdf's without knowing the details of the underlying filesystem (or nbconvert config) would be perfect. Clicking download and then opening the file from a downloads folder feels like a well understood mechanism.

@westurner
Copy link

If I might help to clarify @takluyver 's patch:

https://github.com/ipython/ipython/pull/4656/files#diff-5b7a0ba3686e1c37c317732cd6e771fdR82

+    (r"/nbconvert/%s%s" % (_format_regex, _notebook_path_regex),
+         NbconvertFileHandler),
+    (r"/nbconvert/%s" % _format_regex, NbconvertPostHandler),

It looks like the url to download a PDF of a notebook from the notebook server will be something like:

/nbconvert/pdf/<path_to_notebook.ipynb>

@takluyver
Copy link
Member Author

The URL would indeed look like that. However, at present, PDF export is done by a post-processor after Latex export, so that wouldn't work. We've been discussing restructuring it as a separate exporter in order to make that work.


@web.authenticated
def get(self, format, path='', name=None):
exporter = exporter_map[format]()
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If this is a constructor, pass config=self.config?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done

@minrk
Copy link
Member

minrk commented Dec 19, 2013

A few comments inline, but this looks really good.

One more little bit: we should check that the right tests are skipped when pandoc is unavailable.

As discussed in today's dev meeting, we may want a richer, more configurable way to use nbconvert, but this will be added to /api/nbconvert, not /nbconvert which produces for-humans pages / downloads. So I don't think adding that functionality would necessarily change how any of this works.

@rgbkrk
Copy link
Member

rgbkrk commented Dec 19, 2013

When people run their own notebook servers (or are using multi-user in the future), would we want to expose a read-only view that doesn't require authentication (without the @web.authenticated decorator)? Perhaps either of the notebook itself or a rendered one?

@minrk
Copy link
Member

minrk commented Dec 19, 2013

I think we had indended the html render as a read-only view. I think working out the read-only auth needs to be more carefully done than it was before (which was a proof-of-concept never intended to be merged), and that can be a separate PR.

@minrk
Copy link
Member

minrk commented Dec 19, 2013

Also relevant here that we discussed in person, but I haven't written down is better error pages, like we use in nbviewer. This would allow failed renders to include useful information about the failure. This is also a job for another PR.

@rgbkrk
Copy link
Member

rgbkrk commented Dec 19, 2013

Sorry, yeah, was thinking about it but wasn't sure where to bring it up (in this PR, another issue, stored in my brain until I make my own PR).

@takluyver
Copy link
Member Author

I've had a stab at serving multi-file output as zip files. It currently just checks for the presence of any extra files, so both the rst and latex exports will become a zip file if any cells have any output - even text output is saved as output_1_0.text, though it's not then used in the files.

That seems suboptimal. Is there a good heuristic for determining what files are needed? Or should the extract_output preprocessor be refined to not extract every output?

@takluyver
Copy link
Member Author

After discussion with @minrk, I've made the extract output preprocessor only extract types that don't get embedded in the main output file. This is configurable, but the default is to extract png, jpg, svg and pdf.

@minrk
Copy link
Member

minrk commented Dec 20, 2013

Excellent. Working great here. Now I'll get to work on better error pages as another PR.

minrk added a commit that referenced this pull request Dec 20, 2013
@minrk minrk merged commit 0d046b4 into ipython:master Dec 20, 2013
mattvonrocketstein pushed a commit to mattvonrocketstein/ipython that referenced this pull request Nov 3, 2014
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

9 participants