Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Implement markdown cell attachments. Allow drag’n’drop of images into… #621

Merged
merged 24 commits into from
Mar 8, 2016

Conversation

julienr
Copy link
Contributor

@julienr julienr commented Oct 19, 2015

This adds a way for users to insert inline images in markdown cells using drag and drop

TODO:

  • Add ability to copy-paste images
  • Add a menu item to insert images
  • Test the "Insert Image" menu item. Seems like casperjs doesn't support drag'n'drop/paste testing for now.
  • Add Edit menu items to copy/paste cell attachments
    • Edit : Cut Cell Attachments
    • Edit : Copy Cell Attachments
    • Edit : Paste Cell Attachments
  • "Garbage collect" attachments that are not referenced from markdown ? For example when the notebook is saved.
  • There is a large delay when pasting large images from the clipboard until they appear in the markdown. Should show some visual indicator that something is going on
  • Large attachments warning ?

Ideas :

  • Could add a ipython magic so you can access attachments from python code. Could be nice to directly attach some data to the notebook for small datasets. (won't happen)
  • Could add auto-completion for attachments when editing ?

This is an experimental implementation of what was discussed in #613 by @ellisonbg and @stevengj .
When a user drag and drops an image into a markdown cell, the image data is stored in the cell metadata and can be used as the source for an image by using a special 'nbdata:filename' syntax.

Here are some things to discuss to move forward :

  • It should be possible to insert inline images without drag'n'drop. Add a menu item ?
  • There should be a UI so the user can easily see which cells have 'attachments' and a way to manage them.

Screenshots

Just after the drop of an image :
1_source

The metadata :
2_metadata

The rendered cell
3_rendered

@stevengj
Copy link

Great, thanks so much for working on this!

  • Shouldn't it use markdown image syntax rather than <img ../>?
  • Maybe it should use the notebook's generic "mime bundle" format, i.e. "test.png" : { "image/png" : ["base64-encoded-png-data"] }
  • Would be good to support pasted images as well as drag-and-drop, and a menu item to insert an image from a file-picker menu.
  • Might be nice to use attachment:test.png rathe than nbdata (or in any case the same keyword that is used for the cell data item).

@jdfreder
Copy link
Contributor

This is awesome! Great work!

Shouldn't it use markdown image syntax rather than <img ../>?

+1

Would be good to support pasted images as well as drag-and-drop, and a menu item to insert an image from a file-picker menu.

+2

Maybe it should use the notebook's generic "mime bundle" format, i.e. "test.png" : { "image/png" : ["base64-encoded-png-data"] }

It's not clear to me how this would work- where would the mime bundle be stored in the document and how would it be referenced inline in the markdown?

@stevengj
Copy link

@jdfreder, the mime bundle would be stored in the cell data, exactly as above. i.e.

{
   "collapsed": true,
    ....
    "attachments": {
        "test.png" : { "image/png": [...base64-encoded-data...] }
    }
}

The point is, the notebook format has already defined a way to store arbitrary MIME data; you should re-use it here rather than inventing a new method.

Re-using the existing mime-bundle structure has lots of advantages, e.g. it simplifies processing by external tools. (MIME bundles also open the door to attachments that are stored in multiple formats, which might be useful in the future.)

@jdfreder
Copy link
Contributor

Ah I didn't realize this was already being stored in the notebook metadata, I thought he was inlining the b64 content in the markdown. I don't think we should be storing this data in notebook metadata, which is meant for users and extensions. Instead I think it should have it's own key in nbformat.

@stevengj
Copy link

@jdfreder, it's not being stored in the notebook metadata (i.e. the metadata for the whole file), it's being stored as an extra key/value pair in the markdown cell data. See the discussion in #613 — neither storing the data inline in the markdown nor storing it in a separate file are really acceptable long-term solutions, and @ellisonbg does not want to add a new cell type.

@jdfreder
Copy link
Contributor

it's being stored as an extra key/value pair in the markdown cell data

It's being stored in that.metadata.attachments which is metadata - yes you're right, not notebook level. All I'm trying to say that I think this needs another PR to https://github.com/jupyter/nbformat to change nbformat. Not to add another cell type, but to add an attachments key to the cell.

Metadata is not something we want to use inside Jupyter/notebook in an attempt to avoid making changes to nbformat. If we are going to add keys to the notebook file, they need to be documented, and nbformat is how we document them.

@julienr
Copy link
Contributor Author

julienr commented Oct 20, 2015

Thanks for the comments !

I added a nbformat PR for a new 'attachments' cell property.

I also added a todo list to my first comment on this issue to track progress.

@ellisonbg
Copy link
Contributor

A few points:

  • +1 on using mime bundles for this
  • +1 on adding the attachment field to the nbformat rather than metadata
  • +1 on getting this to work both in img tags and markdown image syntax.
    While the markdown image syntax is nice, I often need more control and use
    igm tags.

On Tue, Oct 20, 2015 at 5:21 AM, Julien Rebetez notifications@github.com
wrote:

Thanks for the comments !

I added a nbformat PR jupyter/nbformat#21 for a
new 'attachments' cell property.

I also added a todo list to my first comment on this issue to track
progress.


Reply to this email directly or view it on GitHub
#621 (comment).

Brian E. Granger
Associate Professor of Physics and Data Science
Cal Poly State University, San Luis Obispo
@ellisonbg on Twitter and GitHub
bgranger@calpoly.edu and ellisonbg@gmail.com

@stevengj
Copy link

@jdfreder, I see. Yes, I completely agree; attachments should be a documented cell field.

// cell, we do the following :
// - We insert the base64-encoded image into the cell attachments
// directory, keyed by the filename.
// - We insert an img tag with a 'nbdata' src that refers to the

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

now an 'attachment:filename' source

@julienr
Copy link
Contributor Author

julienr commented Oct 21, 2015

As suggested by @stevengj, I think it would be great to have an "insert image" menu item. This would be active only when the user is editing a markdown cell. I am not sure where to put it as it seems the current menu items apply to all cells and do not require one to be in edit mode. Do you have UI guidelines for that ?

@ellisonbg
Copy link
Contributor

I would find a suitable place in the "Edit" menu.

On Wed, Oct 21, 2015 at 3:09 AM, Julien Rebetez notifications@github.com
wrote:

As suggested by @stevengj https://github.com/stevengj, I think it would
be great to have an "insert image" menu item. This would be active only
when the user is editing a markdown cell. I am not sure where to put it as
it seems the current menu items apply to all cells and do not require one
to be in edit mode. Do you have UI guidelines for that ?


Reply to this email directly or view it on GitHub
#621 (comment).

Brian E. Granger
Associate Professor of Physics and Data Science
Cal Poly State University, San Luis Obispo
@ellisonbg on Twitter and GitHub
bgranger@calpoly.edu and ellisonbg@gmail.com

@minrk
Copy link
Member

minrk commented Oct 21, 2015

This wouldn't need any custom markdown processing at all, nor any change to the notebook format if drag&drop onto a markdown cell uploaded the file via contents API and then wrote a normal, relative URL in the markdown.

@minrk
Copy link
Member

minrk commented Oct 21, 2015

The more I think about it, the more I think this should be uploading to contents, rather than writing binary data to the notebook. Doing so has many advantages:

  • faster notebook load
  • smaller storage size
  • use existing tools to interact with images/data
  • better version control behavior
  • avoid need to parse special URLs in nbconvert
  • avoid custom markdown processing
  • easier to make attachments available to kernel by not embedding them in the notebook

I can't think of any benefits to the data being in the JSON in particular, just making it easier to put images into the notebook - the storage details of putting it into the JSON don't seem specifically valuable, or am I missing something?

@julienr
Copy link
Contributor Author

julienr commented Oct 21, 2015

@minrk The advantage of having attachments in the JSON is that you have a single .ipynb file to deal with, for example when sharing your notebook via email. And you cannot "break" your notebook by moving files around.

That is one advantage against the many advantages of the upload approach you listed though :-)

@minrk
Copy link
Member

minrk commented Oct 21, 2015

That's a good point, it is a trade-off. We just have to decide which side we want to fall on. I suspect, given the consistent pressure we get that we already put too much in the JSON, that making it easier to work with external files (e.g. completion of filenames in markdown URLs) will be the preferable to yet-more-files in JSON. The drag&drop UI shouldn't be significantly different between the two storage schemes, though.

@stevengj
Copy link

@minrk, having a single file to deal with is the whole point here. I don't think we need a UI to help with linking external files — that is easy already: you just drag and drop the file into the same directory and add an image link to markdown.

Whereas right now it is insanely hard to embed an image in the notebook file itself unless the image is the result of code execution, and this makes it extremely unreliable to share (email, upload, copy to dropbox...) notebooks with markdown images without breaking them.

Moreover, when someone drags-and-drops or pastes an image into a notebook, what they will expect is that it is stored in the notebook. This is the behavior people are familiar with from other contexts like word processors. It is especially true for copy/paste or drag-and-drop of images that don't come from a file (e.g. pasting/dragging a selection from an image-editor program or from another web page).

@stevengj
Copy link

@minrk, note also that 5 out of the 7 items in your list of "advantages of separate files" would also apply to images produce by code execution, i.e. to storing display_data messages. And yet I think everyone can agree that the disadvantage of having to carry around separate image files for every graphical output would almost always vastly outweigh the advantages. The same applies to images in Markdown cells.

(Right now, in my experience, people simply avoid using images in markdown cells as much as possible because of this inconvenience. If you make it easier to embed images, I suspect that you'll start seeing many more notebooks filled with technical illustrations and other graphical aids.)

@juhasch
Copy link
Contributor

juhasch commented Oct 21, 2015

The drag&drop notebook extension in IPython-contrib actually already uploads images to the server using the contents service. My experience is that for small images it sometimes would be nice to have them in the notebook, however I think it is better to store most images separately.
I don't think JSON files make a good file system. The same is true for any local data that is being processed by a notebook.

So the discussions should probably be where to put local data that belongs to a notebook.
For example, I have a lot of notebooks containing code, markdown, images. plots, external data.
Therefore I have only one or a few notebooks in a directory, together with graphics and data.

@stevengj
Copy link

@juhasch, uploading things to the server does not help one bit with the problem of sharing the notebook with someone who does not share the same server. And again, the same arguments would apply to storing plots and other code-generated images in the notebook vs. separate files.

@juhasch
Copy link
Contributor

juhasch commented Oct 21, 2015

Yes, I understand what you want and I would sometimes find that capability nice, too.
The question is if JSON is a good format to be used as a container to store binary data, and what kind of data belongs into that container. I would argue we need a better container and not extend what is stored in the JSON file.

@minrk
Copy link
Member

minrk commented Oct 21, 2015

I think everyone can agree that the disadvantage of having to carry around separate image files for every graphical output would vastly outweigh the advantages

I certainly agree, but it's not everyone. We get regular complaints about putting output into the JSON.

Right now, in my experience, people simply avoid using images in markdown cells as much as possible because of this inconvenience

And my experience is the opposite—since it's so convenient to use regular files in markdown and the behavior is so much better, I use relatively linked images in markdown all the time. For this reason, I thought that the convenience of upload was the only aspect this was meant to address, e.g. matching what GitHub image upload does.

the same argument would apply to storing plots and other code-generated images in the notebook vs. separate files.

I don't agree, or at least I think the tradeoffs end up being different. Generated outputs and uploaded files are different in many ways. Outputs change on every execution of the notebook, and are reasonably discarded by users who do not want to track non-input content.

@stevengj
Copy link

We're already using JSON as a container to store binary data. I just want to be able to paste a few kB more in when I need a simple figure, without losing the self-contained nature of the file.

@minrk, I agree that the tradeoffs are even worse for code-generated images, but I wanted to point out that the arguments that you and @juhasch are making would largely apply to them as well. That should give you pause — we have an existence proof that these arguments can be outweighed by the convenience of having a self-contained file.

@ellisonbg
Copy link
Contributor

I can see both sides here.

I see Min's point that almost everything is accomplished by uploading to
the server next to the notebook using the Content API.

I also see that we are already putting binary data in the notebook and
adding more won't causes too many problems.

From a usability perspective, I do think the main point is is allowing easy
uploading.

But the secondary point is to have a self contained file.

On Wed, Oct 21, 2015 at 12:42 PM, Steven G. Johnson <
notifications@github.com> wrote:

We're already using JSON as a container to store binary data. I just want
to be able to paste a few kB more in when I need a simple figure, without
losing the self-contained nature of the file.

@minrk https://github.com/minrk, I agree that the tradeoffs are even
worse for code-generated images, but I wanted to point out that the
arguments that you and @juhasch https://github.com/juhasch are making
would largely apply to them as well. That should give you pause — we have
an existence proof that these arguments can be outweighed by the
convenience of having a self-contained file.


Reply to this email directly or view it on GitHub
#621 (comment).

Brian E. Granger
Associate Professor of Physics and Data Science
Cal Poly State University, San Luis Obispo
@ellisonbg on Twitter and GitHub
bgranger@calpoly.edu and ellisonbg@gmail.com

@ellisonbg
Copy link
Contributor

@jasongrout is probably in favor of embedding (next to me) - "I think the
call has already been made by how we handle output"

On Wed, Oct 21, 2015 at 12:59 PM, Brian Granger ellisonbg@gmail.com wrote:

I can see both sides here.

I see Min's point that almost everything is accomplished by uploading to
the server next to the notebook using the Content API.

I also see that we are already putting binary data in the notebook and
adding more won't causes too many problems.

From a usability perspective, I do think the main point is is allowing
easy uploading.

But the secondary point is to have a self contained file.

On Wed, Oct 21, 2015 at 12:42 PM, Steven G. Johnson <
notifications@github.com> wrote:

We're already using JSON as a container to store binary data. I just want
to be able to paste a few kB more in when I need a simple figure, without
losing the self-contained nature of the file.

@minrk https://github.com/minrk, I agree that the tradeoffs are even
worse for code-generated images, but I wanted to point out that the
arguments that you and @juhasch https://github.com/juhasch are making
would largely apply to them as well. That should give you pause — we have
an existence proof that these arguments can be outweighed by the
convenience of having a self-contained file.


Reply to this email directly or view it on GitHub
#621 (comment).

Brian E. Granger
Associate Professor of Physics and Data Science
Cal Poly State University, San Luis Obispo
@ellisonbg on Twitter and GitHub
bgranger@calpoly.edu and ellisonbg@gmail.com

Brian E. Granger
Associate Professor of Physics and Data Science
Cal Poly State University, San Luis Obispo
@ellisonbg on Twitter and GitHub
bgranger@calpoly.edu and ellisonbg@gmail.com

@jasongrout
Copy link
Member

I would say that there are usecases for both approaches, and right now only one approach is (easily) possible - the uploading and linking approach. This makes it very easy to do the other approach too.

@minrk
Copy link
Member

minrk commented Oct 21, 2015

If there is consensus that the trade offs are in favor of embedding in the JSON, that's fine by me. Sorry for the diversion.

@Carreau Carreau added this to the 5.0 milestone Oct 21, 2015
@julienr
Copy link
Contributor Author

julienr commented Feb 25, 2016

@jdfreder Thanks for your help. Removing the jquery dependencies seem to have solved the travis failure. The "script error" I got locally was due to my build directory containing some old js code.

@julienr
Copy link
Contributor Author

julienr commented Feb 25, 2016

I have a travis failure on js/services which I think is just some weird timeout. Can someone restart this test ?

@julienr
Copy link
Contributor Author

julienr commented Feb 25, 2016

Thanks to whoever relaunched the test :) I've added the two tests (attachments and garbage collection) I wanted, I think this can be considered ready.

@minrk
Copy link
Member

minrk commented Feb 26, 2016

Awesome, thanks @julienr!

@stevengj
Copy link

I'm really excited about this!

minrk added a commit that referenced this pull request Mar 8, 2016
Implement markdown cell attachments. Allow drag’n’drop of images into…
@minrk minrk merged commit 233f044 into jupyter:master Mar 8, 2016
@julienr
Copy link
Contributor Author

julienr commented Mar 8, 2016

Thanks to everybody involved in this discussion and thanks for merging this !

@jdfreder
Copy link
Contributor

jdfreder commented Mar 8, 2016

🎉 🍰 This is awesome!!

@Carreau
Copy link
Member

Carreau commented Mar 8, 2016

Ha ! Found a bug already ! But awesome !

@platise
Copy link

platise commented Jan 13, 2018

Save as HTML or export to slides does not include attached images, but it prints a filename instead in jupyter 5.2.2. Is this a bug or any hint how to export a report?

On save as .tex I would suggest images are exported under some temp dir, like .ipynb_tmp to be able to generate a PDF.

@takluyver
Copy link
Member

The export functionality is part of nbconvert, and I'm not sure that it supports 'attached' images yet. See jupyter/nbconvert#699 .

@platise
Copy link

platise commented Feb 1, 2018

Looks like not many people are using attached images in markdown, how to trigger more interest to the jupyter/nbconvert#699, any idea?

@platise
Copy link

platise commented Feb 16, 2018

If you merge two cells where each has one image inserted as attachment then the first (above) cell will loose the image.

@stevengj
Copy link

@platise, that sounds like a separate issue that should be filed and linked to this one.

@platise
Copy link

platise commented Feb 17, 2018

Please @stevengj if you can advise or /copy/ the bug report to a proper place; as the starting /body/ of the ticket says about so many features, thought it could belong here.

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet