Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Allow spaces in names of files attached to markdown cells #8095

Merged
merged 13 commits into from
Mar 30, 2020

Conversation

ianhi
Copy link
Contributor

@ianhi ianhi commented Mar 27, 2020

References

partially fixes: #8067:
This will fix the dragging and dropping behavior, but as noted #8062 (comment) the other issue raised there of typing in the name will remain as that is not supported by markdown.
partially addresses: #8062:
the spaces in names also came up there.

Code changes

call encodeURI when setting the attachment to a markdown cell. This allows the attachment to be found when the markdown is being rendered. I was able to confirm that this fixes the issues for nativeDrop and paste events. However, I don't know how to trigger the lm-drop events so I wasn't able to directly verify that this fixes the issue for those events. If these are meant to be triggered by dragging a file from the jupyterlab filebrowser then I was unable to do that and have a new issue to report.

User-facing changes

An image with spaces in the name that is drag and dropped or copied into a markdown cell will now render:
wokring_spaces

Backwards-incompatible changes

N/A

@jupyterlab-dev-mode
Copy link

Thanks for making a pull request to JupyterLab!

To try out this branch on binder, follow this link: Binder

@jasongrout jasongrout added this to the 2.1 milestone Mar 27, 2020
@jasongrout
Copy link
Contributor

I put this in the 2.1 milestone, assuming we can finish it up by this weekend.

ianhi and others added 3 commits March 27, 2020 10:52
Co-Authored-By: Jason Grout <jasongrout@users.noreply.github.com>
Co-Authored-By: Jason Grout <jasongrout@users.noreply.github.com>
Co-Authored-By: Jason Grout <jasongrout@users.noreply.github.com>
@ianhi
Copy link
Contributor Author

ianhi commented Mar 27, 2020

Sounds good. The one other thing maybe worth considering is what the expected behavior is for non-image files? The ![]() notation which is always used is only valid for images. So if you drag a markdown file into a markdown cell it will paste the contents of the file and then add: ![markdown_file.md](attachment:markddown_file.md) which doesn't render.

@jasongrout
Copy link
Contributor

The one other thing maybe worth considering is what the expected behavior is for non-image files?

Great question. Is it easy to determine if the file is an image? Do we have the mimetype in the clipboard data? I suppose we can guess based on extension as a last resort?

@jasongrout
Copy link
Contributor

Reading up on the commonmark spec, apparently you can have a space in the filename, you just have surround the link with <>: https://spec.commonmark.org/0.29/#example-486

@jasongrout
Copy link
Contributor

To quote the commonmark spec: https://spec.commonmark.org/0.29/#links

A link destination consists of ... a sequence of zero or more characters between an opening < and a closing > that contains no line breaks or unescaped < or > characters,

@ianhi
Copy link
Contributor Author

ianhi commented Mar 27, 2020

Should be possible to determine if a file is an image. The clipboardData has a .type attribute that should be confined to one of these "Mandatory" data types (https://w3c.github.io/clipboard-apis/#reading-from-clipboard):

text/plain

text/uri-list

text/csv

text/css

text/html

application/xhtml+xml

image/png

image/jpg, image/jpeg

image/gif

image/svg+xml

application/xml, text/xml

application/javascript

application/json

application/octet-stream

Seems to me that the most reasonable to support are:

text/plain
text/csv
image/*

@jasongrout
Copy link
Contributor

So what if we just take the attachmentName, do a manual regex for newlines and < and > to replace those with the HTML entities, and enclose the link in pointy brackets? It should be much more readable then. We can even determine if we should put pointy brackets around it based on if it has a space if we want the resulting markdown to look nicer.

@ianhi
Copy link
Contributor Author

ianhi commented Mar 27, 2020

When I just type in a reference to a file using < and > doesn't fix the issues with spaces for me.
Renders: ![image](<image.png>)
Doesn't render: ![with spaces](<name with space.png>)
Renders: ![name with space.png](<name%20with%20space.png>)

maybe that's an issue with markedjs?

@jasongrout
Copy link
Contributor

maybe that's an issue with markedjs?

Yes, it seems so

@ianhi
Copy link
Contributor Author

ianhi commented Mar 27, 2020

Also, while the types are well defined for clipboardEvents they are not defined for drop events:
https://html.spec.whatwg.org/multipage/dnd.html#the-drag-data-item-type-string

The drag data item type string
A Unicode string giving the type or format of the data, generally given by a MIME type. Some values that are not MIME types are special-cased for legacy reasons. The API does not enforce the use of MIME types; other values can be used as well. In all cases, however, the values are all converted to ASCII lowercase by the API.

@jasongrout
Copy link
Contributor

It does seem that markdown-it supports the <> style links: demo

@jasongrout
Copy link
Contributor

c.f. #272

@ianhi
Copy link
Contributor Author

ianhi commented Mar 27, 2020

So is the conclusion to use this solution until markedjs implements < or until jupyterlab switches to something like markdown-it?

1

Video could also be supported using a video tag. For example:

diff --git a/packages/cells/src/widget.ts b/packages/cells/src/widget.ts
index c013e1ffe..94761638c 100644
--- a/packages/cells/src/widget.ts
+++ b/packages/cells/src/widget.ts
@@ -1327,8 +1327,14 @@ export abstract class AttachmentsCell extends Cell {
       const encodedData = matches[3];
       const bundle: nbformat.IMimeBundle = { [mimeType]: encodedData };
       const URI = encodeURI(blob.name)
-      this.model.attachments.set(URI, bundle);
-      this.updateCellSourceWithAttachment(blob.name, URI);
+      if (mimeType.includes('image')){
+        this.model.attachments.set(URI, bundle);
+        this.updateCellSourceWithAttachment(blob.name, URI);
+      } else if(mimeType.includes('video')){
+        this.model.attachments.set(URI, bundle);
+        const textToBeAppended = `<video controls src='attachment:${URI}'></video>`
+        this.model.value.insert(this.model.value.text.length, textToBeAppended);
+      }
     };
     reader.onerror = evt => {
       console.error(`Failed to attach ${blob.name}` + evt);

although I'm struggling with attachment:video.mp4 not being a valid URI for the video tag. Though the equivalent works fine for img tags.

2

Because attachments.set is called without checking whether that URI is already being used you can end up overwriting already embedded images. This is primarily an issue when pasting a screenshot, as the paste is always named image.png. So it's currently impossible to paste two distinct images into the same markdown cell. Does a fix for that belong in this PR, or perhaps a different one?
overwriting_screenshots

Checks if the cell already has an attachment using that name. If yes then start adding numbers such that image.png will become image_1.png etc. The regex is used for this splitting in order to only split on the final . (i.e. allow filenames such as image.image.png) and to include the . for reconstructing the filename.
@jasongrout
Copy link
Contributor

jasongrout commented Mar 27, 2020

So is the conclusion to use this solution until markedjs implements < or until jupyterlab switches to something like markdown-it?

Yes, I think so.

Video could also be supported using a video tag.

Awesome!

although I'm struggling with attachment:video.mp4 not being a valid URI for the video tag. Though the equivalent works fine for img tags.

This is because the attachment resolver, what actually converts this attachment: syntax to a data url for the html on the page, filters for only images:

if (
mimeType === undefined ||
imageRendererFactory.mimeTypes.indexOf(mimeType) === -1
) {
return Promise.reject(
`Cannot render unknown image mime type "${mimeType}".`
);
}

Copy link
Contributor

@jasongrout jasongrout left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A few more review comments.

Another way to approach the attachment conflicts is to generate a UUID, which becomes the URI, so you have unique UUIDs for every attachment. The bad thing about a UUID is that there is no intrinsic meaning in the names of entries in the attachments list, but maybe that's okay? The association to a file name is in the markdown text if a person wants to cross reference.

packages/cells/src/widget.ts Outdated Show resolved Hide resolved
packages/cells/src/widget.ts Outdated Show resolved Hide resolved
packages/cells/src/widget.ts Outdated Show resolved Hide resolved
packages/cells/src/widget.ts Show resolved Hide resolved
@ianhi
Copy link
Contributor Author

ianhi commented Mar 28, 2020

This is because the attachment resolver, what actually converts this attachment: syntax to a data url for the html on the page, filters for only images:

Yeah. When I modify that line I can easily embed videos by dragging them in. The question is then where to keep the list of valid mimetypes. It seems to me that both AttachmentsCell and AttachmentsResolver ought to share a ReadonlyArray of their supported mimetypes. This would also help solve the issues of spurious embeddings when you drag over a non image or video filetype. I'm pretty lost on what the correct place to put such a thing is suggestions welcome on that.

Currently I've changed the AttachmentsResolver like so:

diff --git a/packages/attachments/src/model.ts b/packages/attachments/src/model.ts
index 973e4d0d6..3dacd3224 100644
--- a/packages/attachments/src/model.ts
+++ b/packages/attachments/src/model.ts
@@ -14,7 +14,6 @@ import {
 import {
   IAttachmentModel,
   AttachmentModel,
-  imageRendererFactory
 } from '@jupyterlab/rendermime';
 
 import { IRenderMime } from '@jupyterlab/rendermime-interfaces';
@@ -378,6 +377,7 @@ export class AttachmentsResolver implements IRenderMime.IResolver {
   constructor(options: AttachmentsResolver.IOptions) {
     this._parent = options.parent || null;
     this._model = options.model;
+    this.supportedTypes = ['video/mp4','video/webm','video/ogg','image/bmp', 'image/png', 'image/jpeg', 'image/gif']; 
   }
   /**
    * Resolve a relative url to a correct server path.
@@ -411,7 +411,7 @@ export class AttachmentsResolver implements IRenderMime.IResolver {
     // Only support known safe types:
     if (
       mimeType === undefined ||
-      imageRendererFactory.mimeTypes.indexOf(mimeType) === -1
+      this.supportedTypes.indexOf(mimeType) === -1
     ) {
       return Promise.reject(
         `Cannot render unknown image mime type "${mimeType}".`
@@ -434,6 +434,7 @@ export class AttachmentsResolver implements IRenderMime.IResolver {
 
   private _model: IAttachmentsModel;
   private _parent: IRenderMime.IResolver | null;
+  readonly supportedTypes: ReadonlyArray<string>;
 }
 
 /**

@ianhi
Copy link
Contributor Author

ianhi commented Mar 28, 2020

The uuid approach also removes the need to encode URIs, and helps negate possibility someone seeing the filename in the embedding and thinking that if they change the file on disk then the markdown cell will update.

@jasongrout
Copy link
Contributor

The uuid approach also removes the need to encode URIs, and helps negate possibility someone seeing the filename in the embedding and thinking that if they change the file on disk then the markdown cell will update.

Oh yeah, I hadn't thought of that. Switching to UUID makes it clearer that this is a snapshot of the file, and it is different than actually linking to the file on disk. Yeah, I would suggest moving to UUID for that reason, but maybe keep the extension?

ianhi and others added 2 commits March 28, 2020 23:28
Use uuid4 to generate the URI in all instances, if the file has an extension preserve that information. Also add a check if the file type is an image which is currently the only valid markdown embedding.
Co-Authored-By: Jason Grout <jasongrout@users.noreply.github.com>
@ianhi
Copy link
Contributor Author

ianhi commented Mar 29, 2020

I switched to using UUIDs and keep the extension when one exists. I also check if the file is an image type because that's the only file type supported by the ![]() syntax.

If possible it'd be nice to add support for videos as well. Though as noted, that would require changes to the attachments package. If these are only used for attaching to cells this seems pretty easy. But if they are used elsewhere I'm not sure how best to change them to allow the attachment resolver to resolve videos in addition to images. My concern is that it might require adding a RenderedVideo class to https://github.com/jupyterlab/jupyterlab/blob/master/packages/rendermime/src/widgets.ts ?

Copy link
Contributor

@jasongrout jasongrout left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I like the new UUID support! I have a few more suggestions for it. I think we are converging to something really nice here!

packages/cells/src/widget.ts Outdated Show resolved Hide resolved
packages/cells/src/widget.ts Outdated Show resolved Hide resolved
@@ -1272,20 +1273,21 @@ export abstract class AttachmentsCell extends Cell {
CONTENTS_MIME_RICH
) as DirListing.IContentsThunk;
if (model.type === 'file') {
this.updateCellSourceWithAttachment(model.name);
const URI = this._generateURI(model.name);
this.updateCellSourceWithAttachment(model.name, URI);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Below, we now check to see if the data is an image mimetype. Should we do the same here?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There's already some mimetype filtering happening in this function when if makes the array of supportedMimetypes

const supportedMimeTypes = toArray(
filter(event.mimeData.types(), mimeType => {
if (mimeType === CONTENTS_MIME_RICH) {

I hoped that this was doing enough filtering. I wasn't able to test this though, because I don't seem to be able to trigger this event. Should this be triggerable by dragging an image from the jupyterlab filebrowser?

@jasongrout
Copy link
Contributor

If possible it'd be nice to add support for videos as well. Though as noted, that would require changes to the attachments package. If these are only used for attaching to cells this seems pretty easy. But if they are used elsewhere I'm not sure how best to change them to allow the attachment resolver to resolve videos in addition to images. My concern is that it might require adding a RenderedVideo class to https://github.com/jupyterlab/jupyterlab/blob/master/packages/rendermime/src/widgets.ts ?

I don't think it requires anything added to the rendermime package - that's completely separate from what we are doing here. I think how it would work is that we construct an HTML video element with a url of attachment:..., and then the attachment resolver needs to just convert the video to a data URI. This line in the resolver:

// Only support known safe types:
if (
mimeType === undefined ||
imageRendererFactory.mimeTypes.indexOf(mimeType) === -1
) {
return Promise.reject(
`Cannot render unknown image mime type "${mimeType}".`
);
}
I think just uses the rendermime to check to see if it is a "safe" data format. I think it's probably a pretty poor check, actually. Perhaps instead we could whitelist some "safe" formats like image/png as well as some video ones, and not refer to the rendermime?

On the other hand, supporting video would be a nice place to draw the line for this PR, and put in a new PR to support that.

ianhi and others added 3 commits March 29, 2020 01:15
Co-Authored-By: Jason Grout <jasongrout@users.noreply.github.com>
Co-Authored-By: Jason Grout <jasongrout@users.noreply.github.com>
Co-Authored-By: Jason Grout <jasongrout@users.noreply.github.com>
@ianhi
Copy link
Contributor Author

ianhi commented Mar 29, 2020

Perhaps instead we could whitelist some "safe" formats like image/png as well as some video ones, and not refer to the rendermime?

Yeah. I would like to add a readonly array supportedTypes to both attachmentsResolver and attachments. My one worry is that this might mess with extensions that also attach data?

On the other hand, supporting video would be a nice place to draw the line for this PR, and put in a new PR to support that.

That seems reasonable. A side effect of this PR is that I'm now working towards adding a videoRenderer, so a separate PR could include a mimerenderer and support for video and audio in markdown cells.

This reverts a suggested change that breaks backwards compatibility
Copy link
Member

@saulshanabrook saulshanabrook left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good overall!

Just one little typo.

packages/cells/src/widget.ts Outdated Show resolved Hide resolved
@saulshanabrook saulshanabrook merged commit 82dd404 into jupyterlab:master Mar 30, 2020
@lock lock bot added the status:resolved-locked Closed issues are locked after 30 days inactivity. Please open a new issue for related discussion. label May 5, 2020
@lock lock bot locked as resolved and limited conversation to collaborators May 5, 2020
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
pkg:cells status:resolved-locked Closed issues are locked after 30 days inactivity. Please open a new issue for related discussion.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Images that has spaces in its filename/path can't be displayed using ![](filename.png) method.
3 participants