Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

apply encodeURI to imports #1058

Closed
wants to merge 2 commits into from
Closed

apply encodeURI to imports #1058

wants to merge 2 commits into from

Conversation

Fil
Copy link
Contributor

@Fil Fil commented Mar 14, 2024

closes #1055

@Fil Fil requested a review from mbostock March 14, 2024 20:17
@@ -13,7 +13,7 @@
<link rel="modulepreload" href="../_npm/d3@7.8.5/_esm.js">
<link rel="modulepreload" href="../_import/bar/bar.13bb8056.js">
<link rel="modulepreload" href="../_import/top.160847a6.js">
<link rel="modulepreload" href="../_import/foo/foo bar.b173d3de.js">
<link rel="modulepreload" href="../_import/foo/foo%20bar.b173d3de.js">
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this is a welcome change

Copy link
Member

@mbostock mbostock left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This doesn’t help with #1055 because encodeURI doesn’t escape + because it “may part of the URI syntax.” 😅

encodeURI("/_npm/d3@7.9.0/+esm.js") // "/_npm/d3@7.9.0/+esm.js"

encodeURIComponent does encode the +, but then it also encodes slashes, so it’s not what we want either.

encodeURI("/_npm/d3@7.9.0/+esm.js") // "%2F_npm%2Fd3%407.9.0%2F%2Besm.js"

I don’t actually know what the right solution is here. I think we should investigate whether other people have run into similar issues with S3; this might be an S3-specific problem.

@mbostock
Copy link
Member

This suggests that it’s an S3-specific bug. So we could specifically accommodate + in the file name if we wanted to, but we’d have to parse the URL and only rewrite + in the file name (not the query!).

https://stackoverflow.com/questions/38282932/amazon-s3-url-being-encoded-to-2

Maybe we do nothing and just avoid this bug in S3?

Copy link
Member

@mbostock mbostock left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also, there are lots of other places where we render src and href attributes that might need this type of escaping, right? Like everything that gets passed through normalizeLink and all the asset attributes in rewriteHtml? I don’t know how we could do this universally.

@Fil
Copy link
Contributor Author

Fil commented Mar 14, 2024

Apparently linkedom and jsdom don't care about this either:

import {parseHTML} from "linkedom";
const {document} = parseHTML("<a>");
document.querySelector("a").setAttribute("href", "a b+c\"d'e$f#g&h");
console.log(`${document}`); // <a href="a b+c&quot;d'e$f#g&h"></a>

const document2 = await import("jsdom").then(({JSDOM}) => new JSDOM("").window.document);
const a = document2.createElement("a");
a.setAttribute("href", "a b+c\"d'e$f#g&h");
console.log(`${a.outerHTML}`); // <a href="a b+c&quot;d'e$f#g&amp;h"></a>

so yeah, let's just be careful. There are a few other characters that won't work, like : for file names, emojis (#783 ), long filenames on some OS… I don't think we can fix all of that.

Base automatically changed from mbostock/underscore-esm to main March 14, 2024 23:28
@Fil Fil closed this Mar 15, 2024
@Fil Fil deleted the fil/encodeURI branch March 15, 2024 07:40
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

A + in a URL is sometimes decoded as a space
2 participants