Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

prevent URLs in attributes being escaped #9820

Merged

Conversation

alexnguyennz
Copy link
Contributor

@alexnguyennz alexnguyennz commented Jan 25, 2024

Changes

Issue

Adds check for server rendered URLs attributes with any & so they're not escaped. For edge cases where multiple params are needed e.g. for an API that generates OG images.

Before:
<meta name="og:image" content="https://example.com/api/og?title=hello&#38;description=somedescription">

After:
<meta name="og:image" content="https://example.com/api/og?title=hello&description=somedescription">

I think this change is safe - let me know if there's another way this should be done.

Testing

pnpm run test

And manually with minimal demo in dev and build with:

<head>
  <meta
    name="og:image"
    content={'https://example.com/api/og?title=hello&description=somedescription'}
  />
</head>

Docs

N/A

Copy link

changeset-bot bot commented Jan 25, 2024

🦋 Changeset detected

Latest commit: 3cfc4c5

The changes in this PR will be included in the next version bump.

Not sure what this means? Click here to learn what changesets are.

Click here if you're a maintainer who wants to add another changeset to this PR

@github-actions github-actions bot added the pkg: astro Related to the core `astro` package (scope) label Jan 25, 2024
.changeset/quick-islands-ring.md Outdated Show resolved Hide resolved
@@ -104,6 +104,11 @@ Make sure to use the static attribute syntax (\`${key}={value}\`) instead of the
return markHTMLString(` class="${toAttributeString(value, shouldEscape)}"`);
}

// prevent URLs in `content` attributes being escaped for multiple params to work
if (key === 'content' && typeof value === 'string' && URL.canParse(value)) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not familiar with this part of the codebase, but can you for hrefs attributes as well?

@bluwy
Copy link
Member

bluwy commented Jan 25, 2024

Do we need to escape the & in the first place? It should be fine to be used in attribute values 🤔 Seems like it's added since #1789. Perhaps @natemoo-re has some context about it.

@natemoo-re
Copy link
Member

Do we need to escape the & in the first place? It should be fine to be used in attribute values 🤔 Seems like it's added since #1789. Perhaps @natemoo-re has some context about it.

Good question! Escaping & is best practice. Seems like Preact and Svelte escape it by default.

I'm not sure that we should hard-code the content attribute... Maybe we could check for the existence of & and then check for a URL? I'm not sure of a simple way we can determine if the & should be escaped or if it's potentially needed for a URL.

@alexnguyennz alexnguyennz changed the title prevent URLs in content attributes being escaped prevent URLs in attributes being escaped Jan 26, 2024
@alexnguyennz
Copy link
Contributor Author

I'm not sure that we should hard-code the content attribute... Maybe we could check for the existence of & and then check for a URL? I'm not sure of a simple way we can determine if the & should be escaped or if it's potentially needed for a URL.

Not sure either - it's now just a general URL check (with any &) for all attributes if that works.

@bluwy
Copy link
Member

bluwy commented Jan 29, 2024

Good question! Escaping & is best practice. Seems like Preact and Svelte escape it by default.

Thanks! It seems like they encode & as &amp; and " as &quot;, while we encode as &#38; and &#34; respectively. Perhaps we need to change how we encode this as well? Technically both should be equivalent, but maybe browsers are able to parse &amp; better because the # is not tripping it up (?).

@ematipico
Copy link
Member

I also think that using HTML entities is better.

@ematipico
Copy link
Member

This PR is sitting for a while without too much activity. cc @alexnguyennz and @natemoo-re

@natemoo-re
Copy link
Member

Thanks for the ping @ematipico! I'm okay with this solution if everyone else is.

We can follow-up with switching to &amp; and &quot; in another PR.

.changeset/quick-islands-ring.md Outdated Show resolved Hide resolved
Copy link
Member

@natemoo-re natemoo-re left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just blocking until we resolve the URL.canParse discrepancy.

Nothing for you to do @alexnguyennz, the maintainers just need to decide when we're updating our Node version.

@@ -104,6 +104,11 @@ Make sure to use the static attribute syntax (\`${key}={value}\`) instead of the
return markHTMLString(` class="${toAttributeString(value, shouldEscape)}"`);
}

// Prevents URLs in attributes from being escaped in static builds
if (typeof value === 'string' && value.includes('&') && URL.canParse(value)) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I noticed that URL.canParse was added in Node v18.17.0, but our minimum required version is still v18.14.1.

We should either implement a fallback for URL.canParse (try/catch) or wait on merging this until we bump our engines to v.18.17.0 (possibly in the next minor cc @Princesseuh)

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

URL.canParse uses the same internal method as the URL constructor, so a try-catch should be sufficient to unblock.

@lilnasy lilnasy force-pushed the &-encoding-head-urls-multiple-params branch from df2e714 to 24bbb62 Compare March 8, 2024 23:49
Copy link
Contributor

@lilnasy lilnasy left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks great to me! It must have taken effort to find the relevant code path, thanks for the clean fix!

third774 added a commit to third774/kevinkipp.com that referenced this pull request Mar 11, 2024
Using patch-package to apply this PR until it is merged & released
withastro/astro#9820
@lilnasy lilnasy force-pushed the &-encoding-head-urls-multiple-params branch from 16ac392 to 24bbb62 Compare March 13, 2024 15:21
Copy link
Member

@natemoo-re natemoo-re left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for getting this over the line, @lilnasy!

@lilnasy lilnasy merged commit 8edc42a into withastro:main Mar 13, 2024
13 checks passed
@astrobot-houston astrobot-houston mentioned this pull request Mar 13, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
pkg: astro Related to the core `astro` package (scope)
Projects
None yet
Development

Successfully merging this pull request may close these issues.

& encoding with server rendered strings affects head URLs with multiple params
7 participants