Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Processing steps for id now removes the fragment #1122

Merged
merged 13 commits into from
Jun 6, 2024

Conversation

mgiuca
Copy link
Collaborator

@mgiuca mgiuca commented May 6, 2024

Closes #1121

This change (choose at least one, delete ones that don't apply):

  • Adds new normative recommendations or optional items

[Note: This technically changes normative behaviour, but does not break implementations. Any implementation that keeps the fragment and strips it out when comparing ids will behave the same as this. The new behaviour just encourages implementations to keep their local data model clean.]

Commit message:

The canonical id no longer includes the fragment. This was already effectively true (as all comparisons would exclude fragments), but now it makes it true in the processed manifest's canonical id.

Removed or made optional the exclude fragments in the comparisons. Added a lengthy note as to why user agents may still wish to exclude fragments when comparing against potentially old database entries.

Person merging, please make sure that commits are squashed with one of the following as a commit message prefix:

  • chore:
  • editorial:
  • BREAKING CHANGE:
  • And use none if it's a normative change

Preview | Diff

mgiuca added 6 commits May 6, 2024 14:13
…ment manifest id.

Since these are both processed in real time (not from a database) there is no need to exclude fragments here.
If it's just about old data, it should be up to the UA to know whether that old data might exist.
@mgiuca mgiuca requested a review from dmurph May 6, 2024 04:47
Copy link
Collaborator

@dmurph dmurph left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is a good change - this has come up repeatedly with confusion from devs & implementers, so just removing the fragment from the beginning really helps to solve this.

@marcoscaceres
Copy link
Member

Hmm... looking at WebKit, it wasn't stripping the fragment etc. I'll need to check what the impact is.

index.html Outdated
|manifest|["id"] is [=url/equal=] with [=URL serializer/exclude
fragment|exclude fragment true=] to the [=identity=] of an
already-installed application, it SHOULD be used as a signal that
|manifest|["id"] is [=url/equal=] (OPTIONAL: with
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we should remove OPTIONAL... it's a bit confusing. Let's just assume a single model here.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The purpose of OPTIONAL here is that some UAs may decide not to do it if they are sure the fragments were already stripped. (e.g. if they always strip fragments before writing them into the database, and there never was a time when they didn't do that, or they ran a migration at some point stripping fragments from the database, then they don't need to exclude fragments here).

Not doing so could actually significantly help those clients, for example, they could then compare the hashes of the app IDs rather than having to perform a URL equals operation (which can't be done using a SQL query). This doesn't only affect major browsers, but also any website that indexes apps -- we'd like it to be optional for them so they can simply assume the fragment-removed app IDs are primary keys in their database and do simple lookups. If we made this mandatory, then every entity that indexes apps would need to have no canonical representation of an app ID, and always do the URL equals excluding fragments comparison.

So I did go back and forth on this a bit, but settled on making it OPTIONAL by way of saying "hey, if you've been storing app IDs with their fragments, you probably should exclude fragments when you compare them, but you don't have to do that if you've treated the data properly per the algorithm, and there are significant simplifications to the data model if you don't do this."

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Right, yeah. that seems ok. I wonder if it could be better phrased though the "(OPTIONAL: ...)" looks a bit odd to me. Trying to think of a better way to phrase it.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@marcoscaceres input needed: I rewrote it as "(with exclude fragments OPTIONALLY set to true)". Is this sufficient to capture RFC 2119's keyword "OPTIONAL"? (Note that Respec doesn't italicize the word "OPTIONALLY" nor does this adverbial form appear in RFC 2119)

Otherwise if it has to exactly use those words, I think the best phrasing is the somewhat verbose: "(the user agent MAY set exclude fragments to true)".

Copy link
Member

@marcoscaceres marcoscaceres May 23, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmmm 🤔 let’s go with MAY as I just know people will complain otherwise.

I’ll try to see if I can think of some other phrasing in the future.

Really sorry about all the churn on this!

It should be good to merge otherwise (and as a bonus 🍨, this helped fix two bugs in WebKit, so massive thank you!!!)

index.html Outdated Show resolved Hide resolved
mgiuca and others added 2 commits May 10, 2024 11:02
Co-authored-by: Marcos Cáceres <marcos@marcosc.com>
index.html Outdated
this manifest is a replacement for the already-installed
application's manifest, and not a distinct application, even if it is
served from a different URL than the one seen previously.
</p>
<p class="note" title="Exclude fragments for good measure">
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
<p class="note" title="Exclude fragments for good measure">
<p class="note" title="Exclude fragments for comparison">

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Rephrased as "is best practice". The "good measure" was supposed to convey the fact that you don't really need to do it, but if you can then just do it anyway to sanitize bad data.

I hope this also conveys that user agents don't have to do this and it can be beneficial not to, because it affords optimizations like hashing.

index.html Outdated
Comment on lines 872 to 879
for a matching application. However, since old versions of this spec
(and, possibly, old user agents) did not remove the [=URL/fragment=]
from the [=URL=] at parse time, and relied only on
[=URL/equals/exclude fragments|excluding fragments=] during
comparisons, there may be historical app data with [=URL/fragments=]
in the <code>[=manifest/id=]</code>. Therefore, it is best practice for user
agents to [=URL/equals/exclude fragments=] even when comparing two
[=URLs=] that should not have any fragments.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Removing the use of RFC2119 keywords:

Suggested change
for a matching application. However, since old versions of this spec
(and, possibly, old user agents) did not remove the [=URL/fragment=]
from the [=URL=] at parse time, and relied only on
[=URL/equals/exclude fragments|excluding fragments=] during
comparisons, there may be historical app data with [=URL/fragments=]
in the <code>[=manifest/id=]</code>. Therefore, it is best practice for user
agents to [=URL/equals/exclude fragments=] even when comparing two
[=URLs=] that should not have any fragments.
for a matching application. However, since old versions of this spec
(and, possibly, old user agents) did not remove the [=URL/fragment=]
from the [=URL=] at parse time, and relied only on
[=URL/equals/exclude fragments|excluding fragments=] during
comparisons, historical app data could include [=URL/fragments=]
in the <code>[=manifest/id=]</code>. Therefore, it is best practice for user
agents to [=URL/equals/exclude fragments=] even when comparing two
[=URLs=] that typically lack fragments.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks, good catch! I rephrased these a bit but got rid of the RFC words.

webkit-commit-queue pushed a commit to marcoscaceres/WebKit that referenced this pull request May 14, 2024
rdar://127864531
https://bugs.webkit.org/show_bug.cgi?id=273991

Reviewed by Sihui Liu.

Strips the hash from the id member when parsing the Web Manifest.
Part of w3c/manifest#1122

* Source/WebCore/Modules/applicationmanifest/ApplicationManifestParser.cpp:
(WebCore::ApplicationManifestParser::parseId):
* Tools/TestWebKitAPI/Tests/WebCore/ApplicationManifestParser.cpp:
(TEST_F):

Canonical link: https://commits.webkit.org/278724@main
webkit-commit-queue pushed a commit to marcoscaceres/WebKit that referenced this pull request May 15, 2024
https://bugs.webkit.org/show_bug.cgi?id=273989
rdar://127863839

Reviewed by Brady Eidson.

Removes the query and fragment components from the scope URL.
Part of w3c/manifest#1122

* Source/WebCore/Modules/applicationmanifest/ApplicationManifestParser.cpp:
(WebCore::ApplicationManifestParser::parseScope):
* Tools/TestWebKitAPI/Tests/WebCore/ApplicationManifestParser.cpp:
(TEST_F):

Canonical link: https://commits.webkit.org/278795@main
Copy link
Member

@marcoscaceres marcoscaceres left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Couple of suggestions. I can live with the (OPTINAL…) if it’s a pain to change. Otherwise looks good.

@mgiuca
Copy link
Collaborator Author

mgiuca commented May 22, 2024

@marcoscaceres Thanks, I finally got back on this one. Addressed the comments, PTAL.

index.html Outdated Show resolved Hide resolved
index.html Outdated Show resolved Hide resolved
@marcoscaceres marcoscaceres merged commit bceb2ee into w3c:main Jun 6, 2024
2 checks passed
github-actions bot added a commit that referenced this pull request Jun 6, 2024
SHA: bceb2ee
Reason: push, by marcoscaceres

Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Manifest id should strip fragment at parse time
3 participants