Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Sec-Browsing-Topics header format seems a bit awkward #183

Closed
domenic opened this issue May 22, 2023 · 18 comments
Closed

Sec-Browsing-Topics header format seems a bit awkward #183

domenic opened this issue May 22, 2023 · 18 comments

Comments

@domenic
Copy link

domenic commented May 22, 2023

In particular, emitting the version, but only sometimes, seems error-prone for consumers. It means consumers need to do extra work to reconstruct the correct version + topic pairs, on the server side, using a sort of stateful parser loop. They can't just use out-of-the-box HTTP structured field parsers.

Another example failure mode would be if something in their sever or proxy stack sorts HTTP header lists, this could be disastrous. E.g. it could turn Sec-Browsing-Topics: 5;v="x", 3, 1;v="y" (which is equivalent to {(1, "y"), (3, "x"), (5, "x")}) into Sec-Browsing-Topics: 1;v="y", 3, 5;v="x" (which is equivalent to {(1, "y"), (3, "y"), (5, "x")}).

@jkarlin
Copy link
Collaborator

jkarlin commented May 30, 2023

The trade-off here is bytes on the wire (~15 bytes per version) vs ease-of-use. Combined with the desire to pad the header to ensure that we're not leaking the number of topics sent, with your proposal we'd always need to send 3 versions (45 bytes for versions per request). The folks that we expect to use the API is pretty small in number (ad-tech) and generally pretty experienced. I agree that parsing it requires slightly more logic than simply reading a list, but it's pretty rudimentary.

Another example failure mode would be if something in their sever or proxy stack sorts HTTP header lists, this could be disastrous.

If a proxy were to reorder the list that seems bad on many levels? Sometimes ordering matters.

In the end I don't feel strongly here and could be persuaded either way, but we'll wind up having large padding buffers on many requests as a result.

@domenic
Copy link
Author

domenic commented May 30, 2023

When you say "many requests", you just mean the requests which have opted in to getting this data, right? As you say, this is a pretty small number of users. Given that those are generally for multi-kilobyte documents, it seems to me like penny-pinching to worry about 45 bytes.

@jkarlin
Copy link
Collaborator

jkarlin commented May 30, 2023

The number of users is small, but their reach is enormous. Let's assume at least a few requests per page on a significant fraction of page loads, ~50%?

@domenic
Copy link
Author

domenic commented May 31, 2023

It still seems not worth worrying about, compared to the transfer sizes of ads themselves.

@domenic
Copy link
Author

domenic commented May 31, 2023

In an offline chat, @jyasskin suggested using inner lists to group:

Sec-Browsing-Topics: (5 3);v="x", (1);v="y"

This is nice! However we then have to extend it with the padding idea from #186, which makes it more awkward. The nice thing about #186 is that consumers know they can just throw away the entire map entry with key p.

I think the following is a reasonable way of getting both benefits:

Sec-Browsing-Topics: (5 3);v="x", (1);v="y", ();v="PPPPPPPPPPPP"

This seems like it would be pretty easy to parse because it's effectively saying "there are no topics with version = PPPPPPPPPPPP." Full analysis below.

Detailed comparison of parsing ease

Assume our software's goal is to create a list of { topic, version } tuples.

Simplest format with padding

Sec-Browsing-Topics: t=(5;v="x" 3;v="x" 1;v="y"), p="PPP"
const parsed = genericStructuredHeaderParser(headerValue);
let result = [];
for (const item of parsed["t"]) {
  result.push(item.value, item.params["v"]);
}

Abbreviated format with padding (removes repeated versions)

Sec-Browsing-Topics: t=(5;v="x" 3 1;v="y"), p="PPP"
const parsed = genericStructuredHeaderParser(headerValue);
let result = [];
let lastVersion;
for (const item of parsed["t"]) {
  if (item.params["v"]) {
    result.push(item.value, item.params["v"]);
    lastVersion = item.params["v"];
  } else {
    result.push(item.value, lastVersion);
  }
}

Inner list with groups with padding

Sec-Browsing-Topics: (5 3);v="x", (1);v="y", ();v="PPPPPPPPPPPP"
const parsed = genericStructuredHeaderParser(headerValue);
let result = [];
for (const innerList of parsed) {
  const version = innerList.params["v"];
  for (const topic of innerList) {
    result.push(topic.value, version);
  }
}

(The inner loop is a no-op for the zero-length padding inner list.)

@jkarlin
Copy link
Collaborator

jkarlin commented May 31, 2023

I also like Jeffrey's suggestion. How about a compromise that keeps the dictionary which makes it a bit more expicit that it's for padding and uses the same number of characters when whitespace is removed? Having a dictionary also allows us to add extra keys in the future if necessary which is a nice-to-have.

Sec-Browsing-Topics: t=((5 3);v="x", (1);v="y"),p=PPPPPPPPPPPP

const parsed = genericStructuredHeaderParser(headerValue);
let result = [];

for (const innerList of parsed["t"]) {
  for (const topic of innerList) {
    result.push(topic.value, innerList.params["v"]);
  }
}

Correction: Not exactly the same number of chars, but quite close.

@domenic
Copy link
Author

domenic commented May 31, 2023

Inner lists can't have inner list members :(

@jkarlin
Copy link
Collaborator

jkarlin commented May 31, 2023

Ah, right. That's annoying.

Okay, I'd prefer to rename the pad param to 'p' in your example, like so:

Sec-Browsing-Topics: (5 3);v="x", (1);v="y", ();p="P00000000000"

And then the code would be:

const parsed = genericStructuredHeaderParser(headerValue);
let result = [];
for (const innerList of parsed) {
  for (const topic of innerList) {
    result.push(topic.value, innerList.params["v"]);
  }
}

There is no longer a guaranteed v param on each topic list. Yes, it's true that code might then explode if they try to access v assuming it's always there, but that's preferable imho to accidentally using the padding as a version.

@jkarlin
Copy link
Collaborator

jkarlin commented May 31, 2023

Yao points out that structured headers do support '.' and ':' in tokens.

chromium-wpt-export-bot pushed a commit to web-platform-tests/wpt that referenced this issue May 31, 2023
Changed the format of the topics request header from:
"t=(1;v=chrome.1:1:2, 2), p=P00000" to
"(1 2);v=chrome.1:1:2, ();p=P00000"

As discussed in patcg-individual-drafts/topics#183

Bug: 1443540
Change-Id: I3ea40df9f8732f5eff953110f22f4da2163480fb
chromium-wpt-export-bot pushed a commit to web-platform-tests/wpt that referenced this issue May 31, 2023
Changed the format of the topics request header from:
"t=(1;v=chrome.1:1:2, 2), p=P00000" to
"(1 2);v=chrome.1:1:2, ();p=P00000"

As discussed in patcg-individual-drafts/topics#183

Bug: 1443540
Change-Id: I3ea40df9f8732f5eff953110f22f4da2163480fb
chromium-wpt-export-bot pushed a commit to web-platform-tests/wpt that referenced this issue May 31, 2023
Changed the format of the topics request header from:
"t=(1;v=chrome.1:1:2, 2), p=P00000" to
"(1 2);v=chrome.1:1:2, ();p=P00000"

As discussed in patcg-individual-drafts/topics#183

Bug: 1443540
Change-Id: I3ea40df9f8732f5eff953110f22f4da2163480fb
chromium-wpt-export-bot pushed a commit to web-platform-tests/wpt that referenced this issue Jun 1, 2023
Changed the format of the topics request header from:
"t=(1;v=chrome.1:1:2, 2), p=P00000" to
"(1 2);v=chrome.1:1:2, ();p=P00000"

As discussed in patcg-individual-drafts/topics#183

Bug: 1443540
Change-Id: I3ea40df9f8732f5eff953110f22f4da2163480fb
aarongable pushed a commit to chromium/chromium that referenced this issue Jun 1, 2023
Changed the format of the topics request header from:
"t=(1;v=chrome.1:1:2, 2), p=P00000" to
"(1 2);v=chrome.1:1:2, ();p=P00000"

As discussed in patcg-individual-drafts/topics#183

Bug: 1443540
Change-Id: I3ea40df9f8732f5eff953110f22f4da2163480fb
Reviewed-on: https://chromium-review.googlesource.com/c/chromium/src/+/4576009
Reviewed-by: Yao Xiao <yaoxia@chromium.org>
Commit-Queue: Josh Karlin <jkarlin@chromium.org>
Reviewed-by: Andrew Paseltiner <apaseltiner@chromium.org>
Cr-Commit-Position: refs/heads/main@{#1151938}
chromium-wpt-export-bot pushed a commit to web-platform-tests/wpt that referenced this issue Jun 1, 2023
Changed the format of the topics request header from:
"t=(1;v=chrome.1:1:2, 2), p=P00000" to
"(1 2);v=chrome.1:1:2, ();p=P00000"

As discussed in patcg-individual-drafts/topics#183

Bug: 1443540
Change-Id: I3ea40df9f8732f5eff953110f22f4da2163480fb
Reviewed-on: https://chromium-review.googlesource.com/c/chromium/src/+/4576009
Reviewed-by: Yao Xiao <yaoxia@chromium.org>
Commit-Queue: Josh Karlin <jkarlin@chromium.org>
Reviewed-by: Andrew Paseltiner <apaseltiner@chromium.org>
Cr-Commit-Position: refs/heads/main@{#1151938}
chromium-wpt-export-bot pushed a commit to web-platform-tests/wpt that referenced this issue Jun 1, 2023
Changed the format of the topics request header from:
"t=(1;v=chrome.1:1:2, 2), p=P00000" to
"(1 2);v=chrome.1:1:2, ();p=P00000"

As discussed in patcg-individual-drafts/topics#183

Bug: 1443540
Change-Id: I3ea40df9f8732f5eff953110f22f4da2163480fb
Reviewed-on: https://chromium-review.googlesource.com/c/chromium/src/+/4576009
Reviewed-by: Yao Xiao <yaoxia@chromium.org>
Commit-Queue: Josh Karlin <jkarlin@chromium.org>
Reviewed-by: Andrew Paseltiner <apaseltiner@chromium.org>
Cr-Commit-Position: refs/heads/main@{#1151938}
xyaoinum added a commit to xyaoinum/topics that referenced this issue Jun 6, 2023
@dmarti
Copy link
Contributor

dmarti commented Jun 10, 2023

Why does this header have the Sec- prefix? (I don't see anything explaining that under https://github.com/patcg-individual-drafts/topics#privacy-and-security-considerations )

moz-v2v-gh pushed a commit to mozilla/gecko-dev that referenced this issue Jun 13, 2023
… read and parse, a=testonly

Automatic update from web-platform-tests
[Topics] Refactor header to be easier to read and parse

Changed the format of the topics request header from:
"t=(1;v=chrome.1:1:2, 2), p=P00000" to
"(1 2);v=chrome.1:1:2, ();p=P00000"

As discussed in patcg-individual-drafts/topics#183

Bug: 1443540
Change-Id: I3ea40df9f8732f5eff953110f22f4da2163480fb
Reviewed-on: https://chromium-review.googlesource.com/c/chromium/src/+/4576009
Reviewed-by: Yao Xiao <yaoxia@chromium.org>
Commit-Queue: Josh Karlin <jkarlin@chromium.org>
Reviewed-by: Andrew Paseltiner <apaseltiner@chromium.org>
Cr-Commit-Position: refs/heads/main@{#1151938}

--

wpt-commits: ae92b6b433768fd3b35de1dbff6cc8f03197fdc5
wpt-pr: 40332
ErichDonGubler pushed a commit to erichdongubler-mozilla/firefox that referenced this issue Jun 14, 2023
… read and parse, a=testonly

Automatic update from web-platform-tests
[Topics] Refactor header to be easier to read and parse

Changed the format of the topics request header from:
"t=(1;v=chrome.1:1:2, 2), p=P00000" to
"(1 2);v=chrome.1:1:2, ();p=P00000"

As discussed in patcg-individual-drafts/topics#183

Bug: 1443540
Change-Id: I3ea40df9f8732f5eff953110f22f4da2163480fb
Reviewed-on: https://chromium-review.googlesource.com/c/chromium/src/+/4576009
Reviewed-by: Yao Xiao <yaoxia@chromium.org>
Commit-Queue: Josh Karlin <jkarlin@chromium.org>
Reviewed-by: Andrew Paseltiner <apaseltiner@chromium.org>
Cr-Commit-Position: refs/heads/main@{#1151938}

--

wpt-commits: ae92b6b433768fd3b35de1dbff6cc8f03197fdc5
wpt-pr: 40332
gecko-dev-updater pushed a commit to marco-c/gecko-dev-comments-removed that referenced this issue Jun 16, 2023
… read and parse, a=testonly

Automatic update from web-platform-tests
[Topics] Refactor header to be easier to read and parse

Changed the format of the topics request header from:
"t=(1;v=chrome.1:1:2, 2), p=P00000" to
"(1 2);v=chrome.1:1:2, ();p=P00000"

As discussed in patcg-individual-drafts/topics#183

Bug: 1443540
Change-Id: I3ea40df9f8732f5eff953110f22f4da2163480fb
Reviewed-on: https://chromium-review.googlesource.com/c/chromium/src/+/4576009
Reviewed-by: Yao Xiao <yaoxiachromium.org>
Commit-Queue: Josh Karlin <jkarlinchromium.org>
Reviewed-by: Andrew Paseltiner <apaseltinerchromium.org>
Cr-Commit-Position: refs/heads/main{#1151938}

--

wpt-commits: ae92b6b433768fd3b35de1dbff6cc8f03197fdc5
wpt-pr: 40332

UltraBlame original commit: 540daebf60a1596746b438bce17e3d57ace5979a
gecko-dev-updater pushed a commit to marco-c/gecko-dev-wordified-and-comments-removed that referenced this issue Jun 16, 2023
… read and parse, a=testonly

Automatic update from web-platform-tests
[Topics] Refactor header to be easier to read and parse

Changed the format of the topics request header from:
"t=(1;v=chrome.1:1:2, 2), p=P00000" to
"(1 2);v=chrome.1:1:2, ();p=P00000"

As discussed in patcg-individual-drafts/topics#183

Bug: 1443540
Change-Id: I3ea40df9f8732f5eff953110f22f4da2163480fb
Reviewed-on: https://chromium-review.googlesource.com/c/chromium/src/+/4576009
Reviewed-by: Yao Xiao <yaoxiachromium.org>
Commit-Queue: Josh Karlin <jkarlinchromium.org>
Reviewed-by: Andrew Paseltiner <apaseltinerchromium.org>
Cr-Commit-Position: refs/heads/main{#1151938}

--

wpt-commits: ae92b6b433768fd3b35de1dbff6cc8f03197fdc5
wpt-pr: 40332

UltraBlame original commit: 540daebf60a1596746b438bce17e3d57ace5979a
gecko-dev-updater pushed a commit to marco-c/gecko-dev-wordified that referenced this issue Jun 16, 2023
… read and parse, a=testonly

Automatic update from web-platform-tests
[Topics] Refactor header to be easier to read and parse

Changed the format of the topics request header from:
"t=(1;v=chrome.1:1:2, 2), p=P00000" to
"(1 2);v=chrome.1:1:2, ();p=P00000"

As discussed in patcg-individual-drafts/topics#183

Bug: 1443540
Change-Id: I3ea40df9f8732f5eff953110f22f4da2163480fb
Reviewed-on: https://chromium-review.googlesource.com/c/chromium/src/+/4576009
Reviewed-by: Yao Xiao <yaoxiachromium.org>
Commit-Queue: Josh Karlin <jkarlinchromium.org>
Reviewed-by: Andrew Paseltiner <apaseltinerchromium.org>
Cr-Commit-Position: refs/heads/main{#1151938}

--

wpt-commits: ae92b6b433768fd3b35de1dbff6cc8f03197fdc5
wpt-pr: 40332

UltraBlame original commit: 540daebf60a1596746b438bce17e3d57ace5979a
@jkarlin
Copy link
Collaborator

jkarlin commented Jun 22, 2023

The sec- prefix protects the header from being modified by script.

@dmarti
Copy link
Contributor

dmarti commented Jun 22, 2023

@jkarlin See #25 and discussion at https://github.com/patcg-individual-drafts/topics/blob/55153e6166c4aa574ba20226a83bf7e693c07ed6/meetings/2023-06-05-minutes.md -- trying to figure out how a user is going to be able to use a browser extension to filter Topics API data (as they can already do now with cookie management extensions)

@jkarlin
Copy link
Collaborator

jkarlin commented Jun 22, 2023

If a user doesn't want/like topics, then they should simply disable it, which extensions can do. I don't think it makes sense to provide extension support simply to modify topics.

@martinthomson
Copy link

The sec- prefix protects the header from being modified by script.

Yes, but that is a factual statement. You didn't address the underlying question, which was whether there was a need for the value to be protected from modification. Is there some case where a page gains access to a topic value and makes requests - in a context where adversarial script might also be present - and wants to ensure that the adversarial script can't change the topics?

Or is this just the case that all new header fields gain a sec- prefix because we don't want to think about the consequences if we mistakenly omit sec- and someone discovers an attack?

@michaelkleber
Copy link
Collaborator

Just as Martin says, we've heard several times during the development of the various Privacy Sandbox APIs that potential ad tech users are concerned about the risk that a script on the page might try to maliciously modify headers for some fraud purpose. A Sec- header seems like a prudent safeguard in that context.

@amicus-veritatis
Copy link

amicus-veritatis commented Jun 23, 2023

A Sec- header seems like a prudent safeguard in that context.

Maybe, But I highly doubt that sec- is an appropriate header for browsing-topics. As I understand, The primary purpose of the sec- prefix is to prevent malicious websites from convincing user agents to send forged metadata along with requests. It is a security-related prefix indeed.

If it has to be a forbidden header, I propose a new, industry-specific prefix advertising- or analytics-.

@domenic
Copy link
Author

domenic commented Jun 23, 2023

As the person who originally opened this issue, I consider it closed by 9d8fec1. It seems like a new discussion has started about the Sec- prefix, which does not seem relevant to the OP. Probably that discussion is best moved to another issue?

@dmarti
Copy link
Contributor

dmarti commented Jun 23, 2023

@domenic Done, thank you: #207

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

6 participants