Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Questions about section 4.4: Client #121

Closed
skef opened this issue Sep 15, 2022 · 1 comment · Fixed by #132
Closed

Questions about section 4.4: Client #121

skef opened this issue Sep 15, 2022 · 1 comment · Fixed by #132

Comments

@skef
Copy link
Contributor

skef commented Sep 15, 2022

I've been reading this section of the spec in light of the TPAC discussion and I have a few questions relating to client state, as specified in 4.4.1, and also just the protocol. If you want me to break this out into separate issues let me know.

  1. There is an accept_patch_format field in the request but the guidance just says "set to the list of Patch Formats that this client is capable of decoding". Seems like if we have this field it should also be used to ensure the server sends a (genuine) patch that matches the previous subset or patch. So maybe once one patch has been downloaded the content of the field should be reduced to the type sent by the server the first time.
  2. On the other hand, that may be overly constraining, as I can imagine wanting the subset and patch formats to be different. (We have this idea that "a subset is a patch of an empty file" but I'm not sure why we need to enforce that.) It seems like the server can easily distinguish between a subset request and a patch request based on the parameters (at least), so maybe this field should mean "what the client can decode" on the initial request and then "what the server sent last" on subsequent patch requests.
  3. The "font subset" of section 4.4.1 seems to have two roles. First, it is the file that gets "patched", and thus is the file that gets cached. Second, (I think) it is the file the client uses as a font. This strikes me as restrictive and maybe not how things tend to work in existing browsers for other formats.

    For example, when a browser downloads a WOFF2 does it typically cache that file or the file extracted from it? (I would guess the former but I may be wrong.)

    We talked about how you can't really "patch" a WOFF2 but the problems with doing so seem to relate to our particular model of what it means to patch something. If you can deterministically unpack an SFNT from a WOFF2 then you can patch it. Or maybe WOFF2 is unworkable but some similar format wouldn't be.

    Wouldn't it be more general if the "font subset" part of the client state were just one of the valid "Patch Formats", which can then be unpacked into an SFNT (or whatever other end-use format)? Or, less specifically, wouldn't it be more general if what is cached need not be the end-use format?

  4. Alternatively, if the end-use format will always be sole output representation of the patching process, why are we requiring a further checksum when SFNT (which is the only relevant underlying filetype for the foreseeable future) is already thoroughly check-summed? Are we saying that clients are relieved of checking those sums? (Do the clients not do those checks now?) Maybe to the extent that maybe you don't have to patch up head to have the right checkSumAdjustment? Or that the patch checksum and the SFNT checksums play different roles somehow? (Seems like we're proving our work over and over compared with, say, downloading a javascript file. Isn't this all TCP?)

    If the cached file can be something other than the end-use font file then an intermediate checksum makes more sense to me.

@garretrieger
Copy link
Contributor

  1. There is an accept_patch_format field in the request but the guidance just says "set to the list of Patch Formats that this client is capable of decoding". Seems like if we have this field it should also be used to ensure the server sends a (genuine) patch that matches the previous subset or patch. So maybe once one patch has been downloaded the content of the field should be reduced to the type sent by the server the first time.
  2. On the other hand, that may be overly constraining, as I can imagine wanting the subset and patch formats to be different. (We have this idea that "a subset is a patch of an empty file" but I'm not sure why we need to enforce that.) It seems like the server can easily distinguish between a subset request and a patch request based on the parameters (at least), so maybe this field should mean "what the client can decode" on the initial request and then "what the server sent last" on subsequent patch requests.

I think it's perfectly valid to have subsequent requests use different patch formats. Since we are patching against an opaque font subset each patch application is unaffected by the patch format used previously. Consider a case where we had a format that was really good at encoding the initial non-patch response, and a second format that really excelled and encoding patches. Then it may be desirable to use a different patch format on the initial and follow up responses. If for some reason the client desires to continue using the same patch format across all requests it is allowed to set the accept patch format to a single value. The only must level requirement in the text currently is that the field contains at least one format.

  1. The "font subset" of section 4.4.1 seems to have two roles. First, it is the file that gets "patched", and thus is the file that gets cached. Second, (I think) it is the file the client uses as a font. This strikes me as restrictive and maybe not how things tend to work in existing browsers for other formats.For example, when a browser downloads a WOFF2 does it typically cache that file or the file extracted from it? (I would guess the former but I may be wrong.)We talked about how you can't really "patch" a WOFF2 but the problems with doing so seem to relate to our particular model of what it means to patch something. If you can deterministically unpack an SFNT from a WOFF2 then you can patch it. Or maybe WOFF2 is unworkable but some similar format wouldn't be. Wouldn't it be more general if the "font subset" part of the client state were just one of the valid "Patch Formats", which can then be unpacked into an SFNT (or whatever other end-use format)? Or, less specifically, wouldn't it be more general if what is cached need not be the end-use format?

Yes, to me the "font subset" is an opaque binary asset that the browser can use as a font. This theoretically could be anything that the browser can currently handle for a font format (eg. TTF, OTF, WOFF, WOFF2). Now of course some of those formats won't be great candidates for use in IFT (eg. a compressed WOFF2 doesn't patch very efficiently, but it would still fundamentally work).

If I understand correctly the main thing you're asking for here is to have the option to have the patch implementation have the option to store an "intermediate" format in the state that could be different from the format we pass off to the browser. For the two existing patch formats in the spec this isn't necessary since they are both generic binary patchers, but I can see this possibly being useful for a more specialized patch format. That said I'm inclined to leave the spec as is for now since adding this would introduce complexity that isn't currently required, but I'd be happy to reconsider in the future if we are considering the inclusion of additional patch formats which would require this.

  1. Alternatively, if the end-use format will always be sole output representation of the patching process, why are we requiring a further checksum when SFNT (which is the only relevant underlying filetype for the foreseeable future) is already thoroughly check-summed? Are we saying that clients are relieved of checking those sums? (Do the clients not do those checks now?) Maybe to the extent that maybe you don't have to patch up head to have the right checkSumAdjustment? Or that the patch checksum and the SFNT checksums play different roles somehow? (Seems like we're proving our work over and over compared with, say, downloading a javascript file. Isn't this all TCP?)If the cached file can be something other than the end-use font file then an intermediate checksum makes more sense to me.

We discussed this in the last wg call and if I remember correctly the outcome of that discussion was to remove the patched checksum from the response. I'll look into doing that.

garretrieger added a commit that referenced this issue Nov 16, 2022
This checksum isn't strictly necessary. The font already has internal checksums which can be used to check the correctness of the patched font. Fixes #121.
garretrieger added a commit that referenced this issue Nov 17, 2022
This checksum isn't strictly necessary. The font already has internal checksums which can be used to check the correctness of the patched font. Fixes #121.
svgeesus pushed a commit that referenced this issue Nov 18, 2022
* Remove "ordering_checksum" field on the response.

The client can compute this checksum from the codepoint ordering in the response, no need to explicitly send it.

* Remove patched_checksum from PatchResponse.

This checksum isn't strictly necessary. The font already has internal checksums which can be used to check the correctness of the patched font. Fixes #121.

* Spelling.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants