Questions about section 4.4: Client #121

skef · 2022-09-15T18:26:37Z

I've been reading this section of the spec in light of the TPAC discussion and I have a few questions relating to client state, as specified in 4.4.1, and also just the protocol. If you want me to break this out into separate issues let me know.

There is an accept_patch_format field in the request but the guidance just says "set to the list of Patch Formats that this client is capable of decoding". Seems like if we have this field it should also be used to ensure the server sends a (genuine) patch that matches the previous subset or patch. So maybe once one patch has been downloaded the content of the field should be reduced to the type sent by the server the first time.
On the other hand, that may be overly constraining, as I can imagine wanting the subset and patch formats to be different. (We have this idea that "a subset is a patch of an empty file" but I'm not sure why we need to enforce that.) It seems like the server can easily distinguish between a subset request and a patch request based on the parameters (at least), so maybe this field should mean "what the client can decode" on the initial request and then "what the server sent last" on subsequent patch requests.
The "font subset" of section 4.4.1 seems to have two roles. First, it is the file that gets "patched", and thus is the file that gets cached. Second, (I think) it is the file the client uses as a font. This strikes me as restrictive and maybe not how things tend to work in existing browsers for other formats.
For example, when a browser downloads a WOFF2 does it typically cache that file or the file extracted from it? (I would guess the former but I may be wrong.)
We talked about how you can't really "patch" a WOFF2 but the problems with doing so seem to relate to our particular model of what it means to patch something. If you can deterministically unpack an SFNT from a WOFF2 then you can patch it. Or maybe WOFF2 is unworkable but some similar format wouldn't be.
Wouldn't it be more general if the "font subset" part of the client state were just one of the valid "Patch Formats", which can then be unpacked into an SFNT (or whatever other end-use format)? Or, less specifically, wouldn't it be more general if what is cached need not be the end-use format?
Alternatively, if the end-use format will always be sole output representation of the patching process, why are we requiring a further checksum when SFNT (which is the only relevant underlying filetype for the foreseeable future) is already thoroughly check-summed? Are we saying that clients are relieved of checking those sums? (Do the clients not do those checks now?) Maybe to the extent that maybe you don't have to patch up head to have the right checkSumAdjustment? Or that the patch checksum and the SFNT checksums play different roles somehow? (Seems like we're proving our work over and over compared with, say, downloading a javascript file. Isn't this all TCP?)
If the cached file can be something other than the end-use font file then an intermediate checksum makes more sense to me.

The text was updated successfully, but these errors were encountered:

garretrieger · 2022-11-15T00:46:49Z

There is an accept_patch_format field in the request but the guidance just says "set to the list of Patch Formats that this client is capable of decoding". Seems like if we have this field it should also be used to ensure the server sends a (genuine) patch that matches the previous subset or patch. So maybe once one patch has been downloaded the content of the field should be reduced to the type sent by the server the first time.

On the other hand, that may be overly constraining, as I can imagine wanting the subset and patch formats to be different. (We have this idea that "a subset is a patch of an empty file" but I'm not sure why we need to enforce that.) It seems like the server can easily distinguish between a subset request and a patch request based on the parameters (at least), so maybe this field should mean "what the client can decode" on the initial request and then "what the server sent last" on subsequent patch requests.

I think it's perfectly valid to have subsequent requests use different patch formats. Since we are patching against an opaque font subset each patch application is unaffected by the patch format used previously. Consider a case where we had a format that was really good at encoding the initial non-patch response, and a second format that really excelled and encoding patches. Then it may be desirable to use a different patch format on the initial and follow up responses. If for some reason the client desires to continue using the same patch format across all requests it is allowed to set the accept patch format to a single value. The only must level requirement in the text currently is that the field contains at least one format.

The "font subset" of section 4.4.1 seems to have two roles. First, it is the file that gets "patched", and thus is the file that gets cached. Second, (I think) it is the file the client uses as a font. This strikes me as restrictive and maybe not how things tend to work in existing browsers for other formats.For example, when a browser downloads a WOFF2 does it typically cache that file or the file extracted from it? (I would guess the former but I may be wrong.)We talked about how you can't really "patch" a WOFF2 but the problems with doing so seem to relate to our particular model of what it means to patch something. If you can deterministically unpack an SFNT from a WOFF2 then you can patch it. Or maybe WOFF2 is unworkable but some similar format wouldn't be. Wouldn't it be more general if the "font subset" part of the client state were just one of the valid "Patch Formats", which can then be unpacked into an SFNT (or whatever other end-use format)? Or, less specifically, wouldn't it be more general if what is cached need not be the end-use format?

Yes, to me the "font subset" is an opaque binary asset that the browser can use as a font. This theoretically could be anything that the browser can currently handle for a font format (eg. TTF, OTF, WOFF, WOFF2). Now of course some of those formats won't be great candidates for use in IFT (eg. a compressed WOFF2 doesn't patch very efficiently, but it would still fundamentally work).

If I understand correctly the main thing you're asking for here is to have the option to have the patch implementation have the option to store an "intermediate" format in the state that could be different from the format we pass off to the browser. For the two existing patch formats in the spec this isn't necessary since they are both generic binary patchers, but I can see this possibly being useful for a more specialized patch format. That said I'm inclined to leave the spec as is for now since adding this would introduce complexity that isn't currently required, but I'd be happy to reconsider in the future if we are considering the inclusion of additional patch formats which would require this.

Alternatively, if the end-use format will always be sole output representation of the patching process, why are we requiring a further checksum when SFNT (which is the only relevant underlying filetype for the foreseeable future) is already thoroughly check-summed? Are we saying that clients are relieved of checking those sums? (Do the clients not do those checks now?) Maybe to the extent that maybe you don't have to patch up head to have the right checkSumAdjustment? Or that the patch checksum and the SFNT checksums play different roles somehow? (Seems like we're proving our work over and over compared with, say, downloading a javascript file. Isn't this all TCP?)If the cached file can be something other than the end-use font file then an intermediate checksum makes more sense to me.

We discussed this in the last wg call and if I remember correctly the outcome of that discussion was to remove the patched checksum from the response. I'll look into doing that.

This checksum isn't strictly necessary. The font already has internal checksums which can be used to check the correctness of the patched font. Fixes #121.

* Remove "ordering_checksum" field on the response. The client can compute this checksum from the codepoint ordering in the response, no need to explicitly send it. * Remove patched_checksum from PatchResponse. This checksum isn't strictly necessary. The font already has internal checksums which can be used to check the correctness of the patched font. Fixes #121. * Spelling.

garretrieger added a commit that referenced this issue Nov 16, 2022

Remove patched_checksum from PatchResponse.

2dcb5f7

This checksum isn't strictly necessary. The font already has internal checksums which can be used to check the correctness of the patched font. Fixes #121.

garretrieger added a commit that referenced this issue Nov 17, 2022

Remove patched_checksum from PatchResponse.

8ef78e2

This checksum isn't strictly necessary. The font already has internal checksums which can be used to check the correctness of the patched font. Fixes #121.

garretrieger mentioned this issue Nov 17, 2022

Remove ordering and patched checksum from response. #132

Merged

svgeesus closed this as completed in #132 Nov 18, 2022

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Questions about section 4.4: Client #121

Questions about section 4.4: Client #121

skef commented Sep 15, 2022 •

edited

Loading

garretrieger commented Nov 15, 2022

Questions about section 4.4: Client #121

Questions about section 4.4: Client #121

Comments

skef commented Sep 15, 2022 • edited Loading

garretrieger commented Nov 15, 2022

skef commented Sep 15, 2022 •

edited

Loading