Overhead in cache digest algorithm #264

Open
sebdeckers opened this Issue Nov 13, 2016 · 1 comment

Projects

None yet

3 participants

@sebdeckers

(Originally misfiled at mnot/I-D#204)

I believe the algorithm may be improved to increase speed and reduce digest size.

De-dupe URLs before hashing them

Wastes time sorting and skipping through the list in a later stage.
Inflates value of N, needlessly increasing size of the entire digest.

Trim URL origin

Every URL has identical origin. Hashing this repetitive data is wasted effort.

@kazuho
kazuho commented Nov 14, 2016

Thank you for your suggestions.

De-dupe URLs before hashing them

Is it likely that there'd be lot of freshly-cached (or stale-cached) responses sharing the same URL?

My assumption is that the answer is no, and that we do not need to recommend deduping the URLs before hashing (note: deduplication after hashing and truncating would still be necessary even if you dedupe the URLs beforehand).

If the answer is yes, then we should consider including additional keys into the hash context so that the server can more correctly identify what is being cached.

Trim URL origin

I agree that we can trim the origin part of the URL, since in the latest draft we have an origin field for every CACHE_DIGEST frame.

@mnot mnot added the cache-digest label Nov 21, 2016
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment