From 99c9b4688f002477e86d587e21bf1b1b901cb870 Mon Sep 17 00:00:00 2001 From: Yoav Weiss Date: Sun, 5 Nov 2017 20:29:57 +0100 Subject: [PATCH 01/19] Change from golomb sets to cuckoo filters --- draft-ietf-httpbis-cache-digest.md | 237 +++++++++++++++++++---------- 1 file changed, 158 insertions(+), 79 deletions(-) diff --git a/draft-ietf-httpbis-cache-digest.md b/draft-ietf-httpbis-cache-digest.md index d8b8b0131..8b22e4cc1 100644 --- a/draft-ietf-httpbis-cache-digest.md +++ b/draft-ietf-httpbis-cache-digest.md @@ -26,6 +26,13 @@ author: email: mnot@mnot.net uri: https://www.mnot.net/ + - + ins: Y. Weiss + name: Yoav Weiss + organization: Akamai Technologies Inc. + email: yoav@yoav.ws + uri: https://blog.yoav.ws/ + normative: RFC2119: RFC3986: @@ -69,6 +76,9 @@ informative: Fetch: title: Fetch Standard target: https://fetch.spec.whatwg.org/ + Cuckoo: + title: Cuckoo Filter: Practically Better Than Bloom + target: https://www.cs.cmu.edu/~dga/papers/cuckoo-conext2014.pdf --- abstract @@ -99,10 +109,9 @@ allows a stream to be cancelled by a client using a RST_STREAM frame in this sit is still at least one round trip of potentially wasted capacity even then. This specification defines a HTTP/2 frame type to allow clients to inform the server of their -cache's contents using a Golomb-Rice Coded Set {{Rice}}. Servers can then use this to inform their +cache's contents using a Cuckoo-fliter {{Cuckoo}} based digest. Servers can then use this to inform their choices of what to push to clients. - ## Notational Conventions The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT", "SHOULD", "SHOULD NOT", @@ -139,10 +148,6 @@ The CACHE_DIGEST frame defines the following flags: * **COMPLETE** (0x2): When set, indicates that the currently valid set of cache digests held by the server constitutes a complete representation of the cache's state regarding that origin, for the type of cached response indicated by the `STALE` flag. -* **VALIDATORS** (0x4): When set, indicates that the `validators` boolean in {{computing}} is true. - -* **STALE** (0x8): When set, indicates that all cached responses represented in the digest-value are stale {{RFC7234}} at the point in them that the digest was generated; otherwise, all are fresh. - ## Client Behavior A CACHE_DIGEST frame MUST be sent from a client to a server on stream 0, and conveys a digest of @@ -161,75 +166,137 @@ When generating CACHE_DIGEST, a client MUST NOT include cached responses whose U origins {{RFC6454}} with the indicated origin. Clients MUST NOT send CACHE_DIGEST frames on connections that are not authoritative (as defined in {{RFC7540}}, 10.1) for the indicated origin. -CACHE_DIGEST allows the client to indicate whether the set of URLs used to compute the digest -represent fresh or stale stored responses, using the STALE flag. Clients MAY decide whether to only -send CACHE_DIGEST frames representing their fresh stored responses, their stale stored responses, -or both. +Clients can choose to only send a subset of the suitable stored responses. However, when the +CACHE_DIGEST frames sent represent the complete set of stored responses of a given type, the last +such frame SHOULD have a COMPLETE flag set, to indicate to the server that it has all relevant +state of that type. Note that for the purposes of COMPLETE, responses cached since the beginning +of the connection or the last RESET flag on a CACHE_DIGEST frame need not be included. -Clients can choose to only send a subset of the suitable stored responses of each type (fresh or -stale). However, when the CACHE_DIGEST frames sent represent the complete set of stored responses -of a given type, the last such frame SHOULD have a COMPLETE flag set, to indicate to the server -that it has all relevant state of that type. Note that for the purposes of COMPLETE, responses -cached since the beginning of the connection or the last RESET flag on a CACHE_DIGEST frame need -not be included. - -CACHE_DIGEST can be computed to include cached responses' ETags, as indicated by the VALIDATORS -flag. This information can be used by servers to decide what kinds of responses to push to clients; -for example, a stale response that hasn't changed could be refreshed with a 304 (Not Modified) -response; one that has changed can be replaced with a 200 (OK) response, whether the cached -response was fresh or stale. +CACHE_DIGEST will also include the cached responses' ETags, if they were present in the response. +This information can be used by servers to decide if a response needs to be pushed to clients; +If a response is cached and was not changed at the origin server, the server calculating its hash +will find it in the digest and therefore will not push it. If a response is cached but was +modified at the origin server, the server calculating its hash will not find it in the digest, so +the response will be pushed. CACHE_DIGEST has no defined meaning when sent from servers, and SHOULD be ignored by clients. -### Computing the Digest-Value {#computing} +### Creating a digest {#creating} +Given the following inputs: +* `P`, an integer smaller than 256, that indicates the probability of a false positive that is acceptable, expressed as `1/2\*\*P`. +* `N`, an integer that represents the number of entries - a prime number smaller than 2\*\*32 + +1. Let `f` be the number of bits per fingerprint, calculated as `P + 3` +2. Let `b` be the bucket size, defined as 4 +3. Let `bytes` be `f`*`N`*`b`/8 rounded up to the nearest integer +4. Add 5 to `bytes` +5. Allocate memory of `bytes` and set it to zero. Assign it to `digest-value`. +6. Set the first byte to `P` +7. Set the second till fifth bytes to `N` in big endian form +8. Return the `digest-value`. + +### Adding a URL to the Digest-Value {#adding} Given the following inputs: -* `validators`, a boolean indicating whether validators ({{RFC7232}}) are to be included in the digest; -* `URLs'`, an array of (string `URL`, string `ETag`) tuples, each corresponding to the Effective Request URI ({{RFC7230}}, Section 5.5) of a cached response {{RFC7234}} and its entity-tag {{RFC7232}} (if `validators` is true and if the ETag is available; otherwise, null); -* `P`, an integer that MUST be a power of 2 smaller than 2\*\*32, that indicates the probability of a false positive that is acceptable, expressed as `1/P`. - -`digest-value` can be computed using the following algorithm: - -1. Let N be the count of `URLs`' members, rounded to the nearest power of 2 smaller than 2\*\*32. -2. Let `hash-values` be an empty array of integers. -3. For each (`URL`, `ETag`) in `URLs`, compute a hash value ({{hash}}) and append the result to `hash-values`. -4. Sort `hash-values` in ascending order. -5. Let `digest-value` be an empty array of bits. -6. Write log base 2 of `N` to `digest-value` using 5 bits. -7. Write log base 2 of `P` to `digest-value` using 5 bits. -8. Let `C` be -1. -9. For each `V` in `hash-values`: - 1. If `V` is equal to `C`, continue to the next `V`. - 2. Let `D` be the result of `V - C - 1`. - 3. Let `Q` be the integer result of `D / P`. - 4. Let `R` be the result of `D modulo P`. - 5. Write `Q` '0' bits to `digest-value`. - 6. Write 1 '1' bit to `digest-value`. - 7. Write `R` to `digest-value` as binary, using log2(`P`) bits. - 8. Let `C` be `V` -10. If the length of `digest-value` is not a multiple of 8, pad it with 0s until it is. +* `URL` a string corresponding to the Effective Request URI ({{RFC7230}}, Section 5.5) of a cached response {{RFC7234}} +* `ETag` a string corresponding to the entity-tag {{RFC7232}} if a cached response {{RFC7234}} (if the ETag is available; otherwise, null); +* `maxcount` - max number of cuckoo hops +* `digest-value` + +1. Let `f` be the value of the first byte of `digest-value`. +2. Let `b` be the bucket size, defined as 4. +3. Let `N` be the value of the second to fifth bytes of `digest-value` in big endian form. +4. Let `key` be the return value of {{key}} with `URL` and `ETag` as inputs. +5. Let `h1` be the return value of {{hash}} with `key` and `N` as inputs. +6. Let `fingerprint` be the return value of {{fingerprint}} with `key`, `N` and `f` as inputs. +7. Let `h2` be the return value of {{hash}} with `fingerprint` and `N` as inputs, XORed with `h1`. +8. Let `h` be either `h1` or `h2`, picked in random. +9. Let `position_start` be 40 + `h` * `f`. +10. Let `position_end` be `position_start` + `f` * `b`. +11. While `position_start` < `position_end`: + 1. Let `bits` be `f` bits from `digest_value` starting at `position_start`. + 2. If `bits` is all zeros, set `bits` to `fingerprint` and terminate these steps. + 3. Add `f` to `position_start`. +12. Substract `f` from `position_start`. +13. Let `fingerprint` be the `f` bits starting at `position_start`. +14. Let `h1` be `h` +15. Substract 1 from `maxcount`. +16. If `maxcount` is zero, return an error. +17. Go to step 7. + + +### Removing a URL to the Digest-Value {#removing} +Given the following inputs: -### Computing a Hash Value {#hash} +* `URL` a string corresponding to the Effective Request URI ({{RFC7230}}, Section 5.5) of a cached response {{RFC7234}} +* `ETag` a string corresponding to the entity-tag {{RFC7232}} if a cached response {{RFC7234}} (if the ETag is available; otherwise, null); +* `digest-value` + +1. Let `f` be the value of the first byte of `digest-value`. +3. Let `N` be the value of the second to fifth bytes of `digest-value` in big endian form. +4. Let `key` be the return value of {{key}} with `URL` and `ETag` as inputs. +5. Let `h1` be the return value of {{hash}} with `key` and `N` as inputs. +6. Let `fingerprint` be the return value of {{fingerprint}} with `key`, `N` and `f` as inputs. +7. Let `h2` be the return value of {{hash}} with `fingerprint` and `N` as inputs, XORed with `h1`. +8. Let `h` be `h1`. +9. Let `position_start` be 40 + `h` * `f`. +10. Let `position_end` be `position_start` + `f` * `b`. +11. While `position_start` < `position_end`: + 1. Let `bits` be `f` bits from `digest_value` starting at `position_start`. + 2. If `bits` is `fingerprint`, set `bits` to all zeros and terminate these steps. + 3. Add `f` to `position_start`. +12. If `h` is not `h2`, set `h` to `h2` and return to step 9. + +### Computing a fingerprint value {#fingerprint} -Given: +Given the following inputs: -* `URL`, an array of characters -* `ETag`, an array of characters -* `validators`, a boolean +* `key`, an array of characters * `N`, an integer -* `P`, an integer +* `f`, an integer indicating the number of output bits -`hash-value` can be computed using the following algorithm: +1. Let `hash-value` be the return value of {{hash}} with `key` and `N` as inputs. +2. Let `h` be the number of bits in `hash-value` +3. Let `fingerprint-value` be 0 +4. While `fingerprint-value` is 0 and `h` > `f`: + 4.1. Let `fingerprint-value` be the `f` least significant bits of `hash-value`. + 4.2. Let `hash-value` be the the `h`-`f` most significant bits of `hash-value`. + 4.3. `h` -= `f` +5. If `fingerprint-value` is 0, let `fingerprint-value` be 1. +6. Return `fingerprint-value`. + +Note: Step 5 is to handle the extremely unlikely case where a SHA-256 digest of `key` is all zeros. The implications of it means that +there's an infitisimaly larger probability of getting a `fingerprint-value` of 1 compared to all other values. This is not a problem for any +practical purpose. + + + +### Computing the key {#key} + +Given the following inputs: + +* `URL`, an array of characters +* `ETag`, an array of characters 1. Let `key` be `URL` converted to an ASCII string by percent-encoding as appropriate {{RFC3986}}. -2. If `validators` is true and `ETag` is not null: +2. If `ETag` is not null: 1. Append `ETag` to `key` as an ASCII string, including both the `weak` indicator (if present) and double quotes, as per {{RFC7232}}, Section 2.3. -3. Let `hash-value` be the SHA-256 message digest {{RFC6234}} of `key`, expressed as an integer. -4. Truncate `hash-value` to log2( `N` \* `P` ) bits. +3. Return `key` + +### Computing a Hash Value {#hash} + +Given the following inputs: + +* `key`, an array of characters. +* `N`, an integer +`hash-value` can be computed using the following algorithm: + +1. Let `hash-value` be the SHA-256 message digest {{RFC6234}} of `key`, expressed as an integer. +2. Return `hash-value` modulo N. ## Server Behavior @@ -237,9 +304,9 @@ Given: In typical use, a server will query (as per {{querying}}) the CACHE_DIGESTs received on a given connection to inform what it pushes to that client; - * If a given URL has a match in a current CACHE_DIGEST with the STALE flag unset, it need not be pushed, because it is fresh in cache; - * If a given URL and ETag combination has a match in a current CACHE_DIGEST with the STALE flag set, the client has a stale copy in cache, and a validating response can be pushed; - * If a given URL has no match in any current CACHE_DIGEST, the client does not have a cached copy, and a complete response can be pushed. + * If a given URL and ETag combination has a match in a current CACHE_DIGEST, a complete response need not be pushed; The server MAY push a + 304 response for that resource, indicating the client that it hasn't changed. + * If a given URL and ETag has no match in any current CACHE_DIGEST, the client does not have a cached copy, and a complete response can be pushed. Servers MAY use all CACHE_DIGESTs received for a given origin as current, as long as they do not have the RESET flag set; a CACHE_DIGEST frame with the RESET flag set MUST clear any @@ -255,25 +322,39 @@ Servers MUST ignore CACHE_DIGEST frames sent on a stream other than 0. ### Querying the Digest for a Value {#querying} -Given: +Given the following inputs: -* `digest-value`, an array of bits -* `URL`, an array of characters -* `ETag`, an array of characters -* `validators`, a boolean +* `URL` a string corresponding to the Effective Request URI ({{RFC7230}}, Section 5.5) of a cached response {{RFC7234}}. +* `ETag` a string corresponding to the entity-tag {{RFC7232}} if a cached response {{RFC7234}} (if the ETag is available; otherwise, null). +* `digest-value`, an array of bits. + +1. Let `f` be the value of the first byte of `digest-value`. +3. Let `N` be the value of the second to fifth bytes of `digest-value` in big endian form. +4. Let `key` be the return value of {{key}} with `URL` and `ETag` as inputs. +5. Let `h1` be the return value of {{hash}} with `key` and `N` as inputs. +6. Let `fingerprint` be the return value of {{fingerprint}} with `key`, `N` and `f` as inputs. +7. Let `h2` be the return value of {{hash}} with `fingerprint` and `N` as inputs, XORed with `h1`. +8. Let `h` be `h1`. +9. Let `position_start` be 40 + `h` * `f`. +10. Let `position_end` be `position_start` + `f` * `b`. +11. While `position_start` < `position_end`: + 1. Let `bits` be `f` bits from `digest_value` starting at `position_start`. + 2. If `bits` is `fingerprint`, return true + 3. Add `f` to `position_start`. +12. Return false. + +# The SENDING_CACHE_DIGEST SETTINGS Parameter + +A Client SHOULD notify its support for CACHE_DIGEST frames by sending the SENDING_CACHE_DIGEST (0xXXX) SETTINGS parameter. -we can determine whether there is a match in the digest using the following algorithm: +The value of the parameter is a bit-field of which the following bits are defined: + +DIGEST_PENDING (0x1): When set it indicates that the client has a digest to send, and the server may choose to wait for a digest in order to make +server push decisions. -1. Read the first 5 bits of `digest-value` as an integer; let `N` be two raised to the power of that value. -2. Read the next 5 bits of `digest-value` as an integer; let `P` be two raised to the power of that value. -3. Let `hash-value` be the result of computing a hash value ({{hash}}). -4. Let `C` be -1. -5. Read '0' bits from `digest-value` until a '1' bit is found; let `Q` be the number of '0' bits. Discard the '1'. -6. Read log2(`P`) bits from `digest-value` after the '1' as an integer; let `R` be its value. -7. Let `D` be `Q` * `P` + `R`. -8. Increment `C` by `D` + 1. -9. If `C` is equal to `hash-value`, return 'true'. -10. Otherwise, return to step 5 and continue processing; if no match is found before `digest-value` is exhausted, return 'false'. +Rest of the bits MUST be ignored and MUST be left unset when sending. + +The initial value of the parameter is zero (0x0) meaning that the client has no digest to send the server. # The ACCEPT_CACHE_DIGEST SETTINGS Parameter @@ -282,9 +363,7 @@ If the server is tempted to making optimizations based on CACHE_DIGEST frames, i The value of the parameter is a bit-field of which the following bits are defined: -FRESH (0x1): When set, it indicates that the server is willing to make use of a digest of freshly-cached responses. - -STALE (0x2): When set, it indicates that the server is willing to make use of a digest of stale-cached responses. +ACCEPT (0x1): When set, it indicates that the server is willing to make use of a digest of cached responses. Rest of the bits MUST be ignored and MUST be left unset when sending. From 69a8d5634d30c91c0621792476d0548ff883dbc4 Mon Sep 17 00:00:00 2001 From: Yoav Weiss Date: Mon, 6 Nov 2017 02:07:07 +0100 Subject: [PATCH 02/19] Fixed Markdown issue and removed myself as author --- draft-ietf-httpbis-cache-digest.md | 9 +-------- 1 file changed, 1 insertion(+), 8 deletions(-) diff --git a/draft-ietf-httpbis-cache-digest.md b/draft-ietf-httpbis-cache-digest.md index 8b22e4cc1..f88f9cef3 100644 --- a/draft-ietf-httpbis-cache-digest.md +++ b/draft-ietf-httpbis-cache-digest.md @@ -26,13 +26,6 @@ author: email: mnot@mnot.net uri: https://www.mnot.net/ - - - ins: Y. Weiss - name: Yoav Weiss - organization: Akamai Technologies Inc. - email: yoav@yoav.ws - uri: https://blog.yoav.ws/ - normative: RFC2119: RFC3986: @@ -77,7 +70,7 @@ informative: title: Fetch Standard target: https://fetch.spec.whatwg.org/ Cuckoo: - title: Cuckoo Filter: Practically Better Than Bloom + title: 'Cuckoo Filter: Practically Better Than Bloom' target: https://www.cs.cmu.edu/~dga/papers/cuckoo-conext2014.pdf --- abstract From 375600c7439cf57fb6d2baff5f0f97a1d0e701f6 Mon Sep 17 00:00:00 2001 From: Yoav Weiss Date: Mon, 6 Nov 2017 02:13:36 +0100 Subject: [PATCH 03/19] Removed reference to "computing" --- draft-ietf-httpbis-cache-digest.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/draft-ietf-httpbis-cache-digest.md b/draft-ietf-httpbis-cache-digest.md index f88f9cef3..d6fe3dbf4 100644 --- a/draft-ietf-httpbis-cache-digest.md +++ b/draft-ietf-httpbis-cache-digest.md @@ -133,7 +133,7 @@ Origin: : A sequence of characters containing the ASCII serialization of an origin ({{!RFC6454}}, Section 6.2) that the Digest-Value applies to. Digest-Value: -: A sequence of octets containing the digest as computed in {{computing}}. +: A sequence of octets containing the digest as computed in {{creating}} and {{adding}}. The CACHE_DIGEST frame defines the following flags: From f4bf0a52b77f30732ee3c19e114671333181e3b2 Mon Sep 17 00:00:00 2001 From: Yoav Weiss Date: Mon, 13 Nov 2017 10:45:10 +0100 Subject: [PATCH 04/19] define b --- draft-ietf-httpbis-cache-digest.md | 4 +++- 1 file changed, 3 insertions(+), 1 deletion(-) diff --git a/draft-ietf-httpbis-cache-digest.md b/draft-ietf-httpbis-cache-digest.md index d6fe3dbf4..64537f930 100644 --- a/draft-ietf-httpbis-cache-digest.md +++ b/draft-ietf-httpbis-cache-digest.md @@ -181,7 +181,7 @@ Given the following inputs: * `N`, an integer that represents the number of entries - a prime number smaller than 2\*\*32 1. Let `f` be the number of bits per fingerprint, calculated as `P + 3` -2. Let `b` be the bucket size, defined as 4 +2. Let `b` be the bucket size, defined as 4. 3. Let `bytes` be `f`*`N`*`b`/8 rounded up to the nearest integer 4. Add 5 to `bytes` 5. Allocate memory of `bytes` and set it to zero. Assign it to `digest-value`. @@ -229,6 +229,7 @@ Given the following inputs: * `digest-value` 1. Let `f` be the value of the first byte of `digest-value`. +2. Let `b` be the bucket size, defined as 4. 3. Let `N` be the value of the second to fifth bytes of `digest-value` in big endian form. 4. Let `key` be the return value of {{key}} with `URL` and `ETag` as inputs. 5. Let `h1` be the return value of {{hash}} with `key` and `N` as inputs. @@ -322,6 +323,7 @@ Given the following inputs: * `digest-value`, an array of bits. 1. Let `f` be the value of the first byte of `digest-value`. +2. Let `b` be the bucket size, defined as 4. 3. Let `N` be the value of the second to fifth bytes of `digest-value` in big endian form. 4. Let `key` be the return value of {{key}} with `URL` and `ETag` as inputs. 5. Let `h1` be the return value of {{hash}} with `key` and `N` as inputs. From afdbca9a7761c0a47077358ec964bedec443057e Mon Sep 17 00:00:00 2001 From: Yoav Weiss Date: Tue, 14 Nov 2017 05:56:18 +0100 Subject: [PATCH 05/19] Review comments --- draft-ietf-httpbis-cache-digest.md | 20 ++++++++++---------- 1 file changed, 10 insertions(+), 10 deletions(-) diff --git a/draft-ietf-httpbis-cache-digest.md b/draft-ietf-httpbis-cache-digest.md index 64537f930..0fa7937b2 100644 --- a/draft-ietf-httpbis-cache-digest.md +++ b/draft-ietf-httpbis-cache-digest.md @@ -102,7 +102,7 @@ allows a stream to be cancelled by a client using a RST_STREAM frame in this sit is still at least one round trip of potentially wasted capacity even then. This specification defines a HTTP/2 frame type to allow clients to inform the server of their -cache's contents using a Cuckoo-fliter {{Cuckoo}} based digest. Servers can then use this to inform their +cache's contents using a Cuckoo-filter {{Cuckoo}} based digest. Servers can then use this to inform their choices of what to push to clients. ## Notational Conventions @@ -182,7 +182,7 @@ Given the following inputs: 1. Let `f` be the number of bits per fingerprint, calculated as `P + 3` 2. Let `b` be the bucket size, defined as 4. -3. Let `bytes` be `f`*`N`*`b`/8 rounded up to the nearest integer +3. Let `bytes` be `f`\*`N`\*`b`/8 rounded up to the nearest integer 4. Add 5 to `bytes` 5. Allocate memory of `bytes` and set it to zero. Assign it to `digest-value`. 6. Set the first byte to `P` @@ -207,7 +207,7 @@ Given the following inputs: 7. Let `h2` be the return value of {{hash}} with `fingerprint` and `N` as inputs, XORed with `h1`. 8. Let `h` be either `h1` or `h2`, picked in random. 9. Let `position_start` be 40 + `h` * `f`. -10. Let `position_end` be `position_start` + `f` * `b`. +10. Let `position_end` be `position_start` + `f` \* `b`. 11. While `position_start` < `position_end`: 1. Let `bits` be `f` bits from `digest_value` starting at `position_start`. 2. If `bits` is all zeros, set `bits` to `fingerprint` and terminate these steps. @@ -236,8 +236,8 @@ Given the following inputs: 6. Let `fingerprint` be the return value of {{fingerprint}} with `key`, `N` and `f` as inputs. 7. Let `h2` be the return value of {{hash}} with `fingerprint` and `N` as inputs, XORed with `h1`. 8. Let `h` be `h1`. -9. Let `position_start` be 40 + `h` * `f`. -10. Let `position_end` be `position_start` + `f` * `b`. +9. Let `position_start` be 40 + `h` \* `f`. +10. Let `position_end` be `position_start` + `f` \* `b`. 11. While `position_start` < `position_end`: 1. Let `bits` be `f` bits from `digest_value` starting at `position_start`. 2. If `bits` is `fingerprint`, set `bits` to all zeros and terminate these steps. @@ -257,8 +257,8 @@ Given the following inputs: 3. Let `fingerprint-value` be 0 4. While `fingerprint-value` is 0 and `h` > `f`: 4.1. Let `fingerprint-value` be the `f` least significant bits of `hash-value`. - 4.2. Let `hash-value` be the the `h`-`f` most significant bits of `hash-value`. - 4.3. `h` -= `f` + 4.2. Let `hash-value` be the `h`-`f` most significant bits of `hash-value`. + 4.3. Substract `f` from `h`. 5. If `fingerprint-value` is 0, let `fingerprint-value` be 1. 6. Return `fingerprint-value`. @@ -289,7 +289,7 @@ Given the following inputs: `hash-value` can be computed using the following algorithm: -1. Let `hash-value` be the SHA-256 message digest {{RFC6234}} of `key`, expressed as an integer. +1. Let `hash-value` be the SHA-256 message digest {{RFC6234}} of `key`, truncated to 32 bits, expressed as an integer. 2. Return `hash-value` modulo N. @@ -330,8 +330,8 @@ Given the following inputs: 6. Let `fingerprint` be the return value of {{fingerprint}} with `key`, `N` and `f` as inputs. 7. Let `h2` be the return value of {{hash}} with `fingerprint` and `N` as inputs, XORed with `h1`. 8. Let `h` be `h1`. -9. Let `position_start` be 40 + `h` * `f`. -10. Let `position_end` be `position_start` + `f` * `b`. +9. Let `position_start` be 40 + `h` \* `f`. +10. Let `position_end` be `position_start` + `f` \* `b`. 11. While `position_start` < `position_end`: 1. Let `bits` be `f` bits from `digest_value` starting at `position_start`. 2. If `bits` is `fingerprint`, return true From 01a726a18807332d3752ca2b178b9da70adc85ad Mon Sep 17 00:00:00 2001 From: Yoav Weiss Date: Tue, 14 Nov 2017 06:27:55 +0100 Subject: [PATCH 06/19] Typo --- draft-ietf-httpbis-cache-digest.md | 6 +++--- 1 file changed, 3 insertions(+), 3 deletions(-) diff --git a/draft-ietf-httpbis-cache-digest.md b/draft-ietf-httpbis-cache-digest.md index 0fa7937b2..36abaf2e4 100644 --- a/draft-ietf-httpbis-cache-digest.md +++ b/draft-ietf-httpbis-cache-digest.md @@ -194,7 +194,7 @@ Given the following inputs: Given the following inputs: * `URL` a string corresponding to the Effective Request URI ({{RFC7230}}, Section 5.5) of a cached response {{RFC7234}} -* `ETag` a string corresponding to the entity-tag {{RFC7232}} if a cached response {{RFC7234}} (if the ETag is available; otherwise, null); +* `ETag` a string corresponding to the entity-tag {{RFC7232}} of a cached response {{RFC7234}} (if the ETag is available; otherwise, null); * `maxcount` - max number of cuckoo hops * `digest-value` @@ -225,7 +225,7 @@ Given the following inputs: Given the following inputs: * `URL` a string corresponding to the Effective Request URI ({{RFC7230}}, Section 5.5) of a cached response {{RFC7234}} -* `ETag` a string corresponding to the entity-tag {{RFC7232}} if a cached response {{RFC7234}} (if the ETag is available; otherwise, null); +* `ETag` a string corresponding to the entity-tag {{RFC7232}} of a cached response {{RFC7234}} (if the ETag is available; otherwise, null); * `digest-value` 1. Let `f` be the value of the first byte of `digest-value`. @@ -319,7 +319,7 @@ Servers MUST ignore CACHE_DIGEST frames sent on a stream other than 0. Given the following inputs: * `URL` a string corresponding to the Effective Request URI ({{RFC7230}}, Section 5.5) of a cached response {{RFC7234}}. -* `ETag` a string corresponding to the entity-tag {{RFC7232}} if a cached response {{RFC7234}} (if the ETag is available; otherwise, null). +* `ETag` a string corresponding to the entity-tag {{RFC7232}} of a cached response {{RFC7234}} (if the ETag is available; otherwise, null). * `digest-value`, an array of bits. 1. Let `f` be the value of the first byte of `digest-value`. From 10b5b14e4c74ba77240af55ecd770aae3ce066a3 Mon Sep 17 00:00:00 2001 From: Yoav Weiss Date: Tue, 14 Nov 2017 06:41:23 +0100 Subject: [PATCH 07/19] Style issues in `fingerprint` --- draft-ietf-httpbis-cache-digest.md | 6 +++--- 1 file changed, 3 insertions(+), 3 deletions(-) diff --git a/draft-ietf-httpbis-cache-digest.md b/draft-ietf-httpbis-cache-digest.md index 36abaf2e4..2f316bc49 100644 --- a/draft-ietf-httpbis-cache-digest.md +++ b/draft-ietf-httpbis-cache-digest.md @@ -256,9 +256,9 @@ Given the following inputs: 2. Let `h` be the number of bits in `hash-value` 3. Let `fingerprint-value` be 0 4. While `fingerprint-value` is 0 and `h` > `f`: - 4.1. Let `fingerprint-value` be the `f` least significant bits of `hash-value`. - 4.2. Let `hash-value` be the `h`-`f` most significant bits of `hash-value`. - 4.3. Substract `f` from `h`. + 1. Let `fingerprint-value` be the `f` least significant bits of `hash-value`. + 2. Let `hash-value` be the `h`-`f` most significant bits of `hash-value`. + 3. Substract `f` from `h`. 5. If `fingerprint-value` is 0, let `fingerprint-value` be 1. 6. Return `fingerprint-value`. From d32c5332e86abfcaac6afaaf667508586f56aa87 Mon Sep 17 00:00:00 2001 From: Yoav Weiss Date: Tue, 14 Nov 2017 07:01:30 +0100 Subject: [PATCH 08/19] Fix wrong hash reference in fingerprint creation --- draft-ietf-httpbis-cache-digest.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/draft-ietf-httpbis-cache-digest.md b/draft-ietf-httpbis-cache-digest.md index 2f316bc49..cd939cf1c 100644 --- a/draft-ietf-httpbis-cache-digest.md +++ b/draft-ietf-httpbis-cache-digest.md @@ -252,7 +252,7 @@ Given the following inputs: * `N`, an integer * `f`, an integer indicating the number of output bits -1. Let `hash-value` be the return value of {{hash}} with `key` and `N` as inputs. +1. Let `hash-value` be the SHA-256 message digest {{RFC6234}} of `key`, expressed as an integer. 2. Let `h` be the number of bits in `hash-value` 3. Let `fingerprint-value` be 0 4. While `fingerprint-value` is 0 and `h` > `f`: From c90f9b4bc014efde6c843de8f9768dfb50d4d21d Mon Sep 17 00:00:00 2001 From: Yoav Weiss Date: Wed, 15 Nov 2017 08:49:12 +0100 Subject: [PATCH 09/19] Define hashing of fingerprint --- draft-ietf-httpbis-cache-digest.md | 50 ++++++++++++++++-------------- 1 file changed, 26 insertions(+), 24 deletions(-) diff --git a/draft-ietf-httpbis-cache-digest.md b/draft-ietf-httpbis-cache-digest.md index cd939cf1c..47b92d466 100644 --- a/draft-ietf-httpbis-cache-digest.md +++ b/draft-ietf-httpbis-cache-digest.md @@ -204,20 +204,21 @@ Given the following inputs: 4. Let `key` be the return value of {{key}} with `URL` and `ETag` as inputs. 5. Let `h1` be the return value of {{hash}} with `key` and `N` as inputs. 6. Let `fingerprint` be the return value of {{fingerprint}} with `key`, `N` and `f` as inputs. -7. Let `h2` be the return value of {{hash}} with `fingerprint` and `N` as inputs, XORed with `h1`. -8. Let `h` be either `h1` or `h2`, picked in random. -9. Let `position_start` be 40 + `h` * `f`. -10. Let `position_end` be `position_start` + `f` \* `b`. -11. While `position_start` < `position_end`: +7. Let `fingerprint-string` be the value of `fingerprint` in base 10, expressed as a string. +8. Let `h2` be the return value of {{hash}} with `fingerprint-string` and `N` as inputs, XORed with `h1`. +9. Let `h` be either `h1` or `h2`, picked in random. +10. Let `position_start` be 40 + `h` * `f`. +11. Let `position_end` be `position_start` + `f` \* `b`. +12. While `position_start` < `position_end`: 1. Let `bits` be `f` bits from `digest_value` starting at `position_start`. 2. If `bits` is all zeros, set `bits` to `fingerprint` and terminate these steps. 3. Add `f` to `position_start`. -12. Substract `f` from `position_start`. -13. Let `fingerprint` be the `f` bits starting at `position_start`. -14. Let `h1` be `h` -15. Substract 1 from `maxcount`. -16. If `maxcount` is zero, return an error. -17. Go to step 7. +13. Substract `f` from `position_start`. +14. Let `fingerprint` be the `f` bits starting at `position_start`. +15. Let `h1` be `h` +16. Substract 1 from `maxcount`. +17. If `maxcount` is zero, return an error. +18. Go to step 7. ### Removing a URL to the Digest-Value {#removing} @@ -234,22 +235,22 @@ Given the following inputs: 4. Let `key` be the return value of {{key}} with `URL` and `ETag` as inputs. 5. Let `h1` be the return value of {{hash}} with `key` and `N` as inputs. 6. Let `fingerprint` be the return value of {{fingerprint}} with `key`, `N` and `f` as inputs. -7. Let `h2` be the return value of {{hash}} with `fingerprint` and `N` as inputs, XORed with `h1`. -8. Let `h` be `h1`. -9. Let `position_start` be 40 + `h` \* `f`. -10. Let `position_end` be `position_start` + `f` \* `b`. -11. While `position_start` < `position_end`: +7. Let `fingerprint-string` be the value of `fingerprint` in base 10, expressed as a string. +8. Let `h2` be the return value of {{hash}} with `fingerprint-string` and `N` as inputs, XORed with `h1`. +9. Let `h` be `h1`. +10. Let `position_start` be 40 + `h` \* `f`. +11. Let `position_end` be `position_start` + `f` \* `b`. +12. While `position_start` < `position_end`: 1. Let `bits` be `f` bits from `digest_value` starting at `position_start`. 2. If `bits` is `fingerprint`, set `bits` to all zeros and terminate these steps. 3. Add `f` to `position_start`. -12. If `h` is not `h2`, set `h` to `h2` and return to step 9. +13. If `h` is not `h2`, set `h` to `h2` and return to step 9. ### Computing a fingerprint value {#fingerprint} Given the following inputs: * `key`, an array of characters -* `N`, an integer * `f`, an integer indicating the number of output bits 1. Let `hash-value` be the SHA-256 message digest {{RFC6234}} of `key`, expressed as an integer. @@ -328,15 +329,16 @@ Given the following inputs: 4. Let `key` be the return value of {{key}} with `URL` and `ETag` as inputs. 5. Let `h1` be the return value of {{hash}} with `key` and `N` as inputs. 6. Let `fingerprint` be the return value of {{fingerprint}} with `key`, `N` and `f` as inputs. -7. Let `h2` be the return value of {{hash}} with `fingerprint` and `N` as inputs, XORed with `h1`. -8. Let `h` be `h1`. -9. Let `position_start` be 40 + `h` \* `f`. -10. Let `position_end` be `position_start` + `f` \* `b`. -11. While `position_start` < `position_end`: +7. Let `fingerprint-string` be the value of `fingerprint` in base 10, expressed as a string. +8. Let `h2` be the return value of {{hash}} with `fingerprint` and `N` as inputs, XORed with `h1`. +9. Let `h` be `h1`. +10. Let `position_start` be 40 + `h` \* `f`. +11. Let `position_end` be `position_start` + `f` \* `b`. +12. While `position_start` < `position_end`: 1. Let `bits` be `f` bits from `digest_value` starting at `position_start`. 2. If `bits` is `fingerprint`, return true 3. Add `f` to `position_start`. -12. Return false. +13. Return false. # The SENDING_CACHE_DIGEST SETTINGS Parameter From 7729e54a38f13e1f88e13b4b7c345e4781ba1f33 Mon Sep 17 00:00:00 2001 From: Yoav Weiss Date: Wed, 15 Nov 2017 12:01:09 +0100 Subject: [PATCH 10/19] remove N from fingerprint --- draft-ietf-httpbis-cache-digest.md | 12 ++++++------ 1 file changed, 6 insertions(+), 6 deletions(-) diff --git a/draft-ietf-httpbis-cache-digest.md b/draft-ietf-httpbis-cache-digest.md index 47b92d466..1b392748c 100644 --- a/draft-ietf-httpbis-cache-digest.md +++ b/draft-ietf-httpbis-cache-digest.md @@ -203,11 +203,11 @@ Given the following inputs: 3. Let `N` be the value of the second to fifth bytes of `digest-value` in big endian form. 4. Let `key` be the return value of {{key}} with `URL` and `ETag` as inputs. 5. Let `h1` be the return value of {{hash}} with `key` and `N` as inputs. -6. Let `fingerprint` be the return value of {{fingerprint}} with `key`, `N` and `f` as inputs. +6. Let `fingerprint` be the return value of {{fingerprint}} with `key` and `f` as inputs. 7. Let `fingerprint-string` be the value of `fingerprint` in base 10, expressed as a string. 8. Let `h2` be the return value of {{hash}} with `fingerprint-string` and `N` as inputs, XORed with `h1`. 9. Let `h` be either `h1` or `h2`, picked in random. -10. Let `position_start` be 40 + `h` * `f`. +10. Let `position_start` be 40 + `h` * `f` \* `b`. 11. Let `position_end` be `position_start` + `f` \* `b`. 12. While `position_start` < `position_end`: 1. Let `bits` be `f` bits from `digest_value` starting at `position_start`. @@ -234,11 +234,11 @@ Given the following inputs: 3. Let `N` be the value of the second to fifth bytes of `digest-value` in big endian form. 4. Let `key` be the return value of {{key}} with `URL` and `ETag` as inputs. 5. Let `h1` be the return value of {{hash}} with `key` and `N` as inputs. -6. Let `fingerprint` be the return value of {{fingerprint}} with `key`, `N` and `f` as inputs. +6. Let `fingerprint` be the return value of {{fingerprint}} with `key` and `f` as inputs. 7. Let `fingerprint-string` be the value of `fingerprint` in base 10, expressed as a string. 8. Let `h2` be the return value of {{hash}} with `fingerprint-string` and `N` as inputs, XORed with `h1`. 9. Let `h` be `h1`. -10. Let `position_start` be 40 + `h` \* `f`. +10. Let `position_start` be 40 + `h` \* `f` \* `b`. 11. Let `position_end` be `position_start` + `f` \* `b`. 12. While `position_start` < `position_end`: 1. Let `bits` be `f` bits from `digest_value` starting at `position_start`. @@ -328,11 +328,11 @@ Given the following inputs: 3. Let `N` be the value of the second to fifth bytes of `digest-value` in big endian form. 4. Let `key` be the return value of {{key}} with `URL` and `ETag` as inputs. 5. Let `h1` be the return value of {{hash}} with `key` and `N` as inputs. -6. Let `fingerprint` be the return value of {{fingerprint}} with `key`, `N` and `f` as inputs. +6. Let `fingerprint` be the return value of {{fingerprint}} with `key` and `f` as inputs. 7. Let `fingerprint-string` be the value of `fingerprint` in base 10, expressed as a string. 8. Let `h2` be the return value of {{hash}} with `fingerprint` and `N` as inputs, XORed with `h1`. 9. Let `h` be `h1`. -10. Let `position_start` be 40 + `h` \* `f`. +10. Let `position_start` be 40 + `h` \* `f` \* `b`. 11. Let `position_end` be `position_start` + `f` \* `b`. 12. While `position_start` < `position_end`: 1. Let `bits` be `f` bits from `digest_value` starting at `position_start`. From c4167e6ff7c26d44572587322ed028a1e159b4bf Mon Sep 17 00:00:00 2001 From: Yoav Weiss Date: Wed, 15 Nov 2017 12:06:15 +0100 Subject: [PATCH 11/19] Fix querying --- draft-ietf-httpbis-cache-digest.md | 3 ++- 1 file changed, 2 insertions(+), 1 deletion(-) diff --git a/draft-ietf-httpbis-cache-digest.md b/draft-ietf-httpbis-cache-digest.md index 1b392748c..0c51eaeec 100644 --- a/draft-ietf-httpbis-cache-digest.md +++ b/draft-ietf-httpbis-cache-digest.md @@ -244,7 +244,7 @@ Given the following inputs: 1. Let `bits` be `f` bits from `digest_value` starting at `position_start`. 2. If `bits` is `fingerprint`, set `bits` to all zeros and terminate these steps. 3. Add `f` to `position_start`. -13. If `h` is not `h2`, set `h` to `h2` and return to step 9. +13. If `h` is not `h2`, set `h` to `h2` and return to step 10. ### Computing a fingerprint value {#fingerprint} @@ -338,6 +338,7 @@ Given the following inputs: 1. Let `bits` be `f` bits from `digest_value` starting at `position_start`. 2. If `bits` is `fingerprint`, return true 3. Add `f` to `position_start`. +13. If `h` is not `h2`, set `h` to `h2` and return to step 10. 13. Return false. # The SENDING_CACHE_DIGEST SETTINGS Parameter From 88859e7881a707005537f6a75e2b4fd7ae6ddeca Mon Sep 17 00:00:00 2001 From: Yoav Weiss Date: Thu, 16 Nov 2017 10:27:57 +0100 Subject: [PATCH 12/19] Fixed allocation issues and fixed style --- draft-ietf-httpbis-cache-digest.md | 68 ++++++++++++++++++++---------- 1 file changed, 45 insertions(+), 23 deletions(-) diff --git a/draft-ietf-httpbis-cache-digest.md b/draft-ietf-httpbis-cache-digest.md index 0c51eaeec..8caf277d4 100644 --- a/draft-ietf-httpbis-cache-digest.md +++ b/draft-ietf-httpbis-cache-digest.md @@ -177,24 +177,35 @@ CACHE_DIGEST has no defined meaning when sent from servers, and SHOULD be ignore ### Creating a digest {#creating} Given the following inputs: -* `P`, an integer smaller than 256, that indicates the probability of a false positive that is acceptable, expressed as `1/2\*\*P`. +* `P`, an integer smaller than 256, that indicates the probability of a false positive that is +acceptable, expressed as `1/2\*\*P`. * `N`, an integer that represents the number of entries - a prime number smaller than 2\*\*32 1. Let `f` be the number of bits per fingerprint, calculated as `P + 3` 2. Let `b` be the bucket size, defined as 4. -3. Let `bytes` be `f`\*`N`\*`b`/8 rounded up to the nearest integer -4. Add 5 to `bytes` -5. Allocate memory of `bytes` and set it to zero. Assign it to `digest-value`. -6. Set the first byte to `P` -7. Set the second till fifth bytes to `N` in big endian form -8. Return the `digest-value`. +3. Let `allocated` be the closest power of 2 that is larger than `N`. +4. Let `bytes` be `f`\*`allocated`\*`b`/8 rounded up to the nearest integer +5. Add 5 to `bytes` +6. Allocate memory of `bytes` and set it to zero. Assign it to `digest-value`. +7. Set the first byte to `P` +8. Set the second till fifth bytes to `N` in big endian form +9. Return the `digest-value`. + +Note: `allocated` is necessary due to the nature of the way Cuckoo filters are creating the +secondary hash, by XORing the initial hash and the fingerprint's hash. The XOR operation means +that secondary hash can pick an entry beyond the initial number of entries, up to the next power +of 2. In order to avoid issues there, we allocate the table appropriately. For increased space +efficiency, it is recommended that implementations pick a number of entries that's close to the +next power of 2. ### Adding a URL to the Digest-Value {#adding} Given the following inputs: -* `URL` a string corresponding to the Effective Request URI ({{RFC7230}}, Section 5.5) of a cached response {{RFC7234}} -* `ETag` a string corresponding to the entity-tag {{RFC7232}} of a cached response {{RFC7234}} (if the ETag is available; otherwise, null); +* `URL` a string corresponding to the Effective Request URI ({{RFC7230}}, Section 5.5) of a cached +response {{RFC7234}} +* `ETag` a string corresponding to the entity-tag {{RFC7232}} of a cached response {{RFC7234}} (if +the ETag is available; otherwise, null); * `maxcount` - max number of cuckoo hops * `digest-value` @@ -205,7 +216,8 @@ Given the following inputs: 5. Let `h1` be the return value of {{hash}} with `key` and `N` as inputs. 6. Let `fingerprint` be the return value of {{fingerprint}} with `key` and `f` as inputs. 7. Let `fingerprint-string` be the value of `fingerprint` in base 10, expressed as a string. -8. Let `h2` be the return value of {{hash}} with `fingerprint-string` and `N` as inputs, XORed with `h1`. +8. Let `h2` be the return value of {{hash}} with `fingerprint-string` and `N` as inputs, XORed with +`h1`. 9. Let `h` be either `h1` or `h2`, picked in random. 10. Let `position_start` be 40 + `h` * `f` \* `b`. 11. Let `position_end` be `position_start` + `f` \* `b`. @@ -225,8 +237,10 @@ Given the following inputs: Given the following inputs: -* `URL` a string corresponding to the Effective Request URI ({{RFC7230}}, Section 5.5) of a cached response {{RFC7234}} -* `ETag` a string corresponding to the entity-tag {{RFC7232}} of a cached response {{RFC7234}} (if the ETag is available; otherwise, null); +* `URL` a string corresponding to the Effective Request URI ({{RFC7230}}, Section 5.5) of a cached +response {{RFC7234}} +* `ETag` a string corresponding to the entity-tag {{RFC7232}} of a cached response {{RFC7234}} (if +the ETag is available; otherwise, null); * `digest-value` 1. Let `f` be the value of the first byte of `digest-value`. @@ -236,7 +250,8 @@ Given the following inputs: 5. Let `h1` be the return value of {{hash}} with `key` and `N` as inputs. 6. Let `fingerprint` be the return value of {{fingerprint}} with `key` and `f` as inputs. 7. Let `fingerprint-string` be the value of `fingerprint` in base 10, expressed as a string. -8. Let `h2` be the return value of {{hash}} with `fingerprint-string` and `N` as inputs, XORed with `h1`. +8. Let `h2` be the return value of {{hash}} with `fingerprint-string` and `N` as inputs, XORed with +`h1`. 9. Let `h` be `h1`. 10. Let `position_start` be 40 + `h` \* `f` \* `b`. 11. Let `position_end` be `position_start` + `f` \* `b`. @@ -263,9 +278,10 @@ Given the following inputs: 5. If `fingerprint-value` is 0, let `fingerprint-value` be 1. 6. Return `fingerprint-value`. -Note: Step 5 is to handle the extremely unlikely case where a SHA-256 digest of `key` is all zeros. The implications of it means that -there's an infitisimaly larger probability of getting a `fingerprint-value` of 1 compared to all other values. This is not a problem for any -practical purpose. +Note: Step 5 is to handle the extremely unlikely case where a SHA-256 digest of `key` is all zeros. +The implications of it means that there's an infitisimaly larger probability of getting a +`fingerprint-value` of 1 compared to all other values. This is not a problem for any practical +purpose. @@ -278,7 +294,8 @@ Given the following inputs: 1. Let `key` be `URL` converted to an ASCII string by percent-encoding as appropriate {{RFC3986}}. 2. If `ETag` is not null: - 1. Append `ETag` to `key` as an ASCII string, including both the `weak` indicator (if present) and double quotes, as per {{RFC7232}}, Section 2.3. + 1. Append `ETag` to `key` as an ASCII string, including both the `weak` indicator (if present) + and double quotes, as per {{RFC7232}}, Section 2.3. 3. Return `key` ### Computing a Hash Value {#hash} @@ -290,7 +307,8 @@ Given the following inputs: `hash-value` can be computed using the following algorithm: -1. Let `hash-value` be the SHA-256 message digest {{RFC6234}} of `key`, truncated to 32 bits, expressed as an integer. +1. Let `hash-value` be the SHA-256 message digest {{RFC6234}} of `key`, truncated to 32 bits, +expressed as an integer. 2. Return `hash-value` modulo N. @@ -299,9 +317,11 @@ Given the following inputs: In typical use, a server will query (as per {{querying}}) the CACHE_DIGESTs received on a given connection to inform what it pushes to that client; - * If a given URL and ETag combination has a match in a current CACHE_DIGEST, a complete response need not be pushed; The server MAY push a - 304 response for that resource, indicating the client that it hasn't changed. - * If a given URL and ETag has no match in any current CACHE_DIGEST, the client does not have a cached copy, and a complete response can be pushed. +* If a given URL and ETag combination has a match in a current CACHE_DIGEST, a complete response +need not be pushed; The server MAY push a 304 response for that resource, indicating the client +that it hasn't changed. +* If a given URL and ETag has no match in any current CACHE_DIGEST, the client does not have a +cached copy, and a complete response can be pushed. Servers MAY use all CACHE_DIGESTs received for a given origin as current, as long as they do not have the RESET flag set; a CACHE_DIGEST frame with the RESET flag set MUST clear any @@ -319,8 +339,10 @@ Servers MUST ignore CACHE_DIGEST frames sent on a stream other than 0. Given the following inputs: -* `URL` a string corresponding to the Effective Request URI ({{RFC7230}}, Section 5.5) of a cached response {{RFC7234}}. -* `ETag` a string corresponding to the entity-tag {{RFC7232}} of a cached response {{RFC7234}} (if the ETag is available; otherwise, null). +* `URL` a string corresponding to the Effective Request URI ({{RFC7230}}, Section 5.5) of a cached +response {{RFC7234}}. +* `ETag` a string corresponding to the entity-tag {{RFC7232}} of a cached response {{RFC7234}} (if +the ETag is available; otherwise, null). * `digest-value`, an array of bits. 1. Let `f` be the value of the first byte of `digest-value`. From c587bbc6330defe4430bffdbcbd0aa129174abf4 Mon Sep 17 00:00:00 2001 From: Yoav Weiss Date: Mon, 20 Nov 2017 09:59:17 +0100 Subject: [PATCH 13/19] Clean up the algorithm and match it to the ref impl --- draft-ietf-httpbis-cache-digest.md | 79 ++++++++++++++++++------------ 1 file changed, 47 insertions(+), 32 deletions(-) diff --git a/draft-ietf-httpbis-cache-digest.md b/draft-ietf-httpbis-cache-digest.md index 8caf277d4..767e3a4bd 100644 --- a/draft-ietf-httpbis-cache-digest.md +++ b/draft-ietf-httpbis-cache-digest.md @@ -215,16 +215,22 @@ the ETag is available; otherwise, null); 4. Let `key` be the return value of {{key}} with `URL` and `ETag` as inputs. 5. Let `h1` be the return value of {{hash}} with `key` and `N` as inputs. 6. Let `fingerprint` be the return value of {{fingerprint}} with `key` and `f` as inputs. -7. Let `fingerprint-string` be the value of `fingerprint` in base 10, expressed as a string. -8. Let `h2` be the return value of {{hash}} with `fingerprint-string` and `N` as inputs, XORed with -`h1`. -9. Let `h` be either `h1` or `h2`, picked in random. -10. Let `position_start` be 40 + `h` * `f` \* `b`. -11. Let `position_end` be `position_start` + `f` \* `b`. -12. While `position_start` < `position_end`: - 1. Let `bits` be `f` bits from `digest_value` starting at `position_start`. - 2. If `bits` is all zeros, set `bits` to `fingerprint` and terminate these steps. - 3. Add `f` to `position_start`. +7. Let `h2` be the return value of {{hash2}} with `h1`, `fingerprint` and `N` as inputs. +8. Let `h` be either `h1` or `h2`, picked in random. +9. While `maxcount` is larger than zero: + 1. Let `position_start` be 40 + `h` * `f` \* `b`. + 2. Let `position_end` be `position_start` + `f` \* `b`. + 3. While `position_start` < `position_end`: + 1. Let `bits` be `f` bits from `digest_value` starting at `position_start`. + 2. If `bits` is all zeros, set `bits` to `fingerprint` and terminate these steps. + 3. Add `f` to `position_start`. + 4. Let `e` be a random number from 0 to `b`. + 5. Substract `f` * (`b` - `e`) from `position_start`. + 6. Let `bits` be `f` bits from `digest_value` starting at `position_start`. + 7. Let `dest_fingerprint` be the value of bits, read as big endian. + 8. Set `bits` to `fingerprint`. + 9. Let `h` be {{hash2}} with `h`, `dest_fingerprint` and `N` as inputs. + 10. Substract 1 from `maxcount`. 13. Substract `f` from `position_start`. 14. Let `fingerprint` be the `f` bits starting at `position_start`. 15. Let `h1` be `h` @@ -249,17 +255,15 @@ the ETag is available; otherwise, null); 4. Let `key` be the return value of {{key}} with `URL` and `ETag` as inputs. 5. Let `h1` be the return value of {{hash}} with `key` and `N` as inputs. 6. Let `fingerprint` be the return value of {{fingerprint}} with `key` and `f` as inputs. -7. Let `fingerprint-string` be the value of `fingerprint` in base 10, expressed as a string. -8. Let `h2` be the return value of {{hash}} with `fingerprint-string` and `N` as inputs, XORed with -`h1`. -9. Let `h` be `h1`. -10. Let `position_start` be 40 + `h` \* `f` \* `b`. -11. Let `position_end` be `position_start` + `f` \* `b`. -12. While `position_start` < `position_end`: - 1. Let `bits` be `f` bits from `digest_value` starting at `position_start`. - 2. If `bits` is `fingerprint`, set `bits` to all zeros and terminate these steps. - 3. Add `f` to `position_start`. -13. If `h` is not `h2`, set `h` to `h2` and return to step 10. +7. Let `h2` be the return value of {{hash2}} with `h1`, `fingerprint` and `N` as inputs. +8. Let `hashes` be an array containing `h1` and `h2`. +9. For each `h` in `hashes`: + 1. Let `position_start` be 40 + `h` \* `f` \* `b`. + 2. Let `position_end` be `position_start` + `f` \* `b`. + 3. While `position_start` < `position_end`: + 1. Let `bits` be `f` bits from `digest_value` starting at `position_start`. + 2. If `bits` is `fingerprint`, set `bits` to all zeros and terminate these steps. + 3. Add `f` to `position_start`. ### Computing a fingerprint value {#fingerprint} @@ -298,6 +302,8 @@ Given the following inputs: and double quotes, as per {{RFC7232}}, Section 2.3. 3. Return `key` +TODO: Add an example of the ETag and the key calcuations. + ### Computing a Hash Value {#hash} Given the following inputs: @@ -311,6 +317,16 @@ Given the following inputs: expressed as an integer. 2. Return `hash-value` modulo N. +### Computing an Alternative Hash Value {#hash2} +Given the following inputs: + +* `hash1`, an integer indicating the previous hash. +* `fingerprint`, an integer indicating the fingerprint value. +* `N`, an integer indicating the number of entries in the digest. + +1. Let `fingerprint-string` be the value of `fingerprint` in base 10, expressed as a string. +2. Let `hash2` be the return value of {{hash}} with `fingerprint-string` and `N` as inputs, XORed with `hash1`. +3. Return `hash2`. ## Server Behavior @@ -351,17 +367,16 @@ the ETag is available; otherwise, null). 4. Let `key` be the return value of {{key}} with `URL` and `ETag` as inputs. 5. Let `h1` be the return value of {{hash}} with `key` and `N` as inputs. 6. Let `fingerprint` be the return value of {{fingerprint}} with `key` and `f` as inputs. -7. Let `fingerprint-string` be the value of `fingerprint` in base 10, expressed as a string. -8. Let `h2` be the return value of {{hash}} with `fingerprint` and `N` as inputs, XORed with `h1`. -9. Let `h` be `h1`. -10. Let `position_start` be 40 + `h` \* `f` \* `b`. -11. Let `position_end` be `position_start` + `f` \* `b`. -12. While `position_start` < `position_end`: - 1. Let `bits` be `f` bits from `digest_value` starting at `position_start`. - 2. If `bits` is `fingerprint`, return true - 3. Add `f` to `position_start`. -13. If `h` is not `h2`, set `h` to `h2` and return to step 10. -13. Return false. +7. Let `h2` be the return value of {{hash2}} with `h1`, `fingerprint` and `N` as inputs. +8. Let `hashes` be an array containing `h1` and `h2`. +9. For each `h` in `hashes`: + 1. Let `position_start` be 40 + `h` \* `f` \* `b`. + 2. Let `position_end` be `position_start` + `f` \* `b`. + 3. While `position_start` < `position_end`: + 1. Let `bits` be `f` bits from `digest_value` starting at `position_start`. + 2. If `bits` is `fingerprint`, return true + 3. Add `f` to `position_start`. +10. Return false. # The SENDING_CACHE_DIGEST SETTINGS Parameter From d02e5d159e84c1bad677c864e6d9d920d99c99e0 Mon Sep 17 00:00:00 2001 From: Yoav Weiss Date: Mon, 20 Nov 2017 10:23:15 +0100 Subject: [PATCH 14/19] Spec alignment when adding entries --- draft-ietf-httpbis-cache-digest.md | 15 ++++++++------- 1 file changed, 8 insertions(+), 7 deletions(-) diff --git a/draft-ietf-httpbis-cache-digest.md b/draft-ietf-httpbis-cache-digest.md index 767e3a4bd..65ec4c56a 100644 --- a/draft-ietf-httpbis-cache-digest.md +++ b/draft-ietf-httpbis-cache-digest.md @@ -214,23 +214,24 @@ the ETag is available; otherwise, null); 3. Let `N` be the value of the second to fifth bytes of `digest-value` in big endian form. 4. Let `key` be the return value of {{key}} with `URL` and `ETag` as inputs. 5. Let `h1` be the return value of {{hash}} with `key` and `N` as inputs. -6. Let `fingerprint` be the return value of {{fingerprint}} with `key` and `f` as inputs. -7. Let `h2` be the return value of {{hash2}} with `h1`, `fingerprint` and `N` as inputs. +6. Let `dest_fingerprint` be the return value of {{fingerprint}} with `key` and `f` as inputs. +7. Let `h2` be the return value of {{hash2}} with `h1`, `dest_fingerprint` and `N` as inputs. 8. Let `h` be either `h1` or `h2`, picked in random. 9. While `maxcount` is larger than zero: 1. Let `position_start` be 40 + `h` * `f` \* `b`. 2. Let `position_end` be `position_start` + `f` \* `b`. 3. While `position_start` < `position_end`: 1. Let `bits` be `f` bits from `digest_value` starting at `position_start`. - 2. If `bits` is all zeros, set `bits` to `fingerprint` and terminate these steps. + 2. If `bits` is all zeros, set `bits` to `dest_fingerprint` and terminate these steps. 3. Add `f` to `position_start`. 4. Let `e` be a random number from 0 to `b`. 5. Substract `f` * (`b` - `e`) from `position_start`. 6. Let `bits` be `f` bits from `digest_value` starting at `position_start`. - 7. Let `dest_fingerprint` be the value of bits, read as big endian. - 8. Set `bits` to `fingerprint`. - 9. Let `h` be {{hash2}} with `h`, `dest_fingerprint` and `N` as inputs. - 10. Substract 1 from `maxcount`. + 7. Let `fingerprint` be the value of bits, read as big endian. + 8. Set `bits` to `dest_fingerprint`. + 9. Set `dest_fingerprint` to `fingerprint`. + 10. Let `h` be {{hash2}} with `h`, `dest_fingerprint` and `N` as inputs. + 11. Substract 1 from `maxcount`. 13. Substract `f` from `position_start`. 14. Let `fingerprint` be the `f` bits starting at `position_start`. 15. Let `h1` be `h` From 4730fff5b9f05a6ef89b1a3e7e58de901a7ed672 Mon Sep 17 00:00:00 2001 From: Kazuho Oku Date: Wed, 28 Feb 2018 22:44:21 +0900 Subject: [PATCH 15/19] retain the fresh-stale distinction --- draft-ietf-httpbis-cache-digest.md | 35 ++++++++++++++++++++---------- 1 file changed, 23 insertions(+), 12 deletions(-) diff --git a/draft-ietf-httpbis-cache-digest.md b/draft-ietf-httpbis-cache-digest.md index 65ec4c56a..dba182e2b 100644 --- a/draft-ietf-httpbis-cache-digest.md +++ b/draft-ietf-httpbis-cache-digest.md @@ -141,6 +141,10 @@ The CACHE_DIGEST frame defines the following flags: * **COMPLETE** (0x2): When set, indicates that the currently valid set of cache digests held by the server constitutes a complete representation of the cache's state regarding that origin, for the type of cached response indicated by the `STALE` flag. +* **VALIDATORS** (0x4): When set, indicates that the `validators` boolean in {{key}} is true. + +* **STALE** (0x8): When set, indicates that all cached responses represented in the digest-value are stale {{RFC7234}} at the point in them that the digest was generated; otherwise, all are fresh. + ## Client Behavior A CACHE_DIGEST frame MUST be sent from a client to a server on stream 0, and conveys a digest of @@ -159,18 +163,23 @@ When generating CACHE_DIGEST, a client MUST NOT include cached responses whose U origins {{RFC6454}} with the indicated origin. Clients MUST NOT send CACHE_DIGEST frames on connections that are not authoritative (as defined in {{RFC7540}}, 10.1) for the indicated origin. -Clients can choose to only send a subset of the suitable stored responses. However, when the -CACHE_DIGEST frames sent represent the complete set of stored responses of a given type, the last -such frame SHOULD have a COMPLETE flag set, to indicate to the server that it has all relevant -state of that type. Note that for the purposes of COMPLETE, responses cached since the beginning -of the connection or the last RESET flag on a CACHE_DIGEST frame need not be included. +CACHE_DIGEST allows the client to indicate whether the set of URLs used to compute the digest +represent fresh or stale stored responses, using the STALE flag. Clients MAY decide whether to only +send CACHE_DIGEST frames representing their fresh stored responses, their stale stored responses, +or both. + +Clients can choose to only send a subset of the suitable stored responses of each type (fresh or +stale). However, when the CACHE_DIGEST frames sent represent the complete set of stored responses +of a given type, the last such frame SHOULD have a COMPLETE flag set, to indicate to the server +that it has all relevant state of that type. Note that for the purposes of COMPLETE, responses +cached since the beginning of the connection or the last RESET flag on a CACHE_DIGEST frame need +not be included. -CACHE_DIGEST will also include the cached responses' ETags, if they were present in the response. -This information can be used by servers to decide if a response needs to be pushed to clients; -If a response is cached and was not changed at the origin server, the server calculating its hash -will find it in the digest and therefore will not push it. If a response is cached but was -modified at the origin server, the server calculating its hash will not find it in the digest, so -the response will be pushed. +CACHE_DIGEST can be computed to include cached responses' ETags, as indicated by the VALIDATORS +flag. This information can be used by servers to decide what kinds of responses to push to clients; +for example, a stale response that hasn't changed could be refreshed with a 304 (Not Modified) +response; one that has changed can be replaced with a 200 (OK) response, whether the cached +response was fresh or stale. CACHE_DIGEST has no defined meaning when sent from servers, and SHOULD be ignored by clients. @@ -296,9 +305,10 @@ Given the following inputs: * `URL`, an array of characters * `ETag`, an array of characters +* `validators`, a boolean indicating whether validators ({{RFC7232}}) are to be included in the digest 1. Let `key` be `URL` converted to an ASCII string by percent-encoding as appropriate {{RFC3986}}. -2. If `ETag` is not null: +2. If `validators` is true and `ETag` is not null: 1. Append `ETag` to `key` as an ASCII string, including both the `weak` indicator (if present) and double quotes, as per {{RFC7232}}, Section 2.3. 3. Return `key` @@ -360,6 +370,7 @@ Given the following inputs: response {{RFC7234}}. * `ETag` a string corresponding to the entity-tag {{RFC7232}} of a cached response {{RFC7234}} (if the ETag is available; otherwise, null). +* `validators`, a boolean * `digest-value`, an array of bits. 1. Let `f` be the value of the first byte of `digest-value`. From 5855ed9ce4962c476ca9a4a783527a21a9734566 Mon Sep 17 00:00:00 2001 From: Kazuho Oku Date: Wed, 28 Feb 2018 22:51:19 +0900 Subject: [PATCH 16/19] remove reference to Rice encoding --- draft-ietf-httpbis-cache-digest.md | 14 -------------- 1 file changed, 14 deletions(-) diff --git a/draft-ietf-httpbis-cache-digest.md b/draft-ietf-httpbis-cache-digest.md index dba182e2b..b3baeb048 100644 --- a/draft-ietf-httpbis-cache-digest.md +++ b/draft-ietf-httpbis-cache-digest.md @@ -40,20 +40,6 @@ informative: RFC4648: RFC5234: RFC6265: - Rice: - title: Adaptive variable-length coding for efficient compression of spacecraft television data - author: - - - ins: R. F. Rice - name: Robert F. Rice - - - ins: J. Plaunt - name: James Plaunt - date: December 1971 - seriesinfo: - 'IEEE Transactions on Communication Technology': 19.6 - DOI: 10.1109/TCOM.1971.1090789 - ISSN: 0018-9332 I-D.ietf-tls-tls13: Service-Workers: title: Service Workers 1 From 4b189bf82ca95446c7f446096ac702463476afb3 Mon Sep 17 00:00:00 2001 From: Kazuho Oku Date: Wed, 28 Feb 2018 22:51:27 +0900 Subject: [PATCH 17/19] update copyright to 2018 --- draft-ietf-httpbis-cache-digest.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/draft-ietf-httpbis-cache-digest.md b/draft-ietf-httpbis-cache-digest.md index b3baeb048..b14b5aac6 100644 --- a/draft-ietf-httpbis-cache-digest.md +++ b/draft-ietf-httpbis-cache-digest.md @@ -1,7 +1,7 @@ --- title: Cache Digests for HTTP/2 docname: draft-ietf-httpbis-cache-digest-latest -date: 2017 +date: 2018 category: exp ipr: trust200902 From 8d9ba5dbcfc571c2802a55313b7ce643401759af Mon Sep 17 00:00:00 2001 From: Kazuho Oku Date: Wed, 28 Feb 2018 22:55:22 +0900 Subject: [PATCH 18/19] cuckoo filter by Joav! --- draft-ietf-httpbis-cache-digest.md | 5 +---- 1 file changed, 1 insertion(+), 4 deletions(-) diff --git a/draft-ietf-httpbis-cache-digest.md b/draft-ietf-httpbis-cache-digest.md index b14b5aac6..a74f34c8e 100644 --- a/draft-ietf-httpbis-cache-digest.md +++ b/draft-ietf-httpbis-cache-digest.md @@ -487,10 +487,7 @@ The header may represent the cache state of a client or that of a proxy, dependi # Acknowledgements -Thanks to Adam Langley and Giovanni Bajo for their explorations of Golomb-coded sets. In -particular, see -, which -refers to sample code. +Thanks to Joav Weiss for his idea and text to use Cuckoo Filter. Thanks to Stefan Eissing for his suggestions. From f8e63e273c64717fb6a9cb81a52dc63e5b06371d Mon Sep 17 00:00:00 2001 From: Kazuho Oku Date: Thu, 1 Mar 2018 00:19:54 +0900 Subject: [PATCH 19/19] typo --- draft-ietf-httpbis-cache-digest.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/draft-ietf-httpbis-cache-digest.md b/draft-ietf-httpbis-cache-digest.md index a74f34c8e..676fef9e7 100644 --- a/draft-ietf-httpbis-cache-digest.md +++ b/draft-ietf-httpbis-cache-digest.md @@ -487,7 +487,7 @@ The header may represent the cache state of a client or that of a proxy, dependi # Acknowledgements -Thanks to Joav Weiss for his idea and text to use Cuckoo Filter. +Thanks to Yoav Weiss for his idea and text to use Cuckoo Filter. Thanks to Stefan Eissing for his suggestions.