Skip to content

Fix: support for CBOR chunked strings #236

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 5 commits into from
Apr 1, 2020

Conversation

bpintea
Copy link
Collaborator

@bpintea bpintea commented Mar 31, 2020

ES will split the large string values into chunks (of up to 3996 bytes),
even though the entire set of chunks is contained within the answer
(and thus chunking is unnecessary; this is similar to ES'es use of
CBOR undefined length arrays, as possibly programming convenience,
rather than considerations for the transport layers. HTTP/TCP layers
won't benefit from this behavior).

Confusingly, the tinycbor library won't distinguish between a container
of chunked string values and a container of a contiguous string value.

This PR corrects driver's handling of chunked string values:

  • it introduces a new CBOR utility function that will fail to get string
    references, if the value is chunked; this is mostly useful for fetching
    object key names, which are part of the protocol and short (should
    never be chunked);
  • it will now allow the cursor string to be chunked, in which case it'll
    be allocated by the tinycbor library.
  • it will use a thread local static buffer to re-assemble the chunks
    into one contiguous value for the row string values (that will later
    require conversion to UTF16).

bpintea added 5 commits March 30, 2020 20:18
ES will split the large string values in chunks (of ~4KiB), even though
the entire set of chunks is contained within the answer (and thus
needless). This commit:
- introduces a new CBOR utility function that will fail to get string
references, if the value is chunked; this is mostly useful for fetching
object key names, which are part of the protocol and short;
- will now allow the cursor string to be chunked, in which case it'll be
allocated by the tinycbor library.
This commit adds support for reading row string values when these are
spread over multiple chunks.
Adds a unit test for processing cbor chunked string values.
- not required by the test, but easier for troubleshooting.
Fix misspelling.
Copy link
Contributor

@droberts195 droberts195 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@bpintea bpintea merged commit 508c9e7 into elastic:master Apr 1, 2020
@bpintea bpintea deleted the fix/cbor_chunked_strings branch April 1, 2020 10:37
bpintea added a commit that referenced this pull request Apr 1, 2020
* add support for chunked strings

ES will split the large string values in chunks (of ~4KiB), even though
the entire set of chunks is contained within the answer (and thus
needless). This commit:
- introduces a new CBOR utility function that will fail to get string
references, if the value is chunked; this is mostly useful for fetching
object key names, which are part of the protocol and short;
- will now allow the cursor string to be chunked, in which case it'll be
allocated by the tinycbor library.

* cbor: support row string values received chunked

This commit adds support for reading row string values when these are
spread over multiple chunks.


(cherry picked from commit 508c9e7)
bpintea added a commit that referenced this pull request Apr 1, 2020
* add support for chunked strings

ES will split the large string values in chunks (of ~4KiB), even though
the entire set of chunks is contained within the answer (and thus
needless). This commit:
- introduces a new CBOR utility function that will fail to get string
references, if the value is chunked; this is mostly useful for fetching
object key names, which are part of the protocol and short;
- will now allow the cursor string to be chunked, in which case it'll be
allocated by the tinycbor library.

* cbor: support row string values received chunked

This commit adds support for reading row string values when these are
spread over multiple chunks.


(cherry picked from commit 508c9e7)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants