Skip to content

Commit

Permalink
Merge pull request #176 from PJK/streaming-enc
Browse files Browse the repository at this point in the history
Improving streaming API docs
  • Loading branch information
PJK committed Dec 20, 2020
2 parents 5550925 + ca8144f commit e3a6832
Show file tree
Hide file tree
Showing 14 changed files with 159 additions and 62 deletions.
2 changes: 2 additions & 0 deletions doc/source/api.rst
Original file line number Diff line number Diff line change
Expand Up @@ -28,6 +28,8 @@ The API is designed to allow both very tight control & flexibility and general c
api/item_reference_counting
api/decoding
api/encoding
api/streaming_decoding
api/streaming_encoding
api/type_0_1
api/type_2
api/type_3
Expand Down
Original file line number Diff line number Diff line change
@@ -1,11 +1,12 @@
Decoding
Streaming Decoding
=============================

Another way to decode data using libcbor is to specify a callbacks that will be invoked when upon finding certain items in the input. This API is provided by
*libcbor* exposes a stateless decoder that reads a stream of input bytes from a buffer and invokes user-provided callbacks as it decodes the input:

.. doxygenfunction:: cbor_stream_decode

Usage example: https://github.com/PJK/libcbor/blob/master/examples/streaming_parser.c
For example, when :func:`cbor_stream_decode` encounters a 1B unsigned integer, it will invoke the function pointer stored in ``cbor_callbacks.uint8``.
Complete usage example: `examples/streaming_parser.c <https://github.com/PJK/libcbor/blob/master/examples/streaming_parser.c>`_

The callbacks are defined by

Expand All @@ -16,13 +17,6 @@ When building custom sets of callbacks, feel free to start from

.. doxygenvariable:: cbor_empty_callbacks

Related structures
~~~~~~~~~~~~~~~~~~~~~

.. doxygenenum:: cbor_decoder_status
.. doxygenstruct:: cbor_decoder_result
:members:


Callback types definition
~~~~~~~~~~~~~~~~~~~~~~~~~~~~
Expand Down
65 changes: 65 additions & 0 deletions doc/source/api/streaming_encoding.rst
Original file line number Diff line number Diff line change
@@ -0,0 +1,65 @@
Streaming Encoding
=============================

`cbor/encoding.h <https://github.com/PJK/libcbor/blob/master/src/cbor/encoding.h>`_
exposes a low-level encoding API to encode CBOR objects on the fly. Unlike
:func:`cbor_serialize`, these functions take logical values (integers, floats,
strings, etc.) instead of :type:`cbor_item_t`. The client is responsible for
constructing the compound types correctly (e.g. terminating arrays).

Streaming encoding is typically used to create an streaming (indefinite length) CBOR :doc:`strings <type_2>`, :doc:`byte strings <type_3>`, :doc:`arrays <type_4>`, and :doc:`maps <type_5>`. Complete example: `examples/streaming_array.c <https://github.com/PJK/libcbor/blob/master/examples/streaming_array.c>`_

.. doxygenfunction:: cbor_encode_uint8

.. doxygenfunction:: cbor_encode_uint16

.. doxygenfunction:: cbor_encode_uint32

.. doxygenfunction:: cbor_encode_uint64

.. doxygenfunction:: cbor_encode_uint

.. doxygenfunction:: cbor_encode_negint8

.. doxygenfunction:: cbor_encode_negint16

.. doxygenfunction:: cbor_encode_negint32

.. doxygenfunction:: cbor_encode_negint64

.. doxygenfunction:: cbor_encode_negint

.. doxygenfunction:: cbor_encode_bytestring_start

.. doxygenfunction:: cbor_encode_indef_bytestring_start

.. doxygenfunction:: cbor_encode_string_start

.. doxygenfunction:: cbor_encode_indef_string_start

.. doxygenfunction:: cbor_encode_array_start

.. doxygenfunction:: cbor_encode_indef_array_start

.. doxygenfunction:: cbor_encode_map_start

.. doxygenfunction:: cbor_encode_indef_map_start

.. doxygenfunction:: cbor_encode_tag

.. doxygenfunction:: cbor_encode_bool

.. doxygenfunction:: cbor_encode_null

.. doxygenfunction:: cbor_encode_undef

.. doxygenfunction:: cbor_encode_half

.. doxygenfunction:: cbor_encode_single

.. doxygenfunction:: cbor_encode_double

.. doxygenfunction:: cbor_encode_break

.. doxygenfunction:: cbor_encode_ctrl

5 changes: 0 additions & 5 deletions doc/source/api/type_2.rst
Original file line number Diff line number Diff line change
Expand Up @@ -27,11 +27,6 @@ Storage requirements (indefinite) ``sizeof(cbor_item_t) * (1 + chunk_count) +
================================== ======================================================


Streaming indefinite byte strings
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

Please refer to :doc:`/streaming`.

Getting metadata
~~~~~~~~~~~~~~~~~

Expand Down
6 changes: 1 addition & 5 deletions doc/source/api/type_3.rst
Original file line number Diff line number Diff line change
@@ -1,7 +1,7 @@
Type 3 – UTF-8 strings
=============================

CBOR strings work in much the same ways as :doc:`type_2`.
CBOR strings have the same structure as :doc:`type_2`.

================================== ======================================================
Corresponding :type:`cbor_type` ``CBOR_TYPE_STRING``
Expand All @@ -12,10 +12,6 @@ Storage requirements (definite) ``sizeof(cbor_item_t) + length(handle)``
Storage requirements (indefinite) ``sizeof(cbor_item_t) * (1 + chunk_count) + chunks``
================================== ======================================================

Streaming indefinite strings
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

Please refer to :doc:`/streaming`.

UTF-8 encoding validation
~~~~~~~~~~~~~~~~~~~~~~~~~~~
Expand Down
9 changes: 5 additions & 4 deletions doc/source/api/type_4.rst
Original file line number Diff line number Diff line change
Expand Up @@ -2,6 +2,11 @@ Type 4 – Arrays
=============================

CBOR arrays, just like :doc:`byte strings <type_2>` and :doc:`strings <type_3>`, can be encoded either as definite, or as indefinite.
Definite arrays have a fixed size which is stored in the header, whereas indefinite arrays do not and are terminated by a special "break" byte instead.

Arrays are explicitly created or decoded as definite or indefinite and will be encoded using the corresponding wire representation, regardless of whether the actual size is known at the time of encoding.

.. note:: Indefinite arrays can be conveniently used with streaming :doc:`decoding <streaming_decoding>` and :doc:`encoding <streaming_encoding>`.

================================== =====================================================================================
Corresponding :type:`cbor_type` ``CBOR_TYPE_ARRAY``
Expand All @@ -28,10 +33,6 @@ Examples
0x20 Unsigned integer 32
... 32 items follow

Streaming indefinite arrays
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

Please refer to :doc:`/streaming`.

Getting metadata
~~~~~~~~~~~~~~~~~
Expand Down
26 changes: 22 additions & 4 deletions doc/source/api/type_5.rst
Original file line number Diff line number Diff line change
@@ -1,9 +1,18 @@
Type 5 – Maps
=============================

CBOR maps are the plain old associate hash maps known from JSON and many other formats and languages, with one exception: any CBOR data item can be a key, not just strings. This is somewhat unusual and you, as an application developer, should keep that in mind.
CBOR maps are the plain old associative maps similar JSON objects or Python dictionaries.

Maps can be either definite or indefinite, in much the same way as :doc:`type_4`.
Definite maps have a fixed size which is stored in the header, whereas indefinite maps do not and are terminated by a special "break" byte instead.

Map are explicitly created or decoded as definite or indefinite and will be encoded using the corresponding wire representation, regardless of whether the actual size is known at the time of encoding.

.. note::

Indefinite maps can be conveniently used with streaming :doc:`decoding <streaming_decoding>` and :doc:`encoding <streaming_encoding>`.
Keys and values can simply be output one by one, alternating keys and values.

.. warning:: Any CBOR data item is a legal map key (not just strings).

================================== =====================================================================================
Corresponding :type:`cbor_type` ``CBOR_TYPE_MAP``
Expand All @@ -14,10 +23,19 @@ Storage requirements (definite) ``sizeof(cbor_pair) * size + sizeof(cbor_ite
Storage requirements (indefinite) ``<= sizeof(cbor_item_t) + sizeof(cbor_pair) * size * BUFFER_GROWTH``
================================== =====================================================================================

Streaming maps
Examples
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

Please refer to :doc:`/streaming`.
::

0xbf Start indefinite map (represents {1: 2})
0x01 Unsigned integer 1 (key)
0x02 Unsigned integer 2 (value)
0xff "Break" control token

::

0xa0 Map of size 0

Getting metadata
~~~~~~~~~~~~~~~~~
Expand Down
9 changes: 1 addition & 8 deletions doc/source/api/type_7.rst
Original file line number Diff line number Diff line change
Expand Up @@ -65,11 +65,4 @@ Manipulating existing items
Half floats
~~~~~~~~~~~~
CBOR supports two `bytes wide ("half-precision") <https://en.wikipedia.org/wiki/Half-precision_floating-point_format>`_
floats which are not supported by the C language. *libcbor* represents them using `float <https://en.cppreference.com/w/c/language/type>` values throughout the API, which has important implications when manipulating these values.

In particular, if a user uses some of the manipulation APIs
(e.g. :func:`cbor_set_float2`, :func:`cbor_new_float2`)
to introduce a value that doesn't have an exect half-float representation,
the encoding semantics are given by :func:`cbor_encode_half` as follows:

.. doxygenfunction:: cbor_encode_half
floats which are not supported by the C language. *libcbor* represents them using `float <https://en.cppreference.com/w/c/language/type>` values throughout the API. Encoding will be performed by :func:`cbor_encode_half`, which will handle any values that cannot be represented as a half-float.
1 change: 0 additions & 1 deletion doc/source/index.rst
Original file line number Diff line number Diff line change
Expand Up @@ -30,7 +30,6 @@ Contents
getting_started
using
api
streaming
tests
rfc_conformance
internal
Expand Down
13 changes: 0 additions & 13 deletions doc/source/streaming.rst

This file was deleted.

4 changes: 0 additions & 4 deletions doc/source/streaming/encoding.rst

This file was deleted.

3 changes: 3 additions & 0 deletions examples/CMakeLists.txt
Original file line number Diff line number Diff line change
Expand Up @@ -7,6 +7,9 @@ target_link_libraries(create_items cbor)
add_executable(streaming_parser streaming_parser.c)
target_link_libraries(streaming_parser cbor)

add_executable(streaming_array streaming_array.c)
target_link_libraries(streaming_array cbor)

add_executable(sort sort.c)
target_link_libraries(sort cbor)

Expand Down
47 changes: 47 additions & 0 deletions examples/streaming_array.c
Original file line number Diff line number Diff line change
@@ -0,0 +1,47 @@
/*
* Copyright (c) 2014-2020 Pavel Kalvoda <me@pavelkalvoda.com>
*
* libcbor is free software; you can redistribute it and/or modify
* it under the terms of the MIT license. See LICENSE for details.
*/

#include <stdlib.h>
#include "cbor.h"

void usage() {
printf("Usage: streaming_array <N>\n");
printf("Prints out serialized array [0, ..., N-1]\n");
exit(1);
}

#define BUFFER_SIZE 8
unsigned char buffer[BUFFER_SIZE];
FILE* out;

void flush(size_t bytes) {
if (bytes == 0) exit(1); // All items should be successfully encoded
if (fwrite(buffer, sizeof(unsigned char), bytes, out) != bytes) exit(1);
if (fflush(out)) exit(1);
}

/*
* Example of using the streaming encoding API to create an array of integers
* on the fly. Notice that a partial output is produced with every element.
*/
int main(int argc, char* argv[]) {
if (argc != 2) usage();
long n = strtol(argv[1], NULL, 10);
out = freopen(NULL, "wb", stdout);
if (!out) exit(1);

// Start an indefinite-length array
flush(cbor_encode_indef_array_start(buffer, BUFFER_SIZE));
// Write the array items one by one
for (size_t i = 0; i < n; i++) {
flush(cbor_encode_uint32(i, buffer, BUFFER_SIZE));
}
// Close the array
flush(cbor_encode_break(buffer, BUFFER_SIZE));

if (fclose(out)) exit(1);
}
17 changes: 9 additions & 8 deletions src/cbor/encoding.h
Original file line number Diff line number Diff line change
Expand Up @@ -16,9 +16,15 @@ extern "C" {
#endif

/*
* ============================================================================
* Primitives encoding
* ============================================================================
* All cbor_encode_* methods take 2 or 3 arguments:
* - a logical `value` to encode (except for trivial items such as NULLs)
* - an output `buffer` pointer
* - a `buffer_size` specification
*
* They serialize the `value` into one or more bytes and write the bytes to the
* output `buffer` and return either the number of bytes written, or 0 if the
* `buffer_size` was too small to small to fit the serialized value (in which
* case it is not modified).
*/

CBOR_EXPORT size_t cbor_encode_uint8(uint8_t, unsigned char *, size_t);
Expand Down Expand Up @@ -86,11 +92,6 @@ CBOR_EXPORT size_t cbor_encode_undef(unsigned char *, size_t);
* lost.
* - In all other cases, the sign bit, the exponent, and 10 most significant
* bits of the significand are kept
*
* @param value
* @param buffer Target buffer
* @param buffer_size Available space in the buffer
* @return number of bytes written
*/
CBOR_EXPORT size_t cbor_encode_half(float, unsigned char *, size_t);

Expand Down

0 comments on commit e3a6832

Please sign in to comment.