Skip to content

Commit

Permalink
🎨 Completely revamp extension points/detail
Browse files Browse the repository at this point in the history
- 📝Completely document extension points
- 📝Completely document conversion functions and their basic loops
- 🎨 __detail namespace is now __txt_detail
- ✅ Update tests to match
- 🔖Prepare to cut the next version...
  • Loading branch information
ThePhD committed Feb 28, 2021
1 parent bcdeff5 commit 1edd6a7
Show file tree
Hide file tree
Showing 131 changed files with 2,408 additions and 1,497 deletions.
2 changes: 1 addition & 1 deletion documentation/Doxyfile.in
Original file line number Diff line number Diff line change
Expand Up @@ -1009,7 +1009,7 @@ EXCLUDE_PATTERNS = "*/detail/*" "*/vendor/*"
# Note that the wildcards are matched against the file with absolute path, so to
# exclude all test directories use the pattern */test/*

EXCLUDE_SYMBOLS = "ztd::text::__detail*" "_M_*" "std" "*c_string_view"
EXCLUDE_SYMBOLS = "ztd::text::__txt_detail*" "_M_*" "std" "*c_string_view"

# The EXAMPLE_PATH tag can be used to specify one or more files or directories
# that contain example code fragments that are included (see the \include
Expand Down
1 change: 1 addition & 0 deletions documentation/source/api.rst
Original file line number Diff line number Diff line change
Expand Up @@ -131,6 +131,7 @@ Result Types, Status Codes and Quality Aides
api/char8_t
api/endian
api/encoding_error
api/tag
api/make_decode_state
api/make_encode_state
api/unicode_code_point
Expand Down
29 changes: 28 additions & 1 deletion documentation/source/api/conversions/count_code_points.rst
Original file line number Diff line number Diff line change
Expand Up @@ -31,7 +31,34 @@
count_code_points
=================

.. TODO: to be filled out!
``ztd::text::count_code_points`` is a function that takes an input sequence of ``code_point``\ s and attempts to count them, according to the error handler that is given. Because the error handler is included as part of the function call (and is provided by default is one is not passed in), the count operation will also continue to count if the error handler sets the ``error_code`` member of the result to ``ztd::text::encoding_error::ok`` but still performs some action. This is, for example, the case with :doc:`ztd::text::replacement_handler </api/error handlers/replacement_handler>` - output replacement code units or code points will be counted as part of the final count and returned with ``result.error_code == ztd::text::encoding_error::ok``. You can differentiate error-less text from non-error text by checking ``result.errors_were_handled()``, which will be true if the error handler is called regardless of whether or not the error handler "smooths" the problem over by inserting replacement characters, doing nothing, or otherwise.

The overloads of this function increase the level of control you have with each passed argument. At the last overload with four arguments, the function attempts to work call some extension points or falls back to the base function call in this order:

- The ``text_count_code_points(input, encoding, handler, state)`` extension point, if possible.
- An internal, implementation-defined customization point.
- The ``basic_count_code_points`` base function.

The base function call, ``basic_count_code_points``, simply performs the :doc:`core counting loop </design/converting/count code points>` using the :doc:`Lucky 7 </design/lucky 7>` design.

During the ``basic_count_code_points`` loop, if it detects that there is a preferable ``text_count_code_points_one``, it will call that method as ``text_count_code_points_one(input, encoding, handler, state)`` inside of the loop rather than doing the core design.

.. note::

👉 This means that if you implement none of the extension points whatsoever, implementing the basic ``encode_one`` function on your Encoding Object type will guarantee a proper, working implementation.

.. note::

👉 If you need to call the "basic" form of this function that takes no secret implementation shortcuts or user-defined extension points, then call ``basic_count_code_points`` directly. This can be useful to stop infinity loops when your extension points cannot handle certain inputs and thereby needs to "delegate" to the basic case.



~~~~~~~~~~~~



Functions
---------

.. doxygengroup:: ztd_text_count_code_points
:content-only:
29 changes: 28 additions & 1 deletion documentation/source/api/conversions/count_code_units.rst
Original file line number Diff line number Diff line change
Expand Up @@ -31,7 +31,34 @@
count_code_units
================

.. TODO: to be filled out!
``ztd::text::count_code_units`` is a function that takes an input sequence of ``code_unit``\ s and attempts to count them, according to the error handler that is given. Because the error handler is included as part of the function call (and is provided by default is one is not passed in), the count operation will also continue to count if the error handler sets the ``error_code`` member of the result to ``ztd::text::encoding_error::ok`` but still performs some action. This is, for example, the case with :doc:`ztd::text::replacement_handler </api/error handlers/replacement_handler>` - output replacement code units or code points will be counted as part of the final count and returned with ``result.error_code == ztd::text::encoding_error::ok``. You can differentiate error-less text from non-error text by checking ``result.errors_were_handled()``, which will be true if the error handler is called regardless of whether or not the error handler "smooths" the problem over by inserting replacement characters, doing nothing, or otherwise.

The overloads of this function increase the level of control you have with each passed argument. At the last overload with four arguments, the function attempts to work call some extension points or falls back to the base function call in this order:

- The ``text_count_code_units(input, encoding, handler, state)`` extension point, if possible.
- An internal, implementation-defined customization point.
- The ``basic_count_code_units`` base function.

The base function call, ``basic_count_code_units``, simply performs the :doc:`core counting loop </design/converting/count code units>` using the :doc:`Lucky 7 </design/lucky 7>` design.

During the ``basic_count_code_units`` loop, if it detects that there is a preferable ``text_count_code_units_one``, it will call that method as ``text_count_code_units_one(input, encoding, handler, state)`` inside of the loop rather than doing the core design.

.. note::

👉 This means that if you implement none of the extension points whatsoever, implementing the basic ``decode_one`` function on your Encoding Object type will guarantee a proper, working implementation.

.. note::

👉 If you need to call the "basic" form of this function that takes no secret implementation shortcuts or user-defined extension points, then call ``basic_count_code_units`` directly. This can be useful to stop infinity loops when your extension points cannot handle certain inputs and thereby needs to "delegate" to the basic case.



~~~~~~~~~~~~



Functions
---------

.. doxygengroup:: ztd_text_count_code_units
:content-only:
85 changes: 84 additions & 1 deletion documentation/source/api/conversions/decode.rst
Original file line number Diff line number Diff line change
Expand Up @@ -31,7 +31,90 @@
decode
======

.. TODO: to be filled out!
The ``decode`` grouping of functions (``decode``, ``decode_to``, and ``decode_into``) perform the task of doing bulk decoding from an ``input`` of ``code_unit``\ s to the encoding's ``code_point`` type.



Named Groups
------------

There are 3 named functions for this behavior, and each function comes with several function overloads. Each named function produces increasingly more information, letting you opt into just how much information and control you'd like over the algorithm and behavior. The first one simply returns a container with the transformation applied, discarding much of the operation's result information. This is useful for quick, one-off conversions where you do not care about any errors and would rather let it be handled by the error handler. The second ``_to`` suffixed functions return a container within a ``result`` type that contains additional information. The final ``_into`` suffixed functions take an output range to write into, letting you explicitly control just how much space there is to write into as well as returning a detailed ``result`` type.

The return type for these function calls is one of:

- the desired output container (highest level);
- :doc:`ztd::text::decode_result </api/decode_result>` or :doc:`ztd::text::stateless_decode_result </api/decode_result>` with the desired output container embedded as the `.output` parameter (mid level); or,
- :doc:`ztd::text::decode_result </api/decode_result>` or :doc:`ztd::text::stateless_decode_result </api/decode_result>` returning just the input and output ranges (lowest level).


``decode(...)``
+++++++++++++++

This is the highest level bulk function.

This set of function overloads takes the provided ``input``, ``encoding``, ``handler`` and ``state`` and produces an output container type. The default container will either be a ``std::basic_string`` of the ``code_point`` type, or a ``std::vector`` if it is not a known "character" type.

The container type can be specified by passing it as an explicit template parameter to this function, such as ``ztd::text::decode<std::vector<char32_t>>("bark", ztd::text::ascii{});``. The output container is default constructed.

It will either call ``push_back``/``insert`` directly on the target container to fill it up, or serialize data to a temporary buffer (controlled by :ref:`ZTD_TEXT_INTERMEDIATE_BUFFER_SIZE <config-ZTD_TEXT_INTERMEDIATE_BUFFER_SIZE>`) before then copying it into the desired output container through any available means (bulk ``.insert``, repeated ``.push_back``, or repeated single ``.insert`` with the ``.cend()`` iterator in that order).

This is the "fire and forget" version of the ``decode`` function, returning only the container and not returning any of the result or state information used to construct it.


``decode_to(...)``
++++++++++++++++++

This is the mid level bulk function.

This set of function overloads takes the provided ``input``, ``encoding``, ``handler`` and ``state`` and produces an output container type that is embedded within a :doc:`ztd::text::decode_result </api/decode_result>`, or a :doc:`ztd::text::stateless_decode_result </api/stateless_decode_result>`, depending on whether or not you called the version which takes a :doc:`ztd::text::decode_state_t\<Encoding\> </api/decode_state>` as a parameter or if it had to create one on the stack internally and discard it after the operation was finished.

The container type can be specified by passing it as an explicit template parameter to this function, such as ``ztd::text::decode_to<std::u32string>("meow", ztd::text::ascii{});``. The output container is default constructed.

It will either call ``push_back``/``insert`` directly on the target container to fill it up, or serialize data to a temporary buffer (controlled by :ref:`ZTD_TEXT_INTERMEDIATE_BUFFER_SIZE <config-ZTD_TEXT_INTERMEDIATE_BUFFER_SIZE>`) before then copying it into the desired output container through any available means (bulk ``.insert``, repeated ``.push_back``, or repeated single ``.insert`` with the ``.cend()`` iterator in that order).

If nothing goes wrong or the error handler lets the algorithm continue, ``.input`` on the result should be empty.


``decode_into(...)``
++++++++++++++++++++

This is the lowest level bulk function.

This set of function overloads takes the provided ``input``, ``encoding``, ``output``, ``handler``, and ``state`` and writes data into the output range specified by ``output``. The result is a :doc:`ztd::text::decode_result </api/decode_result>`, or a :doc:`ztd::text::stateless_decode_result </api/stateless_decode_result>`, depending on whether or not you called the version which takes a :doc:`ztd::text::decode_state_t\<Encoding\> </api/decode_state>` as a parameter or if it had to create one on the stack internally and discard it after the operation was finished.

It is up to the end-user to provide a suitably-sized output range for ``output``, otherwise this operation may return with :doc:`ztd::text::encoding_error::insufficient_output </api/encoding_error>`. for the ``result``\ 's ``error_code`` member. The amount of space consumed can be determined by checking the ``std::distance`` between the ``.begin()`` of the original ``output`` parameter and the ``.begin()`` of the returned ``.output`` member. The result also has error information and an ``.input`` member for checking how much input was consumed.

If nothing goes wrong or the error handler lets the algorithm continue, ``.input`` on the result should be empty.



For Everything
--------------

All named functions have 4 overloads. Each of the "higher level" functions, at the end of their overload call chain, will call the lower-level ``decode_into`` to perform the work. The final ``decode_into`` call uses the following ordering of extension points into calling the base implementation:

- ``text_decode_into(input, encoding, output, handler, state)``
- An internal, implementation-defined customization point.
- ``basic_decode_into``

The base function call, ``basic_decode_into``, simply performs the :doc:`core decode loop </design/converting/decode>` using the :doc:`Lucky 7 </design/lucky 7>` design. This design also means minimal stack space is used, keeping the core algorithm suitable for resource-constrained devices.

.. note::

👉 This means that if you implement none of the extension points whatsoever, implementing the basic ``decode_one`` function on your Encoding Object type will guarantee a proper, working implementation.

.. note::

👉 If you need to call the "basic" form of this function that takes no secret implementation shortcuts or user-defined extension points, then call ``basic_decode_into`` directly. This can be useful to stop infinity loops when your extension points cannot handle certain inputs and thereby needs to "delegate" to the basic case.



~~~~~~~~~



Functions
---------

.. doxygengroup:: ztd_text_decode
:content-only:

0 comments on commit 1edd6a7

Please sign in to comment.