Skip to content

Commit

Permalink
Merge pull request #7 from nicolasfella/patch-1
Browse files Browse the repository at this point in the history
Fix some typos in documentation
  • Loading branch information
ThePhD committed Mar 1, 2021
2 parents 1edd6a7 + f02e561 commit eb4c26d
Showing 1 changed file with 3 additions and 3 deletions.
6 changes: 3 additions & 3 deletions documentation/source/definitions.rst
Original file line number Diff line number Diff line change
Expand Up @@ -58,7 +58,7 @@ Occasionally, we may need to use precise language to describe what we want. This
A single unicode code point is NOT equivalent to a :term:`character <character>`, and multiple of them can be put together or taken apart and still have their sequence form a :term:`"character" <character>`. For a more holistic, human-like interpretation of code points or other data, see :term:`grapheme clusters <grapheme cluster>`.

unicode scalar value
A single unit of decoded information for Unicode. It's definition is identical to that of :term:`unicode code points <unicode code point>`, with the additional constraint that every unicode svalar value may not be a "Surrogate Value". Surrogate values are non-characters used exclusively for the purpose of encoding and decoding specific sequences of code units, and therefore carry no useful meaning in general interchange. They may appear in text streams in certain encodings: see :doc:`Wobbly Transformation Format-8 (WTF-8) </api/encodings/wtf8>` for an example.
A single unit of decoded information for Unicode. It's definition is identical to that of :term:`unicode code points <unicode code point>`, with the additional constraint that every unicode scalar value may not be a "Surrogate Value". Surrogate values are non-characters used exclusively for the purpose of encoding and decoding specific sequences of code units, and therefore carry no useful meaning in general interchange. They may appear in text streams in certain encodings: see :doc:`Wobbly Transformation Format-8 (WTF-8) </api/encodings/wtf8>` for an example.

grapheme cluster
The closest the Unicode Standard gets to recognizing a :term:`human-readable and writable character <character>`, grapheme cluster's are arbitrarily sized bundles of :term:`unicode code points <unicode code point>` that compose of a single concept that might match what a :term:`"character" <character>` is in any given human language.
Expand All @@ -67,10 +67,10 @@ Occasionally, we may need to use precise language to describe what we want. This
A set of functionality that includes an encode process or a decode process (or both). The encode process takes in a stream of code points and puts out a stream of code units. The decode process takes in a stream of code units and puts out a stream of code points. In a concrete sense, there are a number of additional operations an encoding needs: see the :doc:`Lucky 7 design concept</design/lucky 7>`.

encode
Converting from a stream of input, typically code points, to a stream of output, typically code units. The output may be less suitable for general interchange or consumption, or is in a specific interchange format for the interoperation. Freqently, this library expects and works with the goal that any decoding process is producing :term:`unicode code points <unicode code point>` or :term:`unicode scalar values <unicode scalar value>` from some set of :term:`code units <code unit>`.
Converting from a stream of input, typically code points, to a stream of output, typically code units. The output may be less suitable for general interchange or consumption, or is in a specific interchange format for the interoperation. Frequently, this library expects and works with the goal that any decoding process is producing :term:`unicode code points <unicode code point>` or :term:`unicode scalar values <unicode scalar value>` from some set of :term:`code units <code unit>`.

decode
Converting from a stream of input, typically code units, to a stream of output, typically code points. The output is generally in a form that is more widely consummable or easier to process than when it started. Freqently, this library expects and works with the goal that any decoding process is producing :term:`unicode code points <unicode code point>` or :term:`unicode scalar values <unicode scalar value>` from some set of :term:`code units <code unit>`.
Converting from a stream of input, typically code units, to a stream of output, typically code points. The output is generally in a form that is more widely consummable or easier to process than when it started. Frequently, this library expects and works with the goal that any decoding process is producing :term:`unicode code points <unicode code point>` or :term:`unicode scalar values <unicode scalar value>` from some set of :term:`code units <code unit>`.

transcode
Converting from one form of encoded information to another form of encoded information. In the context of this library, it means going from an input in one :term:`encoding <encoding>`'s code units to an output of another encoding's code units. Typically, this is done by invoking the :term:`decode <decode>` of the original encoding to reach a common interchange format (such as :term:`unicode code points <unicode code point>`) before taking that intermediate output and piping it through the :term:`encode <encode>` step of the other encoding. Different transcode operations may not need to go through a common interchange, and may transcode "directly", as a way to improve space utilization, time spent, or both.
Expand Down

0 comments on commit eb4c26d

Please sign in to comment.