Skip to content

Commit

Permalink
About half of carriage return leniency
Browse files Browse the repository at this point in the history
  • Loading branch information
Tamschi committed Jul 31, 2021
1 parent e577691 commit d2c013b
Show file tree
Hide file tree
Showing 6 changed files with 143 additions and 82 deletions.
27 changes: 17 additions & 10 deletions Cargo.lock

Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.

1 change: 1 addition & 0 deletions Cargo.toml
Original file line number Diff line number Diff line change
Expand Up @@ -43,6 +43,7 @@ try_match = "^0.2.2"
# Minimum version working with try_match starting with Rust 1.47.0.
# SEE: https://github.com/rust-lang/rust/issues/77789, https://github.com/dtolnay/syn/issues/906, https://github.com/dtolnay/syn/releases/tag/1.0.44
syn = { version = "^1.0.44", default-features = false }
tap = "1.0.1"

[dev-dependencies]
cargo-husky = "^1.5.0"
Expand Down
26 changes: 22 additions & 4 deletions docs/grammar.rst
Original file line number Diff line number Diff line change
Expand Up @@ -96,8 +96,10 @@ Empty lines outside of quotes and lines containing only a comment always can be
``taml fix`` can fix your line endings for you without changing the meaning of quotes. (TODO)
It warns about any occurrence of the character it doesn't fix by default, in either sense. (TODO)

Identifiers
-----------
.. _identifiers:

Identifier
----------

.. note::

Expand All @@ -109,7 +111,7 @@ Identifiers
.. code-block:: regex
`([^\\`]|\\\\|\\`)*`
`([^\\`\r]|\\\\|\\`|\\r)*`
Identifiers in TAML are arbitrary Unicode strings and can appear in two forms, verbatim and quoted:

Expand Down Expand Up @@ -164,7 +166,7 @@ Value

A value is any one of the following:

TK
`data literal`_, decimal_, `enum variant`_, integer_, list_, string_, struct_.

.. warning::

Expand Down Expand Up @@ -237,6 +239,22 @@ Additional trailing zeroes are considered idempotent and **must not make a diffe
Absolutely do not make any distinction regarding additional trailing zeroes in decimals when writing a lexer or parser.


String
------

.. note::

TK: Format as regex section

.. code-block:: regex
"([^\\"\r]|\\\\|\\"|\\r)*"
Strings are written as quoted Unicode literals. The characters ``\``, ``"`` and `U+000D CARRIAGE RETURN (CR) <https://graphemica.com/000D>`_
must be escaped as ``\\``, ``\"`` and ``\r``, respectively.

The character `U+0000 NULL <https://graphemica.com/0000>`_ may be unsupported in environments where processing it would be unreasonably error-prone.

.. _variants:

Enum Variants
Expand Down
8 changes: 6 additions & 2 deletions src/formatting.rs
Original file line number Diff line number Diff line change
Expand Up @@ -101,7 +101,9 @@ impl CanonicalFormatScanner {
| Token::Integer(_)
| Token::InvalidZeroPrefixedDecimal(_)
| Token::InvalidZeroPrefixedInteger(_) => State::Number,
Token::Identifier(_) => State::Identifier,
Token::Identifier(_) | Token::InvalidIdentifierWithVerbatimCarriageReturn(_) => {
State::Identifier
}
Token::Colon | Token::Comma => State::ColonOrComma,
Token::Error => State::Error,

Expand All @@ -115,7 +117,9 @@ impl CanonicalFormatScanner {
| Token::Thesis
| Token::Period
| Token::String(_)
| Token::DataLiteral(_) => State::Other,
| Token::InvalidStringLiteralWithCarriageReturn(_)
| Token::DataLiteral(_)
| Token::InvalidDataLiteralWithVerbatimCarriageReturn(_) => State::Other,
};

recommendation
Expand Down
4 changes: 4 additions & 0 deletions src/parsing.rs
Original file line number Diff line number Diff line change
Expand Up @@ -1366,6 +1366,10 @@ fn parse_value<'a, Position: Debug + Clone + PartialEq>(
return Err(());
}

(lexerToken::InvalidDataLiteralWithVerbatimCarriageReturn(_), span) => todo!(),
(lexerToken::InvalidIdentifierWithVerbatimCarriageReturn(_), span) => todo!(),
(lexerToken::InvalidStringWithVerbatimCarriageReturn(_), span) => todo!(),

(_, span) => return err(span, reporter),
})
} else {
Expand Down

0 comments on commit d2c013b

Please sign in to comment.