Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Hexidecimal value with underscore #331

Closed
dariodsa opened this issue Aug 1, 2020 · 4 comments
Closed

Hexidecimal value with underscore #331

dariodsa opened this issue Aug 1, 2020 · 4 comments
Labels
bug Something isn't working

Comments

@dariodsa
Copy link
Collaborator

dariodsa commented Aug 1, 2020

Hi,
according to the TOML-0.5.0 version (link) hexadecimal, binary and octal values were introduced. I checked your test modules and you are checking for case insensitivity, negative, non-negative, etc but you missed one sentence in the specification.

Underscores are allowed between digits (but not between the prefix and the value).

So, for example, the following code snippet should pass:

*> Toml.Parser.parse $ "hex3 = 0xdead_beef"
*> Left (TomlParseError {unTomlParseError = "1:19:\n  |\n1 | hex3 = 0xdead_beef\n  |                   ^\nunexpected end of input\nexpecting '-', '.', '=', '_', or alphanumeric character\n"})

The same problem is with binary and octal numbers, classic decimal numbers support underscores.

I noticed the issue #199 is still open. If you want I can look into that and made test cases which will check all changes made from version 0.4.0 to 0.5.0.

@chshersh chshersh added the bug Something isn't working label Aug 3, 2020
@chshersh
Copy link
Contributor

chshersh commented Aug 3, 2020

Hi @dariodsa, thanks for reporting the issue! This indeed looks like a bug. It would be nice to add test-cases for such situations and to fix the problem itself, of course 🙂

Regarding issue #199: I've opened issue in the TOML repo itself a while ago about providing testable example:

But there will be no example in the TOML repo. Instead, the implementation needs to be verified with the official compliance test-suite:

So instead of walking through the examples in documentation, it's much better to perform automatic testing, as described in #167. But this also requires more work. I've also closed #199 as a duplicate of #167 due to some recent changes in the toml-lang org.

@dariodsa
Copy link
Collaborator Author

dariodsa commented Aug 3, 2020

Ok, I will add test cases for such situations and try to fix the problem.

@chshersh
Copy link
Contributor

chshersh commented Aug 3, 2020

@dariodsa No worries and thanks for the help!

Just to give some more context, here is the parser for all integer numbers:

-- | Parser for 'Integer' value.
integerP :: Parser Integer
integerP = lexeme (bin <|> oct <|> hex <|> dec) <?> "integer"
where
bin, oct, hex, dec :: Parser Integer
bin = try (char '0' *> char 'b') *> binary <?> "bin"
oct = try (char '0' *> char 'o') *> octal <?> "oct"
hex = try (char '0' *> char 'x') *> hexadecimal <?> "hex"
dec = signed sc decimalP <?> "dec"

For binary, octal and hexadecimal numbers the primitive parsers from megaparsec are reused. That's why they don't support underscores. The implementation needs to do something similar to the decimal parser:

-- | Parser for decimap 'Integer': included parsing of underscore.
decimalP :: Parser Integer
decimalP = zero <|> more
where
zero, more :: Parser Integer
zero = 0 <$ char '0'
more = check =<< readMaybe . concat <$> sepBy1 (some digitChar) (char '_')
check :: Maybe Integer -> Parser Integer
check = maybe (fail "Not an integer") pure

But I imagine this will be quite a lot of work...

dariodsa added a commit to dariodsa/tomland that referenced this issue Aug 4, 2020
dariodsa added a commit to dariodsa/tomland that referenced this issue Aug 4, 2020
dariodsa added a commit to dariodsa/tomland that referenced this issue Aug 4, 2020
dariodsa added a commit to dariodsa/tomland that referenced this issue Aug 5, 2020
dariodsa added a commit to dariodsa/tomland that referenced this issue Aug 5, 2020
dariodsa added a commit to dariodsa/tomland that referenced this issue Aug 5, 2020
dariodsa added a commit to dariodsa/tomland that referenced this issue Aug 5, 2020
dariodsa added a commit to dariodsa/tomland that referenced this issue Aug 5, 2020
dariodsa added a commit to dariodsa/tomland that referenced this issue Aug 9, 2020
dariodsa added a commit to dariodsa/tomland that referenced this issue Aug 9, 2020
dariodsa added a commit to dariodsa/tomland that referenced this issue Aug 9, 2020
chshersh pushed a commit that referenced this issue Aug 13, 2020
* [#331] adding test for hexadecimal representation

* [#331] adding test for octal representation

* [#331] adding test for binary representation with underscore

* [#331] corrected wrong types in function parseInteger

* [#331] add Data.Char module for ascii value and parsec modules for hex,oct and bin digits

* [#331] add more tests with underscore

* [#331] implemented parser for hex, oct and bin with underscore feature

* [#331] haskell style guide

* [#331] removed trailing spaces

* [#311] removed additional trailing spaces

* [#331] changed tests to check for exceptions and added new test group

* [#331] implemented parser for hex, oct and bin with underscore using megaparsec lib function

* [#331] added test cases that check when number ends with underscore

* [#331] resolved issues with code style and naming convention
@chshersh
Copy link
Contributor

Fixed by #332

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

No branches or pull requests

2 participants