Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

`,` is not allowed in an ID #5

Closed
erkal opened this Issue Apr 3, 2019 · 5 comments

Comments

Projects
None yet
2 participants
@erkal
Copy link

erkal commented Apr 3, 2019

If we want to encode things like point in the way graphviz does, than we need to allow , in IDs.
I believe that adding , to this line would solve the problem.

Maybe all the special symbols which are used in DOT language (I mean things like {,=,;, [ and so on) should be allowed in IDs.
For example, when I tried to put json strings as ID's, the decoding didn't work because those symbols like { which are common to both json and dot, caused errors when parsing.

@brandly

This comment has been minimized.

Copy link
Owner

brandly commented Apr 3, 2019

you're right. toString has some deficiencies.

the language spec says node IDs include

Any string of alphabetic ([a-zA-Z\200-\377]) characters, underscores ('_') or digits ([0-9]), not beginning with a digit;

what do they mean by \200-\377? is that a range of utf-8 characters?

potentially, within toString, the code should say "if any character is outside this whitelist, wrap it in quotation marks" instead of having a blacklist of symbols. any idea what that looks like in Elm? hah

@brandly brandly referenced this issue Apr 4, 2019

Closed

Problem with `#` #3

@erkal

This comment has been minimized.

Copy link
Author

erkal commented Apr 4, 2019

what do they mean by \200-\377? is that a range of utf-8 characters?

I think it is ASCII
The octal codes for Extended Character Set (ANSI) go from 200 to 377:
https://www.autoitscript.com/autoit3/docs/appendix/ascii.htm

@erkal

This comment has been minimized.

Copy link
Author

erkal commented Apr 4, 2019

I fear that allowing characters like } in the IDs can break the parser.

@brandly

This comment has been minimized.

Copy link
Owner

brandly commented Apr 4, 2019

any characters, like {, should be fine as long as they're wrapped in quotation marks.

in the failing test i posted, parsing the # in an ID is fine already since it's wrapped in quotes. similarly, parsing arbitrary JSON in an ID should be fine, now that i fixed handling escaped quotation marks.

as mentioned above, toString needs work. the current implementation of showId is short-sighted. i could keep adding characters like { to the blacklist that triggers the need for quotation marks, but i think the correct way is to define a whitelist of characters that don't need quotation marks and wrap the ID if any character falls outside those bounds.

The octal codes for Extended Character Set (ANSI) go from 200 to 377:

good find! i'll figure out how Elm handles these things, and get both these issues fixed. thanks again

@erkal

This comment has been minimized.

Copy link
Author

erkal commented Apr 5, 2019

Any string of alphabetic ([a-zA-Z\200-\377]) characters, underscores ('_') or digits ([0-9]), not beginning with a digit;

Here is a suggestion:

isInWhiteList : Char -> Bool
isInWhiteList char =
    let
        asciiOctalFrom200To377 : Set Char
        asciiOctalFrom200To377 =
            Set.fromList [{- here come the characters with that code -}]
    in
    List.any identity
        [ Char.isLower char
        , Char.isUpper char
        , '_' == char
        , Char.isDigit char
        , Set.member char asciiOctalFrom200To377
        ]


beginningWithADigit : String -> Bool
beginningWithADigit string =
    case String.uncons string of
        Just ( c, _ ) ->
            Char.isDigit c

        Nothing ->
            False


hasACharacterNotInWhiteList : String -> Bool
hasACharacterNotInWhiteList =
    String.any (isInWhiteList >> not)


shouldBeQuoted : String -> Bool
shouldBeQuoted s =
    {- `String.isEmpty s` could prevent current parsing errors caused the empty strings. -}
    String.isEmpty s || beginningWithADigit s || hasACharacterNotInWhiteList s
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
You can’t perform that action at this time.