Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Formatter]: Able to normilze symbols #13024

Draft
wants to merge 9 commits into
base: master
Choose a base branch
from

Conversation

meatball133
Copy link
Contributor

@meatball133 meatball133 commented Jan 28, 2023

This one line change make so the formatter can normalize symbols, it uses the inspect method which isn't used else where in the formatter, I am interested in hearing if that is something that should be avoided and if I should do it some other way?

resolves #12938

@HertzDevil
Copy link
Contributor

This removes all "s and \s from the symbol even if they are valid or form part of an escape sequence, e.g. it turns :"\\\c\n\"" to :cn.

@meatball133
Copy link
Contributor Author

Thanks for the input, I will take a look into fixing that

@meatball133
Copy link
Contributor Author

meatball133 commented Jan 29, 2023

It should turn out to be :"\\c\n\"", right, or am I mistaken?

@straight-shoota
Copy link
Member

straight-shoota commented Jan 30, 2023

It's not clear what exactly the purpose of that normalize_symbols method is. It might be similar to String#inspect_unquoted? If we need some special behaviour for writing symbols, I think this should be an isolated method that can be tested individually.
That would also be my main recommendation for getting on with this: Write tests. We definitely needs tests for the final formatter behaviour. #12938 already presents some examples that would make good tests. Probably more edge cases are necessary as well.
Once the intended behaviour is clearly specified, we can improve on how to implement it.

@straight-shoota straight-shoota marked this pull request as draft January 30, 2023 15:53
@meatball133
Copy link
Contributor Author

meatball133 commented Jan 30, 2023

Well we need some way of taking a "raw" string and then confirming if it has a valid escape sequence or not. Although I don't know if it makes so much sense to have it as a method for a string? Or did I misunderstand what you meant there?. But surely we could have a method for strings that removes all "invalid" escape sequences.

I will look into updating/adding tests for this behavior.

@meatball133
Copy link
Contributor Author

Giving the formatter_spec.cr file this line:
assert_format ":\"\\c\n\"\"", ":\"\\\c\n\"\""
Raises a (Crystal::SyntaxException) with the message: DELIMITER_START

But in my normal test environment can it format :"\\\c\n\"" fine to: :"\\c\n\""

Did I do something wrong or is this an error with the formatter, since I tested it without my changes and it still raises an error?

@straight-shoota
Copy link
Member

The source in your example equals to :"c\n"" with (most) escapes replaced (except newline). You're probably missing an escape for the second to last quote. I'd recommend to use a percent string literal or heredoc (for longer strings) when a string contains quotes. That helps a lot avoiding such escape mistakes.

@meatball133
Copy link
Contributor Author

meatball133 commented Jan 30, 2023

Thanks ✨, I will use percent string literals, didnt think of that.

@HertzDevil
Copy link
Contributor

HertzDevil commented Jan 30, 2023

String literals do it like this for the string contents:

if @token.invalid_escape
write @token.value
else
write @token.raw
end

Symbol literals shouldn't be much different.

@meatball133
Copy link
Contributor Author

Yes I imagined it was something around this, I did go through and tried to search for what was causing it and then I compared the raw and value and noticed it had to be something around that.

@meatball133
Copy link
Contributor Author

I tried to write it like this: assert_format %(:\\c\n\"), %(:\\\c\n\")

Given it to the formater give: ":\\c\n\""

Am I doing something wrong?

@straight-shoota
Copy link
Member

Why don't you have crystal print the string to see what it actually looks like: puts %(:\\c\n\"). This prints :\c\n". That's again an invalid syntax. You're missing a quote after the colon.

@meatball133
Copy link
Contributor Author

Hmmm interesting, when I debug I use p and not puts. Using p gives ":\\c\n\"" and using puts gives what you said.

Any way changing it to this: assert_format %(:"\\c\n\""), %(:"\\\c\n\"") doesn't fix it

@meatball133
Copy link
Contributor Author

I know that it translate the string to: ":\c\n"" and those 2 " creates error, although at the same time is :"a\"" not equal to :"a"

@meatball133
Copy link
Contributor Author

@straight-shoota sorry for taking some time, been busy with exercism. I have now thought through the test files and they are now how I intend them. But that may be incorrect.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Formatter should normalize symbol literals and similar fragments
4 participants