-
-
Notifications
You must be signed in to change notification settings - Fork 1.6k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Parser: rewrite type parser #9208
Conversation
Does this still allow for the star after a type like in libs? fun open_device = OpenDevice(chr : Char*) : UInt8* |
My guess is yes because CI passes and that syntax is in our code. |
@jwoertink This represents pointer type and it is allowed of course. In type grammar, prefix I think this PR fix does not break any existing codes. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This looks good! Since it passes CI I expect it to have few if any regressions.
I remembered the infra that tries to build external Crystal projects with the new changes (that I think @bcardiff owned). Should that be involved? |
https://github.com/bcardiff/test-ecosystem I think he runs that as part of the release process, but sure, no harm in doing it on this branch too. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't know how I spotted a typo. 😅
Thanks for the BNF!
To try How should I do? or could anyone try it? |
You should try again, crystal-lang/shards#379 probably fixed this. |
@makenowjust nightlies are stuck until a new version of shards is releases. The easiest way to use test-ecosystem today would be to use the
That way you can use your working copy of both projects (or even use the current shards binary) |
if things stuck again ping me and I'll check what can I do. It's hard to automate things in test-ecosystem. In every release I need to change the content in 10-clone-repos.sh to point to forked branches with overrided dependencies. If the scope is to check if master of those projects work, things are doable, but IME is hardly the case. for example checking PR:8893 db drivers requires some update, and that mean forking lots of things later. Shards overrides will be helpful of course. |
I'm pretty sure this change is fine, though. Perhaps there might be some rare glitch that hasn't shown up on the radar yet. But this might be easier to spot when this PR is merged and in nightly. |
I agree, if you have troubles getting it running I think we can do without! |
If all specs pass, this can be merged. This isn't a language feature that's changing or a breaking change. The compiler and std use some C bindings too. I think merging this is fine. We can revert it if it breaks things just before releasing the next version. |
Actually, I'll review this today. |
spec/compiler/parser/parser_spec.cr
Outdated
it_parses "def foo(var : self?); end", Def.new("foo", [Arg.new("var", restriction: Crystal::Union.new([Self.new, Path.global("Nil")] of ASTNode))]) | ||
it_parses "def foo(var : self?); end", Def.new("foo", [Arg.new("var", restriction: Crystal::Generic.new(Path.global("Union"), [Self.new, Path.global("Nil")] of ASTNode))]) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why does this change? Same for other place where we now produce a Generic instead of Union. I think Union is simpler...?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Generic
has question
property to represent suffix ?
. (see here). So I think it is better than using Union
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The thing is that we have code in restrictions.cr
for Union and I think it should work the same as for Generic, but I'm not sure. So changing this might indeed break someones code because we are changing behavior.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm not sure too. But your concern is semantic analyzer's bug, and this is also parser's bug fix.
For instance, currently Foo(Bar?)?
is parsed to Generic.new(Path.global("Union"), [Generic.new(Path.new("Foo"), [Union.new([Path.new("Bar"), Path.global("Nil")])]), Path.global("Nil")])
(inner Bar?
uses Union
and outer Foo(...)?
uses Generic
.)
When such a bug is found really, we can fix analyzer correctly. Leaving a certain bug sounds not good.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Okay, let's wait and see if bugs happen then.
Who can merge this PR... @asterite If possible, could you merge or approve this? |
Sorry, I won't have time to review this. But if others feel confident this is okay, please merge. I'm still hesitant about the Union -> Generic change, I think representing it with |
Aha, mixing
So I will try to keep this behavior. @asterite Is this okay? |
@makenowjust I think so. It's just those changes in the parser that didn't need to change. Before they were Union and now they are Generic. My only comment in this PR. |
For some reasons, the parser for types and generic type path literals is known as one of the most complex part of the parser. (Another one is `do ... end` block parsing.) - **too much sharing**: the parser for a single type, type arguments, proc type arguments, tuple element types and others are unified to just one method. This code-sharing is excessive. They are not share syntax really, so there are many conditions and branches, thus it is hard to follow code flow. - **historical naming**: a type path (or part ot this) is called as `ident` for historical reason. However `ident` means different today. As the above reasons, gazillion bugs live in there unfortunately. For instance, they are valid type restriction for now: - `*Foo` (orphan splat) - `(Foo, Bar) | Baz` (comma + union type) - `{(Foo, Bar), Baz}` (comma + tuple type) This commit refines type grammer and rewrite parsers.
Co-authored-by: Brian J. Cardiff <bcardiff@gmail.com>
crystal-lang#9208 (comment) and, some bugs are fixed.
Now Note that 3235714 introduces a tiny breaking change. See the following. This code can be parsed by the current compiler (Of course it is failed on semantic phase.). class Foo < Bar?
include Baz?
end After 3235714, it becomes parsing error. I think any problems are not happened by this change because every such code is not compilable. |
@makenowjust That's great, thank you! I'll try to review it this week if I can. |
Well, sorry again, I don't think I'll have time to review this 😓 |
FYI test-ecosystem is happy with this PR. |
Problem
For some reasons, the parser for types and generic type path literals is known as one of the most complex part of the parser. (Another one is
do ... end
block parsing.)ident
for historical reason. Howeverident
means different today.As the above reasons, gazillion bugs live in there unfortunately. For instance, they are valid type restriction for now:
*Foo
(orphan splat)(Foo, Bar) | Baz
(comma + union type){(Foo, Bar), Baz}
(comma + tuple type)So, refining type grammar and rewriting type parser is necessary.
Type Grammar
The following is complete type grammar which is supported by this PR.
I think this grammar behaves as we expect.
(Repeatedly the type parser is broken, so we cannot know the current right grammar. I tried to observe the parser behavior and reject unexpected ones.)
At least, modifying stdlib and all specs for this grammar compatibility was not needed.
Fixes
Rewriting type parser is done in this PR, then some bugs are fixed.
Reject orphan splat
Now, splat (
*
) can appear only in type arguments (proc input type arguments also).The following example is syntax error by this PR:
Disable auto-flatting comma list
Currently
{(Foo, Bar), Baz}
is parsed to{Foo, Bar, Baz}
. However no one wants this behavior. In addition(Foo, Bar) | Baz
is valid for near reason. They should be disabled.The following example is syntax error by this PR:
And, some minor bugs are fixed:
(self)?
restriction correctly.Unify parsed AST forself?
andA?
. (Current compiler generatesself | ::Nil
and::Union(A, ::Nil)
. I think the latter is correct AST becauseGeneric
hasquestion
flag to represent nilable type.)/cc @asterite