-
Notifications
You must be signed in to change notification settings - Fork 217
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
refactor: Implement Ident as vec #2305
Conversation
Part of PRQL#1535, this changes our Idents in PL from `$namespace.$name` to an arbitrary hierachy. It's full of `.clone()` & `.into()` -- I've been doing this by replacing one definition and then adding lots of `.into()` untils it passes (not the most conceptual work!). Doing it incrementally at least means I can't end up in a quagmire of not knowing why the new version doesn't work; it's easy to revert to the last known good state. We can do another pass to improve the rust / reduce allocations. The plan here would be to: - replace any other usages in the compilation prior to the ident being resolved. Once it's resolved, it doesn't necessarily need a full hierarchy. Possibly we can remove the old `Ident` / rename the new one to `Ident`. - adjust how the resolution works so we can have arbitrary hierarchies of schema (`a.b.c`). Still some work to think about how this should work (some initial comments in PRQL#1535) - make backticks fully opaque, so PRQL#1535 works -- then using parquet files will be easy, we won't need to quote schemas, the semantics will be simple & consistent
This is now ready for review. The final PR ends up being very small. I feel like it was a lot of work! — because each change undid some of the previous change as I moved through the pipeline swapping out If others agree the path above is good, we can merge this, and then we can work on the name resolution, as per bullet (2) above |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can you convince me that this is an improvement?
I see that:
- Ident was capable of representing arbitrary deeply nested names before this PR,
- Ident can now also represent names with zero path chunks, which is not desired. The methods guard against that, but one could still just construct the struct directly from an empty vec.
- it does more cloning.
I may just not understand your plan entirely. We already have nested modules (with no syntax to express that) and lookups arbitrary deep into them. And the linked issue with table names lies mostly in how table names are translated in the SQL backend.
// Q: @aljazerzen do we need these now? I couldn't immediately see what they | ||
// do (but can look more if needed). |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
No, we don't. This makes it so the Ident was serialized into an array, even though it was a struct. But now this should have no effect, because it's gets serialized into array anyway.
Ok, reflecting more, I think I might have made a mistake here. That's annoying — I should have raised the suggestion earlier. Thanks a lot for pushing back. Do you remember where in the library currently stops us from having Probably we should start there and then work out what's required, rather than my previous approach. |
Ah, it happens :) It's a long explanation of how table inference is working, so I'll just point you toward Module::wildcard and Module::default |
Scratch that, look at this https://github.com/PRQL/prql/blob/main/prql-compiler/src/semantic/context.rs#L53-L54 |
Part of #1535, this changes our Idents in PL from
$namespace.$name
to an arbitrary hierachy.It's full of
.clone()
&.into()
-- I've been doing this by replacing one definition and then adding lots of.into()
untils it passes (not the most conceptual work!). Doing it incrementally at least means I can't end up in a quagmire of not knowing why the new version doesn't work; it's easy to revert to the last known good state. We can do another pass to improve the rust / reduce allocations.The plan here would be to:
Ident
/ rename the new one toIdent
.a.b.c
). Still some work to think about how this should work (some initial comments in Resolve opaqueness of idents #1535)