-
Notifications
You must be signed in to change notification settings - Fork 1.2k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
fix(sql): ASOF and LT JOINS no longer convert SYMBOL keys to STRING #3087
fix(sql): ASOF and LT JOINS no longer convert SYMBOL keys to STRING #3087
Conversation
this prevents converting symbol keys to strings
…keys_wrapping_to_master
the main point of AbstractJoinCursor is to provide symbol tables. but these methods are overridden in the LtJoinRecordCursor anyway
…keys_wrapping_to_master
also some of temporary comments removed
…keys_wrapping_to_master
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Are there tests with LT join on symbol columns between different tables with unmatching symbol sets?
So that if table T1 has symbols
1 = A
2 = B
3 = C
And table T2 has symbols
1 = C
2 = D
3 = A
and then join them
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm still not sure I can fully understand the mechanics.
Perhaps @bziobrowski you can have a look here too?
core/src/main/java/io/questdb/griffin/engine/join/AbstractSymbolWrapOverCursor.java
Show resolved
Hide resolved
core/src/main/java/io/questdb/griffin/engine/join/AbstractSymbolWrapOverCursor.java
Show resolved
Hide resolved
core/src/main/java/io/questdb/griffin/engine/join/LtJoinRecordCursorFactory.java
Outdated
Show resolved
Hide resolved
One more question, what if the Master table does not have a |
…ction All classes implementing RecordMetadata were also subclasses of AbstractRecordMetadata and thus implementing ColumnMetadataCollection. Having `TableColumnMetadata getColumnMetadata(int columnIndex);` directly in `RecordMetadata` makes sense. So no need to keep the hierarchy separated and have code branches which are never taken.
@ideoma can you please elaborate this:
Do you mean joining on a symbol key where a slave cursor has no matching record? |
…keys_wrapping_to_master
core/src/main/java/io/questdb/griffin/engine/join/AsOfJoinRecordCursorFactory.java
Show resolved
Hide resolved
…keys_wrapping_to_master
…keys_wrapping_to_master
[PR Coverage check]😍 pass : 91 / 99 (91.92%) file detail
|
@bziobrowski can you please have another look? thanks! |
a failure on windows-other due to the #2941 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks good now 👍
fixes #2976
The issue is best demonstrated with multiple levels of LT JOINS.
It goes like this:
The idea behind this fix: Join keys are present in both master and slave records. So we can still convert key symbols from the slave to strings to join with the master. But when a downstream code asks for a key column from the slave we can cheat and return the symbol ID from the master cursor. Because the symbol (string value) must be the same; otherwise, a join would not match.
I call this: "wrapping over from slave columns to master columns". Why? Columns in a join record as organized like this:
So when a downstream code asks for a symbol column which is after the slave key-value divide then we can "wrap-over" to master columns and fetch the symbol ID from a master record. Because the symbol has to have the same string value in both master and slave records. (assuming the slave record is non-null). And obviously, the join cursor has to return the right symbol table on
get/newSymbolTable()
AN ALTERNATIVE SOLUTION:
An alternative could be to sink symbol IDs from keys into a map value. So when a JOIN key uses symbol then the Join would sink a string representation of the symbol to map key (for joining with a master record) and
int
representation of the symbol would also go to a map value. So when a downstream operator asks for a symbol ID which is used as a key the JOIN record would return the ID from a value portion of a map entry.