-
Notifications
You must be signed in to change notification settings - Fork 13.1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
fix(oracle): denormalize column names where applicable #24471
Conversation
Ping @rumbin @agusfigueroa-htg |
"name": name, | ||
"column_name": name, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
A recently merged PR changed "name" to "column_name", but left "name": #24248 To be consistent with this, I'm setting both.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@villebro we were worried about backwards compatibility, since some clients/users are only using the API. Looking to deprecate name in the future
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for the context @hughhhh - makes sense 👍 Let's try to remove these when we do the next round of breaking changes.
Codecov Report
@@ Coverage Diff @@
## master #24471 +/- ##
==========================================
- Coverage 69.03% 68.95% -0.08%
==========================================
Files 1901 1901
Lines 74002 74008 +6
Branches 8116 8116
==========================================
- Hits 51086 51034 -52
- Misses 20805 20863 +58
Partials 2111 2111
Flags with carried forward coverage won't be shown. Click here to find out more.
... and 16 files with indirect coverage changes 📣 We’re building smart automated test selection to slash your CI/CD build times. Learn more |
@villebro this is great! |
Loving this! @villebro and I have been talking about this for a while now, and I'm happy to see that this solution tackles the intricate issue in a clear and straightforward way. The PR description of the fix, the tests, and the code itself all look fantastic to me. If anyone reading this thread has access to Oracle/Snowflake test DBs, or knows their intricacies, please consider adding your name/handle to the Database Rolodex. Unfortunately we have no additional folks to ping for Oracle/Snowflake... yet. |
Late to the party here, thank you very much @villebro ! Also looking forward to seeing how name/column_name ends up looking like after the next round of breaking changes (we have quite a few API flows in place). |
HI @villebro I believe we found a breaking change with this PR when a chart has been created with lower case columns that are later synced and changed to uppercase. The chart can no longer find those columns. Have you heard of any other reports? Do you think we could fast-follow with a fix? (This is currently reverted from our local branch but we'd like to pull it back in!) |
@villebro Should we revert this on 3.0 or are we planning on a follow-up? |
Hi @michael-s-molina, I'm working on a fix, and I should have it ready by the end of this week. |
SUMMARY
Currently all databases store physical datasets with normalized column names, and virtual datasets with denormalized column names. This is problematic for Oracle-like databases where the normalized name is typically all lowercase, and the denormalized one is ALL UPPERCASE. This causes issues when using native filters on dashboards, as the column references will be different, depending on whether or not the chart is built on a physical or virtual dataset.
To fix this, this PR proposes calling
dialect.denormalize_name()
on dialects that have set therequires_name_normalize
property toTrue
. This ensures that databases like Oracle and Snowflake will have the same case for both physical and virtual datasets.We add a unit test to verify that Oracle changes the case correctly, and that MSSQL doesn't do any conversion. We don't add a Snowflake test, as our current testing requirements don't include the necessary Snowflake drivers, and I don't want to bloat our testing dependencies.
Note that there's no migration for migrating existing physical datasets, as that would have introduced significant risks. Therefore only datasets created after this PR will be affected. Consequently anyone who wants to migrate old datasets to the new format needs to manyally sync the columns on the dataset modal.
AFTER
After the change, Snowflake will denormalize column names, typically resulting in ALL UPPERCASE:
![image](https://private-user-images.githubusercontent.com/33317356/247571159-97690263-7752-4391-b11e-91e600a87b3e.png?jwt=eyJhbGciOiJIUzI1NiIsInR5cCI6IkpXVCJ9.eyJpc3MiOiJnaXRodWIuY29tIiwiYXVkIjoicmF3LmdpdGh1YnVzZXJjb250ZW50LmNvbSIsImtleSI6ImtleTUiLCJleHAiOjE3MjA1MjMwNDksIm5iZiI6MTcyMDUyMjc0OSwicGF0aCI6Ii8zMzMxNzM1Ni8yNDc1NzExNTktOTc2OTAyNjMtNzc1Mi00MzkxLWIxMWUtOTFlNjAwYTg3YjNlLnBuZz9YLUFtei1BbGdvcml0aG09QVdTNC1ITUFDLVNIQTI1NiZYLUFtei1DcmVkZW50aWFsPUFLSUFWQ09EWUxTQTUzUFFLNFpBJTJGMjAyNDA3MDklMkZ1cy1lYXN0LTElMkZzMyUyRmF3czRfcmVxdWVzdCZYLUFtei1EYXRlPTIwMjQwNzA5VDEwNTkwOVomWC1BbXotRXhwaXJlcz0zMDAmWC1BbXotU2lnbmF0dXJlPTM3YjU5NTc3M2FhYzk2NjIwZGJiMDg4NzFlMTE1M2Y5M2ZkNjZmNzgzY2IxYmY5MGUwZDZjNmE0ZTkzYTJkYzMmWC1BbXotU2lnbmVkSGVhZGVycz1ob3N0JmFjdG9yX2lkPTAma2V5X2lkPTAmcmVwb19pZD0wIn0.rR3Aque3LShuU_gn9S2QEjsUnXyuh2UL-ivebYIwhLg)
BEFORE
Before physical datasets would usually store the column names as all lowercase on Snowflake:
![image](https://private-user-images.githubusercontent.com/33317356/247570237-a25ca91a-2415-4f88-b6d9-a93fd607fb40.png?jwt=eyJhbGciOiJIUzI1NiIsInR5cCI6IkpXVCJ9.eyJpc3MiOiJnaXRodWIuY29tIiwiYXVkIjoicmF3LmdpdGh1YnVzZXJjb250ZW50LmNvbSIsImtleSI6ImtleTUiLCJleHAiOjE3MjA1MjMwNDksIm5iZiI6MTcyMDUyMjc0OSwicGF0aCI6Ii8zMzMxNzM1Ni8yNDc1NzAyMzctYTI1Y2E5MWEtMjQxNS00Zjg4LWI2ZDktYTkzZmQ2MDdmYjQwLnBuZz9YLUFtei1BbGdvcml0aG09QVdTNC1ITUFDLVNIQTI1NiZYLUFtei1DcmVkZW50aWFsPUFLSUFWQ09EWUxTQTUzUFFLNFpBJTJGMjAyNDA3MDklMkZ1cy1lYXN0LTElMkZzMyUyRmF3czRfcmVxdWVzdCZYLUFtei1EYXRlPTIwMjQwNzA5VDEwNTkwOVomWC1BbXotRXhwaXJlcz0zMDAmWC1BbXotU2lnbmF0dXJlPTViNGIzMTRkYzdkODg4NDZiMDg3Njc4MTFiNmIzYjExZmJkOWRiM2IzOWQ1OGJlODY1ZDQzNGEyZDlkMjZmMWUmWC1BbXotU2lnbmVkSGVhZGVycz1ob3N0JmFjdG9yX2lkPTAma2V5X2lkPTAmcmVwb19pZD0wIn0.ILNGZGZpuIxR7JB6bjKj3C2w6CZ1_DoOpNL-djOT228)
TESTING INSTRUCTIONS
select * from ...
virtual dataset referencing the same tableADDITIONAL INFORMATION