New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add support max_identifier_length
session variable
#20999
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This looks great to me. Can you work with @mjibson/ @jkosh44 to ensure that no one is using identifiers larger than this before we ship this in a release?
@morsapaes also heads up that we should adjust the dbt adapter for this accordingly.
SELECT octet_length(name), name FROM
(SELECT name FROM mz_objects
UNION ALL
SELECT name FROM mz_clusters
UNION ALL
SELECT name FROM mz_cluster_replicas
UNION ALL
SELECT name FROM mz_databases
UNION ALL
SELECT name FROM mz_schemas
UNION ALL
SELECT name FROM mz_roles
UNION ALL
SELECT name FROM mz_columns)
WHERE octet_length(name) > 255; Should do the trick. I'll post in the release channel to ask the release manager to run this in prod. EDIT: I added |
Looks like this won't break any environments: https://materializeinc.slack.com/archives/CTESPM7FU/p1691430142889159?thread_ts=1691420649.092829&cid=CTESPM7FU, so this is good to merge assuming those nightly failures are unrelated. I'll add a step to the release playbook to check again during the release. |
Though it does look like the Parallel workload tests need to be updated
|
Thanks a lot for taking a look, Joe. I've fixed the tests that failed because of too long identifies. The feature benchmark still seems to be failing, but I don't see how I can possibly introduce a regression there. |
@nrainer-materialize @philip-stoev Is that something to worry about? @sthm as a side note I left column names and aliases in view definitions out of my query above. For column names we can add the |
The feature benchmark failure can be ignored. It is unrelated to the PR at hand. |
Adapts dbt-materialize to #20999.
This PR adds a max length to the `Ident` struct. In #20999 we added the concept of a "max_identifier_length" which was enforced when parsing SQL, but it was not enforced for identifiers created internally, e.g. when generating a name for a progress subsource by appending "_progress" to the source name. The largest changes in this PR are the following APIs: * **`Ident::new(...)`** will return an `Err(IdentError)` if the provided string is longer than our max, which is 255 bytes. * **`ident!("some static str")`** macro, which enforces our invariants at compile time. This prevents a pattern like `Ident::new("some static str").expect("known correct")` from polluting the code base. * **`Ident::new_unchecked(...)`** checks the invariants with a `soft_assert!`. This pattern is an escape hatch for cases we know are valid, but can't express in the type system, e.g. appending a number to a static string. Wherever possible I used `Ident::new(...)` or `ident!(...)`, but there were some callsites that didn't already return an error, and didn't have a `&'static str` as an argument. For those cases I used `Ident::new_unchecked(...)` to prevent this PR from getting too large. At the very least by using `new_unchecked(...)` our tests will catch invalid `Ident`s via soft asserts. ### Motivation Fixes #22535 But also has the goal of eliminating these kinds of bugs entirely. Chatted about this on [Slack](https://materializeinc.slack.com/archives/C02FWJ94HME/p1699476924158659) and it seemed desirable. ### Checklist - [ ] This PR has adequate test coverage / QA involvement has been duly considered. - [ ] This PR has an associated up-to-date [design doc](https://github.com/MaterializeInc/materialize/blob/main/doc/developer/design/README.md), is a design doc ([template](https://github.com/MaterializeInc/materialize/blob/main/doc/developer/design/00000000_template.md)), or is sufficiently small to not require a design. <!-- Reference the design in the description. --> - [ ] If this PR evolves [an existing `$T ⇔ Proto$T` mapping](https://github.com/MaterializeInc/materialize/blob/main/doc/developer/command-and-response-binary-encoding.md) (possibly in a backwards-incompatible way), then it is tagged with a `T-proto` label. - [ ] If this PR will require changes to cloud orchestration or tests, there is a companion cloud PR to account for those changes that is tagged with the release-blocker label ([example](MaterializeInc/cloud#5021)). <!-- Ask in #team-cloud on Slack if you need help preparing the cloud PR. --> - [x] This PR includes the following [user-facing behavior changes](https://github.com/MaterializeInc/materialize/blob/main/doc/developer/guide-changes.md#what-changes-require-a-release-note): - Internal only change
This PR introduces a session variable to report the maximum identifier length of 255 and enforces that limit on identifiers. Previously, the identifier length was unconstrained, so this is a backwards incompatible change.
This PR adds a known-desirable feature #20931.
Motivation
For compatibility reasons (SQL Server Management Studio), its desirable to report this value for clients.
Checklist
$T ⇔ Proto$T
mapping (possibly in a backwards-incompatible way), then it is tagged with aT-proto
label.max_identifier_length
session variable.