sql: refactor type types #10624

benesch · 2022-02-13T08:56:28Z

This mega PR refactors the type types as described in #10571. More details in that issue and in the commit messages within. The net result is, IMO, a set of types whose purposes are more clear, safer and more correct conversions between them, and the removal of a lot of duplicate code.

I'm sorry this diff is so large, but these changes were massively interdependent, and I don't have any better ideas for further splitting it into chunks.

Fix #10571.
Fix #10576.

Motivation

This PR refactors existing code.

Checklist

This PR has adequate test coverage / QA involvement has been duly considered.
This PR adds a release note for any user-facing behavior changes.

sploiselle · 2022-02-14T17:09:01Z

src/pgrepr/src/types.rs

+}
+
+impl NumericConstraints {
+    fn from_typmod(typmod: i32) -> Option<NumericConstraints> {


🤯 This seems like wants to share an implementation w/

materialize/src/expr/src/scalar/func.rs

Line 5691 in e49579c

fn mz_render_typemod<'a>(

, though maybe I'm being overeager

Once this PR lands, will see if I can find a place to tie these together.

Oh 100% it should!! Good catch!

sploiselle

I obviously didn't spend time reverse-engineering the motivation behind all of these changes. However, I "spot checked" all of the major points of refactoring and things seem really great/clear.

If you have the time or interest, I'd love to hear your high-level thinking behind:

PgScalarType being the unnecessary pivot between pgrepr and ScalarType. I believe you mentioned removing this to me previously, but did you know it was chaff because you had to write it originally, or was this something you discovered by means of "looking at"/understanding the dependency graph?
New typing the typmods in adt. Like, is it nice to tidy up the semantic boundaries of these things? I saw a "greatest common denominator" for all of them, so never would have been motivated to specialize the code. Would love to get a clearer sense of why you think this approach is superior (asking to learn, not in any way a disagreement).

sploiselle · 2022-02-14T17:12:09Z

src/postgres-util/src/lib.rs

-/// [`VARHDRSZ`](https://github.com/postgres/postgres/blob/REL_14_0/src/include/c.h#L627) constant
-const PG_HEADER_SIZE: i32 = 4;
-
-pub enum PgScalarType {


nharring-adjacent

This all looks reasonable, I mostly focused on the postgres_util and pgrepr bits since I'm more familiar with that code however everything is clear and readable. Thanks very much for doing this, these type relationships seem demonstrably easier to grok and use.

benesch · 2022-02-14T21:25:04Z

PgScalarType being the unnecessary pivot between pgrepr and ScalarType. I believe you mentioned removing this to me previously, but did you know it was chaff because you had to write it originally, or was this something you discovered by means of "looking at"/understanding the dependency graph?

A combination of both, I think! When I wrote pgrepr::Type I meant for it to be an exact representation of Postgres's view of the type. So seeing a type called PgScalarType pop up definitely triggered my spidey sense. But pgrepr::Type wasn't directly usable as PgScalarType for sure, because it had accumulated some cruft over the years and didn't handle typmods directly.

New typing the typmods in adt. Like, is it nice to tidy up the semantic boundaries of these things? I saw a "greatest common denominator" for all of them, so never would have been motivated to specialize the code. Would love to get a clearer sense of why you think this approach is superior (asking to learn, not in any way a disagreement).

Oh, I just get stressed out about having representable states that are invalid. Like, nothing stopped someone from inadvertently constructing ScalarType::Numeric { scale: Some(400) }, and falling into completely unexercised code paths. This doesn't matter too much when you're only constructing ScalarType::Numeric in one place, but since there's a few different ways of getting there now (via the SQL parser -> planner, via pgrepr, via the catalog, and via interchange), it just seemed worth completely eliminating the possibility of error. I can now say with confidence that nothing in our codebase can construct a ScalarType::Numeric with a scale that exceeds 38. On the other hand: still not convinced that everything that constructs a decimal is properly restricting the precision to 38!

They don't seem to be used. Also improve the comments on the wrapper types.

…types Redistribute how PostgreSQL types are handled. Creating a PostgreSQL type from an OID and typmod is now handled entirely by `pgrepr::Type`. The `postgres_util::PgColumn` struct now stores a `pgrepr::Type` directly, rather than a separate `PgScalarType` enum that was relatively duplicative with `pgrepr::Type`. The `sql` crate now generates views for a PostgreSQL source by converting a `pgrepr::Type` to an `UnresolvedDataType` by way of a new `fmt::Display` implementation on `pgrepr::Type`. For safety, the modifiers on `ScalarType::{Numeric,VarChar,Char}` now have newtype wrappers that enforce the bounds at construction. For clarity, the `scale` parameter of `ScalarType::Numeric` has been renamed `max_scale` and the `length` parameter of `ScalarType::VarChar` has been renamed `max_length`, to better reflect that they are maximums, not hard requirements. This is in contrast to `ScalarType::Char`, where the length is in fact a hard requirement, not a maximum.

benesch · 2022-02-15T03:57:26Z

I had to ditch the last commit that updated the test that compares PostgreSQL OIDs to Materialize OIDs, because it was failing in CI for reasons entirely unrelated to this commit. I'll do that in a separate PR to get this unblocked.

Store types in the catalog via a new CatalogType enum. This enum is nearly one-to-one with ScalarType, except that modifier fields are removed and embedded types are replaced with GlobalId references. This type is used throughout the SQL planner, in the plans returned by `CREATE TYPE` and in the type name resolution code. The motivation for this refactor is the removal of the "lossy" conversions from `pgrepr::Type` to `ScalarType`. Now, all conversions are either full fidelity or report an error if they would discard data.

benesch · 2022-02-15T04:46:26Z

Hallelujah! Thanks for the reviews, everyone!

benesch requested review from petrosagg, sploiselle and nharring-adjacent February 13, 2022 08:56

benesch force-pushed the repr-improvements branch 3 times, most recently from 4bdee16 to 21943fc Compare February 14, 2022 04:40

sploiselle reviewed Feb 14, 2022

View reviewed changes

sploiselle approved these changes Feb 14, 2022

View reviewed changes

nharring-adjacent approved these changes Feb 14, 2022

View reviewed changes

benesch force-pushed the repr-improvements branch from 21943fc to a055ebc Compare February 14, 2022 22:29

benesch added 2 commits February 14, 2022 18:53

repr: remove deref impls on Char and VarChar wrappers

1e4b031

They don't seem to be used. Also improve the comments on the wrapper types.

benesch force-pushed the repr-improvements branch from a055ebc to 1776c9b Compare February 15, 2022 03:50

benesch added 2 commits February 14, 2022 23:21

ci-builder: silence warning with shellcheck 0.7.0

6141324

benesch force-pushed the repr-improvements branch from b4c7572 to 6141324 Compare February 15, 2022 04:24

benesch merged commit 522e7e2 into MaterializeInc:main Feb 15, 2022

benesch deleted the repr-improvements branch February 15, 2022 04:46

materialize-bot mentioned this pull request Feb 16, 2022

release: v0.21.0-rc1 required reviews #10727

Closed

29 tasks

benesch mentioned this pull request Mar 4, 2022

repr,pgrepr: add PostgreSQL "char" type #11003

Merged

1 task

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

sql: refactor type types #10624

sql: refactor type types #10624

benesch commented Feb 13, 2022

sploiselle Feb 14, 2022

benesch Feb 14, 2022

sploiselle left a comment

sploiselle Feb 14, 2022

nharring-adjacent left a comment

benesch commented Feb 14, 2022

benesch commented Feb 15, 2022

benesch commented Feb 15, 2022

sql: refactor type types #10624

sql: refactor type types #10624

Conversation

benesch commented Feb 13, 2022

Motivation

Checklist

sploiselle Feb 14, 2022

Choose a reason for hiding this comment

benesch Feb 14, 2022

Choose a reason for hiding this comment

sploiselle left a comment

Choose a reason for hiding this comment

sploiselle Feb 14, 2022

Choose a reason for hiding this comment

nharring-adjacent left a comment

Choose a reason for hiding this comment

benesch commented Feb 14, 2022

benesch commented Feb 15, 2022

benesch commented Feb 15, 2022