Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat: add support for "enum tables" #635

Merged
merged 7 commits into from
Aug 4, 2020
Merged

feat: add support for "enum tables" #635

merged 7 commits into from
Aug 4, 2020

Conversation

benjie
Copy link
Member

@benjie benjie commented Aug 4, 2020

PostgreSQL enums cannot be added to inside of transactions, and cannot have values removed. This makes them very cumbersome to use. For years I've been recommending to use "enum tables" instead, where you write your enum values to a table and reference them via foreign key, but I never got around to adding support for this pattern to PostGraphile... until now!

An enum table must have a text (or varchar or char) primary key, and may have other columns. It must have the @enum smart comment, and this must be done via a smart comment (and not a smart tag file or plugin) due to the way in which PostGraphile v4's introspection engine works.

Example:

create table abcd (
  letter text primary key,
  description text
);
comment on table abcd is E'@enum';
insert into abcd (letter, description) values
  ('A', 'The letter A'),
  ('B', 'The letter B'),
  ('C', 'The letter C'),
  ('D', 'The letter D');

If one of the columns is named 'description' or has the smart comment (NOT smart tag) @enumDescription then its contents will be used as the description of the enum value in GraphiQL.

The enum table will not show up in your GraphQL schema, it's automatically omitted. (Don't bypass this with a smart tags plugin, bad things will occur.) To use a value from another table, use a foreign key constraint:

create table letter_descriptions(
  id serial primary key,
  letter text not null references abcd,
  description text
);

PostGraphile will represent this field as if it were an enum rather than text. This works for queries, mutations, conditions and ordering.

NOTE: we currently don't have an official way to mark function input/output parameters as enum-table enums, so they will remain as text.

NOTE: --watch won't monitor for new enum values being added.

@benjie benjie merged commit e6bde66 into v4 Aug 4, 2020
@benjie benjie deleted the enum-tables branch August 4, 2020 13:35
@bfelbo
Copy link

bfelbo commented Aug 5, 2020

This is amazing, thanks @benjie! 🙌

We are using a slight variant of your enum table, where we have a single table of all our enums instead of one table per enum. This prevents us from specifying a description, but allows us to only have one table across all enums, thereby dramatically reducing the number of tables.

Following your example with the post_topic enum in the tutorial, our definition would be implemented as:

create table enums (
  post_topic text unique
);
comment on table enums is E'@enum';
insert into enums (post_topic) values
  ('discussion'),
  ('inspiration'),
  ('help'),
  ('showcase');
create table post(
  id serial primary key,
  topic text not null references enums(post_topic)
);

Adding a second enum would only require adding a column on enums w/o adding any new tables. We believe this approach could also be useful for others with lots of enums. Would it be possible to support this enum table variant as well?

@benjie
Copy link
Member Author

benjie commented Aug 5, 2020

Could you demonstrate with a few enums?

@bfelbo
Copy link

bfelbo commented Aug 5, 2020

Sure! Imagine we want to add an enum column visibility to the post table above and a new table comment with an enum column kind. This is how we'd do it:

alter table enums add column post_visibility text unique;
insert into enums (post_visibility) values
    ('hidden'),
    ('unlisted'),
    ('visible');
alter table post add column visibility text not null references enums(post_visibility);

alter table enums add column comment_kind text unique;
insert into enums (comment_kind) values
    ('reply'),
    ('quote');
create table comment(
  id serial primary key,
  post_id int not null references post(id),
  kind text not null references enums(comment_kind)
);

We're doing this in production with lots of enums and it's working well :)

@benjie
Copy link
Member Author

benjie commented Aug 6, 2020

Hmmm... Maybe we could tag the unique constraint; e.g.

comment on constraint your_comment_kind_unique_constraint
  is E'@enum description=comment_kind_description';

That'd allow adding the description and make the enum-ness more obvious. It'd also allow you to use the table normally and just @omit the unique constraint which would drop the relevant relations without necessarily dropping the table entirely (which obviously you could do with a separate smart comment).

@bfelbo
Copy link

bfelbo commented Aug 6, 2020

Hmm. Since this is a separate enum table, we're not planning to expose it to users. Allowing use of the table normally is therefore not important, at least to us. Do you often expose your enum tables?

We don't currently use/need enum descriptions, but it's definitely nice to have that ability in the future. And I can see that it would be useful for others with more cryptic enums :)

We sometimes use the same enum as foreign keys for multiple tables. For that reason, it would be great to only have 1 source of truth for the description of the enum. Perhaps it would therefore make sense to tag the column instead of the constraint?

We'd be more than happy to tag all the columns in our enum table if using smart tags is the preferred approach. I can see how that would also provide flexibility of having tables that are a mix of enum and normal data, which might be useful for others :)

Anyway, this is such an important step towards better typing of the GraphQL schema. Thanks again!

@benjie
Copy link
Member Author

benjie commented Aug 7, 2020

Okay so if the table has a PK it’s one enum; otherwise each unique constraint becomes its own enum, and we can just use the description column for all of them since it’s likely you’d have nulls in the other enum columns.

@bfelbo
Copy link

bfelbo commented Aug 7, 2020

That sounds perfect!

@benjie
Copy link
Member Author

benjie commented Aug 11, 2020

@bfelbo Could you give this a scan urgently? #638 Plan to release it shortly.

@bfelbo
Copy link

bfelbo commented Aug 11, 2020

Thanks for looking into this so quickly! It looks great! 🎉

I have one question though. As you said above, PostGraphile could treat each unique column on an @enum table as enums:

Okay so if the table has a PK it’s one enum; otherwise each unique constraint becomes its own enum, and we can just use the description column for all of them since it’s likely you’d have nulls in the other enum columns.

I'm therefore curious why the PR you linked uses a smart tag for each enum column in the enum table? https://github.com/graphile/graphile-engine/pull/638/files#diff-c8bb689a58197b1592c6e704ad935d16R1125-R1142

@benjie
Copy link
Member Author

benjie commented Aug 11, 2020

I decided I wanted to support the table having a PK that wasn't an enum; all tables should have primary keys. An enum table with just unique constraints and no notnulls could contain the entry (null, null, null) as many times as you like which seems wrong. So instead you mark the constraints themselves as @enum.

I wrote some docs too: https://github.com/graphile/graphile.github.io/blob/develop/src/pages/postgraphile/enums.md

@bfelbo
Copy link

bfelbo commented Aug 11, 2020

Okay. I've found that smart comments can be a bit confusing for people new to PostGraphile, particularly because it can be unclear whether they should be on the table, column, constraint, etc. I wonder if there would be an acceptable way to achieve the global enum table w/o individual smart comments?

Some thoughts on your comments:

all tables should have primary keys.

I would agree that all normal tables should have primary keys, but we've found a few special cases. For instance, we have an env table that allows for controlled rollout using feature flags across stages, which doesn't have a primary key. Similarly, I think the global enum table could be a special case?

An enum table with just unique constraints and no notnulls could contain the entry (null, null, null) as many times as you like which seems wrong.

I see what you're saying 😄 I guess the way I think about the global enum table is that each column is interpreted independently w/o the nulls. I did some quick tests inserting 1 million such rows into the enum table and there was no impact on FK performance so I don't think there's much to fear in terms of such rows being a "gotcha".

@benjie
Copy link
Member Author

benjie commented Aug 11, 2020

I would agree that all normal tables should have primary keys, but we've found a few special cases.

Could you expand more on these special cases? Why does your feature flags table not have a primary key, for example - what's its' definition?

I wonder if there would be an acceptable way to achieve the global enum table w/o individual smart comments?

Currently no (hopefully in V5 the introspection infrastructure can be overhauled such that this doesn't have to be special-cased). But I think the global enum table is a bit of an edge case and just having an enum table marked as enum is the right choice for most people (including beginners) - this way they don't need to specify columns on their foreign keys (foo text references foo rather than foo text references all_enums(foo)) and they can just use plain table comments.

@bfelbo
Copy link

bfelbo commented Aug 11, 2020

Could you expand more on these special cases? Why does your feature flags table not have a primary key, for example - what's its' definition?

Here's a simplified version of the definition spanning two features (we have many more features in our table). The table contains a single row with values for all columns. The columns are different types based on what's needed for the feature.

CREATE TABLE tango.env (
    feature1_enabled boolean NOT NULL,
    feature1_minimum_duration integer NOT NULL,
    feature2_algorithm_type text NOT NULL REFERENCES enums(feature2_algorithm_type)
);

These env are read in SQL and plpgqsl functions.

I think the global enum table is a bit of an edge case and just having an enum table marked as enum is the right choice for most people (including beginners) - this way they don't need to specify columns on their foreign keys (foo text references foo rather than foo text references all_enums(foo)) and they can just use plain table comments.

I see where you're coming from. However, if we used a separate enum table per enum, we'd have more enum tables than non-enum tables. This would also be the case for previous companies I've worked at so I think a global enum table might be more attractive to developers and less of an edge case than you think.

@benjie
Copy link
Member Author

benjie commented Aug 12, 2020

I’m happy with the current trade-offs, we can still expand it later if need be.

@bfelbo
Copy link

bfelbo commented Aug 12, 2020

Okay. Thanks for adding this!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

2 participants