[RFC] Tagged type #733

benjie · 2020-06-12T16:54:57Z

THIS RFC HAS BEEN SUPERSEDED by @oneof, for now at least... See: #825

This is an RFC for a new "Tagged type" to be added to GraphQL. It replaces the "@oneField directive" proposal following feedback from the Input Unions Working Group. Please note that "Tagged type" is the working name, and may change if we come up with a better name for it.

A Tagged type defines a list of named members each with an associated type (like the fields in Object types and Input Object types), but differs from Object types and Input Object types in that exactly one of those members must be present.

The aim of the Tagged type is to introduce a form of polymorphism in GraphQL that can be symmetric between input and output. In output, it can generally be used as an alternative to Union (the differences will be outlined below). It goes beyond interfaces and unions in that it allows the same type to be specified more than once, which is particularly useful to represent filters such as this pseudocode {greaterThan: Int} | {lessThan: Int}.

If merged, Tagged would be the first non-leaf type kind (i.e. not a Scalar, not an Enum) that could be valid in both input and output. It is also the first kind of type where types of that kind may have different input/output suitability.

In SDL, a tagged type could look like one of these:

# suitable for input and output:
tagged StringFilter {
  contains: String!
  lengthAtLeast: Int!
  lengthAtMost: Int!
}

# output only:
tagged Pet {
  cat: Cat!
  dog: Dog!
  colony: ColonyType!
}

# input only:
tagged PetInput {
  cat: CatInput!
  dog: DogInput!
  colony: ColonyType!
}

(Note a number of alternative syntaxes were mooted by the Input Unions working group; the one above was chosen to be the preferred syntax.)

If we queried a StringFilter with the following selection set:

{
  contains
  lengthAtLeast
  lengthAtMost
}

then this could yield one of the following objects:

{ "contains": "Awesome" }
{ "lengthAtLeast": 3 }
{ "lengthAtMost": 42 }

Note that each of these objects specify exactly one key.

Similarly the above JSON objects would be valid input values for the StringFilter where it was used as an input.

Tagged vs Union for output

Tagged does not replace Union; there are things that Union can do that tagged cannot:

{
  myUnionField {
    ... on Node {
      id # If the concrete type returned by `myUnionField` implements
         # the `Node` interface, we can query `id`.
    }
  }
}

And things that Tagged can do that Union cannot:

tagged Filter {
  equalTo: Int!
  lessThan: Int!
  greaterThan: Int!
  isNull: Boolean!
}

Tagged allows for exploring the various polymorphic outputs without requiring fragments:

{
  pets {
    cat { name numberOfLives }
    dog { name breed }
    parrot { name favouritePhrase }
  }
}

When carefully designed and queried, the data output by a tagged output could also be usable as input to another (or the same, if it's suitable for both input and output) tagged input, giving polymorphic symmetry to your schema.

Nullability

Tagged is designed in the way that it is so that it may leverage the existing field logic relating to nullability and errors. In particular, if you had a schema such as:

type Query {
  pets: [Pet]
}

tagged Pet {
  cat: Cat
  dog: Dog
}

type Cat {
  id: ID!
  name: String!
  numberOfLives: Int
}

type Dog {
  id: ID!
  name: String!
  breed: String
}

and you issued the following query:

{
  pets {
    cat { id name numberOfLives }
    dog { id name breed }
  }
}

and for some reason the name field on Cat were to throw, the the result might come out as:

{
  "data": {
    "pets": [
      { "cat": null },
      { "dog": { "id": "BUSTER", "name": "Buster" } }
    ]
  },
  "errors": [{ ... }]
}

where we can tell an error occurred and the result would have been a Cat but something went wrong. This may potentially be useful, particularly for debugging, compared to returning "pets": null or "pets": [null, {"dog": {...}}]. It also makes implementation easier because it's the same algorithm as for object field return types.

FAQ

Can a tagged type be part of a union?

Not as currently specified.

Can a tagged type implement an interface?

No.

What does `__typename` return?

It returns the name of the tagged type. (This is a new behaviour, previously __typename would always return the name of an object type, but now we have two concrete composite output types.)

What happens if I don't request the relevant tagged member?

You'll receive an empty object. For example if you issue the selection set { cat } against the tagged type below, but the result is a dog, you'll receive {}.

tagged Animal {
  cat: Cat
  dog: Dog
}

How can I determine which field would have been returned without specifying all fields?

There is currently no way of finding out what the field should have been other than querying every field; however there's room to solve this later with an introspection field like __typename (e.g. __membername) should this show sufficient utility.

Open questions

Should we add isInputType / isOutputType to __Type for introspection? [Author opinion: separate RFC.]
Should we use TAGGED_INPUT and TAGGED_OUTPUT types separately, rather than sharing just one type? [Author opinion: no.]
Should we prevent field aliases? [Author opinion: no.]
What exactly should the input coercion rules be, particularly around variables being omitted, e.g. {a: $a, b: $b} [Author opinion: as currently specified.]

spec/Section 3 -- Type System.md

spawnia · 2020-07-20T07:06:41Z

Great job @benjie, this looks really solid overall.

benjie · 2020-07-21T16:37:47Z

All but one piece of feedback addressed. Will ponder the final point. Thanks @spawnia 🙌

Changing from WIP to ready for review, and adding it to the next GraphQL Spec WG 👍

andimarek · 2020-08-06T19:00:44Z

Great work @benjie and everybody else involved.
The two fundamental questions that come to my mind:

What is the impact of this new type on the underlying GraphQL structure? What I mean with that is: does is make static analysis harder (or easier)? See https://www.graphql.de/blog/static-query-analysis/ for some background. Does it make certain advanced topics like query rewriting harder or maybe even impossible? What is the impact on validation in terms of complexity?
What is the impact of schema design? Are we making it harder or easier to design schemas? What can we do create Pits of success?

These questions are relevant to all type systems changes I believe.

leebyron

This is looking amazing, great work!

spec/Appendix B -- Grammar Summary.md

spec/Section 2 -- Language.md

spec/Section 3 -- Type System.md

leebyron · 2020-08-06T18:22:01Z

spec/Section 3 -- Type System.md

@@ -266,6 +267,10 @@ A `Union` defines a list of possible types; similar to interfaces, whenever the
 type system claims a union will be returned, one of the possible types will be
 returned.

+A `Tagged Type` defines a list of possible members; whenever the type system


at this moment it's useful to explain, briefly, what a "member" is - a tag named in the form of a field, and the type of that field.

I've addressed this, but I'm not happy with the wording so I'm not marking this conversation resolved.

leebyron · 2020-08-06T18:23:16Z

spec/Section 3 -- Type System.md

+  * If {type} is a Tagged type:
+    * If there exists a member in {type} where {IsInputType(memberType)} is {false}
+      * Return {false}
+    * Return {true}


This is nice recursion. I like that IsInputType remains fully defined in terms of itself.

spec/Section 3 -- Type System.md

leebyron · 2020-08-06T19:14:01Z

spec/Section 3 -- Type System.md

+      no two members may share the same name.
+   2. The member must not have a name which begins with the
+      characters {"__"} (two underscores).
+3. {IsOutputType(taggedType)} and {IsInputType(taggedType)} cannot both be {false}.


spec/Section 3 -- Type System.md

leebyron · 2020-08-06T19:39:14Z

@andimarek Excellent questions, I agree they're very relevant.

What is the impact of this new type on the underlying GraphQL structure? What I mean with that is: does is make static analysis harder (or easier)? See https://www.graphql.de/blog/static-query-analysis/ for some background. Does it make certain advanced topics like query rewriting harder or maybe even impossible? What is the impact on validation in terms of complexity?

I think from a fundamentals point of view, this shouldn't make static analysis or query rewriting any different. In most ways a Tagged type operates equivalently to an Object type and provides the same level of static knowledge of types.

Some cases we should want to stress test are the edge cases and corner cases of using fragment spreads along with the additional constraints Tagged types offer. That will inform how strict we need to be about these constraints.

For example, we have an implicit rule that a fragment spread should always be able to be inlined without changing meaning or validity. eg { ...Frag } fragment Frag on T { name } is equivalent to { name } (provided the type condition always applies). The area that deserves some careful consideration is aliased fields. For example, on a tagged type, is { alias1: name, alias2: name } legal? If not, then are fragment Frag1 on Tagged { alias1: name } and fragment Frag2 on Tagged { alias2: name } separately legal? And if so is { ...Frag1, ...Frag2 } legal on a tagged type Tagged since both fragment types apply?

Can you think of other scenarios which could pose problems for static analysis or query rewriting?

What is the impact of schema design? Are we making it harder or easier to design schemas? What can we do create Pits of success?

This is an excellent question because this moves counter to the good design principle that there should be one way to model something. That's not a principle explicitly listed but it's still a good one. Now when you consider a case that you have a field which can return one of many options, we're expanding your choices from 2 (interface, object) to 3 (+ tagged). There was already pushback during original design years ago from @schrockn that perhaps there should be only one (interface) where unions are modeled as interfaces which have no fields. There are other modeling tradeoffs there, but the benefit was ensuring only one way to do abstract types - potentially creating a pit of success.

Something critical we'll need to do is well document the tradeoffs for when you should consider using a Union vs Tagged type (or Interface), and in fact if we can be terse and clear about it that would be quite valuable to include within the spec itself so there's a canonical explanation of why multiple options exist.

Can you see other schema design pitfalls?

benjie · 2020-08-06T19:59:22Z

Thanks for the review @leebyron!

I've 👍'd the input I agree with without hesitation. Some others need discussion, or I need to think about more. I'm unlikely to have time in August to implement the suggestions, but hopefully will have more time in September - I welcome others' feedback in the mean time 👍

For example, on a tagged type, is { alias1: name, alias2: name } legal

I currently propose that this is definitely legal, it's allowed in the RFC as is, I see no reason to forbid it, and it simplifes the concerns you raise 👍

The main reason for this is that { name __typename } is valid, so already you may have more than one key returned.

If we chose not to allow { alias1: name, alias2: name } then it would be done, IMO, by forbidding aliases on tagged types, which also solves the {...Frag1, ...Frag2} issue.

dschafer · 2020-08-07T01:18:16Z

Thanks for putting this together!

My one concern here is in that we're allowing tagged to exist in the output type system as well; I think that might lead to some interesting and not entirely desirable downstream effects. There was discussion above about how to decide between tagged and union in designing a schema, but my worry that by allowing tagged to exist in both, we might create some tricky cases where it's not suitable for either.

For example, I can write an "input union", but then discover that my type got accidentally restricted by someone putting it in the output type set.

i.e. if I write an input union

tagged Filter {
  startsWith: String
  endsWith: String
}

and then come back two days later to add an optional input object parameter:

tagged Filter {
  startsWith: String
  endsWith: String
  range: Range
}

input Range {
  start: Int
  end: Int
}

I might not be able to, if someone decided to include Filter as a field in an object type in the meantime -- because that would mean Filter is now in both input and output, and hence can only use things in both schemas, and input objects can't be in the output schema. Similarly, if I write a tagged intending it for use in the output type system, but then it gets used somewhere else in the input type system, I can no longer add an object/interface/union field to that tagged.

That is to say: right now, there's no way for any type to "change" its input/output availability. Scalars and Enums are always "both"; input types are always "input only"; objects types, interfaces, and unions are always "output only". But tagged as proposed today can do so -- it can start out as "both", but then when someone adds an input to it, it becomes "input-only", or when someone adds an interface to it, it becomes "output-only". And that mutability can cascade through the schema, because if a type changes from "both" to "input-only", then every type that recursively references that type via fields also must become "input-only"... which might not be possible.

As I understand it, the impetus for this proposal was to enable input unions. If the solution for input unions naturally extended to the output types, then it might make sense to include that extension for cleanliness, and it seems like that was the intention here. In this case, though, I think adding output type system support might add complexity, and I don't immediately see the benefit that it yields since we already have unions in the output type system.

So I'd consider restricting tagged to just the input type system (i.e. both tagged and input can only contain fields of tagged, input, enum and scalar, and just as input cannot appear in the output type system, neither can tagged). That seems to solve the problem of input unions, and it also creates a nice parallelism between the relationship between type and union with input and tagged` -- the former in the output type system, the latter in the input type system.

dschafer · 2020-08-07T01:31:58Z

If merged, Tagged would be the first non-leaf type kind (i.e. not a Scalar, not an Enum) that could be valid in both input and output. It is also the first kind of type where types of that kind may have different input/output suitability.

I do think this is an interesting problem in general -- the desire to have a type that can be shared between input and output makes a lot of sense to me. My concern is more around trying to solve that problem at the same time that we're solving the "input unions" problem, when those feel like separate issues to me.

benjie · 2020-08-07T06:49:02Z

@dschafer This is my biggest concern too, as discussed on the WG. However tagged brings “unions” of objects, unions, interfaces, scalars, enums and lists to the output schema, whereas GraphQLUnion only allows for objects, so I think it definitely has value there (see e.g. #215).

As mentioned on the WG, I think an alternative solution to your specific issue would be a schema validation rule stating that a specific tagged type may only be used in an input context or in an output context but not both. This validation rule could be dropped in future if we find value in using the same tagged in both contexts. That said, I trust users to make this decision for themselves and implement the restriction in their own tooling should they need it.

spec/Section 3 -- Type System.md

Changes: * Comparison operators: `{eq: 1, not: yes}` => `{notEqual: 1}` * Logical: `{anyOf: [...], not: yes}` => `not { anyOf: [...]} The `not` became more consistent with `allOf`/`anyOf` and compatible with [Tagged Type](graphql/graphql-spec#733 input](graphql/graphql-spec#825). Also resolved ambiguity for Relation: ``` # this means "where count(...) != 2" now relation: { where: {....} eq: 2 not: yes } ```

jamiehodge · 2021-09-14T08:21:26Z

Considering how ubiquitous and well-understood { ..., _tag: "MyTag" } is, I don't see why there is currently a proposal using directives. Tagged unions should complement the existing __typename mechanism. I realise that there has been a lengthy discussion and iteration through several issues and proposals, but this still feels like the simplest and most general approach.

benjie · 2021-09-14T08:44:40Z

I don't see why there is currently a proposal using directives

There's plenty of alternative syntaxes than directives that we can use the represent the exact same thing; which of those would you prefer and would it overcome your "directives" concern?

https://gist.github.com/benjie/5e7324c64f42dd818b9c3ac2a91b6b12

Tagged unions should complement the existing __typename mechanism.

Please note that both this PR #733 and the oneof PR #825 would involve you sending the same data to the server: {"MyTag": ...}.

It sounds like your comment is in favour of solution 1 in the InputUnions proposal; one of the main issues that we found with this was that it would cause complexity around when to/not to include __typename. E.g. adding __typename to your existing input objects would throw errors in existing GraphQL implementations, so you'd have to only send it when the field is an input union; but doing so means that you need to have two representations for your type - e.g. User when used in a regular input, and UserWithTypename when used in a union-expecting interface. This would be generally unpleasant to work with.

jamiehodge · 2021-09-14T09:10:44Z

It sounds like your comment is in favour of solution 1 in the InputUnions proposal; one of the main issues that we found with this was that it would cause complexity around when to/not to include __typename.

Is this primary due to overloading __typename? I haven't read all of the discussions, but would an __inputtypename avoid this? We already have the Object vs InputObject distinction.

I apologise if I'm rehashing previous discussions. I guess my main point is that so-called tagged unions/variants are such a well-established pattern in programming languages, that it would be a shame to deviate from them. That said oneOf is present in both JSON Schema and Protobufs (for better or for worse).

wyfo · 2021-09-14T10:23:28Z

I don't see why there is currently a proposal using directives.

@jamiehodge Actually, I did kind of a proposal using directive. You can look at it in my comment of #825. I've even implemented it (for input only, but it's easily extensible to output).
My point was (and I didn't change my mind) that GraphQL specifications doesn't really need to be modified, because @oneOf could be implemented as a validation directive by frameworks which want to include it.

benjie · 2021-09-14T18:31:44Z

Is this primary due to overloading __typename? I haven't read all of the discussions, but would an __inputtypename avoid this? We already have the Object vs InputObject distinction.

No, that's not a concern. I tried to describe this concern above; namely that (TypeScript syntax) you'd sometimes have:

const user: UserInput = { name: "Benjie" };

and sometimes you'd instead have to have:

const user: UserInputWithTypename = { name: "Benjie", __typename: "User" }

depending on where the object is to be submitted. With oneof you would always just have the first one, and when you send it in as part of a oneof it would be explicitly wrapped ({ user: user }) rather than modified/cloned. Keep in mind that GraphQL targets all programming languages, so this kind of concern is much bigger than it might at first seem.

jamiehodge · 2021-09-15T09:20:52Z

I guess wouldn't think of the entity example as the primary use case, nor do I see the requirement of adding a tag property as onerous. Isn't a more realistic use of a tagged union, using a contrived example, a payment where it may be cash or credit and the two inputs are not entities, but have distinct fields? Another example, which I am dealing with is a list of changes to an aggregate, each pf which contain the type of change and then appropriate fields. Neither of these examples are entities and they are constructed at submission time. In neither case would the tag property be an unreasonable requirement.

benjie · 2021-09-15T10:19:51Z

I... don't follow at all. When you say "entity" do you mean "input object" or something else? If they're not "input objects" then what are they? I used UserInput (edited for clarity) in my example but didn't intend any meaning to be construed - it's just any input object. The point of input polymorphism would be to be able to use these input objects (and in the case of my proposals also scalars, enums and list types) directly as well as part of a polymorphic input; e.g. (deliberately using foo, bar, etc so that no meanings can be conferred)

input Foo { ... }
input Bar { ... }

tagged Tagged {
  foo: Foo
  bar: Bar
}

type Mutation {
  doSomethingWithFoo(foo: Foo!): Int
  doSomethingWithBar(bar: Foo!): Int
  doSomething(input: Tagged!): Int
}

Here Foo is usable on its own (in doSomethingWithFoo) and also within a tagged type (doSomething). Are you suggesting that your "non-entity"s would only be usable as part of an input union of some kind and never used on their own regard?

jamiehodge · 2021-09-15T12:03:02Z

I see. Now I understand your reuse example. My mind went to entities when you initially described reuse, but I can see that something as simple as a union of scalar types would be a compelling argument for externalising the tagging.

I don't know enough about the GraphQL parser to know how difficult it would be allow for an untagged union of scalar types, making object types the exception.

I don't think I have any real objections to the above example.

benjie · 2021-09-16T11:34:33Z

I don't know enough about the GraphQL parser to know how difficult it would be allow for an untagged union of scalar types

Basically impossible - we can't tell the difference between an Int and a Float (unless it has a decimal period in it) or a String and an ID, and of course there's custom scalars which we have absolutely no understanding of whatsoever. The types have to be declared, or there needs to be some kind of rule to help determine which it's coerced as.

jamiehodge · 2021-09-17T08:06:05Z

For reference, io-ts has adopted a somewhat similar solution, albeit a mixture of external/internal tagging: https://github.com/gcanti/io-ts/blob/master/Decoder.md#the-sum-combinator. There is a separate combinator for untagged unions: https://github.com/gcanti/io-ts/blob/master/Decoder.md#the-union-combinator. If anything the language distinction is helpful.

jamiehodge · 2021-09-17T10:05:42Z

I've been imitating the tagged input pattern for some schema and can confirm that it feels pretty natural. I will say that I'm unsure about the reuse story. For the sake of flexibility/extensibility, I would probably error on the side of keeping the input types distinct, even when they are identical at first.

benjie · 2024-07-01T16:26:54Z

Closing in favour of #825

benjie added 6 commits June 12, 2020 16:25

First pass

3d43179

More edits

d411b60

Input coercion

16de657

Change tagged "fields" to "members"

a73d1cc

Merge branch 'master' into tagged-type

ceb81d5

Sync type system

e65126d