Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Proposal: Support union scalar types #215

Open
stubailo opened this issue Sep 30, 2016 · 91 comments
Open

Proposal: Support union scalar types #215

stubailo opened this issue Sep 30, 2016 · 91 comments
Labels
👻 Needs Champion RFC Needs a champion to progress (See CONTRIBUTING.md) 💭 Strawman (RFC 0) RFC Stage 0 (See CONTRIBUTING.md)

Comments

@stubailo
Copy link
Contributor

It makes sense often to have a field that could return, for example, an Int or a String. It would be convenient to be able to use a resolveType-like mechanism to determine how the value is serialized. This would be even more useful for custom scalars.

The specification currently says: "GraphQL Unions represent an object that could be one of a list of GraphQL Object types", which basically means that unions are the same as interfaces, but without any shared fields. Supporting unions of scalars would further differentiate unions from interfaces.

@rmosolgo
Copy link

Could you share a couple of concrete use cases? I ask because, personally, I'm usually reading values right out of MySQL (or running those values through transformations with predictably-typed output), so I haven't needed this yet!

@migueloller
Copy link

@rmosolgo,

Here is a concrete use case that we've come across at my team. I have a couple more if you need them.

We've got an algorithm that produces results based on location. We have an enum AreaOfTown that contains all possible areas of town to filter the results. Here's the catch, the algorithm can also filter locations by "near me". Because we can't simply make a union that is AreaOfTown | "NEAR_ME" for the input type, then we have to create a completely new enum that contains all of the values in AreaOfTown and "NEAR_ME". This is undesirable because if we ever change the AreaOfTown enum by adding more (this will 100% happen in the future), then we have to make sure we also update our other enum.

If we could write scalar unions then it would solve this problem for us, among others (we would like to have constructs like Int | [Int!] or Int | String like @stubailo mentioned).

@stubailo
Copy link
Contributor Author

stubailo commented Oct 3, 2016

In our case, it's because we were returning a structure that represented a JSON tree, where keys can be integers (in the case of arrays) or strings (in the case of objects). We ended up compromising for now to just use strings, but I can imagine how this would be extra useful in the case where you want to return one of a set of custom scalars, which might have different serialization logic.

@migueloller
Copy link

Here are other relevant discussions:

@leebyron made some good points in his comments here.

@marcintustin
Copy link

Allowing uniontypes to contain primitives allows us to model any kind of source that can be polymorphic in its source data. For example, avro data or elasticsearch.

This is a severe limitation on exposing existing data, rather than creating new systems that are defined by graphql from the ground up.

@leebyron
Copy link
Collaborator

leebyron commented Feb 9, 2017

What's still unclear to me is what a valid query would look like which encounters such a union, and how should well-typed clients determine what to expect from such a query?

Consider:

type MyType {
  field: String
}

union MyUnion = String | MyType

type Query {
  example: MyUnion
}

@marcintustin
Copy link

marcintustin commented Feb 9, 2017 via email

@leebyron
Copy link
Collaborator

leebyron commented Feb 9, 2017

For the above schema consider the query:

{ example }

It's clear what this query should return if the example field returns a String at runtime, but what should it return if the value was a MyType?

Similarly:

{ 
  example {
    ... on MyType {
      field
    }
  }
}

It's clear what this would return for MyType, but what should it return for a String?

@marcintustin
Copy link

marcintustin commented Feb 9, 2017 via email

@leebyron
Copy link
Collaborator

leebyron commented Feb 9, 2017

It's not clear to me what that means. What would a valid query look like that would clearly give a value for a primitive type or an object type?

@leebyron
Copy link
Collaborator

leebyron commented Feb 9, 2017

Specifically http://facebook.github.io/graphql/#sec-Leaf-Field-Selections is what I'm referring to as "valid query"

@stubailo
Copy link
Contributor Author

stubailo commented Feb 9, 2017

I think the original issue is about having unions where every member is a scalar right? Not unions between scalers and objects?

@OlegIlyenko
Copy link
Contributor

OlegIlyenko commented Feb 9, 2017

just an idea. Maybe it would make sense to introduce special inline fragment semantics for scalars:

{ 
  example {
    ... on String {
      value
    }
    ... on MyType {
      field
    }
  }
}

This can be generalized by allowing scalars to be used as leafs (current semantics) and as objects with a single field (value). A huge advantage of this approach is that one can use __typename introspection on scalars.

In our API we have a similar issue. For some parts of the schema, we even have introduced object types like StringAttribute, IntAttribute, etc. in order to unify object attributes ("attributes" are a part of user-defined data structures which our service supports) and scalar attributes and provide same introspection capabilities for both.

@migueloller
Copy link

migueloller commented Feb 9, 2017

What if the result was a scalar then the result would simply ignore the selection set, and if it is an object type then it would consider the selection set. Statically, the query can be analyzed because the schema should have enough information to know if the union contains a scalar or an object or both.

If there is no selection set and the value is an object then the returned value can just be {}. If there is a selection set and the value is a scalar then the returned value can just be the scalar value (ignoring the selection set, just like it is done for object type unions).

On the client most type system allow unions of arbitrary types, so the client can expect a result that is string | { name: string } in Flow, for example.

Nevertheless, this starts getting into the design choices of the simplicity of GraphQL, which I think @leebyron made great points here: graphql/graphql-js#207 (comment)

@calebmer
Copy link

calebmer commented Feb 9, 2017

Mixing scalars and object types would require a change to the schema which is too complex to consider. Schemaless GraphQL clients depend on the guarantee the { field } is scalar and { field { a b c } } is composite. Given that a scalar field could be an object this is the only reliable way to tell a scalar apart from a composite value without looking at the schema. By breaking that guarantee you introduce a lot of unnecessary complexity into the GraphQL client when you could easily just add a composite StringBox type or similar:

type StringBox {
  value: String
}

@migueloller
Copy link

migueloller commented Feb 9, 2017

@calebmer, agreed! It's a balance between feature-set and simplicity and I think the GraphQL designers wanted to err on the side of simplicity. Please correct me if I'm wrong @leebyron.

Given this, I think that the original discussion of scalar-scalar (or more specifically, leaf nodes) unions is what we should perhaps consider (as opposed to object-scalar unions).

@rmosolgo
Copy link

rmosolgo commented Feb 9, 2017

It seems like the Box approach could solve most of these issues without any changes to the spec, right?

# These two fields are mutually exclusive:
type AreaOfTownBox {
  stringValue: String 
  enumValue: AreaOfTown
}

# or 
type IntOrStringBox {
  stringValue: String
  intValue: Int 
}

It pushes the complexity to the schema but keeps GraphQL simple and predictable. Are there downsides to that approach other than verbosity?

@migueloller
Copy link

migueloller commented Feb 9, 2017

@rmosolgo, you are right that it is possible but it comes at the cost of added complexity in the schema, verbosity in the queries, and having to check each value in the client side at runtime to determine which field was returned.

@calebmer
Copy link

calebmer commented Feb 9, 2017

Another approach inspired by @rmosolgo’s recommendation would to have a type system that looks like:

union IntOrString = IntBox | StringBox

type IntBox {
  value: Int
}

type StringBox {
  value: String
}

With a query that looks like:

fragment intOrString on IntOrString {
  __typename
  ... on IntBox { intValue: value }
  ... on StringBox { stringValue: value }
}

@IvanGoncharov
Copy link
Member

Because we can't simply make a union that is AreaOfTown | "NEAR_ME" for the input type, then we have to create a completely new enum that contains all of the values in AreaOfTown and "NEAR_ME". This is undesirable because if we ever change the AreaOfTown enum by adding more (this will 100% happen in the future), then we have to make sure we also update our other enum.

@migueloller This problem can be solved directly in your code without any change to GraphQL.
Wehn you define your types in source code just create a small utility function like extendEnumValues in the example below:

new GraphQLEnumType({
  name: 'NewName',
  values: extendEnumValues(AreaOfTown.getValues(), ['NEAR_ME']);
});

As for IDL, I think it would be great to have support for it there, for example allowing extend to be used on enums.
But this is a topic for separate issue/PR.

Allowing uniontypes to contain primitives allows us to model any kind of source that can be polymorphic in its source data. For example, avro data or elasticsearch.
This is a severe limitation on exposing existing data, rather than creating new systems that are defined by graphql from the ground up.

@marcintustin There are number common JSON patterns that you can't express in GraphQL: key-value pairs (aka Maps), Tuples, etc.
I believe this is a bad idea to make GraphQL a superset of all popular formats/protocols.

Given this, I think that the original discussion of scalar-scalar (or more specifically, leaf-nodes) unions is what we should perhaps consider (as opposed to object-scalar unions).

@migueloller There are many cases when API client can't detect type based on the value returned.
For example, union of enums having the same value. It is pretty common for enums to have common values like None, Undefined, etc.
Another example is a union of custom scalars with the same base type like AbsoluteUrl|RelativeUrl.

Also, API client should explicitly specify types which he can handle using ... on TypeName { construction. Without it, if you change Number|Boolean to Number|Boolean|String it will break client which don't expect the value to be a string.

That mean you can't serialize union of scalar types as a scalar value and forced to support some equivalents of __typename and ... on ScalarType { constructs. So the end result will look very similar to solution proposed by @calebmer

Personally, I think unions of scalar types are frequently abused.
A common example of this is returning an array of strings or the single string only because array with one element looks not so nice :)
It forces all API clients to add few more lines of code for every instance of such anti-pattern.

For the cases when you return arbitrary data you can always fall-back to providing value as a JSON string. Such pattern is already used by GraphQL introspection to return defaultValue
http://facebook.github.io/graphql/#sec-The-__InputValue-Type

Also, in SQL, you can have only one type per column and it doesn't prevent it from being dominate language for databases.

@ianks
Copy link

ianks commented Mar 9, 2017

A common example of this is returning an array of strings or the single string only because array with one element looks not so nice

There are legitimate use cases that are not simply cosmetic, however.

Here is an example from the Facebook Marketing API. When attempting to model filtering in the API, you have an object that represents a filter expression, like so:

filtering: [{ field: "clicks", operator: "LESS_THAN", value: 42 }]

However, value is by nature polymorphic, as it has to support expressions like this, as well:

filtering: [{ field: "campaign_id", operator: "IN", value: ["1", "2", ...] }]

Currently, this is not possible to express safely AFAIK. Scalar-union types would make this implementation trivial. Falling back to a JSON string for this seems like a poor solution. Some downsides:

  1. validate against the schema
  2. interactive tools like graphiql become less useful
  3. introducing another data serialization format into your resolvers, which can be a point of failure
  4. lack of type safety

I believe this is a bad idea to make GraphQL a superset of all popular formats/protocols.

This is a strawman. The argument does not necessitate GQL to be a superset of all protocols. Adding scalar union inputs certainly would not make GQL a superset of all protocols.


It seems like the Box approach could solve most of these issues without any changes to the spec, right?

# These two fields are mutually exclusive:
type AreaOfTownBox {
  stringValue: String 
  enumValue: AreaOfTown
}

# or 
type IntOrStringBox {
  stringValue: String
  intValue: Int 
}

@rmosolgo The problem I see with this is you assume that the two fields are mutually exclusive; however, there is no way to statically define that in the current type system.

@IvanGoncharov
Copy link
Member

IvanGoncharov commented Mar 9, 2017

A common example of this is RETURNING an array of strings or the single string only because array with one element looks not so nice

@ianks The entire issue is about returning data, it was clearly stated in the initial comment

It makes sense often to have a field that could RETURN, for example, an Int or a String

It has nothing to do with input objects since they are not only missing unions of scalars but don't support unions at all. If you feel that supporting unions of types (including scalars) is essential for GraphQL please open a separate issue.

Here is an example from the Facebook Marketing API

In this example, you know all possible fields in advance, moreover, you know which filtering is possible on which field. So you can represent this in terms of the current GraphQL spec like below:

filtering: {
  clicks: {
    LESS_THAN: 42 
  },
  campaign_id: {
    IN: ["1", "2"]
  }
}

Advantages of this are much better autocompletion and validation on the client side than when using unions of scalars.

@calebmer
Copy link

calebmer commented Mar 9, 2017

@ianks here’s some related issues on union input types: #202 and graphql/graphql-js#207

@RomanHotsiy
Copy link
Contributor

RomanHotsiy commented Mar 9, 2017

@calebmer

Another approach inspired by @rmosolgo’s recommendation would to have a type system that looks like:

union IntOrString = IntBox | StringBox

type IntBox {
  value: Int
}

type StringBox {
  value: String
}

There is one more alternative approach. You can just use custom scalars. GraphQL doesn't specify how custom scalars should be serialised so you can return anything, even free-form JSON. For example graphql-type-json and corresponding article in Apollo docs
Custom scalars lack types info though. But this is not an issue for using them. A notable example is GitHub API that has a number of custom scalars like DateTime, HTML, URI and information about type of those scalars (string) is present ONLY in description.

@ianks ianks mentioned this issue Mar 9, 2017
@j0nd0n7
Copy link

j0nd0n7 commented May 13, 2017

I was trying to do filtering by a single ID or an array of IDs and I get here :-(
my case:

union IDFilter = ID | [ID]
friends(id:IDFilter): [Friend]

now I have to go like this:
friends(id:ID, ids:[ID]): [Friend]

In my opinion and preference, the first option is better, you always filter using id:xxx

@josepot
Copy link

josepot commented Sep 19, 2017

@jdonzet In your case, wouldn't it be better to just do:

friends(idsFilter: [ID!]): [Friend]

If idsFilter is null there is no filter, you return all Friends... If it's not null, the Array must have at least one ID...

@acjay
Copy link

acjay commented Jan 3, 2018

I posed a somewhat related question here: https://stackoverflow.com/q/47933512/807674

Although, my question deals with the awkwardness of modeling union variants that should contain no data other than their mere presence (i.e. singletons). I do think it would be extremely helpful to have one blessed syntactic way of approaching this data modeling question, rather than a hodgepodge of ad hoc approaches.

At the end of the day, these all seem to be special cases of the question of "how far GraphQL should go from a data modeling perspective?" I would argue that obvious ways of representing algebraic data types would be extremely helpful. Mostly, this boils down to union allowing singletons (and at that point, may as well allow all scalars too). There's already the concept of an enum, which is really just a union of singletons. There's just no obvious bridge between that and proper union.

More than just a question of syntactic convenience, it means bumpy transitions for data models as they grow in complexity.

@tsirolnik
Copy link

Consider having a MongoDB instance having two collections - Users and Activities.

The user model has a field named "activities" that holds a reference (Mongoose: mongoose.Schema.Types.ObjectId) to the activities a given user is participating in.

Consider I want to use a union of Activities | String as I want to either load the activities IDs or the activities' documents -

i.e - union ActivityOrActivityID = Activity | String

Currently, this won't work and instead will return -

 Union type ActivityOrActivityID can only include Object types, it cannot include String.

I'm not sure if I'm missing something as I'm not a GraphQL Pro, but it seems like this should be supported as otherwise, to my understanding, one would need to create two different definitions - one for a "flat" user and one for a populated user...

@Tiedye
Copy link

Tiedye commented Feb 18, 2022

This feature would be a big help in migrating schemas that don't use the ID type to schemas that do without introducing breaking changes.

@acao
Copy link
Member

acao commented Feb 18, 2022

@benjie it seems ~1+ year later @oneOf RFC would be the place for this, yes?

@benjie
Copy link
Member

benjie commented Feb 19, 2022

I'm not sure what @Tiedye has in mind, but I don't think @OneOf can achieve that.

@Tiedye
Copy link

Tiedye commented Feb 21, 2022

@benjie For migrating argument types: union scalar types would allow it to be done more directly as only one intermediate schema would be required to support the old and new types. My use case is migrating Integer ids to ID ids. For example:

# with union scalar types
type Query {
  post(id: Integer!): Post
}
# to
type Query {
  post(id: Integer! | ID!): Post
}
# to
type Query {
  post(id: ID!): Post
}
# without union types
type Query {
  post(id: Integer!): Post
}
# to
type Query {
  post(id: Integer, new_id: ID): Post # schema no longer encompasses that an id is required
}
# to
type Query {
  post(id: ID, new_id: ID): Post
}
# to
type Query {
  post(id: ID!): Post # front-end has to migrate twice instead of once, very annoying
}

@benjie
Copy link
Member

benjie commented Feb 23, 2022

With @oneOf:

type Query {
  post(by: PostSpec!): Post
}
input PostSpec @oneOf {
  pk: Integer
}
# to
input PostSpec @oneOf {
  pk: Integer
  id: ID
}
# to
input PostSpec @oneOf {
  id: ID
}

I don't think there's any way to support union scalar types in general (without boxing like @OneOf does) because GraphQL cannot tell the difference between a String, an ID and a custom scalar that's encoded as a string.

@Tiedye
Copy link

Tiedye commented Feb 23, 2022

Why would it need to tell the difference? Argument types are specified in the query, and there is no overloading in GraphQL

@benjie
Copy link
Member

benjie commented Feb 23, 2022

Keep in mind custom scalars can do basically anything. Say for example you have:

scalar Base64EncodedJSON

When you send this to the server, you'll send something like: "eyJoZWxsbyI6IndvcmxkIn0="

When the server receives it, it would then "parse" it into an internal representation, which may well be an object such as (JSON syntax): {"hello":"world"}.

From a client perspective, Base64EncodedJSON and ID and String all look the same - they're strings.

From a schema point of view, there's nothing that tells GraphQL that the argument to myField(with: Base64EncodedJSON! | String!): ReturnType is not valid. The schema author might even expect it to work - give me a string, or give me the JSON. However, when we come to CoerceVariableValues() or similar, GraphQL does not know what parse method to call - is it dealing with a string, or base64-encoded JSON? How can it know?

@Tiedye
Copy link

Tiedye commented Feb 23, 2022

I think it is acceptable to run all applicable parse methods and if multiple succeed either: throw an error, or preferably have some tiebreak like first type in the union. There is likely a more refined way of doing the tiebreaks, but I think it is far from not doable. The extra execution cost is minimal and worth it for the use case I see myself using it for.

@rivantsov
Copy link
Contributor

being on the server side, I agree with @benjie, more troubles (recognizing what the heck is this token) than benefits. A simple oneOf wrapper type would provide equivalent functionality without ambiquities

@Tiedye
Copy link

Tiedye commented Feb 23, 2022

oneOf just doesn't solve the problem sadly :(

@aitorllj93
Copy link

Another approach inspired by @rmosolgo’s recommendation would to have a type system that looks like:

union IntOrString = IntBox | StringBox

type IntBox {
  value: Int
}

type StringBox {
  value: String
}

With a query that looks like:

fragment intOrString on IntOrString {
  __typename
  ... on IntBox { intValue: value }
  ... on StringBox { stringValue: value }
}

This sounds like ValueObjects to me so I think it's the best approach right now:

union ScalarValue = IntValue | StringValue;

type IntValue{
  value: Int
}

type StringValue{
  value: String
}
fragment scalarValue on ScalarValue {
  __typename

  ... on IntValue { 
     value 
  }
  ... on StringValue { 
    value 
  }
}

@ThisIsRuddy
Copy link

I too would enjoy the benefits of using enums in unions, it would allow me to modularise my directive inputs options which I currently have to use one big enum for 😅

@psirenny
Copy link

psirenny commented Dec 1, 2022

The lack of a union scalar type exacerbates the null/undefined problem and API authors/consumers largely avoid boxing because of how unergonomic it is. A None enum combined with scalars would allow to disambiguate between the many meanings of null.

@ggepenyan

This comment was marked as off-topic.

@ggepenyan

This comment was marked as off-topic.

@benjie

This comment was marked as off-topic.

@CRC-Mismatch
Copy link

CRC-Mismatch commented May 17, 2023

Please correct me if I'm wrong at any rate, but I believe a use-case example would be the GraphQL spec's own "error type" specification in itself... According to the examples in https://spec.graphql.org/draft/#sec-Errors, the error's path element would be a list of both Strings and Ints. As it currently stands, a GraphQL API would be unable to reproduce its own GraphQL errors in a valid schema as-is (for example if I want to reproduce such errors in a log query of sorts).

@biggestredapple

This comment was marked as off-topic.

@benjie

This comment was marked as off-topic.

@biggestredapple

This comment was marked as off-topic.

@benjie

This comment was marked as off-topic.

@biggestredapple

This comment was marked as off-topic.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
👻 Needs Champion RFC Needs a champion to progress (See CONTRIBUTING.md) 💭 Strawman (RFC 0) RFC Stage 0 (See CONTRIBUTING.md)
Projects
None yet
Development

No branches or pull requests