New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Proposal: Support union scalar types #215

Open
stubailo opened this Issue Sep 30, 2016 · 35 comments

Comments

Projects
None yet
@stubailo
Contributor

stubailo commented Sep 30, 2016

It makes sense often to have a field that could return, for example, an Int or a String. It would be convenient to be able to use a resolveType-like mechanism to determine how the value is serialized. This would be even more useful for custom scalars.

The specification currently says: "GraphQL Unions represent an object that could be one of a list of GraphQL Object types", which basically means that unions are the same as interfaces, but without any shared fields. Supporting unions of scalars would further differentiate unions from interfaces.

@rmosolgo

This comment has been minimized.

rmosolgo commented Sep 30, 2016

Could you share a couple of concrete use cases? I ask because, personally, I'm usually reading values right out of MySQL (or running those values through transformations with predictably-typed output), so I haven't needed this yet!

@migueloller

This comment has been minimized.

migueloller commented Oct 3, 2016

@rmosolgo,

Here is a concrete use case that we've come across at my team. I have a couple more if you need them.

We've got an algorithm that produces results based on location. We have an enum AreaOfTown that contains all possible areas of town to filter the results. Here's the catch, the algorithm can also filter locations by "near me". Because we can't simply make a union that is AreaOfTown | "NEAR_ME" for the input type, then we have to create a completely new enum that contains all of the values in AreaOfTown and "NEAR_ME". This is undesirable because if we ever change the AreaOfTown enum by adding more (this will 100% happen in the future), then we have to make sure we also update our other enum.

If we could write scalar unions then it would solve this problem for us, among others (we would like to have constructs like Int | [Int!] or Int | String like @stubailo mentioned).

@stubailo

This comment has been minimized.

Contributor

stubailo commented Oct 3, 2016

In our case, it's because we were returning a structure that represented a JSON tree, where keys can be integers (in the case of arrays) or strings (in the case of objects). We ended up compromising for now to just use strings, but I can imagine how this would be extra useful in the case where you want to return one of a set of custom scalars, which might have different serialization logic.

@migueloller

This comment has been minimized.

migueloller commented Oct 5, 2016

Here are other relevant discussions:

@leebyron made some good points in his comments here.

@marcintustin

This comment has been minimized.

marcintustin commented Feb 9, 2017

Allowing uniontypes to contain primitives allows us to model any kind of source that can be polymorphic in its source data. For example, avro data or elasticsearch.

This is a severe limitation on exposing existing data, rather than creating new systems that are defined by graphql from the ground up.

@leebyron

This comment has been minimized.

Collaborator

leebyron commented Feb 9, 2017

What's still unclear to me is what a valid query would look like which encounters such a union, and how should well-typed clients determine what to expect from such a query?

Consider:

type MyType {
  field: String
}

union MyUnion = String | MyType

type Query {
  example: MyUnion
}
@marcintustin

This comment has been minimized.

marcintustin commented Feb 9, 2017

@leebyron

This comment has been minimized.

Collaborator

leebyron commented Feb 9, 2017

For the above schema consider the query:

{ example }

It's clear what this query should return if the example field returns a String at runtime, but what should it return if the value was a MyType?

Similarly:

{ 
  example {
    ... on MyType {
      field
    }
  }
}

It's clear what this would return for MyType, but what should it return for a String?

@marcintustin

This comment has been minimized.

marcintustin commented Feb 9, 2017

@leebyron

This comment has been minimized.

Collaborator

leebyron commented Feb 9, 2017

It's not clear to me what that means. What would a valid query look like that would clearly give a value for a primitive type or an object type?

@leebyron

This comment has been minimized.

Collaborator

leebyron commented Feb 9, 2017

Specifically http://facebook.github.io/graphql/#sec-Leaf-Field-Selections is what I'm referring to as "valid query"

@stubailo

This comment has been minimized.

Contributor

stubailo commented Feb 9, 2017

I think the original issue is about having unions where every member is a scalar right? Not unions between scalers and objects?

@OlegIlyenko

This comment has been minimized.

OlegIlyenko commented Feb 9, 2017

just an idea. Maybe it would make sense to introduce special inline fragment semantics for scalars:

{ 
  example {
    ... on String {
      value
    }
    ... on MyType {
      field
    }
  }
}

This can be generalized by allowing scalars to be used as leafs (current semantics) and as objects with a single field (value). A huge advantage of this approach is that one can use __typename introspection on scalars.

In our API we have a similar issue. For some parts of the schema, we even have introduced object types like StringAttribute, IntAttribute, etc. in order to unify object attributes ("attributes" are a part of user-defined data structures which our service supports) and scalar attributes and provide same introspection capabilities for both.

@migueloller

This comment has been minimized.

migueloller commented Feb 9, 2017

What if the result was a scalar then the result would simply ignore the selection set, and if it is an object type then it would consider the selection set. Statically, the query can be analyzed because the schema should have enough information to know if the union contains a scalar or an object or both.

If there is no selection set and the value is an object then the returned value can just be {}. If there is a selection set and the value is a scalar then the returned value can just be the scalar value (ignoring the selection set, just like it is done for object type unions).

On the client most type system allow unions of arbitrary types, so the client can expect a result that is string | { name: string } in Flow, for example.

Nevertheless, this starts getting into the design choices of the simplicity of GraphQL, which I think @leebyron made great points here: graphql/graphql-js#207 (comment)

@calebmer

This comment has been minimized.

calebmer commented Feb 9, 2017

Mixing scalars and object types would require a change to the schema which is too complex to consider. Schemaless GraphQL clients depend on the guarantee the { field } is scalar and { field { a b c } } is composite. Given that a scalar field could be an object this is the only reliable way to tell a scalar apart from a composite value without looking at the schema. By breaking that guarantee you introduce a lot of unnecessary complexity into the GraphQL client when you could easily just add a composite StringBox type or similar:

type StringBox {
  value: String
}
@migueloller

This comment has been minimized.

migueloller commented Feb 9, 2017

@calebmer, agreed! It's a balance between feature-set and simplicity and I think the GraphQL designers wanted to err on the side of simplicity. Please correct me if I'm wrong @leebyron.

Given this, I think that the original discussion of scalar-scalar (or more specifically, leaf nodes) unions is what we should perhaps consider (as opposed to object-scalar unions).

@rmosolgo

This comment has been minimized.

rmosolgo commented Feb 9, 2017

It seems like the Box approach could solve most of these issues without any changes to the spec, right?

# These two fields are mutually exclusive:
type AreaOfTownBox {
  stringValue: String 
  enumValue: AreaOfTown
}

# or 
type IntOrStringBox {
  stringValue: String
  intValue: Int 
}

It pushes the complexity to the schema but keeps GraphQL simple and predictable. Are there downsides to that approach other than verbosity?

@migueloller

This comment has been minimized.

migueloller commented Feb 9, 2017

@rmosolgo, you are right that it is possible but it comes at the cost of added complexity in the schema, verbosity in the queries, and having to check each value in the client side at runtime to determine which field was returned.

@calebmer

This comment has been minimized.

calebmer commented Feb 9, 2017

Another approach inspired by @rmosolgo’s recommendation would to have a type system that looks like:

union IntOrString = IntBox | StringBox

type IntBox {
  value: Int
}

type StringBox {
  value: String
}

With a query that looks like:

fragment intOrString on IntOrString {
  __typename
  ... on IntBox { intValue: value }
  ... on StringBox { stringValue: value }
}
@IvanGoncharov

This comment has been minimized.

Collaborator

IvanGoncharov commented Feb 9, 2017

Because we can't simply make a union that is AreaOfTown | "NEAR_ME" for the input type, then we have to create a completely new enum that contains all of the values in AreaOfTown and "NEAR_ME". This is undesirable because if we ever change the AreaOfTown enum by adding more (this will 100% happen in the future), then we have to make sure we also update our other enum.

@migueloller This problem can be solved directly in your code without any change to GraphQL.
Wehn you define your types in source code just create a small utility function like extendEnumValues in the example below:

new GraphQLEnumType({
  name: 'NewName',
  values: extendEnumValues(AreaOfTown.getValues(), ['NEAR_ME']);
});

As for IDL, I think it would be great to have support for it there, for example allowing extend to be used on enums.
But this is a topic for separate issue/PR.

Allowing uniontypes to contain primitives allows us to model any kind of source that can be polymorphic in its source data. For example, avro data or elasticsearch.
This is a severe limitation on exposing existing data, rather than creating new systems that are defined by graphql from the ground up.

@marcintustin There are number common JSON patterns that you can't express in GraphQL: key-value pairs (aka Maps), Tuples, etc.
I believe this is a bad idea to make GraphQL a superset of all popular formats/protocols.

Given this, I think that the original discussion of scalar-scalar (or more specifically, leaf-nodes) unions is what we should perhaps consider (as opposed to object-scalar unions).

@migueloller There are many cases when API client can't detect type based on the value returned.
For example, union of enums having the same value. It is pretty common for enums to have common values like None, Undefined, etc.
Another example is a union of custom scalars with the same base type like AbsoluteUrl|RelativeUrl.

Also, API client should explicitly specify types which he can handle using ... on TypeName { construction. Without it, if you change Number|Boolean to Number|Boolean|String it will break client which don't expect the value to be a string.

That mean you can't serialize union of scalar types as a scalar value and forced to support some equivalents of __typename and ... on ScalarType { constructs. So the end result will look very similar to solution proposed by @calebmer

Personally, I think unions of scalar types are frequently abused.
A common example of this is returning an array of strings or the single string only because array with one element looks not so nice :)
It forces all API clients to add few more lines of code for every instance of such anti-pattern.

For the cases when you return arbitrary data you can always fall-back to providing value as a JSON string. Such pattern is already used by GraphQL introspection to return defaultValue
http://facebook.github.io/graphql/#sec-The-__InputValue-Type

Also, in SQL, you can have only one type per column and it doesn't prevent it from being dominate language for databases.

@ianks

This comment has been minimized.

ianks commented Mar 9, 2017

A common example of this is returning an array of strings or the single string only because array with one element looks not so nice

There are legitimate use cases that are not simply cosmetic, however.

Here is an example from the Facebook Marketing API. When attempting to model filtering in the API, you have an object that represents a filter expression, like so:

filtering: [{ field: "clicks", operator: "LESS_THAN", value: 42 }]

However, value is by nature polymorphic, as it has to support expressions like this, as well:

filtering: [{ field: "campaign_id", operator: "IN", value: ["1", "2", ...] }]

Currently, this is not possible to express safely AFAIK. Scalar-union types would make this implementation trivial. Falling back to a JSON string for this seems like a poor solution. Some downsides:

  1. validate against the schema
  2. interactive tools like graphiql become less useful
  3. introducing another data serialization format into your resolvers, which can be a point of failure
  4. lack of type safety

I believe this is a bad idea to make GraphQL a superset of all popular formats/protocols.

This is a strawman. The argument does not necessitate GQL to be a superset of all protocols. Adding scalar union inputs certainly would not make GQL a superset of all protocols.


It seems like the Box approach could solve most of these issues without any changes to the spec, right?

# These two fields are mutually exclusive:
type AreaOfTownBox {
  stringValue: String 
  enumValue: AreaOfTown
}

# or 
type IntOrStringBox {
  stringValue: String
  intValue: Int 
}

@rmosolgo The problem I see with this is you assume that the two fields are mutually exclusive; however, there is no way to statically define that in the current type system.

@IvanGoncharov

This comment has been minimized.

Collaborator

IvanGoncharov commented Mar 9, 2017

A common example of this is RETURNING an array of strings or the single string only because array with one element looks not so nice

@ianks The entire issue is about returning data, it was clearly stated in the initial comment

It makes sense often to have a field that could RETURN, for example, an Int or a String

It has nothing to do with input objects since they are not only missing unions of scalars but don't support unions at all. If you feel that supporting unions of types (including scalars) is essential for GraphQL please open a separate issue.

Here is an example from the Facebook Marketing API

In this example, you know all possible fields in advance, moreover, you know which filtering is possible on which field. So you can represent this in terms of the current GraphQL spec like below:

filtering: {
  clicks: {
    LESS_THAN: 42 
  },
  campaign_id: {
    IN: ["1", "2"]
  }
}

Advantages of this are much better autocompletion and validation on the client side than when using unions of scalars.

@calebmer

This comment has been minimized.

calebmer commented Mar 9, 2017

@ianks here’s some related issues on union input types: #202 and graphql/graphql-js#207

@RomanGotsiy

This comment has been minimized.

Contributor

RomanGotsiy commented Mar 9, 2017

@calebmer

Another approach inspired by @rmosolgo’s recommendation would to have a type system that looks like:

union IntOrString = IntBox | StringBox

type IntBox {
  value: Int
}

type StringBox {
  value: String
}

There is one more alternative approach. You can just use custom scalars. GraphQL doesn't specify how custom scalars should be serialised so you can return anything, even free-form JSON. For example graphql-type-json and corresponding article in Apollo docs
Custom scalars lack types info though. But this is not an issue for using them. A notable example is GitHub API that has a number of custom scalars like DateTime, HTML, URI and information about type of those scalars (string) is present ONLY in description.

@ianks ianks referenced this issue Mar 9, 2017

Closed

Union input types #202

@j0nd0n7

This comment has been minimized.

j0nd0n7 commented May 13, 2017

I was trying to do filtering by a single ID or an array of IDs and I get here :-(
my case:

union IDFilter = ID | [ID]
friends(id:IDFilter): [Friend]

now I have to go like this:
friends(id:ID, ids:[ID]): [Friend]

In my opinion and preference, the first option is better, you always filter using id:xxx

@josepot

This comment has been minimized.

josepot commented Sep 19, 2017

@jdonzet In your case, wouldn't it be better to just do:

friends(idsFilter: [ID!]): [Friend]

If idsFilter is null there is no filter, you return all Friends... If it's not null, the Array must have at least one ID...

@acjay

This comment has been minimized.

acjay commented Jan 3, 2018

I posed a somewhat related question here: https://stackoverflow.com/q/47933512/807674

Although, my question deals with the awkwardness of modeling union variants that should contain no data other than their mere presence (i.e. singletons). I do think it would be extremely helpful to have one blessed syntactic way of approaching this data modeling question, rather than a hodgepodge of ad hoc approaches.

At the end of the day, these all seem to be special cases of the question of "how far GraphQL should go from a data modeling perspective?" I would argue that obvious ways of representing algebraic data types would be extremely helpful. Mostly, this boils down to union allowing singletons (and at that point, may as well allow all scalars too). There's already the concept of an enum, which is really just a union of singletons. There's just no obvious bridge between that and proper union.

More than just a question of syntactic convenience, it means bumpy transitions for data models as they grow in complexity.

@JarredMack

This comment has been minimized.

JarredMack commented Feb 14, 2018

Just commenting to say I've also got a use case similar what @acjay describes - I'm working on a query which can either resolve a user ID as a string, or optionally go resolve the user data from a remote service.

My first pass was to basically define a custom field resolve which return a resolved user, and just map it in the query - e.g.:

query {
  theResource(id: 'foo') {
    user: resolve {
      id
      name
    }
  }
}

query {
  theResource(id: 'foo') {
    user
  }
}

But ideally I'd like to implement it without needing to specify the custom type, i.e., dropping the resolve from above and either returning immediately as a string, or if additional fields are requested, resolve them.

I'm aware that I could get around this by just having id as part of the returned User object and conditionally resolve the rest, but I'm working with an existing REST API which returns String | User, and I want to mirror the behaviour.

@falconmick

This comment has been minimized.

falconmick commented Mar 7, 2018

My use case:

I am building a graphql proxy infront of an API. This Api has an array of component’s data for a given page. Now I have 20 possible components and I would rather not make 20 queries that get all of the fields the API returns just so that I can use revolvers to fetch secondary items like forms. So what I want to do is define a set of components that need to be a part of a union so I can socify what to resolve, however I want a catch all of a custom scalar that basically json wraps all fields returned.

If union types supported this I could specify what components need a type (to be added to the union) and the ones which don’t fetch extra data get caught in the JSON scalar type and all the data is serialised and no further resolvers are called.

I’ll edit this when I get to work, super Zapped but needed a brain dump

@GroofyIT

This comment has been minimized.

GroofyIT commented Apr 11, 2018

the recommendation of @calebmer seems the best fit to me, as a nice-to-have it could be partially build in as a standard. (meaning no boxes have to be created by developers).

The issue at hand is that scalar types are no object types. However as available in many languages a developer can choose to handle a scalar type as an object type or not (consider string vs String in c# for example).
Following that perspective, a scalar type can be considered as a "shorthand" for the alternative object type. aka "my string" === { __typename:"string", value:"my string" }.

Within a query the user/developer is the one that makes the choice how to handle it.
given:

type MyType {
  field: String
}

union MyUnion = String | MyType

type Query {
  example: MyUnion
  exampleString: String
}

Could allow for:

example {
    ... on String {
      value
    }
    ... on MyType {
      field {
         value
      }
    }
  }

example {
    ... on String 
    ... on MyType {
      field
    }
  }

exampleString 
exampleString {
     value
}

As for my use case:

In a task management system, a task can be supplied with contextual data. Of what this contextual data consists is defined by a task definition, which is created by a user. So it can be anything from a scalar to an object.
Say for example a task's definition states it can have contextual data:

  • "name" (a string field)
  • "client" (selection from a list of clients)

When an (frontend) application wants to show the task information it sends over a query.
We could just send over the client id and let the frontend look it up in another query, but that's not really harnessing the power of graphql. It would be cooler (and more performant) to let the application handle it inline:

task(id:"1234") {
     name
     fields {
        fieldName
        fieldValue {
         ... on String { value }
         ... on Client { id name }
       }
    }
}

There are alternatives into how to implement this of course (dynamic type system/describers/...), though a mixed union of scalar and object types would simplify a lot.

@mrdulin

This comment has been minimized.

mrdulin commented Apr 18, 2018

My case is the payload field can be Int, or String scalar type.
when I write it like union type:

const schema = `
  input QuickReply {
    content_type: String
    title: String
    payload: Int | String
    image_url: String
  }
`

I got an error:

GraphQLError: Syntax Error GraphQL request (45:18) Expected Name, found |

    44:     title: String
    45:     payload: Int | String
                         ^
    46:     image_url: String
@mike-marcacci

This comment has been minimized.

mike-marcacci commented May 14, 2018

I've been wrestling with something that's related to this, but isn't this exactly. I'm curious if my issue is what's been driving many of these use cases.

My need is not necessarily to have a codified scalar union, but to ergonomically query a heterogeneous list that currently results in a field merge conflict. For example:

type Foo = {
  payload: Number
}

type Bar = {
  payload: String
}

type MyUnion = Foo | Bar;

type Query {
  example: [MyUnion]
}

The current field merging logic prohibits the following query:

query {
  example {
    payload
  } 
}

because payload can be either String or Number.

To get around this, we must either run example twice with conditional fragments (which is incredibly inefficient if the list is large), or alias the payload property so payloadString and payloadNumber in conditional fragments (which is intensely frustrating or impossible when the field list comes from component fragments somewhere downstream).

In order to support scalar unions, the merge rules would have to be loosened, and the above would become valid even without an explicit scalar union.

So I guess I'm curious if the rule change itself would solve the problems of people here. If so, that might be a good separate first step (and I can create an issue if there's interest).

@speller

This comment has been minimized.

speller commented Sep 13, 2018

Support this proposal. In my project I want to implement a logic when field manufacturer is mentioned in query then the string value will be returned. If client will request manufacturer(id, name) then the object value with fields id and name will be returned.

@Kadrian

This comment has been minimized.

Kadrian commented Sep 14, 2018

Have a similar real-world example here. A Product has a list of attributes. Each attribute always has a name and a value. In case it's a number attribute, let's say: "weight", there's an additional unit possible, e.g. "kg".

Coming from Gatsby here, this structure won't work during Schema generation, because value has conflicting types. I guess the solution is still that I'd have to do something like valueString, valueNumber, valueBool, valueRange, right?

type ProductType = {
  name: string,
  attributes: AttributeType[]
};

type AttributeType =
  | StringAttributeType
  | NumberAttributeType
  | BoolAttributeType;

type StringAttributeType = {
  name: string,
  value: ?string,
};

type BoolAttributeType = {
  name: string,
  value: ?boolean
};

type NumberAttributeType = {
  name: string,
  value: ?(number | RangeType),
  valueVariance?: number,
  unit?: string, // If no unit, then it's probably a count or similar
};

type RangeType = {
  from: number,
  to: number
};
@leebyron

This comment has been minimized.

Collaborator

leebyron commented Oct 2, 2018

Marking this as Strawman and Needs Champion, however I'm still very skeptical since not all target server or client languages support scalar unions.

GraphQL already prevents (via validation rules) other scenarios where a field could return more than one kind of type.

If anyone decides to take ownership of this proposal, I think the bar will be set high to prove value and early attention should be placed on how it would affect code generation systems, especially ones based in Java/C/C++/ObjC.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment