Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Whole API versioning #175

Closed
charlieschwabacher opened this Issue May 10, 2016 · 11 comments

Comments

Projects
None yet
6 participants
@charlieschwabacher
Copy link

commented May 10, 2016

We have been using GraphQL as a backend to both web and iOS apps at Chloe and Isabel for around five months, in production for around a month and a half. It is a joy to work with, and introducing it into our large existing app has been nearly frictionless. I'm hoping to see it used all over the place in the future - for everything from quick prototypes to public APIs. The one place we have run into problems so far is needing to make changes to existing fields.

I understand that the lack of versioning in GraphQL now is deliberate, and that there are good arguments against including it (IE @leebyron's comment here). That said, I think it is worth reconsidering - in real use, you are likely to make mistakes when designing your api, have requirements change over time, or need to make changes for other reasons.

Without versioning, when the behavior of an existing field needs to change, it is necessary to deprecate the old field and add a new field w/ a new name. Usually the existing name will be the ideal name for the field, and the new name will by definition be a second choice. As you continue to make changes to an API, these compromises add up, and the quality of the API (measured by names that make sense, are consistent w/ other areas of your code, are easy to guess and remember, etc..) will degrade over time.

I'm going to give a concrete example of a possible breaking change to a field, go through the ways I can think of to deal w/ it without versioning, with versioning but without any changes to the GraphQL spec, and then suggest an ideal solution that would require spec changes.

Example requiring field versioning or replacement:

Imagine we have the following schema, with both a web app and native mobile app displaying a feed of images in pages:

type Image {
  url: String
  caption: String
}
type Query {
  images(count: Int, offset: Int): [Image]
}

with clients making queries like:

query ImagesQuery {
  images(count: 10, offset: 0) {
    url
    caption
  }
}

On the web we add the ability to upload .gif images to the feed, but we want to only display still images in the mobile app because of bandwidth concerns. Say also we know in the future we will want to support stereo images, so we want to be able to add image types and query different combinations of them as needed. Our ideal schema could now look something like this, with images taking an additional argument specifying which types we want to receive, and by default returning unfiltered images.

enum ImageType { GIF, STILL }
type Image {
  url: String
  caption: String
}
type Query {
  images(types: [ImageType], count: Int, offset: Int): [Image]
}

The web client would not require changes, and we would update our mobile app to query for only still images like:

query ImagesQuery {
  images(types: [STILL], count: 10, offset: 0) {
    url
    caption
  }
}

The problem with this is that we have installed versions of the mobile app still using the original query without the types argument. If we modify the schema as described above, the existing clients would get gif images in their results. So, we need to make sure that any changes we make preserve the existing behavior of the images field.

Posible approaches w/o spec changes or versioning:

  • We could make the default value of the types argument [STILL], and modify the query in our web app to ask for types: [GIF, STILL]. This solves our immediate problem, but having the default behavior of the field include filtering is not predictable. It's easy to imagine this change leading to bugs as new users of the API expect the images field to be unfiltered.

    type Query {
      images(types: [ImageType] = [STILL], count: Int, offset: Int): [Image]
    }
  • We could deprecate the images field, and add a new field either w/ a version number like images2, semantic name like imagesByType, or synonym like pictures. This feels unsustainable, if we continue to make changes, we could end up with very poorly named fields like images7, imagesByTypeAndRatingAndUploader, or doodles.

    type Query {
      images(count: Int, offset: Int): [Image] // deprecated
      imagesByType(types: [ImageType], count: Int, offset: Int): [Image]
    }

    With this second approach, we could track use of the deprecated field and eventually clean up the API. Still, at least in the short term, either one of these changes seem to me to damage API quality, and if the API is public, we might need to support the deprecated fields and keep the changed fields in perpetuity.

Possible approaches implementing versioning without spec changes

  • Its possible to version types directly. This would have required us to create a versioned API from the beginning, but lets us replace fields without needing to degrade field names by creating a new version of their type. We also are able to maintain documentation for api versions through metada on the schema.

    enum ImageType { GIF, STILL }
    type Image {
      url: String
      caption: String
    }
    type V1 {
      images(types: [ImageType], count: Int, offset: Int): [Image]
    }
    type V2 {
      images(types: [ImageType], count: Int, offset: Int): [Image]
    }
    type Query {
      v1: V1
      v2: V2
    }

    Another similar approach would be to maintain versioned schemas, send the version in a post variable or http header along with graphql requests, and resolve using the appropriate schema.

    The problem with these approaches is that they quickly become impractical with more complex schemas. If we want to replace a field on a deeply nested type, we need to create new versions of every type referencing the changed type, every type refrencing one of those types, and so on. I think this makes this approach unreasonable in most cases.

  • We could provide a whole api version argument in the query context, and check it during individual field resolution. I think this is the best solution so far - we keep our ideal field names, required changes don't proliferate outside of the field in question, and we have room to continue to make changes in the future. This will handle changing behavior of the field, but fails to handle changing types.

    resolve: (obj, args, ctx) => {
      if (ctx.version >= 2) {
        return getImages(args);
      } else {
        const {count, offset} = args;
        return getImages({count, offset, types: ['STILL']});
      }
    }

    Another downside to this approach is that we lose the ability to provide documentation for older versions. Without any changes to the GraphQL spec, the documentation will be able to include only the arguments and description of the latest version of the field.

Ideal solution with spec changes (defining max version on fields):

The only extra information needed to make whole API versioning possible is a maximum version on fields, and a desired version when resolving queries. Both of these could be optional, so they wouldn't require any breaking changes to the GraphQL spec or reference implementation, and wouldn't add complexity to definition of non-versioned APIs. Versioning could be added to initially non versioned APIs without requiring any planning for it ahead of time. Metadata would be preserved for every version of the API, allowing tools like GraphiQL to easily switch between versions, and providing deprecation reasons for replaced fields could create a useful log of changes to fields over time.

As an example, the Images schema above could potentially be defined by passing an array of objects as field definintion in graphql-js like:

var Query = new GraphQLObjectType({
  name: 'Query',
  fields: {
    images: [
      {
        maxVersion: '1.0.0',
        deprecationReason: ```
          This version of the images field returns only still images, in future versions
          we return both still images and animated gifs with the ability to filter by
          type using the 'types' argument.
        ```,
        type: new GraphQLList(Image),
        args: {
          count: GraphQLInt,
          offset: GraphQLInt
        },
        resolve: (obj, args) => {
          const {count, offset} = args;
          return getImages({count, offset, types: ['STILL']});
        }
      },
      {
        type: new GraphQLList(Image),
        args: {
          count: GraphQLInt,
          offset: GraphQLInt,
          types: new GraphQLList(ImageType)
        },
        resolve: (obj, args) => getImages(args)
      }
    ]
  }
});

Then, when executing queries, we could provide a version argument - if we were to proivde 1.0.0 or an earlier version, GraphQL would use the first definition:

const version = '1.0.0';
graphql(Schema, query, root, context, variables, operationName, version);

If we did not provide a version, or provided '1.0.1' or greater, graphql would use the second:

graphql(Schema, query, root, context, variables, operationName);

or:

const version = '1.0.1';
graphql(Schema, query, root, context, variables, operationName, version);

TLDR / Summary:

Whole API versioning could be achieved easily w/ two spec additions:

  • an optional maximum version when defining fields
  • an optional version when resolving queries

These additions:

  • would not require any breaking changes to the GraphQL spec or reference implementation
  • would not add complexity to definition of non-versioned APIs
  • would make versioning available, without requiring any planning for it ahead of time
  • would prevent degradation of API quality (meaured by names that make sense, are consistent w/ other areas of your code, and are easy to guess and remember) when mistakes are fixed or requirements change
  • metadata would be preserved for every version of the API, allowing tools like GraphiQL to easily switch between versions
@dylanahsmith

This comment has been minimized.

Copy link
Contributor

commented May 10, 2016

It's easy to imagine this change leading to bugs as new users of the API expect the images field to be unfiltered.

It sounds like you wouldn't want that to allow images to be returned unfiltered, since then you would end up with the same backwards compatibility concerns when you add another type. So your ideal behaviour might actually be to require that types argument, although there isn't a way to indicate in the schema that using the default value for an argument is deprecated.

If you really want to version your API, then you can do that the same way it is done in REST, you could version your URL. However, adding and deprecating fields seems like a more maintainable way of evolving the API.

@sgwilym

This comment has been minimized.

Copy link

commented May 10, 2016

@dylanahsmith Focusing on the specific example of the image filtering is missing the forest for the trees. You shouldn't have to puzzle out ways to make sure edited schemas support all kind of different clients with a single implementation.

Also, creating a new endpoint URL every time you make breaking changes to your GraphQL schema doesn't sound sustainable in the long-term, like, at all.

@dylanahsmith

This comment has been minimized.

Copy link
Contributor

commented May 10, 2016

Also, creating a new endpoint URL every time you make breaking changes to your GraphQL schema sounds like a recipe for madness.

You wouldn't be a creating another code path, you would just be getting the parameter from the url without having to specify const version = '1.0.0'; in the query itself. Similar to have you aren't creating a separate url endpoint in your code for /users/1 and /users/2.

You shouldn't have to puzzle out ways to make sure edited schemas support all kind of different clients with a single implementation.

I don't see anything stopping you from implementing support for versioning on a server-side graphql implementation. There is nothing in the graphql spec that would prevent you from implementing your graphql-js example.

@leebyron

This comment has been minimized.

Copy link
Collaborator

commented May 10, 2016

Thanks for this clear and thoughtful writeup.

I still believe that versioning is too complicating to be added within the GraphQL spec itself, however having tools to support other forms of versioning is reasonable.

Another similar approach would be to maintain versioned schemas, send the version in a post variable or http header along with graphql requests, and resolve using the appropriate schema.

The few times this has come up before - "what do you do when you just have to add a breaking change" - I recommend this approach. When constructing the GraphQL schema, allow for a "version" to be provided which then produces the appropriate schema. This enables all existing tools to operate correctly: validation, execution, GraphiQL, any code-generation you may have - they all see a given "snapshot" of the schema in it's current state, presumably your tools default to the "latest".

If you really want to version your API, then you can do that the same way it is done in REST, you could version your URL.

And this is often how such a thing is exposed. my.endpoint/graphql/v2/? - this sort of thing is really common in long-lived versioned APIs.

From a sustainability point of view, this is up to you and how often you introduce breaking changes. As a point of fact for example: Facebook's GraphQL schema (over 4 years old, 1,000s of types at this point, and under active change by 100s of engineers) has never needed a versioned breaking change, and still supports 4-year old shipped versions of iOS and Android apps (which unfortunately are still being used).

Here are a couple of concerns I have for introducing versions in a more sanctioned way:

Schema Validation

Schema Validation is the step when building a schema which ensures that all types make sense. If an object implements an interface - it's checked to ensure that it actually does, for example.

Schema validation can be performed fairly simply on a given schema, but that's because every portion of a schema is well formed and complete as soon as the schema is assembled. Introducing versions in makes this much more difficult, as these sorts of type checking validations need to happen for every possible combination. This is true regardless of whether you perform that versioning control within the schema, as proposed, or if it happens earlier during the construction of a schema as I suggested as an alternative.

This, by the way, is also an issue for dynamic schema which are constructed on the fly according to any other input (version or otherwise). Schema validation must occur before the schema can be used, and this becomes either very expensive if validating a schema for every possible version, or potentially problematic if only validating just before execution and an invalid schema exists for some under-tested version.

This isn't a deal-breaker by any means, but it does add a lot of maintainance cost to your GraphQL
endpoint. This is a tradeoff you'll have to make.

Server code maintenance

This is another tradeoff. Versions can result in conditional logic spread throughout your server which can be a real legibility and maintenance cost. It becomes hard to understand all the possible states your server can be in when multiple similar but not the same conditions appear:

fields: {
  one: [
    { maxVersion: '1.0.0', ... },
    { maxVersion: '2.0.0', ... }
  ],
  two: [
    { maxVersion: '1.1.0', ... },
  ],
  three: [
    { maxVersion: '1.0.1', ... },
    { ... },
  ]
}

The above example has many different combinatoric possibilities due to versioning, and this one is simple.

Numbers vs Migrations

A lot of the problems above are made worse because of the non-linear nature of versioning within many APIs. For example, say there are two parts of your API that need to undergo a breaking change at around the same time. You have your images that you want to start filtering, but another part of your team is working on a different project for your feed so that feed now accepts an order argument and the default behavior should be something else to make the most sense. The feed change is ready first on the server, so it bumps the API version to v2. Later the images change is ready, which bumps the API to v3.

Now, both changing images and feed requires work for your mobile applications, but the mobile client team is only prepared to support gif images, but they're not prepared to support the feed change. But they can't upgrade to v3 without first supporting the feed change! Oh no!

From here there are two paths forward:

1) Migrations

Migrations (or sometimes called Capabilities) are often used similar to API versions, however they don't require a linear scale. Instead they're represented as a list of boolean flags which always default to false when first introduced, and can be turned to true for individual uses to indicate support for a change. Often times these come with a cut-off date after which the migration defaults to true and very old clients may break. Newcomers to an API start with all existing migrations defaulted to true unless they explicitly request otherwise, to ensure new users start on the latest "version". Using this tool, the server could provide newImages and newFeed as two migrations, and our mobile client team could first start passing newImages to indicate they now have support for that, and only later when they get to support for the changes to feed do they send newFeed. No need to add support for things on a client in the same explicit order they were added on the server!

However, if this sounds potentially complicated to you, then I'm with you. This can be a great control tool for a widely consumed API which has regular breaking changes, but maintaining these migrations can be a pain.

2) Additive

Our server engineers instead decide to avoid a breaking change and instead introduce a new field filteredImages and a new field orderedFeed to represent these new features in an additive way (there are other ways, but this one is fine). Now, when our mobile client team wants to support the new way to request images, they can do so explicitly by changing their query to no longer include images and instead include filteredImages. They didn't have to worry at all that the orderedFeed change also occurred, let alone the order in which the change occurred.

@leebyron

This comment has been minimized.

Copy link
Collaborator

commented May 10, 2016

Ultimately, this all boils down to tradeoffs.

From working with GraphQL over the last 4 years, and with other APIs before that, we decided that it was best if GraphQL itself not attempt to represent versions and instead strongly suggest additive changes only. Not because versioning is wrong per se, but because it's costly and the additive cost is exponential to the amount of versioning that occurs due to combinatorics (as opposed to linear cost for additive changes).

It's also not that you can't use versioning with GraphQL, you certainly can via the same methods you would in a versioned REST API, however excluding it from the spec itself creates a better "pit of success" that leads people towards field deprecation and additive changes and away from the more intuitive but costly versioning.

@charlieschwabacher

This comment has been minimized.

Copy link
Author

commented May 11, 2016

@leebyron - thanks for the explanation. I hadn't thought through the issue w/ linear versioning and concurrent changes. I see that linear versioning won't scale to huge teams, and migrations (which are still a cool idea) won't scale w/ the number of flags for things like validation. I'm convinced it doesn't make sense to add to the spec as suggested.

I'm curious how you handle the naming issue at Facebook w/ such a large add only graph - do you commonly replace fields? Are there any names that have become strange after multiple changes? Do you have any naming conventions that work well? And if you are able to share, what's the approximate ratio of deprecated to active fields?

@leebyron

This comment has been minimized.

Copy link
Collaborator

commented Jul 2, 2016

I'm curious how you handle the naming issue at Facebook w/ such a large add only graph - do you commonly replace fields?

Very uncommon actually. Teams put some forethought into the structure of the new data they're representing and typically don't make very many critical mistakes that result in replacing fields. Though in these early stages of quick development they've also not shipped anything yet so typically they can change things quickly after adding them. While technically breaking, they're only breaking themselves. However once they ship something, it's pretty rare that the underlying data model makes extreme changes that result in replacing fields rather than just the addition of new fields.

Are there any names that have become strange after multiple changes?

A little bit. I can't really think of many examples of fields that had to undergo multiple additive changes. One example I can think of is for News Feed units. We initially had a field called style that told clients what rendering strategy they should use. Later we introduced new styles but older clients didn't know how to render them. Rather than just leaving older clients broken-looking, we decided that style should actually be a list of strategies rather than just one. Clients would just use whichever they knew how to render, like a list of fallbacks. Rather than making a breaking change to make style a list, we just introduced styleList. I suppose we could have also called it styles. A little awkward? Sure. But still beats having to deal with versioning.

Do you have any naming conventions that work well?

I'll have to think more about this. We have some convention guides written internally by various engineers (most of the great ones @dschafer wrote) perhaps we can work on generalizing and sharing some of the better ones. I think a good thought exercise to use is to consider what something's semantic value should be rather than just how you intend on using it in your UI. I think most mistakes are made when you have tunnel-vision with your single UI in mind and don't consider how it might evolve over time when features are added. It's a good idea to pretend that you have many totally different looking UIs and think what data you might need to support all those theoretical UIs. That might help highlight somethings you over-simplify in the short-term that would be the first to break as your product evolves.

An example: do you have some display text in your app that's very likely to need to support links in version 2? Either have some way to represent links in your text now which might be something other than a String, or have a naming strategy in mind to introduce such a new change in a least-awkward way when the time comes.

And if you are able to share, what's the approximate ratio of deprecated to active fields?

I don't have the answer to this off the top of my head, but I'd be surprised if more than 10% of fields are deprecated after ~4 years of use.

@leebyron leebyron closed this Jul 2, 2016

@raderio

This comment has been minimized.

Copy link

commented Aug 7, 2017

@Bessonov

This comment has been minimized.

Copy link

commented Mar 16, 2018

@leebyron

When constructing the GraphQL schema, allow for a "version" to be provided which then produces the appropriate schema.

Thank you for your insights! I'm very new to GraphQL and don't get it how it can help me. Can you give me a small example or point me to more explanation of this approach, please?

@Bessonov

This comment has been minimized.

Copy link

commented Jul 19, 2018

@leebyron just a friendly reminder

@leebyron

This comment has been minimized.

Copy link
Collaborator

commented Jul 29, 2018

That was an off the cuff idea, so there isn’t more on it. My thought was that you would handle this the same way most rest APIs do, where a version number is part of the “route”. That is if you’re linearly versioning your graphql API, you’re effectively serving a different graphql schema per version.

I still wouldn’t encourage the approach though compared to graphql’s more typical additive evolution

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
You can’t perform that action at this time.