-
Notifications
You must be signed in to change notification settings - Fork 1.1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Whole API versioning #175
Comments
It sounds like you wouldn't want that to allow images to be returned unfiltered, since then you would end up with the same backwards compatibility concerns when you add another type. So your ideal behaviour might actually be to require that If you really want to version your API, then you can do that the same way it is done in REST, you could version your URL. However, adding and deprecating fields seems like a more maintainable way of evolving the API. |
@dylanahsmith Focusing on the specific example of the image filtering is missing the forest for the trees. You shouldn't have to puzzle out ways to make sure edited schemas support all kind of different clients with a single implementation. Also, creating a new endpoint URL every time you make breaking changes to your GraphQL schema doesn't sound sustainable in the long-term, like, at all. |
You wouldn't be a creating another code path, you would just be getting the parameter from the url without having to specify
I don't see anything stopping you from implementing support for versioning on a server-side graphql implementation. There is nothing in the graphql spec that would prevent you from implementing your graphql-js example. |
Thanks for this clear and thoughtful writeup. I still believe that versioning is too complicating to be added within the GraphQL spec itself, however having tools to support other forms of versioning is reasonable.
The few times this has come up before - "what do you do when you just have to add a breaking change" - I recommend this approach. When constructing the GraphQL schema, allow for a "version" to be provided which then produces the appropriate schema. This enables all existing tools to operate correctly: validation, execution, GraphiQL, any code-generation you may have - they all see a given "snapshot" of the schema in it's current state, presumably your tools default to the "latest".
And this is often how such a thing is exposed. From a sustainability point of view, this is up to you and how often you introduce breaking changes. As a point of fact for example: Facebook's GraphQL schema (over 4 years old, 1,000s of types at this point, and under active change by 100s of engineers) has never needed a versioned breaking change, and still supports 4-year old shipped versions of iOS and Android apps (which unfortunately are still being used). Here are a couple of concerns I have for introducing versions in a more sanctioned way: Schema Validation Schema Validation is the step when building a schema which ensures that all types make sense. If an object implements an interface - it's checked to ensure that it actually does, for example. Schema validation can be performed fairly simply on a given schema, but that's because every portion of a schema is well formed and complete as soon as the schema is assembled. Introducing versions in makes this much more difficult, as these sorts of type checking validations need to happen for every possible combination. This is true regardless of whether you perform that versioning control within the schema, as proposed, or if it happens earlier during the construction of a schema as I suggested as an alternative. This, by the way, is also an issue for dynamic schema which are constructed on the fly according to any other input (version or otherwise). Schema validation must occur before the schema can be used, and this becomes either very expensive if validating a schema for every possible version, or potentially problematic if only validating just before execution and an invalid schema exists for some under-tested version. This isn't a deal-breaker by any means, but it does add a lot of maintainance cost to your GraphQL Server code maintenance This is another tradeoff. Versions can result in conditional logic spread throughout your server which can be a real legibility and maintenance cost. It becomes hard to understand all the possible states your server can be in when multiple similar but not the same conditions appear: fields: {
one: [
{ maxVersion: '1.0.0', ... },
{ maxVersion: '2.0.0', ... }
],
two: [
{ maxVersion: '1.1.0', ... },
],
three: [
{ maxVersion: '1.0.1', ... },
{ ... },
]
} The above example has many different combinatoric possibilities due to versioning, and this one is simple. Numbers vs Migrations A lot of the problems above are made worse because of the non-linear nature of versioning within many APIs. For example, say there are two parts of your API that need to undergo a breaking change at around the same time. You have your Now, both changing From here there are two paths forward: 1) Migrations Migrations (or sometimes called Capabilities) are often used similar to API versions, however they don't require a linear scale. Instead they're represented as a list of boolean flags which always default to However, if this sounds potentially complicated to you, then I'm with you. This can be a great control tool for a widely consumed API which has regular breaking changes, but maintaining these migrations can be a pain. 2) Additive Our server engineers instead decide to avoid a breaking change and instead introduce a new field |
Ultimately, this all boils down to tradeoffs. From working with GraphQL over the last 4 years, and with other APIs before that, we decided that it was best if GraphQL itself not attempt to represent versions and instead strongly suggest additive changes only. Not because versioning is wrong per se, but because it's costly and the additive cost is exponential to the amount of versioning that occurs due to combinatorics (as opposed to linear cost for additive changes). It's also not that you can't use versioning with GraphQL, you certainly can via the same methods you would in a versioned REST API, however excluding it from the spec itself creates a better "pit of success" that leads people towards field deprecation and additive changes and away from the more intuitive but costly versioning. |
@leebyron - thanks for the explanation. I hadn't thought through the issue w/ linear versioning and concurrent changes. I see that linear versioning won't scale to huge teams, and migrations (which are still a cool idea) won't scale w/ the number of flags for things like validation. I'm convinced it doesn't make sense to add to the spec as suggested. I'm curious how you handle the naming issue at Facebook w/ such a large add only graph - do you commonly replace fields? Are there any names that have become strange after multiple changes? Do you have any naming conventions that work well? And if you are able to share, what's the approximate ratio of deprecated to active fields? |
Very uncommon actually. Teams put some forethought into the structure of the new data they're representing and typically don't make very many critical mistakes that result in replacing fields. Though in these early stages of quick development they've also not shipped anything yet so typically they can change things quickly after adding them. While technically breaking, they're only breaking themselves. However once they ship something, it's pretty rare that the underlying data model makes extreme changes that result in replacing fields rather than just the addition of new fields.
A little bit. I can't really think of many examples of fields that had to undergo multiple additive changes. One example I can think of is for News Feed units. We initially had a field called
I'll have to think more about this. We have some convention guides written internally by various engineers (most of the great ones @dschafer wrote) perhaps we can work on generalizing and sharing some of the better ones. I think a good thought exercise to use is to consider what something's semantic value should be rather than just how you intend on using it in your UI. I think most mistakes are made when you have tunnel-vision with your single UI in mind and don't consider how it might evolve over time when features are added. It's a good idea to pretend that you have many totally different looking UIs and think what data you might need to support all those theoretical UIs. That might help highlight somethings you over-simplify in the short-term that would be the first to break as your product evolves. An example: do you have some display text in your app that's very likely to need to support links in version 2? Either have some way to represent links in your text now which might be something other than a String, or have a naming strategy in mind to introduce such a new change in a least-awkward way when the time comes.
I don't have the answer to this off the top of my head, but I'd be surprised if more than 10% of fields are deprecated after ~4 years of use. |
Thank you for your insights! I'm very new to GraphQL and don't get it how it can help me. Can you give me a small example or point me to more explanation of this approach, please? |
@leebyron just a friendly reminder |
That was an off the cuff idea, so there isn’t more on it. My thought was that you would handle this the same way most rest APIs do, where a version number is part of the “route”. That is if you’re linearly versioning your graphql API, you’re effectively serving a different graphql schema per version. I still wouldn’t encourage the approach though compared to graphql’s more typical additive evolution |
What an incredibly good thread. Everyone who contributed above has clearly tried to think this through; I appreciate the time all of you took to outline the complex considerations involved as it has really helped me to wrap my head around the problem and the available solutions. I can see why GraphQL promotes an ever-evolving API rather than a versioned API but can also see the argument for versioning. After thinking it over, I'm going to version my API using the URL (e.g. https://api.my.com/graphql/v1), because I think I can trust myself and my team to hold both sets of ideals at the same time. I think "the GraphQL way" is going to help us to avoid incrementing the version nearly as often as we increment our existing REST API versions; no doubt GraphQL provides us with tools like addition and in-line deprecation, and I think we can expect major version increases to be rare as long as we actively try to avoid them using the techniques and methodologies listed above. That being said, in my semi-long career as a programmer, I've yet to find myself in a continuum that allowed all changes to remain non-breaking. If current patterns hold, no matter how diligent we try to be, we'll always need to introduce a breaking change, at some point, inevitably. I think the anecdotal success stories, like the ones provided by @leebyron, while probably representative of the vast majority of GraphQL implementations, will meet an increasing number of counter-examples as GraphQL continues to gain traction and APIs built on it get larger and older. Although GraphQL probably would allow us to carry on indefinitely using additive approaches, the schema will eventually, inevitably, suffer (abandoned data paths, redundant field and type names, etc) and I think that, given enough time, the technical debt created by that schema convolution will begin to tax the client experience... and when it does, we're likely to regret not having any way to delineate the API version. So why not come out of the gate with a little future-proofing, even if we never need to use it? Anyway, for us, on the backend, we'll use Git branching to delineate the API versions, rather than expect the GraphQL spec or even the HTTP pre/post processors to weigh in (header, etc) and we'll deploy the API versions as whole products, rather than trying to define code that serves more than one version. What I'm describing here is the strategy that most software products take and I can't see how GraphQL introduces anything that makes that strategy less advantageous. Sure, we have additional options for slowing down the version creep, but the fundamental advantages of software versioning remain constant. Anyway, the only reason I bothered to post is that I think the GraphQL community should introduce some nuance into the versioning discussion rather than taking the, seemingly, hard-line approach against versioning in the various documents and tutorials. e.g.:
i.e. As I'm implementing my versioning strategy I'm going to feel like I'm going against the grain, when I really don't think that should be the case. I think versioning should be considered as being within the realm of "best practices", provided that it is maximally de-emphasized. TLDR; I think the GraphQL community, at-large, should be more open to versioning and should try to assure newcomers that versioning does not break the community standards or best practices as long as priority is given to the non-breaking approaches that GraphQL promotes. |
@vmadman ack. In fact, github has a lot of breaking changes. And a lot more with previous versions. |
We have been using GraphQL as a backend to both web and iOS apps at Chloe and Isabel for around five months, in production for around a month and a half. It is a joy to work with, and introducing it into our large existing app has been nearly frictionless. I'm hoping to see it used all over the place in the future - for everything from quick prototypes to public APIs. The one place we have run into problems so far is needing to make changes to existing fields.
I understand that the lack of versioning in GraphQL now is deliberate, and that there are good arguments against including it (IE @leebyron's comment here). That said, I think it is worth reconsidering - in real use, you are likely to make mistakes when designing your api, have requirements change over time, or need to make changes for other reasons.
Without versioning, when the behavior of an existing field needs to change, it is necessary to deprecate the old field and add a new field w/ a new name. Usually the existing name will be the ideal name for the field, and the new name will by definition be a second choice. As you continue to make changes to an API, these compromises add up, and the quality of the API (measured by names that make sense, are consistent w/ other areas of your code, are easy to guess and remember, etc..) will degrade over time.
I'm going to give a concrete example of a possible breaking change to a field, go through the ways I can think of to deal w/ it without versioning, with versioning but without any changes to the GraphQL spec, and then suggest an ideal solution that would require spec changes.
Example requiring field versioning or replacement:
Imagine we have the following schema, with both a web app and native mobile app displaying a feed of images in pages:
with clients making queries like:
On the web we add the ability to upload
.gif
images to the feed, but we want to only display still images in the mobile app because of bandwidth concerns. Say also we know in the future we will want to support stereo images, so we want to be able to add image types and query different combinations of them as needed. Our ideal schema could now look something like this, with images taking an additional argument specifying which types we want to receive, and by default returning unfiltered images.The web client would not require changes, and we would update our mobile app to query for only still images like:
The problem with this is that we have installed versions of the mobile app still using the original query without the
types
argument. If we modify the schema as described above, the existing clients would get gif images in their results. So, we need to make sure that any changes we make preserve the existing behavior of the images field.Posible approaches w/o spec changes or versioning:
We could make the default value of the types argument
[STILL]
, and modify the query in our web app to ask fortypes: [GIF, STILL]
. This solves our immediate problem, but having the default behavior of the field include filtering is not predictable. It's easy to imagine this change leading to bugs as new users of the API expect the images field to be unfiltered.We could deprecate the
images
field, and add a new field either w/ a version number likeimages2
, semantic name likeimagesByType
, or synonym likepictures
. This feels unsustainable, if we continue to make changes, we could end up with very poorly named fields likeimages7
,imagesByTypeAndRatingAndUploader
, ordoodles
.With this second approach, we could track use of the deprecated field and eventually clean up the API. Still, at least in the short term, either one of these changes seem to me to damage API quality, and if the API is public, we might need to support the deprecated fields and keep the changed fields in perpetuity.
Possible approaches implementing versioning without spec changes
Its possible to version types directly. This would have required us to create a versioned API from the beginning, but lets us replace fields without needing to degrade field names by creating a new version of their type. We also are able to maintain documentation for api versions through metada on the schema.
Another similar approach would be to maintain versioned schemas, send the version in a post variable or http header along with graphql requests, and resolve using the appropriate schema.
The problem with these approaches is that they quickly become impractical with more complex schemas. If we want to replace a field on a deeply nested type, we need to create new versions of every type referencing the changed type, every type refrencing one of those types, and so on. I think this makes this approach unreasonable in most cases.
We could provide a whole api version argument in the query context, and check it during individual field resolution. I think this is the best solution so far - we keep our ideal field names, required changes don't proliferate outside of the field in question, and we have room to continue to make changes in the future. This will handle changing behavior of the field, but fails to handle changing types.
Another downside to this approach is that we lose the ability to provide documentation for older versions. Without any changes to the GraphQL spec, the documentation will be able to include only the arguments and description of the latest version of the field.
Ideal solution with spec changes (defining max version on fields):
The only extra information needed to make whole API versioning possible is a maximum version on fields, and a desired version when resolving queries. Both of these could be optional, so they wouldn't require any breaking changes to the GraphQL spec or reference implementation, and wouldn't add complexity to definition of non-versioned APIs. Versioning could be added to initially non versioned APIs without requiring any planning for it ahead of time. Metadata would be preserved for every version of the API, allowing tools like GraphiQL to easily switch between versions, and providing deprecation reasons for replaced fields could create a useful log of changes to fields over time.
As an example, the Images schema above could potentially be defined by passing an array of objects as field definintion in graphql-js like:
Then, when executing queries, we could provide a version argument - if we were to proivde
1.0.0
or an earlier version, GraphQL would use the first definition:If we did not provide a version, or provided '1.0.1' or greater, graphql would use the second:
or:
TLDR / Summary:
Whole API versioning could be achieved easily w/ two spec additions:
These additions:
The text was updated successfully, but these errors were encountered: