Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

RFC: Use GraphQL/Apollo for structured data access in the CMS #6245

Closed
chillu opened this issue Oct 27, 2016 · 12 comments
Closed

RFC: Use GraphQL/Apollo for structured data access in the CMS #6245

chillu opened this issue Oct 27, 2016 · 12 comments

Comments

@chillu
Copy link
Member

chillu commented Oct 27, 2016

Overview

This RFC recommends the introduction of a GraphQL API in SilverStripe 4 to expose structured data required for the CMS UI, as well as the Apollo JavaScript library as a smart client to consume this data.

We have implemented a Proof of Concept (see below), and are looking for feedback from the community. Do you think these technologies will lead to a more extensible and maintainable CMS UI over the next years? Should we focus on better core web service APIs for SilverStripe 4?

Context

SilverStripe is moving to a more frontend-focused CMS UI through the introduction of React, which requires its CMS controllers to expose more structured data through APIs. While the wider SilverStripe ecosystem has modules for REST APIs and webservices (1, 2, 3, 4), there's nothing built into SilverStripe core.

For the existing React interfaces, we've created temporary (mostly REST) API controllers (AssetAdmin and CampaignAdmin). This approach duplicates logic, and creates a fragile system which is hard to scale. We need a clear API strategy in SilverStripe core.

On the frontend, we've created a Backend.js helper to abstract HTTP request/response details from React components - they'll just need to accept a Promise without worrying about HTTP data fetching. This approach works together with redux-thunk in action creators (e.g. publishCampaign).

Going forward, we'll need to retrieve a multitude of data sets to construct a fully decoupled CMS view. For example, an "edit page" view might require the current user, menu options, page tree, batch actions, page-type permissions, configuration, form schema, form state. We'll need an API which can retrieve a dozen data sets without affecting the user experience by delayed renders through multiple individual 300ms+ XHR requests. This is particularly important for user experience on slower network connections (e.g. a tablet accessing the CMS over a 3G connection).

Why GraphQL

GraphQL is an alternative to REST, and an attempt to build smarter clients which can tell the API exactly which data they need in a single request. This contrasts with REST's principles of a single URI for one type of data structure. By describing your data as types in a "schema", GraphQL provides input validation and introspection. GraphQL enables "co-location" of data requirements with view components, decoupling them from how data is retrieved (no more XHR logic in components). Don't get hung up on the "graph" terminology, this is a great fit for the type of related data structures which make up your typical SilverStripe project, even if you're not building a social network.

I encourage you to check graphql.org/learn and Facebook's introduction blog post) before continuing here. Or watch the "Zero to GraphQL in 30 mins" video.

GraphQL is a (surprisingly readable) specification which has been implemented in many languages, incl. PHP. It has been widely adopted: Facebook has been using it in production for years, Github has announced a GraphQL API, and big players like Twitter/Fabric.io, Pinterest, the Financial Times and Shopify are on board. Even Drupal is considering it, and there's talk in Wordpress Calypso land.

Check the "awesome graphql" collection if you're still unsure about how it all works, particularly the "awesome graphql" posts.

SilverStripe Content API

Looking beyond SilverStripe's own use of GraphQL to power the CMS UI, this technology also gives us a better shot at creating a "SilverStripe Content API" which exposes core data structures like File, Page or Member to third party developers as well as your own apps. When you're building the next "content blocks" module in the SilverStripe CMS, you should be thinking in React components rather than server-rendered HTML, Entwine and PJAX. Which means you'll need structured data to expose those content blocks to your CMS authors.

While a SilverStripe Content API doesn't rely on choosing GraphQL, it's a lot easier to evolve those APIs with GraphQL - no more api/v1 (see GraphQL: The next generation of API design). It's auto-documenting, no more Swagger docs boilerplate. The static nature of GraphQL queries means your IDE can tell you when you're using the API endpoint wrong before any code is compiled or HTTP requests are performed (see benefits of static GraphQL queries). A "SilverStripe Content API" (GraphQL or otherwise) will open the door to other CMS clients - publish from your Apple Watch? ;) The Symfony CMS Project believes it'll be a driver for better CMS interoperability, without all the conceptual overhead of CMIS.

GraphQL and the SilverStripe ORM

Since GraphQL relies on "types" to describe your data and "queries" to retrieve them, we need to make it aware of the SilverStripe ORM data structures. I've written a silverstripe-graphql module to create those GraphQL artefacts and expose a GraphQL API endpoint. The heavy lifting here is done by the graphql-php dependency. Please check the README for code examples.

At the moment there's a bit too much boilerplate, which I'm hoping to reduce through more introspection into the SilverStripe ORM. A "DataObject<->GraphQL" scaffolder should take care of the 80% use case, and expose the underlying objects for customisations (e.g. restrict field access, implement search by property in queries). It'll be more code than $api_access = true in the good old restfulserver module, but provide a good balance between configurability and brevity.

I have put this approach into practice by rewriting File and Folder CRUD into GraphQL in an experiments/graphql branch (check the code/GraphQL folder).

GraphQL Clients (Apollo vs. Relay)

While you can use GraphQL without client libraries (it's just a HTTP POST with a JSON payload), it makes most sense when combined with a smart client JavaScript library. There's essentially two choices: Facebook's Relay and Meteor's Apollo.

They both share some characteristics:

  • Integrates well with React
  • Wrap around React components, can retain props-based data flow and "dumb" components
  • Declare data requirements and let the library figure out how/when to fetch
  • Co-locate queries and data requirements with components
  • Keep a client-cache of previous queries and retrieved records (same path, same object)
  • Send create/update/delete from client ("mutations")
  • Batch multiple queries into a single network request
  • Support for optimistic updates (before API response comes in)
  • Both systems have a "footprint" in your React components, since the whole point is declarative data fetching (see Replacing Relay with Redux). With proper smart/dumb component separation, that's not a big issue though.

There's some differences:

  • Relay uses it's own client data store, while Apollo uses the same Redux store already in place for SilverStripe 4. You can use Relay+Redux together (see Clearing up React Data Management Confusion
  • Relay bakes in the GraphQL schema into the JS bundle. Given that there's one schema per SilverStripe installation, the contents of it will depend on installed modules. Since we don't want to force devs to rebuild framework dist files on installing a new module with GraphQL types, this will require some thought. Apollo works without bundling the schema.
  • Relay is more opinionated about the design of your GraphQL endpoint (we're unlikely to benefit from its cursor-based pagination)
  • Relay's mutation API is very powerful, but too complex unless you're Facebook
  • Apollo has about 1/5th of Relay's "attention" on Github (stars/forks/contributors), but is still considered a healthy project with 40+ contributors
  • Apollo has much better documentation
  • Apollo is not stable yet (0.x releases), while Relay is on 1.x releases. Relay 1.x is being refactored into Relay 2.x already, with Facebook committing to the 1.x API for the forseeable future.

See Choosing a GraphQL client for a detailed comparison, A look at Relay vs. Apollo for some code comparisons, and Apollo's own comparison.

I've tried to use both (see "Proof of Concept Implementations" below), and overall I think Apollo fits better into our existing product strategy, particularly because it embraces rather than replaces our existing Redux architecture. Relay 2 is promising simplifications which might make us reconsider, but there's not even a clear roadmap yet (just a meetup talk from last month). Overall, I think the risk of potentially switching Apollo for Relay 2 later on is manageable, and probably not much more effort than upgrading Relay 1 to Relay 2. Co-located GraphQL queries stay the same, we just replace some "plumbing" around describing mutations and mapping of state into component props (~100 LOC in the current PoC).

GraphQL and Form Schema

In order to move form rendering from SilverStripe templates to React, we've created the form schema API. It describes fields as JSON, and passes the current form state down to the client. Form submissions are handled by SilverStripe controllers as usual, but submission results are returned as JSON (form schema and form state).

In an ideal world, the CMS would use general purpose APIs to both read and write data. This goes against the current practice of using getCMSFields() on models to create CMS forms, since general purpose APIs shouldn't be reliant on a particular form or controller context. APIs also shouldn't be reliant on access control by form fields and conditional field logic, as commonly implemented via getCMSFields().

The thinking around using APIs for CMS form submissions is still early stage, and out of scope for this RFC. What's clear at the moment is that the solution won't be heavily impacted by choosing GraphQL or any other API technology, the problems are mainly conceptual rather than dependant on a particular library.

Proof of Concept Implementations

I've partially ported the current AssetAdmin functionality over to GraphQL+Apollo (list files, create folders) - haven't gotten around to uploading files. Check out the framework branch and asset-admin branch if you're interested (and run composer require chillu/silverstripe-graphql).

Additionally, I've started a Relay PoC (framework branch, asset-admin branch), but it's been harder to get going - currently broken, but check the commits to get a feel for the approach.

Next Steps

We're getting to the end of the SilverStripe 4 alpha phase (with one more alpha release planned before 4.0.0-beta1 early next year), so the window of opportunity for API changes is closing. Either we work on this now, or wait at least another year until development focus shifts to SilverStripe 5. If this proposal is successful, the next steps would be:

  • Create an RFC for an "GraphQL Type Scaffolder" based on DataObjects (reduce boilerplate)
  • Create an RFC for GraphQL-powered form submissions (this might be too much change for SilverStripe 4)
  • Stabilise the chillu/silverstripe-graphql module
  • Finish GraphQL migration of AssetAdmin (mainly file upload and form submissions)
  • Migrate CampaignAdmin to use GraphQL

FAQ

  • How does this affect my upgrade path?: On the short term, you wouldn't need to spend more upgrade effort. Once we switch over more CMS UIs to React (e.g. CMSMain and ModelAdmin), your own models and page types will need to be exposed as GraphQL types in order to expose them through a GraphQL API to React. We'll try to infer as much as possible from your existing DataObject information and getCMSFields() there, so don't expect a lot of custom code in your project (approach TBC).
  • Does this lock us into React (vs. Angular)?: You can still use whatever frontend stack for your own website. The SilverStripe CMS UI has made a large investment into React already, so using React-based tools won't increase that lock-in. GraphQL/Apollo can be used in Angular or VueJS as well.
  • Isn't this a bit overkill?: When you break down the CMS UI into truly decoupled components fed by structured data (rather than rendered HTML), it's much more complex than retrieving forms and page data as JSON - and I believe requires a powerful API.
  • But I want to use REST/SOAP on my website!: There's no limitation on technologies used in your own code. The SilverStripe GraphQL endpoints give you an easy way to access structured data already exposed through core APIs, but you can still write your own REST endpoints.
  • Why not just use GraphQL with plain Redux?: Going through the trouble of creating an expressive schema in our API is only really worth it if you have a client which can use it (rather than just dealing with raw JSON responses) - although there's opinions that a lightweight client should only concern itself with caching instead of data co-location.
  • Does this make our existing Redux work obsolete?: We'll still have client-only state to manage (e.g unsubmitted form values). We likey won't need much redux-thunk any more, but Redux itself isn't going anywhere. Relay2 might include client-only fields, but that looks quite complex.
  • Doesn't HTTP/2 make GraphQL query batching obsolete?: While HTTP/2 reduces the handshake delay, you're still booting SilverStripe for each individual API request (~50-100ms each) - which can accumulate to perceived user delays.
  • Does this decision affect potential offline support in the CMS?: It's a hard problem, particularly around offline mutations and conflict resolutions - but a smart client cache like Apollo/Relay will be helpful on the long run (discussion)
  • Can't this wait until SilverStripe 5?: Potentially, but it's likely to lead to wasted effort on "intermediary APIs" in the meantime. We'll be supporting 4.x LTS until 2020, which is a long time to base major UIs like Campaigns and AssetAdmin off the current API endpoints (lots of boilerplate and duplication). It'll also send mixed signals to modules wanting to adopt a more frontend-driven approach in 4.x (e.g. userforms, React-driven content blocks editors).
  • Why did you choose the graphql-php library?: There's really only two libraries to chose from: webonyx/graphql-php and Youshido/GraphQL. They're both fairly new, but webonyx/graphql-php has a bit more traction (e.g. has working Laravel and Symfony bundles). Check the GraphQL PHP ecosystem for details
@sminnee
Copy link
Member

sminnee commented Oct 27, 2016

I think that GraphQL is a great choice of API for the CMS and so I'm broadly in support of this, and of using Apollo for the client side code.

My one caution would be around form submissions: for the introduction of GraphQL to be an effective way of introducing clarity and consistency to our APIs, GraphQL and FormSchema-based submissions need to work nicely together.

Although it's not detailed explicitly in the RFC above, this is something that Ingo and I have discussed offline. GraphQL is based on "types". A type corresponds roughly to a DataObject, and the design of the GraphQL APIs would be fairly straightforward for these.

However, if we are delivering a FormSchema via GraphQL (which would be the goal), it's a little less clear how to design the API:

  • Is the FormSchema a type (or maybe the Form)? Is it explicitly linked to an underlying data type?
  • A form submissions mutations on the underlying data type, or on the Form/FormSchema type?
  • Is it always one or the other, or do we use the underlying data type in some cases and the form in other?

I think that we'd want to clarify some of these issues before getting fully into the implementation of GraphQL.

@chillu
Copy link
Member Author

chillu commented Oct 30, 2016

I've written a follow-up enhancement card on pagination with GraphQL: https://github.com/chillu/silverstripe-graphql/issues/7

@chillu
Copy link
Member Author

chillu commented Nov 2, 2016

Created a uservoice items as well so this pops up on our roadmap: https://silverstripe.uservoice.com/admin/forums/251266-new-features/suggestions/16924327-graphql-api-support

@dhensby
Copy link
Contributor

dhensby commented Nov 4, 2016

I think using GraphQL would be wonderful.

It looks to fit with our complex data types and models really well and it provides a ton to power to fronted components.

I do think it looks like the every shrinking window to get this into 4.x is going to be an issue and it seems like something that needs a bit of real world usage to make sure we're actually providing solutions to how it will be used in production.

Do we have to decide between v4 and v5 or could we bring this in for 4.1?

On the other-hand, it could form the foundation of an overhaul for how we resolve values for templates as well as for graphql responses. The value resolvers that GraphQL work with look like something that could form the basis of a decoupling of our DBFields and templates further allowing us to decouple the template engine...

Summary: I'd support a push for getting this in 4 if it's realistic, but I think holding off and using this as a foundation to decouple our datatypes from our view could be better in the long term of the framework.

@sminnee
Copy link
Member

sminnee commented Nov 6, 2016

I don't see this as being a replacement of our type system. I see this has being an API with a suitable bridge to our ORM DataObjects.

We can always make a more pervasive change in v5 if we decide that's worthwhile; I don't see adopting it in v4 would hinder that – on the contrary, it would help us get some real experience with GraphQL.

@chillu
Copy link
Member Author

chillu commented Nov 7, 2016

Two further discussion threads relevant to this:

@stevie-mayhew
Copy link
Contributor

I really love that SS is looking to take this approach. While GraphQL could still be considered in its infancy there is a real, tangible future benefit to work being done now.

I agree with @sminnee that there is a case for getting as much of this as possible into 4 and then working towards (perhaps) full integration for 5. Getting experience using GraphQL earlier is going to be better for us as a platform and for the developers who use SilverStripe moving into the future.

With regards to the RFC, I agree getting something done completely in GraphQL would be a great starting point.

@anselmdk
Copy link
Contributor

anselmdk commented Nov 7, 2016

I'd love to see this in 4, having this as the suggested way of developing for the CMS. I'm afraid that waiting for 5 will lead to way too much custom code that will need to be rewritten again if GraphQL is introduced in 5.

@chillu
Copy link
Member Author

chillu commented Nov 17, 2016

More thoughts on extensibility: For example, a "extended toolbar" module might show the avatar next to the "hello " label in the CMS menu bar. This means a new React component needs to be injected into the React hierarchy, the details of which isn't important here. But this new React component also needs new data to be fetched (the avatar URL). Relay has this built-in, explained nicely here.

Apollo has a similar concept based on co-locating fragments.
We could define a "fragment registry" (similar to the already existing "route registry"), but the problem is that we likely can't get it in early enough before the graphql() HOC needs the final query.

Note: Query merging in Apollo is only a way to merge queries into a single network request, not to merge fragments into a single query. Also, graphql-php doesn't support server-side batching yet, because it's hard to run async in PHP. In Apollo, it has been factored out into a separate package (according to comments there's not a lot of focus on this). Note that it's different from query batching, which combines multiple GraphQL query requests into a single HTTP request.

@chillu
Copy link
Member Author

chillu commented Dec 13, 2016

@silverstripe/core-team I'm sending a pull request against silverstripe/asset-admin on this shortly, which will make silverstripe/graphql a core dependency (task). We've discussed this in the Open Sourcerers team a bit, Sam wants to see a few more people convinced specifically about Apollo as a way forward for using GraphQL on the client. If you read the discussion above, it hasn't been an entirely straightforward path. Here's an overview of ongoing initiatives, in case you haven't had the time to keep tabs on things - it's moving quickly!

  • Scaffold types and queries for DataObject: @unclecheese has been tirelessly working on this, the API is shaping up nicely. Creating GraphQL types from scratch can get pretty verbose with our interitance structure and breadth of models, so that's crucial for developer experience.
  • Pagination in GraphQL: @wilr has been doing most of this already. In short, we're following the Relay connection spec, but not using cursors (since it's too hard)
  • Fragment registry to extend GraphQL queries: GraphQL will only return fields you list in a query, so any third party code relying on more data needs a way to get it into queries. That's been de-risked now, I've created a PoC.
  • Update GraphQL models after form submission: @wilr is working on this, Apollo is introducing a state middleware which allows us to hook into this via official APIs
  • Submit forms through GraphQL: Too hard basket for now, since we need more underlying changes in the Form.php API around model bindings to make this work nicely.
  • Drive a React-based GridField through GraphQL, with user-selectable columns. No issue for this yet, but on a high level I'd expect a GraphQL query to be sent through with GridField form schema data, which then gets executed by the component directly on the Apollo client. So GridField wouldn't be a react-apollo component (with the graphql() higher order component) because we don't know which query it'll run when the JS initialises. Which means rebuilding a few hundred LOC from react-apollo, which I'm not too worried about. A React-based GridField probably won't be SS 4.0 anyway.

@chillu
Copy link
Member Author

chillu commented Dec 19, 2016

chillu added a commit to open-sausages/silverstripe-asset-admin that referenced this issue Dec 22, 2016
At the moment this is limited to "read files"
and "create folder". The remaining API endpoints will be
successively moved over to GraphQL.

The GraphQL type creators are likely temporary,
to be replaced with a shorter scaffolding-based version:
silverstripe/silverstripe-graphql#9
silverstripe/silverstripe-graphql#22
silverstripe/silverstripe-graphql#23

RFC at silverstripe/silverstripe-framework#6245
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

5 participants