Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Browse API Parameter Design #5463

Closed
2 of 5 tasks
ErisDS opened this issue Jun 21, 2015 · 4 comments
Closed
2 of 5 tasks

Browse API Parameter Design #5463

ErisDS opened this issue Jun 21, 2015 · 4 comments
Assignees
Labels
affects:api Affects the Ghost API

Comments

@ErisDS
Copy link
Member

ErisDS commented Jun 21, 2015

This issue recommends that we restructure the top level parameters for the API to only permit include, page*, limit, filter, order and fields, and explains why.

Posts, Users & Tags are the most important resources in Ghost, and will be the most commonly queried. Our Browse API needs to support an advanced set of querying tools so that those accessing the API directly, via the {{#get}} helper or via apps can perform the functions they need.

Current state

Currently, the Browse endpoints support the following query parameters:

  • Post: include, page, limit, status, staticPages, featured, tag, author
  • User: include, page, limit, status, role
  • Tag: include, page, limit

The include parameter is used to declare which related objects should be included in an expanded state in the result. E.g. it is common to add ?include=tags,author to a post request.

The page and limit parameters are used to control pagination. page defaults to 1, limit defaults to 15 and accepts all as a value.

All of the other parameters are intended for pre-filtering the result set.

Across other endpoints, e.g. READ we have url parameters like :id or :slug to fetch a specific resource (where these parameters are unique and guaranteed to return a single result). These ‘filters’ are encoded in the URL. When using method-calls, url & query parameters get merged together into ‘options’.

The problem with the current pattern

Currently in the API, we’re adding top-level parameters every time we want to add a new filter - e.g. adding tag, then author, then featured & so on. One small issue with this is the inconsistency across the endpoints, although that’s relatively explicable - each resource is different.

The larger problem is that it creates an instant limitation. If I write /api/posts?featured=true&tag=photo&author=joe this reads as ‘posts which are featured and have the tag photo and the author joe’ - but what if I want to do an or? Or a not? There’s no room to grow in complexity.

A recommendation for the future

My recommendation is to move our filtering parameters down a level and create a new filter parameter to house them. E.g. /api/posts?filter="tag:photo".

In the short term, we can support a very minimal set of combinations as the value of filter, and long term this can be expanded to the rough equivalent of where. There’s a bit more info on the filter values in the advanced filtering section at the end…

As well as filter, the top level parameters for the Browse API would keep include, page* & limit.

In addition we need to add support for two more API features:

Order

Currently posts and users have a default order which cannot be overridden, and tags don't even have a default order the database returns it's own default. Ordering is a major missing feature of the API (also needing its own issue), the order top level parameter would allow us to do something like ?order=name:desc,id:asc.

See issue #5602 for more details

Fields

Much less high priority, providing a fields parameter would allow the user to specify exactly which properties to return. See issue #5601 for more details.

This gives us the following, consistent set of parameters: include, page*, limit, filter, order, fields.

* page is meant to be changing to offset #5093

The funny thing about status

It seems to me that status is, in some senses, a bit of an oddball in our API. Whether or not to return published/draft posts or active/inactive users is INCREDIBLY critical to our business logic.

If a request asks for publicly available resources (published posts, active users) then all is well - but if a request asks for private resources, then permissions need to be taken into account. Our permissions system isn't very good at coping with logic like if this is either public or the user has permission and it also falls down massively when you get to relations.

Case in point, including the post_count when querying tags does a count which always includes draft posts, regardless of the user's permissions.

This needs an issue of its own, to determine how to properly declare the difference between public and private access, to ensure that the status is propagated through relations, and to ensure that permissions are properly taken into account. I'm mentioning it here because ?filter="status:draft" seems to me to obscure the importance of status.

One way to simplify this logic in the short term might be to remove the ability to do status=all, and/or require status to be a url parameter - e.g. /api/posts/ is published only and /api/posts/draft/ returns drafts. This has some implications when we add more non-published statuses in future (like scheduled) and also complicates the UI, forcing multiple requests.

We may instead need to keep status as a top-level param, and add some way to indicate switch the public vs privateness of the request, always overriding status if the request is public.

Advanced Filtering

This is covered in extensive detail in #5604, but just for clarity, here’s a quick run-down of how I envisage handling advanced queries via the filter= parameter:

  • combos: , for OR, + for AND, support for () to change precedence (seriously advanced).
  • operators: - for NOT, > for greater than, [] for in
  • paths: . for relations

E.g. filter="author.name:-'Joe Bloggs'+tag:[photo,video]+featured:true"
E.g. filter="(author:joe,author:joanne)+tag:photo"

Depending on where this is used, it may need to be URL encoded. The syntax works for URLs, method calls, handlebars helpers and search boxes. I think it's a good balance between readable, flexible,

The assumption is that author and tag are aliases for the slug property, which can only contain a-z, 0-9 & -, meaning queries with these are largely easy to compose, but there's a fall back to quoting strings with spaces etc.

Also for clarity, I am very aware that this is super duper complex - I have been working on PEG & JISON grammars to see what's possible, and am confident this recommendation will work for us.

Summary

I've written this rather long issue to explain the full-thinking behind moving some parameters under filter and the big picture plan for how the API ought to look. In the immediate future #2758 could be used to put in place the beginnings of permitting only include, page*, limit, filter, order and fields at the top-level.

Related:

@ErisDS ErisDS added the affects:api Affects the Ghost API label Jun 21, 2015
This was referenced Jul 4, 2015
@vickychijwani
Copy link
Member

Regarding the public vs private requests for status, you've probably considered this already but: one solution could be to have all public endpoints prefixed with /public, so:

/api/public/posts does not accept a filter=status parameter and returns only published posts
/api/posts/?filter=status:draft,status:published returns the requested set of posts (drafts + published posts in this case) to authenticated clients

Now, I understand this makes the API slightly un-RESTful but there are 2 advantages to this:

  1. The /api/public/ prefix makes the scope of these endpoints very clear -- nobody will expect to get drafts when calling /api/public/posts
  2. The Ghost admin UI and client apps that need write access can still use the non-public /api/posts endpoint and get all posts in one request

This could also be done the other way round -- prefixing private endpoints with /api/private/ instead. Depends on which case needs to be more explicit. Also I realize this approach may not be feasible for various reasons and / or may involve too much work but just thought I'd throw this out there.

@vickychijwani
Copy link
Member

Another note wrt advanced filtering with ?filter=..., there are some cons to using an ad-hoc query language that may be worth considering:

  1. Complex implementation in Ghost (as mentioned, this will require a full-fledged expression parser in Jison, PEG.js, etc)
  2. Difficult for clients to work with (client developers must learn the query syntax, do some error-prone string concatenation to format it correctly, etc)

An alternative approach may be to have a dedicated endpoint for advanced filtering, like POST /posts/search that takes a JSON body, which can express arbitrarily complex queries (as suggested by this answer on StackOverflow), along with simple filters in GET /posts like ?status=draft&tag=photo. This might be both easier to implement and easier for clients to work with. Just my $0.02 :)

@ErisDS
Copy link
Member Author

ErisDS commented Jul 7, 2015

@vickychijwani really appreciate your input here - it's good to have more ideas being thrown into the pot :)

Just a quick initial reply on both of your comments - the thing that I'd be concerned with in those suggestions is how flexible they are in terms of the various different ways that Ghost's API can be used. HTTP is just one way to talk to the api, there's also method calls & handlebars helpers to take into account.

So for example, the nice thing about the filter syntax suggested is that it's compatible with all of those methods:

  • HTTP: GET api/posts/?filter="(author:joe,author:joanne)+tag:photo"
  • method call: api.posts.browse({filter: "(author:joe,author:joanne)+tag:photo"})
  • Handlebars helper: {{#get "posts" filter="(author:joe,author:joanne)+tag:photo"}}

You can also extend this thinking to things like a search box where same syntax could work, and adding settings for things like channels where again I'd suggest having a string representation of a query, rather than an object, would be very useful.

I agree that the string concatenation problem needs considering though.

Still, if we used a JSON object representation for queries, it would not be any less complex in terms of needing to be able to represent and parse a query, in my opinion.

@vickychijwani
Copy link
Member

Ah I see. Clearly this is more complex than I imagined. I agree that JSON queries would become complex quickly in terms of representation and would not be as flexible.

ErisDS added a commit to ErisDS/Ghost that referenced this issue Aug 28, 2015
refs TryGhost#5601, TryGhost#5463, TryGhost#5343

- adds rudimentary support for a 'fields' parameter on browse requests
@ErisDS ErisDS self-assigned this Oct 13, 2015
@ErisDS ErisDS modified the milestone: Public API v1 Oct 13, 2015
ErisDS added a commit to ErisDS/Ghost that referenced this issue Oct 22, 2015
refs TryGhost#5604, refs TryGhost#5463

- deps: ghost-gql@0.0.2
- adds code to wire up the filtering to a paginated query
- updated pagination plugin count query to use 'distinct' so it's more robust
- rename paginationUtils.query to addLimitAndOffset to be more explicit and make the code clearer
- add a new 'advanced browsing spec' set of tests for tracking these features as they are built out
@ErisDS ErisDS modified the milestone: Public API v1 Dec 3, 2015
@laran
Copy link

laran commented Mar 5, 2016

Some ideas from HATEOS can be used instead of the include parameter. Instead of returning the embedded objects which immediately raises issues about paging the nested collection and so on, you can provide an href to a different URL where the related objects can be retrieved. Retrieving the parent and the children then is a two step process.

You can see some examples of this over on the spring-data-rest project. Ignore all the java. Just look at the JSON and how the URLs are structured.

ALPS (Application-Level Profile Semantics)
http://docs.spring.io/spring-data/rest/docs/current/reference/html/#metadata.alps

And you can find some ideas on there for nesting child objects and returning only some fields instead of the whole object (which they call projections).

Projections
http://docs.spring.io/spring-data/rest/docs/current/reference/html/#projections-excerpts

Happy to talk through this if you're interested.

@ErisDS
Copy link
Member Author

ErisDS commented Mar 5, 2016

You can read the lengthly discussion which resulted in our current API format here: #2362.

@ErisDS ErisDS added the later [triage] Things we intend to work but are not immediate priority label Sep 20, 2016
@ErisDS
Copy link
Member Author

ErisDS commented Sep 20, 2016

I'm closing most API issues temporarily with the later label.

JSON API Overhaul & OAuth access are currently scheduled next on the roadmap

@ErisDS ErisDS closed this as completed Sep 20, 2016
@ErisDS ErisDS removed later [triage] Things we intend to work but are not immediate priority labels Jan 23, 2019
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
affects:api Affects the Ghost API
Projects
None yet
Development

No branches or pull requests

4 participants
@laran @ErisDS @vickychijwani and others