Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Requesting filters or conditions #21

Closed
indirect opened this issue May 5, 2013 · 22 comments
Closed

Requesting filters or conditions #21

indirect opened this issue May 5, 2013 · 22 comments
Labels

Comments

@indirect
Copy link
Contributor

indirect commented May 5, 2013

Okay, last one that immediately jumps out at me: in order to filter or limit the data that will be returned by some criteria, Rails has scopes and SQL has WHERE. Will there be a (presumably optional) section of the specification that provides a convention for clients that want, say, only the open issues on a specific repo, and none of the closed ones? Or only the draft posts, and none of the posted ones?

@steveklabnik
Copy link
Contributor

I'd imagine this would happen the normal way, via the URI template. You'd include parameters to make this happen.

@indirect
Copy link
Contributor Author

indirect commented May 5, 2013

Since I'm just arriving, I can't figure out from the Reading page how to use templates for anything other than relations, but I'm excited to find out what that looks like when an example gets added. :)

@steveklabnik
Copy link
Contributor

@steveklabnik
Copy link
Contributor

(that should get added, yes)

@steveklabnik
Copy link
Contributor

Yup. Also easily accomplished through profiles.

@indirect
Copy link
Contributor Author

indirect commented May 9, 2013

Just to make things super explicit — there will be an example of doing this in the Extending section? Or no?

@steveklabnik
Copy link
Contributor

Sure, if it's useful. Seems good.

@jokeyrhyme
Copy link

Seems related to #45

@jokeyrhyme
Copy link

Any chance we can get one or two examples of how to implement filters/conditions via Profiles and URL Templates? I've read both RFCs and I'm afraid I don't feel I'm any closer to figuring out how they apply to querying resources. Would it be possible to re-open this ticket until such examples are published with the JSON API specification?

Imagine I have a /api/v1/books resource type with individual resources looking like:

{
  id: 1234,
  title: "Profiles, WTF?",
  keywords: [ "rfc", "ietf", "http" ],
  authors: [
    { name: "bob", email: "bob@bob.bob" }
  ],
  publisher: {
    name: "alice inc",
    date: "2013-11-07",
  }
}

What could I put in my /api/v1/profile document to describe how to search on each of the available fields (e.g. all books by Bob, all books published by Alice between given dates, all books with the "ietf" keyword, all books with "WTF" in the title, etc) ?

What would HTTP requests look like to perform these example searches?

What if I have a bunch of different resource types with similar search characteristics (e.g. videos might also have publisher.date)? Do I have a separate profile document for each resource type? How could I arrange multiple profile documents to minimise duplication?

JSON API already uses id, includes and fields keywords in the query string. Is there a possible collision between these keywords and a field I may need to filter upon? What could I do (besides avoiding these in my resource type definitions) to avoid keyword collisions?

Thanks in advance for any assistance you can provide. :)

@steveklabnik
Copy link
Contributor

I don't have time to get into it right this moment, but this will end up in the FAQ.

The super short answer is a link in the meta that looks something like http://amundsen.com/media-types/collection/examples/#ex-queries

@steveklabnik
Copy link
Contributor

There's also http://jsonapi.org/extending/ which describes an example of pagination, but you could see how it would work with any kind of query.

@garysweaver
Copy link

@steveklabnik Thanks! Some questions:

Param names that are composites of various behavior like that which is provided via has_scope (used by Inherited Resources) might be difficult to describe, e.g. scope chaining with URL path /graduations?params[by_period][started_at]=20100701&params[by_period][ended_at]=20101013, which is an example provided in their doc. How would you suggest that be described in a "meta" block such as the one in http://jsonapi.org/extending/ ?

The abilities provided for filtering by can_filter_by/can_order_by/etc. in Irie wouldn't be quite as bad to describe in that format with param names at least, e.g. /posts?created_at.gteq=..., but ordering can be done in a single value, e.g. /posts?order=foo_date,-foo_color or in chain like /posts?order=foo_date&order=-foo_color- how would either and/or both of those options be adequately represented? Also, the + is optional to indicate ascending, so it could be /posts?order=+foo_date,-foo_color. I'm just not totally sure how I should describe that in the meta or what the intention is for how the client should plan to interpret "cursors". Or is that just something they are supposed to look at the URL and read the comments to figure out (which is fine with me)?

Thanks again!

@jokeyrhyme
Copy link

To summarise my understanding of this:

  • JSON API specifies that query string parameters be documented via the Profile RFC
  • the Profile RFC suggests that we use the URI Template RFC
  • neither of these attempt to specify what the values of such query string parameters should be in order for the HTTP client to request filtered results
  • JSON API does not attempt to specify this either, currently

The thing I love about JSON API is that it takes what we already know from HTTP and REST and fills in many of the questions surrounding URL schemes, MIME types and request/response bodies. Awesome.

In that spirit, I personally would find it awesome if one day JSON API could specify how searches should be requested: what the URL query parameters look like and how we filter based on any data structure our JSON API would otherwise be able to return.

I realise that it's incredibly difficult to come up with a specification that solves every problem, so it's totally fair if this is a long-term nice-to-have feature of an otherwise already productive specification.

@garysweaver your order chain example /posts?order=foo_date&order=-foo_color might be problematic depending on the implementation environment. PHP, for example, would only expose a single QSA for order with the last value declared.

@garysweaver
Copy link

@jokeyrhyme Thanks! Irie supports comma-delimited order value also, e.g. order=-color,-price,name. I'm also a fan of its filtering syntax but I'm partial. :)

@jokeyrhyme
Copy link

In case anyone was interested, I settled on MongoDB's query syntax:
http://docs.mongodb.org/manual/reference/operator/query/
http://docs.mongodb.org/manual/core/read-operations/

  • reasonably well documented
  • JSON-specific query language, unlike other approaches that focus on SQL
  • specifies how to query properties within nested objects and arrays, so I never have to wonder how a complex JSON object ought to be filtered
  • for complex object/array filters, it has much shorter syntax than JSONiq
  • most of our resources are stored in MongoDB, so we can mostly use requested queries as-is with a little validation and extra filtering based on permissions
  • it is DB-vendor-specific :S

Our implementation:

  • as all queries are JSON objects themselves, we'll assume the outer enclosing braces for brevity
  • we will need to convert these to SQL WHERE clauses for our MySQL resources, but we don't intend to implement every single MongoDB query feature, just the basics
  • use a ?mongo= query string parameter to make room for whatever JSON API specifies later

Simple queries look simple, but cover all the bases. My URLs look like this:

  • /api/v1/resources?mongo=name:"jokeyrhyme"
  • /api/v1/resources?mongo=quantity:{$gt:10}

We can mix it with the JSON API ?include= query string keyword just like the specified ?fields= keyword. So to find only libraries that are open and have books with "cat" in the title:

  • /api/v1/libraries?include=books
    • &mongo[library]=open:true
    • &mongo[book]=title:{$regex:"cat",$options:"i"}

Note: this is just proposed syntax specification. This helps document to API consumers what they can put in their request URLs, but the actual server-side implementation is separate.

I looked at JSONiq, CouchDB and a few other JSON-specific query specifications, but MongoDB's seemed the shortest and most comprehensive. I don't feel any solution based only on RDBMS with columns and no true appreciation of JSON structures should be considered for inclusion in the JSON API specification.

@garysweaver
Copy link

@jokeyrhyme I'm not averse to more syntax if it is necessary, but I think the URL query syntax used in Irie, such as ?color=red&created_at.gteq=2013-11-08 is sufficient for basic querying. Irie makes you specifically declare what attributes are exposed and you can alias attributes at the root level of the current resource, or through a number of levels of associations deep, or calculated attributes. In addition, it can support attribute naming that match associations via dot-delimited syntax, e.g. ?color.hue.code.gt=15. The supported predicates could be returned for each attribute via the meta Steve mentioned, and I would do that, but we currently have no need for it since no one I know of yet is going to query the service to get meta data to allow a client to integrate on the fly (as of Nov 2013).

Brackets such as seen in has_scope's syntax, e.g. /graduations?params[by_period][started_at]=20100701&params[by_period][ended_at]=20101013 could be of use, although I think a single character delimiter could be sufficient.

In both has_scope and Mongo's URL query syntax, I'm not really a fan of params or mongo as the start of a query parameter- that doesn't seem DRY. I could be in favor of a single letter to start a query like q, and even potentially using a single string for the entire query. But on the other hand, when a service implements layer upon layer of service/controller functionality within the same service/controller, using a prefix like "mongo" that provides the name of the part of the service that is handling the request could be of use. I don't know that it should be required though. To me, it just looks ugly and I see little reason for it at the moment.

...unfortunately this discussion will go into a rathole if we discuss differences in opinion about what the URL should look like. IMO, the real question is: "Is there anything specifically about json-api as it currently stands that could be changed so that we could move forward in an effort to define standards for querying data through JSON?"

I think the answer is "maybe". Basically, I think the structure of meta is loose enough to support data provided in it to be used by a client to determine how it could query data. And profile is a great way to define a URL to document how all or part of the API should be used. But:

  • profile looks to be primarily informational for integration developers rather than to be used by humans using an interface that has integrated with the API, but the example lacks formatting, accessibility information, or i18n'd versions of each doc. Should it say somewhere that, "profile is meant to be used both as a identifier of the compatible version if behavior is described, e.g. http://douglasadams.com/answer_to_life_the_universe_and_everything/json_service_description/v42. When an HTTP GET is called on that URL, it should accessibly communicate information to the user how to use the API, which could involve redirection or further links for additional information, other language translations of the document, etc. It could just be a text doc in the language of choice, depending on the audience."? Or, is the profile URL not intended to be used in that way? Extensible functionality of the service via profile and meta mean little without their own versioning and it shouldn't just be in English or Chinese (because I can't yet read Chinese and Google Translate isn't perfect yet).
  • If profile and meta are good enough for extensions, why can't they be good enough for all of json-api? A more future proof solution would just be an API that has a list of profile links to describe each feature, including the data structure. As long of all of it is in JSON, what difference does it make what it looks like? That is the idea guiding many XML-based APIs before. If you think this is a bad idea, then we need more standardization around more complex querying ability as we started discussing above, because otherwise it isn't going to be very consistent.

@steveklabnik
Copy link
Contributor

Should the example be changed to just say "The profile is meant to be used both as a unique identifier of the compatible version if behavior is described,

This is basically true.

If profile and meta are good enough for extensions, why can't they be good enough for all of json-api?

In theory, they are! Except that application/json does not have any parameters, so you'd have to include a profile via a Link header.

The reason to mint a new type is so that we can share the 80% that's the same, and only specialize on the 20% that is different. If it were entirely in a profile, you'd still need to fully re-implement for each one, which doesn't gain you anything.

@garysweaver
Copy link

@steveklabnik The thing I worry about though is that the extras are as much part of the overall definition as the 80% that is the same. API composition can be as piecemeal as you wish but integration is most likely a singular activity, i.e. if you have fuzzy things like profile and meta, that isn't going to be automatically capable of integration by a client until we have a Skynet/Terminator-level of AI on the client-side, so a human is going to have to read all those profile URL docs and then integrate, which is a PITA that the 80% doesn't solve. In short- the 80% done in 100% of cases provides little, but 100% done in 80% of the cases buys a lot more. Without query param standardization, you aren't 100% done in 80% of the cases.

@jokeyrhyme
Copy link

@garysweaver I don't want to force MongoDB syntax on everyone. I just wish there was a better alternative. Irie is a good start. Here are my problems with it:

  • It puts object attributes in the same namespace as JSON API's reserved query string keywords (e.g. ?include=, ?ids=, ?fields=). This means you cannot use the current (and future) reserved keywords as part of your own object definition.
  • It puts comparison operations in the same namespace as attributes within nested objects. So things look ambiguous when you have { nested: { eq: "abc" } } and you query ?nested.eq=: are you testing the top-level attribute for equality? or are you referring to the nested attribute?

I feel having clearly separate namespaces makes parsing easier, syntax less ambiguous and makes knowledge of the JSON API specification more portable. It also frees developers to use whatever JSON structure they feel depicts their domain objects without fear of collisions with reserved keywords.

I know ?mongo= is a bit gross. However, it does mean that you can add ?jsoniq= or ?where= or JSON API's official search parameter in parallel without needing weird detection rules or immediately ending support for what you offered before.

@steveklabnik an advantage of making some searching syntax part of the JSON API specification: if I already know JSON API, then I can immediately start using an API without needing to first read about the specific quirks the owner implemented regarding searches.

@garysweaver
Copy link

@jokeyrhyme Typically when you expose an API for an application, you are doing so for the specific behavior required by the application. Irie does that. Like typical controller development, you have to declare what is possible, specifically. The syntax that you are proposing to be standard used by Mongo would appear to be not-so-specific, which would appear to make an application developer more likely to expose data that their application does not need, which is usually not a good idea. Is that the case, or am I missing something?

Also, Irie does not inhibit the user from adding a prefix to any parameter name. The examples I provided were to show the simplest case. I'm not against a prefix, if it is required. I just as of yet have not seen a reason for one. For parameter values in the syntax you are proposing, it appears to require a lot of parameter value manipulation just to get the value out for something that is specific and declarative, and if it is unspecific, you are probably exposing more than you need.

I'm not against such an elaborate syntax if it were the most common case and it were to become standard, but allowing that level of access to the data to a public user without it being specifically declared to be so is not common in my experience.

@jokeyrhyme
Copy link

@garysweaver

  • MongoDB is an open-source yet commercially successful implementation of a JSON query language
  • many (but not all) new JSON API projects use MongoDB as their persistent store

you have to declare what is possible, specifically

True, but conventions are so powerful, and JSON API already specifies conventions for compound documents (?include=) and sparse fieldsets (?fields=)

The syntax that you are proposing to be standard

I'm not proposing the MongoDB syntax, rather I am using it as an example of a good one. Above all else, I think it exemplifies a JSON-first query language, unlike all the SQL-first stuff

make an application developer more likely to expose data

Only if they implement it that way. I'm primarily interested in implementing filters for GET and HEAD, not for DELETE, PATCH, POST or PATCH (although others are free to if they need it). What exactly is the harm in being able to filter on any field in the JSON that you can already retrieve with GET?

  • At my company, we're only going to be implementing single-field equality filters to start with, as much of our data is currently in MySQL. But when we do need to implement range filters and multi-field filters, we'll already have a document to tell us what it should look like. The design is already done, we'll just need to implement it.
  • Design is tough, tougher than implementation in many cases. Don't underestimate the value in being able to avoid bespoke design, especially for HTTP APIs which aren't going to be directly accessed by non-developers.

@garysweaver
Copy link

many (but not all) new JSON API projects use MongoDB as their persistent store

I'm not saying Mongo is bad; I'm saying that if the standard for any application that wanted to provide a standard JSON API to query data is to provide a way to query exactly like Mongo, you'd have to ensure that query could be locked down to only provide the specific access to the data that the server allowed, and it's not clear to me what benefit there would be to using the Mongo syntax if only specific parts are allowed, because it is a little verbose and requires parsing of the request param value to get values that form part of the query vs. the entire param value being the dynamic part. In Irie's case, order is the only one like that that requires parsing.

What exactly is the harm in being able to filter on any field in the JSON that you can already retrieve with GET?

Nothing, as long as there is no way to construct the query to provide access that was not intended. For example, if the data is extremely large, you may want to cache certain types of queries and only allow those queries, while still conforming to the json-api spec for those types of queries. I just want to ensure that if you are suggesting that anything can be queried that is part of the resource or the resource "tree" with a custom query, then that may be going further than many application developers might want to go.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

4 participants