-
Notifications
You must be signed in to change notification settings - Fork 38
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Replace tag_search with tag_key and tag_value. #33
Conversation
Also add filtering by state.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I vote for simplifying the tag_key
/tag_value
combo to a single filter
object parameter.
in: query | ||
required: false | ||
type: string | ||
- name: tag_value |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If I'm understanding this right, this filter would look like GET ".../workflows?tag_key=foo&tag_value=bar"
— correct? I can see the argument for arbitrarily complex queries, but doesn't seem like the current OpenAPI spec supports that very well. Because we're already defining tags
in the POST request to be an object, I think it's reasonable to use a single parameter with the same format here, e.g.:
- name: filter
description: |-
OPTIONAL
Filter workflow runs to only return those where exact key-value pairs are present in the
"tags" object specified by the client to track submissions.
in: query
required: false
type: object
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Does OpenAPI specify how type: object
is encoded when it is part of a query? The main reason I proposed doing it this way is to avoid jamming ugly url-encoded json in the query portion
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The default style (form
) and "explode" behavior (true
) for object parameter serialization in queries would apparently look like this:
Object id = {"role": "admin", "firstName": "Alex"}
=> /users?role=admin&firstName=Alex
You can specify more fine-grained and/or complex types for the parameter with schema (example from this page:
parameters:
- in: query
name: filter
# Wrap 'schema' into 'content.<media-type>'
content:
application/json: # <---- media type indicates how to serialize / deserialize the parameter content
schema:
type: object
properties:
type:
type: string
color:
type: string
... but I think that only works if we define specific keys for the filter. (?)
Not sure if that answers your question, or avoids encoding issues. I can play around with some examples after today's call to see how it looks.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Oh, I see. I think you're proposing that any query parameter that isn't one of page_size
, page_token
or state
would be considered a filter on tags. That seems quite reasonable.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
An alternative to URL-encoding a JSON object and including it as a query parameter is to put the object in the request body (and use POST instead of GET for the HTTP verb). The advantage is that if the object becomes large the URL will not run up against the maximum URL length.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
FWIW we ran into the URL length in Cromwell, which led to us having both a GET
and a POST
for the equivalent behavior (why not just POST
? Because people normally see it as a pain). So this is definitely a potential issue.
in: query | ||
required: false | ||
type: string | ||
- name: state |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can the state
parameter description just be replaced with:
- $ref: '#/definitions/State'
...?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Probably, since it only makes sense to accept a string in the state enum.
A key-value map of arbitrary metadata which may be used by | ||
the client to organize and manage workflow runs. Tags must | ||
not be considered part of the workflow run input. Clients may | ||
use the "tag_key" and "tag_value" query parameters to filter |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
... use the "tag_key" and "tag_value" query parameters to filter ...
to
... use the key-value query parameters to filter ...
(see above comment re: filter
parameter)
Just to surface this, I think @jaeddy was proposing that any URL query parameter that isn't one of |
@tetron I'm not sure if I think that'd be more or less confusing to a user, but have no strong feelings. |
What about @geoffjentry ? He said if it's on the spec they will implement... |
@briandoconnor I don't have a strong opinion on the topic and we'll likely need to do roughly the same amount of work either way |
@jaeddy will resolve the conflicts after the other PRs get merged |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
OpenAPI 2.0 doesn't support object types for query parameters. My suggestion of mirroring the "tags" map from the workflow run request with a single "filter" parameter" won't work for now. We could investigate @brucehoff's suggestion of a "POST /workflows/_search" type endpoint as a workaround, but I think it's OK to merge the current PR and move forward.
I might suggest that filtering on this service is a nice-to-have, but unless there is a specific use case that requires it I would suggest leaving this as an optional upgrade a WES implementing service could provide. WES is not a system of record for provenance and even if the number of workflows is in the thousands, retrieving the entirety of the index of running workflows will be in the kilobytes, making client side filtering a breeze. |
Thanks, @david4096. I think we were hoping to get @mckinsel's input on whether this would be required for HCA (not sure about other Driver Projects at this point). I agree that paging and limits should work fine for most current use cases. Originally, this PR was to clarify the "tag_search" parameter in the WES spec — maybe the better answer for now is to remove it? |
If none of our driver projects need search now, I agree that removing it is the right short-term answer. (And if we do decide to keep it in, I'd like to confirm that whatever syntax we agree on meets the needs of Cromwell's existing search API users -- otherwise we're pretty much guaranteeing we'll have to rev it soon.) |
@dglazer if it helps, I'm fairly confident (read: I've been starting to argue for w/ an increasing amount of vigor) that Cromwell will need to change its search API in the foreseeable future. My $0.02 is to find something off the shelf which makes sense instead of going down the otherwise inevitable path of making a bespoke query language (the irony of that last statement is not lost on me) |
It does help, in that it makes me feel even more strongly that since WES and Cromwell are both inventing new search APIs, we should only invent it once. And since it we'd probably like to take more time to get it right, leaving it out entirely from WES 1.0 would be ideal. (Unless of course an existing driver project needs it -- then we can understand their needs, do something minimal, and expect it to change later.) |
Added this issue to remove tag_search: #66 |
I believe the HCA interest in filtering and querying is mostly driven by Job Manager UI, which presumably is relying some intersection of the filtering capabilities of Cromwell and dsub. @dglazer, you may have more detail about that than anybody. It doesn't sound like there's confidence that that intersection is something we'd want to write into an API standard at this point, so let's not. I assume we could have a situation where Cromwell implements WES 1.0 and a couple additional routes for JMUI, and more filtering eventually is included in WES when it's a little more fully baked. |
@mckinsel yeah i'd rather not codify JMUI needs as A Thing yet as i can already see it's going in a bad direction and For Now we can handle it w/ Cromwell. I do agree that there's a There There but we can discuss it after Basel (again, ideally w/ OTS solutions) |
Thanks @mckinsel for confirming that you only depend on filtering as implemented by JMUI. GIven that, I agree with you and @geoffjentry that API standardization is premature. In the future, I agree with Jeff that the Right Thing is to define a WES API that meets JMUI (and other) needs directly, and have that be the one way that Cromwell supports filtering. |
To make a slightly less confusing way of phrasing what I stated yesterday - a lot of the changes we've added to the cromwell search API in the recent past have been specifically to support JMUI but also are highlighting why I think we need to start from scratch. So I think that JMUI makes for a good use case on a redesign, but not the current JMUI/Cromwell interaction. |
Who is JMUI, and what are the search projects being worked on? Is there a place where the specifications for such use cases are defined in order to better understand the deliverables? Also was there a call yesterday I was not aware of? If there are meeting notes for that call that would be very helpful, in order to catch up on what was discussed. Thanks, |
@pgrosu JMUI is the Job Manager UI from Data Biosphere. It was originally created by and for HCA (and thus something a driver project relies on). There was no call yesterday, at least not one I was on. |
Thanks Jeff (@geoffjentry) for the clarification - appreciate it, |
Spinoff from discussion on #30
We allow key-value "tags" on workflow runs. These are most useful if the client can filter on them. The current description for "tag_search" is too vague. This adds query parameters tag_key and tag_value and describes their usage more precisely.
Also adds "state" query parameter for filtering by state.