Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Replace tag_search with tag_key and tag_value. #33

Closed
wants to merge 1 commit into from
Closed

Conversation

tetron
Copy link
Contributor

@tetron tetron commented Jun 7, 2018

Spinoff from discussion on #30

We allow key-value "tags" on workflow runs. These are most useful if the client can filter on them. The current description for "tag_search" is too vague. This adds query parameters tag_key and tag_value and describes their usage more precisely.

Also adds "state" query parameter for filtering by state.

Also add filtering by state.
@tetron
Copy link
Contributor Author

tetron commented Jun 7, 2018

Copy link
Member

@jaeddy jaeddy left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I vote for simplifying the tag_key/tag_value combo to a single filter object parameter.

in: query
required: false
type: string
- name: tag_value
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If I'm understanding this right, this filter would look like GET ".../workflows?tag_key=foo&tag_value=bar" — correct? I can see the argument for arbitrarily complex queries, but doesn't seem like the current OpenAPI spec supports that very well. Because we're already defining tags in the POST request to be an object, I think it's reasonable to use a single parameter with the same format here, e.g.:

- name: filter
  description: |-
    OPTIONAL
    Filter workflow runs to only return those where exact key-value pairs are present in the 
    "tags" object specified by the client to track submissions. 
  in: query
  required: false
  type: object

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Does OpenAPI specify how type: object is encoded when it is part of a query? The main reason I proposed doing it this way is to avoid jamming ugly url-encoded json in the query portion

Copy link
Member

@jaeddy jaeddy Jun 18, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The default style (form) and "explode" behavior (true) for object parameter serialization in queries would apparently look like this:

Object id = {"role": "admin", "firstName": "Alex"} => /users?role=admin&firstName=Alex

You can specify more fine-grained and/or complex types for the parameter with schema (example from this page:

parameters:
  - in: query
    name: filter
    # Wrap 'schema' into 'content.<media-type>'
    content:
      application/json:  # <---- media type indicates how to serialize / deserialize the parameter content
        schema:
          type: object
          properties:
            type:
              type: string
            color:
              type: string

... but I think that only works if we define specific keys for the filter. (?)

Not sure if that answers your question, or avoids encoding issues. I can play around with some examples after today's call to see how it looks.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh, I see. I think you're proposing that any query parameter that isn't one of page_size, page_token or state would be considered a filter on tags. That seems quite reasonable.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

An alternative to URL-encoding a JSON object and including it as a query parameter is to put the object in the request body (and use POST instead of GET for the HTTP verb). The advantage is that if the object becomes large the URL will not run up against the maximum URL length.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

FWIW we ran into the URL length in Cromwell, which led to us having both a GET and a POST for the equivalent behavior (why not just POST? Because people normally see it as a pain). So this is definitely a potential issue.

in: query
required: false
type: string
- name: state
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can the state parameter description just be replaced with:

- $ref: '#/definitions/State'

...?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Probably, since it only makes sense to accept a string in the state enum.

A key-value map of arbitrary metadata which may be used by
the client to organize and manage workflow runs. Tags must
not be considered part of the workflow run input. Clients may
use the "tag_key" and "tag_value" query parameters to filter
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

... use the "tag_key" and "tag_value" query parameters to filter ...

to

... use the key-value query parameters to filter ...

(see above comment re: filter parameter)

@tetron
Copy link
Contributor Author

tetron commented Jun 25, 2018

Just to surface this, I think @jaeddy was proposing that any URL query parameter that isn't one of page_size, page_token or state would be considered a filter on tags. I don't know how to express that with swagger but otherwise I don't have a problem with in principle, what do you think @geoffjentry @mckinsel ?

@geoffjentry
Copy link
Contributor

@tetron I'm not sure if I think that'd be more or less confusing to a user, but have no strong feelings.

@briandoconnor
Copy link
Contributor

@mckinsel is this a requirement for HCA?

@jaeddy is this needed for the testbed?

What about other drivers

@briandoconnor
Copy link
Contributor

What about @geoffjentry ? He said if it's on the spec they will implement...

@geoffjentry
Copy link
Contributor

@briandoconnor I don't have a strong opinion on the topic and we'll likely need to do roughly the same amount of work either way

@briandoconnor
Copy link
Contributor

@jaeddy will resolve the conflicts after the other PRs get merged

Copy link
Member

@jaeddy jaeddy left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

OpenAPI 2.0 doesn't support object types for query parameters. My suggestion of mirroring the "tags" map from the workflow run request with a single "filter" parameter" won't work for now. We could investigate @brucehoff's suggestion of a "POST /workflows/_search" type endpoint as a workaround, but I think it's OK to merge the current PR and move forward.

@david4096
Copy link
Member

I might suggest that filtering on this service is a nice-to-have, but unless there is a specific use case that requires it I would suggest leaving this as an optional upgrade a WES implementing service could provide. WES is not a system of record for provenance and even if the number of workflows is in the thousands, retrieving the entirety of the index of running workflows will be in the kilobytes, making client side filtering a breeze.

@jaeddy
Copy link
Member

jaeddy commented Jul 9, 2018

Thanks, @david4096. I think we were hoping to get @mckinsel's input on whether this would be required for HCA (not sure about other Driver Projects at this point). I agree that paging and limits should work fine for most current use cases.

Originally, this PR was to clarify the "tag_search" parameter in the WES spec — maybe the better answer for now is to remove it?

@dglazer
Copy link
Member

dglazer commented Jul 10, 2018

If none of our driver projects need search now, I agree that removing it is the right short-term answer.

(And if we do decide to keep it in, I'd like to confirm that whatever syntax we agree on meets the needs of Cromwell's existing search API users -- otherwise we're pretty much guaranteeing we'll have to rev it soon.)

@geoffjentry
Copy link
Contributor

@dglazer if it helps, I'm fairly confident (read: I've been starting to argue for w/ an increasing amount of vigor) that Cromwell will need to change its search API in the foreseeable future. My $0.02 is to find something off the shelf which makes sense instead of going down the otherwise inevitable path of making a bespoke query language (the irony of that last statement is not lost on me)

@dglazer
Copy link
Member

dglazer commented Jul 10, 2018

It does help, in that it makes me feel even more strongly that since WES and Cromwell are both inventing new search APIs, we should only invent it once. And since it we'd probably like to take more time to get it right, leaving it out entirely from WES 1.0 would be ideal.

(Unless of course an existing driver project needs it -- then we can understand their needs, do something minimal, and expect it to change later.)

@david4096
Copy link
Member

Added this issue to remove tag_search: #66

@mckinsel
Copy link

I believe the HCA interest in filtering and querying is mostly driven by Job Manager UI, which presumably is relying some intersection of the filtering capabilities of Cromwell and dsub. @dglazer, you may have more detail about that than anybody.

It doesn't sound like there's confidence that that intersection is something we'd want to write into an API standard at this point, so let's not. I assume we could have a situation where Cromwell implements WES 1.0 and a couple additional routes for JMUI, and more filtering eventually is included in WES when it's a little more fully baked.

@geoffjentry
Copy link
Contributor

@mckinsel yeah i'd rather not codify JMUI needs as A Thing yet as i can already see it's going in a bad direction and For Now we can handle it w/ Cromwell. I do agree that there's a There There but we can discuss it after Basel (again, ideally w/ OTS solutions)

@dglazer
Copy link
Member

dglazer commented Jul 12, 2018

Thanks @mckinsel for confirming that you only depend on filtering as implemented by JMUI. GIven that, I agree with you and @geoffjentry that API standardization is premature.

In the future, I agree with Jeff that the Right Thing is to define a WES API that meets JMUI (and other) needs directly, and have that be the one way that Cromwell supports filtering.

@geoffjentry
Copy link
Contributor

To make a slightly less confusing way of phrasing what I stated yesterday - a lot of the changes we've added to the cromwell search API in the recent past have been specifically to support JMUI but also are highlighting why I think we need to start from scratch. So I think that JMUI makes for a good use case on a redesign, but not the current JMUI/Cromwell interaction.

@pgrosu
Copy link

pgrosu commented Jul 12, 2018

Who is JMUI, and what are the search projects being worked on? Is there a place where the specifications for such use cases are defined in order to better understand the deliverables?

Also was there a call yesterday I was not aware of? If there are meeting notes for that call that would be very helpful, in order to catch up on what was discussed.

Thanks,
Paul

@geoffjentry
Copy link
Contributor

@pgrosu JMUI is the Job Manager UI from Data Biosphere. It was originally created by and for HCA (and thus something a driver project relies on).

There was no call yesterday, at least not one I was on.

@pgrosu
Copy link

pgrosu commented Jul 12, 2018

Thanks Jeff (@geoffjentry) for the clarification - appreciate it,
Paul

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

9 participants