-
Notifications
You must be signed in to change notification settings - Fork 95
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Tags can hold list values #622
Conversation
@skrawcz haven't tried it yet, but just flagging my thoughts: Do both of these return the same thing nodes?
|
f022807
to
2cc83e9
Compare
yep it's order invariant. I even handle the case: @tag(business_line="US")
def my_function() -> ...:
... # does not match
dr.list_available_variables({"business_line": ["US", "UK"]})
# does match
dr.list_available_variables({"business_line": ["US"]}) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
A few thoughts -- largely on API
# gets all | ||
dr.list_available_variables() | ||
# gets exact matching | ||
dr.list_available_variables({"TAG_NAME": "TAG_VALUE"}) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This API is confusing me quite a bit TBH. TAG_VALUE
is a list means that they all have to match? BUt if its an individual it means they have to match any?
We should be opinionated and simple here. If not, we should leave querying up to the user... I was thinking:
- Only query for individual tag_values in dict. Dicts imply
and
, and an individual one means its one of the decorated values - We leave open the option for more kw params later
dr.select_variables(tag_match={"foo" : "bar", "baz": "qux"})
This would match:
@tag(foo="bar")
@tag(baz="qux")
- `@tag(foo=["bar", "qux"])
But not:
@tag(foo="not_bar")
@tag(foo=["not_baz", "not_bar"])
@tag(foo="bar", baz="not_qux")
@tag(foo=["bar", "qux"], baz=["bar", "not_qux"])
This is being opinionated that matching means any do exist, anything else is up to the user to specify/query independently.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
A single value, i.e. no list, is an exact match if the tag doesn't have a list, else if there is a list, it just needs to match one of them.
The issue/grey area, is how do you interpret a list of tag values? To me it stands to reason if we can tag values with lists, we can also query with lists.
E.g. Given:
@tag(business_lines=["CA", "TX"])
def my_func() -> ...:
pass
# e.g. I want to get all nodes applicable to a particular business line (it could be variable in length)
# should this be an exact match if someone passes a single item list?
dr.list_available_variables(tag_filter=dict(business_lines=["CA"]))
# to me the above is equivalent to this -- because a user could also write this:
dr.list_available_variables(tag_filter=dict(business_lines="CA"))
# Why?
# Well, why would someone put a tag on a function with only a single value in it?
@tag(business_lines=["CA"]) # < -- this wouldn't make sense to me.
def my_func() -> ...:
pass
# e.g. I want to get all nodes that correspond to Cali AND Texas AND Florida:
# what about this? I think this is an exact match requirement (i.e. AND), not an OR.
dr.list_available_variables(tag_filter=dict(business_lines=["CA", "TX", "FL"]))
# e.g. I want to get all nodes that hit Cali OR Texas OR Florida
# If you want an OR you would have to do multiple calls and union the results.
(
dr.list_available_variables(tag_filter=dict(business_lines="CA")) |
dr.list_available_variables(tag_filter=dict(business_lines="TX")) |
dr.list_available_variables(tag_filter=dict(business_lines="FL")) |
) # you'd have to make them sets for this to work but you get the picture
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@kzk2000 any thoughts on lists of values? It would be simpler to not allow it.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@skrawcz Whatever feels more intuitive is all right by me.
Say
@tag( business_value=["CA", "US"] )`
def combo_node():
...
@tag( business_value=["US"] )`
def only_us_node():
...
@tag( business_value=["CA"] )`
def only_ca_node():
...
so
dr.list_available_variables(tag_filter=dict(business_lines=["CA"])) # case A: returns only_ca_node() and combo_node()
dr.list_available_variables(tag_filter=dict(business_lines="US")) # case B: returns only_us_node() and combo_node()
dr.list_available_variables(tag_filter=dict(business_lines=["CA", "US"] )) # case C: returns all 3 nodes
That is, I would treat the last case as OR operator, if any one of the list entries matches any of the tag's list entries , return the nodes for which this is true.
But I'm also totally fine with making this more explicit as you suggested and do not allow lists in the tag_filter
at all, that is, only allow for Case B above. And then do this as you suggested:
(
dr.list_available_variables(tag_filter=dict(business_lines="CA")) |
dr.list_available_variables(tag_filter=dict(business_lines="TX")) |
dr.list_available_variables(tag_filter=dict(business_lines="FL")) |
) # you'd have to make them sets for this to work but you get the picture
Bottom line as long as `dr.list_available_variables()``` returns all nodes, I can easily run any list comprehension filter on this myself, e.g.
[ x for x in dr.list_available_variables() if x.tag.get('business_line", None) in ['US', 'CA']]
or similar
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Okay I think I might switch to the OR interpretation then.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Will try to have something by Friday
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
For this case:
dr.list_available_variables(tag_filter=dict(business_lines=["CA"], some_other_tag="foo" ))
I'm interpreting it as an AND query, while the values in the list are OR. Thoughts?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
So yeah, offline we've synced up and decided that tags are or queries. This leaves the door open with:
tag_filter=dict(foo=or_("bar", "baz"))
tag_filter=dict(foo=and_("bar", "baz"))
or something like this
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
To confirm:
dr.list_available_variables(tag_filter=dict(business_lines=["CA", "US"], some_other_tag="foo" ))
this to me reads:
business_lines=CA OR business_lines=US
AND
some_other_tag=foo
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
This completes #620. This implements the following behavior: Behavior is now this. Given ```python @tag( business_value=["CA", "US"] )` def combo_node(): ... @tag( business_value=["US"] )` def only_us_node(): ... @tag( business_value=["CA"], some_other_tag="BAR" )` def only_ca_node(): ... ``` then the following queries will return: ```python dr.list_available_variables(tag_filter=dict(business_lines=["CA"])) dr.list_available_variables(tag_filter=dict(business_lines="US")) dr.list_available_variables(tag_filter=dict(business_lines=["CA", "US"] )) dr.list_available_variables(tag_filter=dict(business_lines=None )) dr.list_available_variables(tag_filter=dict(business_lines="UK" )) dr.list_available_variables(tag_filter=dict(business_lines="US", some_other_tag="FOO" ) dr.list_available_variables(tag_filter=dict(business_lines="CA", some_other_tag="BAR" ) ``` So values in lists are read as OR clauses, and multiple tags in the query dict are read as AND clauses. We settled on this design since it seemed the most intuitive to read, while also leaving the door for more complex querying capabilities. Squashed commits: [f7a9752] Makes matches_query return True on empty dict [f93282b] Updates logic for tag querying This moves the tag filter logic to the node module, since it's kind of coupled to it. Behavior is now this. Given ```python @tag( business_value=["CA", "US"] )` def combo_node(): ... @tag( business_value=["US"] )` def only_us_node(): ... @tag( business_value=["CA"], some_other_tag="BAR" )` def only_ca_node(): ... ``` then the following queries will return: ```python dr.list_available_variables(tag_filter=dict(business_lines=["CA"])) dr.list_available_variables(tag_filter=dict(business_lines="US")) dr.list_available_variables(tag_filter=dict(business_lines=["CA", "US"] )) dr.list_available_variables(tag_filter=dict(business_lines=None )) dr.list_available_variables(tag_filter=dict(business_lines="UK" )) dr.list_available_variables(tag_filter=dict(business_lines="US", some_other_tag="FOO" ) dr.list_available_variables(tag_filter=dict(business_lines="CA", some_other_tag="BAR" ) ``` [94c58de] Fixing whitespace and type annotation for list_available_variables These were incorrect. [2cc83e9] Adds exposing tag_filter open on listing variables #620 This is a quick way to improve the ergonomics of listing all available variables and filtering by some criteria. ```python dr.list_available_variables() # gets all dr.list_available_variables({"TAG_NAME": "TAG_VALUE"}) # gets all matching tag name & value dr.list_available_variables({"TAG_NAME": ["TAG_VALUE", "TAG_VALUE2"]}) # gets all matching tag name & all values order invariant dr.list_available_variables({"TAG_NAME": None}) # gets all with matching tag, irrespective of value ``` This completes 620, and handles the filtering logic in the driver function. I think this is an reasonable place to put it for now. Assumptions: - if passed a list it's an exact match - AND clause across all values passed in. - None means just get me anything with that tag. - it's keyword only argument to enable us to maintain backwards compatibility for future changes. More complex clauses seem out of scope for this change. But for the future, if we want to change things, e.g. take in a query object that can express `not in, or, etc` We can do that. Another idea I didn't pursue was enabling query syntax: ```python {"TAG_NAME": "=FOO"}, # equal {"TAG_NAME": "!FOO"}, # not equal etc ``` Since with lists of values of strings, this could get quite complex to parse/understand...
Implements a version of #620 .
then the following queries will return:
Changes
@tag
now can take in a list of stringsdriver.list_available_variables()
now takes in a tag_filter argumentHow I tested this
Notes
Checklist