Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add contains and contains_any filtering #1592

Merged
merged 20 commits into from
Apr 17, 2018
Merged

Conversation

Natim
Copy link
Member

@Natim Natim commented Apr 16, 2018

Fixes #343

  • Add documentation.
  • Add tests.
  • Add a changelog entry.
  • Add your name in the contributors file.
  • If you changed the HTTP API, update the API_VERSION constant and add an API changelog entry in the docs

@@ -336,6 +342,8 @@ def apply_filters(records, filters):
COMPARISON.IN: operator.contains,
COMPARISON.EXCLUDE: lambda x, y: not operator.contains(x, y),
COMPARISON.LIKE: lambda x, y: re.search(y, x, re.IGNORECASE),
COMPARISON.CONTAINS_ALL: contains_all_filtering,
COMPARISON.CONTAINS: lambda record_value, search_term: set(record_value[1]).intersection(set(search_term[1]))
Copy link
Member Author

@Natim Natim Apr 16, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd like to understand where this [1] comes from it is a bit unclear to me right now.

@Natim Natim force-pushed the 343-add-contains-filtering branch 2 times, most recently from 0370823 to 84b21dc Compare April 16, 2018 15:09
Rémy HUBSCHER and others added 4 commits April 16, 2018 17:10
Fixes #343

Co-authored-by: Julien Lebunetel <julien@lebunetel.com>
Co-authored-by: Julien Lebunetel <julien@lebunetel.com>
Co-authored-by: Julien Lebunetel <julien@lebunetel.com>
Paired with @jlebunetel

Co-authored-by: Julien Lebunetel <julien@lebunetel.com>
@Natim Natim force-pushed the 343-add-contains-filtering branch from 84b21dc to a886fcf Compare April 16, 2018 15:11
@glasserc
Copy link
Contributor

This is kind of bikesheddy, but I think contains_foo=x,y should mean "the foo attribute contains ["x", "y"]" (i.e. what you called contains_all here). If you want to write "the foo attribute overlaps with ["x", "y"]", I think you should write intersect or contains_any.

@Natim
Copy link
Member Author

Natim commented Apr 16, 2018

If you have any idea why pip is failling, I would be interested in knowing.

Edited: 503 Backend is unhealthy

@Natim Natim requested a review from leplatrem April 16, 2018 15:51
@Natim Natim changed the title 343 add contains filtering Add contains and contains_any filtering Apr 16, 2018
filtr.operator.value)
# Thanks https://dba.stackexchange.com/a/157186
cond = "TRANSLATE(({})\:\:jsonb\:\:text, '[]','{}')\:\:{}[] {} :{}".format(
sql_field, '{}', array_type, sql_operator, value_holder)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this is all way too complicated and I am not crazy about the use of string processing to convert from one type to another. I think you should be converting the filter value to JSONB. This way we can support checking for fields containing complex objects like [1, 2] (in a field like [[1,2], [5, 9]]). There's no way to do "overlap" testing with this type, but I think that's OK -- do we really need contains_any at this point in time or did you just figure it was easy to implement at the same time?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

do we really need contains_any

Yes I do

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this is all way too complicated and I am not crazy about the use of string processing to convert from one type to another.

That's the only way I found so far to be able to add it in the WHERE clause.

We could use jsonb_to_text_array but it would be have to be a JOIN.

If you find another way of doing it, I would be pleased to use it.

It was also a bit complex to use the SQLAlchemy execute parameters with an array.

@glasserc
Copy link
Contributor

Pip failing might not be your fault... on mozilla/pyjexl#13 I'm getting "too many redirects".

tox.ini Outdated
@@ -9,7 +9,6 @@ commands =
py.test --cov-report term-missing --cov-branch --cov-fail-under 100 --cov kinto {posargs}
deps =
-rdev-requirements.txt
-crequirements.txt
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What was this about?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah maybe we should keep it.

# However, it does support Postgres arrays of any
# type. Assume that the referenced field is a JSON
# array and convert it to a Postgres array.
# N.B. This won't work on JSON objects.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What does it mean "won't work" ? "will always be false" ?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If you've got an array of json objects it won't work. The array should be only strings or int.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh I just understood your remark, I guess we should add a test to make sure of it.

'''''''''''''''''

- New `contains_` and `contains_any` filter operators (fixes #343)

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes @jlebunetel is working on the documentation part 👍

CHANGELOG.rst Outdated

**New feature**

- Add ``contains`` and ``contains_any`` filtering (Fixes #343).
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nitpick: usually when we add new features to the HTTP API we highlight it as such (see older entries) :)

@Natim Natim merged commit a9b8d07 into master Apr 17, 2018
@Natim Natim deleted the 343-add-contains-filtering branch April 17, 2018 12:25
# However, it does support Postgres arrays of any
# type. Assume that the referenced field is a JSON
# array and convert it to a Postgres array.
# N.B. This won't work on JSON objects and will never match.
Copy link
Contributor

@glasserc glasserc Apr 17, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No, that's not right.

http --auth 'user:user' PUT 'http://localhost:8888/v1/buckets/default/collections/def/records/r1' 'data:={"user": {"first": "Ethan", "id": 1234}, "aliases": [{"ll": "ls -l"}, {"gti": "git"}, {"vi": "vim"}]}'
http --auth 'user:user' 'http://localhost:8888/v1/buckets/default/collections/def/records?contains_any_aliases=[{"ll":"ls -l"}]'   # matches
http --auth 'user:user' 'http://localhost:8888/v1/buckets/default/collections/def/records?contains_any_aliases=[{"sl":"ls -l"}]'   #doesn't match
http --auth 'user:user' 'http://localhost:8888/v1/buckets/default/collections/def/records?contains_any_user={"id":1234}'   # 503

Compare with contains:

http --auth 'user:user' 'http://localhost:8888/v1/buckets/default/collections/def/records?contains_aliases=[{"ll":"ls -l"}]'   # matches
http --auth 'user:user' 'http://localhost:8888/v1/buckets/default/collections/def/records?contains_aliases=[{"sl":"ls -l"}]'   #doesn't match
http --auth 'user:user' 'http://localhost:8888/v1/buckets/default/collections/def/records?contains_user={"id":1234}'  # matches

N.B. these behaviors may not match in the memory backend.

* ``/collection?contains_colors=[red,blue]``

Matches any record whose ``colors`` array field contains ``red`` and
``blue`` elements.
Copy link
Contributor

@glasserc glasserc Apr 17, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This doesn't work.

http --auth 'user:user' POST 'http://localhost:8888/v1/buckets/default/collections/def/records' data:='{"name": "Superman", "colors": ["red", "blue"]}'
http --auth 'user:user' GET 'http://localhost:8888/v1/buckets/default/collections/def/records?contains_colors=[red,blue]'

Doesn't match..

* ``/collection?contains_any_colors=[red,blue]``

Matches any record whose ``colors`` array field contains ``red`` or
``blue`` elements.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The same problem above is present here.

Matches any records whose ``field`` array field contains ``value``. Values can
be an integer, a string, a list of integers or a list of strings. In case of
values are a list, only matches records whose field contains all the values
listed.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What happens if field is not an array?

This can also match objects, but I guess it doesn't matter because almost any object contains you could consider doing is better expressed as a bunch of equality filters?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I guess we should raise a 400 for objects here.

result = self.resource.collection_get()
values = [item['fib'] for item in result['data']]
for value in values:
assert 3 in value or 5 in value
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

These tests are inaccurately named.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Whould test_contains_with_string_select_records_with_color be better? Any suggestion?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

4 participants