Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Allow st.lists() to accept a tuple of functions for unique_by #1916

Closed
bleakley opened this issue Apr 3, 2019 · 3 comments · Fixed by #1962
Closed

Allow st.lists() to accept a tuple of functions for unique_by #1916

bleakley opened this issue Apr 3, 2019 · 3 comments · Fixed by #1962
Labels
enhancement it's not broken, but we want it to be better

Comments

@bleakley
Copy link

bleakley commented Apr 3, 2019

Let's say I want to generate a list of users, but I want each user to be unique by id and email independently, not unique by the pair of properties.

For example, the strategy should not generate [{'id': 1, 'email': 'john@aol.com'}, {'id': 1, 'email': 'jane@aol.com'}], because they both have the same id. This might occur if I were to do the following:

strategies.lists(_user_generator(), min_size=1, max_size=2, unique_by=lambda x: (x['id'], x['email']))

What I would really like is something along the lines of:

strategies.lists(_user_generator(), min_size=1, max_size=2, unique_by=[lambda x: x['id'], lambda x: x['email']])

Am I missing a common approach here, or is something like this needed as a new feature?

@Zac-HD Zac-HD added the enhancement it's not broken, but we want it to be better label Apr 3, 2019
@Zac-HD
Copy link
Member

Zac-HD commented Apr 3, 2019

Sounds like a useful feature request to me! Tagged as enhancement because it's upgrading an existing strategy rather than adding a new one.

You could currently approximate this by generating two unique lists, or by filtering, but that will be inefficient to shrink or to generate respectively. (check out the @composite docs if you don't mind that)

For anyone who wants to implement this: check out hypothesis.searchstrategies.collections.UniqueListStrategy, and think about using a list of key functions. As a design point I think we should stick to a single "seen" set for all the key functions; this is a slightly stronger uniqueness property in that it avoids possible collisions - for example unique_by=(get_first_element, get_last_element) nothing can be generated with a last element that is the same as any other last or first element. It's much easier to relax this later than to tighten the uniqueness constraint!

Then thread that back through to the public interface. For validation, I think we'll insist on a single callable or a tuple of two or more callables, without allowing lists (because of mutation).

@bleakley
Copy link
Author

bleakley commented Apr 3, 2019

Thank you, it sounds like @composite will be helpful.

@Zac-HD Zac-HD changed the title Allow strategies to accept a list of functions for unique_by Allow st.lists() to accept a tuple of functions for unique_by May 6, 2019
@mjsir911
Copy link
Contributor

mjsir911 commented May 6, 2019

I can have a go at this one

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement it's not broken, but we want it to be better
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants