New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add support for set difference #14
Conversation
This adds support for Redis' SDIFF and SDIFFSTORE operations allowing us to efficiently calcuate set difference and, optionally, store the result in another redis key
fwiw, I just knocked out the union ( |
Thanks for looking at this! We generally don't want to just match Redis commands in Ruby, but rather base the types on use cases from within Ruby. That's why we have e.g. UniqueList which uses multiple commands underneath the hood. I think the |
Hey @kaspth, thanks for taking a look. I enjoyed hacking on Kredis so good job so far! Here's what I'm working on which might explain my use case: I need to sync data with a rather terrible API (basically a mailing list). I need to grab all the email addresses from the API, grab the user list from our database, and then work out which emails to delete from the API and which to add so I can keep them in sync. The current script is straight ruby and uses the Array difference operator to determine Things are getting a little slow now there are tens of thousands of entries and I'd like to split this up into separate jobs. Instead of storing the entries in a Ruby array, I'll be inserting the entries into a Redis set to coordinate the data from various jobs. Then, just like the ruby code, I'll use set difference to determine which email addresses to add, and which to delete. In Kredis, using this PR looks like paying_customers = Kredis.set 'customers'
api_contacts = Kredis.set 'contacts'
emails_to_delete = api_contacts - paying_customers
emails_to_delete.each { |e| DeleteContactJob.perform_later e }
emails_to_add = paying_customers - api_contacts
emails_to_add.each { |e| AddContactJob.perform_later e } Whenever I use a set data structure, it's almost always because I need to use set operations, especially diff and intersect, perhaps with some uniqueness constraints—otherwise, I'd just use a list. I'd really like to see those operations here in Kredis too. I think it would be a powerful use of the data structure and maintain compatibility with the familiar Array interface. With these operators, you could take pure Ruby code using Arrays and convert it to Redis-backed sets using Kredis with a familiar interface. TBH, I wasn't aware of the |
Heyo, I've been trying to do some more thinking around this and I've been wondering if we should support Set operations right now. Because we're essentially opening the door to cross-type comparisons as shown by the need to call Meanwhile, you can make your example work with this: emails_to_delete = api_contacts.proxy.sdiff paying_customers.key Thanks for the suggestion! You're more than welcome to share more use cases here or in other issues as you add Kredis to your app 🙏 |
Gotcha! I think the hardest part of starting an open source project is figuring out what it is and what it's not. The cross-type comparisons are something I considered and would probably need some type-checking to raise an exception if you tried to perform set difference with a non-set type. I'm not sure how Redis handles that situation either. To be honest, I'll probably go back to plain redis commands instead of kredis.proxy… 😞 |
This adds support for Redis' SDIFF and SDIFFSTORE operations allowing us to efficiently calculate set difference and, optionally, store the result in another Kredis key.
It looks like this:
I'm not sure if this is a direction you'd like Kredis to go in but it would be easy enough to add
SUNION
(set + otherset
) andSINTER
(set & otherset
) operations to mirror the same operations in Array. Happy to write PRs for those operations too or expand this one