Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

Already on GitHub? Sign in to your account

Added ZSET command ZDIFFSTORE #448

Open
wants to merge 3 commits into
from

Conversation

Projects
None yet
3 participants

parrish commented Apr 11, 2012

I've posted to an existing topic on the google group (pending approval) to explain our use case, but I'll go ahead and duplicate it here.

We use Redis to select the next object to be shown to a user for classification.

Two things are important in this selection:

  1. Users are shown objects only once (unique selection)
  2. Objects are scored based upon prior classifications (the score is updated in real time) and users should be shown the most highly scored object they haven't seen.

Currently we do the following:

  • Store many keys for object scores (objects_score_<id>)
  • Store a set of object ids seen by a user (seen_objects_for_<user_id>)
  • Store a set of object ids available for classification (objects)
  • Remove the diff set after selection is done (since the stored order from the score is invalidated shortly after creation)

Then perform:

sdiffstore user_<id>_unseen_objects objects seen_objects_for_user_<id>
sort user_<id>_unseen_objects DESC BY objects_score_* LIMIT 0 1
del user_<id>_unseen_objects

Given we have over 600,000 users and well over 10,000,000 objects between our projects, staying responsive while handling rates above 100 requests/second is non-trivial. Optimizing this specific use-case is pretty critical for us.

Please feel free to suggest changes, revisions, etc.

Thanks!

-Michael

@parrish parrish referenced this pull request Apr 19, 2012

Open

zdiffstore #446

Plasma commented Jan 11, 2014

This would be appreciated; I have a use case for this as well:

  1. Maintain a sorted set of "New Item Ids", capped at 100 items, sorted/scored by insertion timestamp
  2. Maintain a set/sorted set of "Recently Seen Item Ids", capped at 100 items, sorted/scored by insertion timestamp

To answer the query of "What New Item Ids have not been seen recently?" I would like to ZDIFF the sets.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment