Add SINTERCARD/ZINTERCARD Commands #8946

jonahharris · 2021-05-14T19:42:14Z

Add SINTERCARD and ZINTERCARD commands that are similar to
ZINTER and SINTER but only return the cardinality with minimum
processing and memory overheads.

For set-based operations, a painful consequence of requiring only the resulting cardinality is a substantial memory overhead in either returning the entire resulting set or storing it in another key. This adds SINTERCARD/ZINTERCARD commands, which have zero memory overhead and return only the resulting cardinality. With these commands, performing Jaccard-type calculations on two sets is substantially faster and less resource-intensive - it's simply an SCARD of both sets and one SINTERCARD. Unfortunately, there is no easy way to implement a similar cardinality for unions given the underlying implementation. ZINTERCARD is kinda nasty from a factoring perspective given zunionInterDiffGenericCommand's handling of all use-cases.

Anyway, interested in thoughts on this as they are required for a good amount of recommendation system work and, while they could be done with modules, it seems nasty to copy out the logic of redis core into a module. If there's a pro-command response, I'll clean-up the zset variant some.

…nly the resulting cardinality.

jonahharris · 2021-05-14T19:57:42Z

@oranagra @yossigo @itamarhaber Thoughts on this? At present, it's resulting in ~4x performance gain for the project I'm working on.

jonahharris · 2021-05-14T20:07:07Z

local function microtime ()
  local ts = redis.call('time');
  return tonumber(string.format('%s.%s', ts[1], ts[2]));
end

local function sinter (s1, s2)
  return #(redis.call('sinter', s1, s2));
end

local function sinterstore (s1, s2)
  return redis.call('sinterstore', 'tempset', s1, s2);
end

local function sintercard (s1, s2)
  return redis.call('sintercard', s1, s2);
end

local function jaccard (s1, s2, interfunc)
  local ic = interfunc(s1, s2);
  local uc = ((redis.call('scard', s1) + redis.call('scard', s2)) - ic);

  if (0 == uc) then
    return tostring(0.0);
  else
    return tostring(ic / uc);
  end 
end

local function main (
  argc,
  argv
)
  if (2 ~= argc) then
    return redis.error_reply('invalid number of arguments');
  end

  local num_iterations = 1000;
  local impl = {
    ['sinter'] = sinter,
    ['sinterstore'] = sinterstore,
    ['sintercard'] = sintercard
  };
  local result = {};
  for k, v in pairs(impl)
  do
    local sum = 0;
    for ii = 1, num_iterations
    do
      local t1 = microtime();
      jaccard(argv[1], argv[2], v);
      sum = (sum + (microtime() - t1));
    end
    local avg = (sum / num_iterations);
    result[k] = avg;
    print(string.format('%s %8.6fs avg', k, avg));
  end

  local comparators = { 'sinter', 'sinterstore' };
  for ii = 1, #comparators
  do
    local comparator = comparators[ii];

    print(string.format('COMPARED TO %s', comparator));
    print(string.format('increase/decrease...... %8.6f %%',
      ((result['sintercard'] - result[comparator]) / result[comparator]) * 100.0));
    print(string.format('performance increase... %8.6f %%',
      ((result[comparator] - result['sintercard']) / result['sintercard']) * 100.0));
    print(string.format('times faster........... %8.6f',
      (result[comparator] / result['sintercard'])));
  end
end

return main(#KEYS, KEYS);

jharris-laptop02:redis jharris$ redis-cli --eval jaccard.lua s1 s2
sinter 0.002803s avg
sintercard 0.000908s avg
sinterstore 0.005039s avg
COMPARED TO sinter
increase/decrease...... -67.620561 %
performance increase... 208.837964 %
times faster........... 3.088380
COMPARED TO sinterstore
increase/decrease...... -81.986088 %
performance increase... 455.126494 %
times faster........... 5.551265

madolson

Seems useful enough to me. We've had an ongoing discussion about trying to reduce the number of commands that exist, but it seems like this is best left as a separate command since it returns a very different return type.

Some options like "WITHSCORES" also don't really make sense with zintercard, and should be blocked.

madolson · 2021-05-14T21:11:02Z

src/t_zset.c

-                    dictAdd(dstzset->dict,tmp,&znode->score);
-                    if (sdslen(tmp) > maxelelen) maxelelen = sdslen(tmp);
+                    if (!cardinality_only) {
+                      tmp = zuiNewSdsFromValue(&zval);


nit: this spacing is off

Agreed. Fixed.

Still missing the blockage of irrelevant options (WITHSCORES, WEIGHTS, ...)

jonahharris · 2021-05-14T22:48:49Z

Seems useful enough to me. We've had an ongoing discussion about trying to reduce the number of commands that exist, but it seems like this is best left as a separate command since it returns a very different return type.

Agreed. While the store variant of zunionInterDiffGenericCommand similarly returns only the cardinality of the resulting zset, there wasn't a good way for me to fit this concept into any of the current commands that made sense - with the exception of adding a new intersection-only option which seemed nasty and out of place.

Some options like "WITHSCORES" also don't really make sense with zintercard, and should be blocked.

Definitely. That's the main thing I'd like to clean-up if there is a desire to move this forward.

oranagra · 2021-05-19T11:01:02Z

i also feel that new command is probably better, but i wanna note that in some sense WITHSCORE also changes the response type (more clearly on RESP3), and that mutually-exclusive arguments are also common (NX and XX), so in that sense a CARDONLY argument might have been ok too.

I think the reasoning may be that when there are no tons of common other arguments (other than the two inputs).
i.e. the WEIGHTS and AGGREGATE arguments for ZDIFF and ZUNION don't apply here, it's just a matter of defining the operation (cmd name), and two inputs.
unlike other commands we unified in which if we kept them apart, extending one, would mean we need to extend the others (e.g. GEORADIUS vs GEOBOX, and SETEX vs GETSET)

oranagra

sorry for the delay. generally LGTM, few minor suggestions.
i'd like to hear @itamarhaber feedback on the command and if there was anything similar discussed in the past.

src/server.c

src/t_zset.c

yossigo · 2021-06-15T13:44:04Z

New command and syntax LGTM.

src/t_zset.c

itamarhaber · 2021-06-30T15:23:18Z

I upvote the new API and the use case.

oranagra · 2021-07-19T08:26:32Z

@jonahharris do you want to see it though, or shall i pick it up?

oranagra · 2021-07-25T11:36:34Z

@yossigo please review my changes.
p.s. note that unlike ZINTER and ZINTERSTORE, which takes several flag argument, ZINTERCARD takes none.
so in theory we can change it from ZINTERCARD <num-keys> <key1> <key2> ... to be similar to SINTERCARD (SINTERCARD <key1> <key2>).
however i suppose that being consistent with ZINTER and ZINTERSTORE is more important for most users of this command.

oranagra · 2021-07-25T11:37:44Z

doc PR: redis/redis-doc#1610

itamarhaber · 2021-07-25T15:16:50Z

@oranagra

however i suppose that being consistent with ZINTER and ZINTERSTORE is more important for most users of this command.

Agreeing with that.

jonahharris · 2021-07-26T15:12:53Z

@oranagra Thanks for picking this up, man. I don't know why GMail only pushes some of these to my Inbox where I can see them, but it never seems to be the right ones :(

oranagra · 2021-07-26T15:46:09Z

@redis/core-team please approve the two new commands.

itamarhaber

Code LGTM

src/t_set.c

src/t_zset.c

Co-authored-by: Itamar Haber <itamar@redislabs.com>

madolson

Sorry for the delay, I missed the update on this. Code and API LGTM.

Add SINTERCARD and ZINTERCARD commands that are similar to ZINTER and SINTER but only return the cardinality with minimum processing and memory overheads. Co-authored-by: Oran Agra <oran@redislabs.com>

jonahharris added 2 commits May 14, 2021 14:47

Added SINTERCARD command which has zero memory overhead and returns o…

861050e

…nly the resulting cardinality.

Added ZINTERCARD command which has zero memory overhead and returns o…

59eecd1

…nly the resulting cardinality.

madolson reviewed May 14, 2021

View reviewed changes

formatting correction

ff4e0db

oranagra reviewed Jun 7, 2021

View reviewed changes

src/server.c Outdated Show resolved Hide resolved

src/t_zset.c Outdated Show resolved Hide resolved

oranagra added state:major-decision Requires core team consensus release-notes indication that this issue needs to be mentioned in the release notes approval-needed Waiting for core team approval to be merged state:needs-doc-pr requires a PR to redis-doc repository labels Jun 30, 2021

itamarhaber reviewed Jun 30, 2021

View reviewed changes

src/t_zset.c Show resolved Hide resolved

oranagra added 3 commits July 25, 2021 14:10

code review comments

d75a5d3

Merge remote-tracking branch 'origin/unstable' into with-intercard

475ed07

fix SINTERCARD on a missing key and add a test

19324d1

itamarhaber previously approved these changes Jul 26, 2021

View reviewed changes

src/t_set.c Outdated Show resolved Hide resolved

src/t_zset.c Show resolved Hide resolved

Update src/t_set.c

7b78c3b

Co-authored-by: Itamar Haber <itamar@redislabs.com>

oranagra dismissed itamarhaber’s stale review via 7b78c3b July 26, 2021 18:39

madolson self-requested a review August 3, 2021 06:17

madolson previously approved these changes Aug 3, 2021

View reviewed changes

add docs about cardinality_only argument

4fcac14

oranagra dismissed madolson’s stale review via 4fcac14 August 3, 2021 08:41

oranagra approved these changes Aug 3, 2021

View reviewed changes

oranagra merged commit 432c92d into redis:unstable Aug 3, 2021

enjoy-binbin mentioned this pull request Aug 29, 2021

Adds limit to SINTERCARD/ZINTERCARD. #9425

Merged

hwware mentioned this pull request Dec 23, 2021

Add new commands called lmmove & blmmove. #9929

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add SINTERCARD/ZINTERCARD Commands #8946

Add SINTERCARD/ZINTERCARD Commands #8946

jonahharris commented May 14, 2021 •

edited by oranagra

Loading

jonahharris commented May 14, 2021

jonahharris commented May 14, 2021 •

edited

Loading

madolson left a comment

madolson May 14, 2021

jonahharris May 14, 2021

itamarhaber Jun 30, 2021

jonahharris commented May 14, 2021 •

edited

Loading

oranagra commented May 19, 2021 •

edited

Loading

oranagra left a comment

yossigo commented Jun 15, 2021

itamarhaber commented Jun 30, 2021

oranagra commented Jul 19, 2021

oranagra commented Jul 25, 2021

oranagra commented Jul 25, 2021

itamarhaber commented Jul 25, 2021

jonahharris commented Jul 26, 2021

oranagra commented Jul 26, 2021

itamarhaber left a comment

madolson left a comment

Add SINTERCARD/ZINTERCARD Commands #8946

Add SINTERCARD/ZINTERCARD Commands #8946

Conversation

jonahharris commented May 14, 2021 • edited by oranagra Loading

jonahharris commented May 14, 2021

jonahharris commented May 14, 2021 • edited Loading

madolson left a comment

Choose a reason for hiding this comment

madolson May 14, 2021

Choose a reason for hiding this comment

jonahharris May 14, 2021

Choose a reason for hiding this comment

itamarhaber Jun 30, 2021

Choose a reason for hiding this comment

jonahharris commented May 14, 2021 • edited Loading

oranagra commented May 19, 2021 • edited Loading

oranagra left a comment

Choose a reason for hiding this comment

yossigo commented Jun 15, 2021

itamarhaber commented Jun 30, 2021

oranagra commented Jul 19, 2021

oranagra commented Jul 25, 2021

oranagra commented Jul 25, 2021

itamarhaber commented Jul 25, 2021

jonahharris commented Jul 26, 2021

oranagra commented Jul 26, 2021

itamarhaber left a comment

Choose a reason for hiding this comment

madolson left a comment

Choose a reason for hiding this comment

jonahharris commented May 14, 2021 •

edited by oranagra

Loading

jonahharris commented May 14, 2021 •

edited

Loading

jonahharris commented May 14, 2021 •

edited

Loading

oranagra commented May 19, 2021 •

edited

Loading