Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Retrieve all IDs and latest timestamps for a given stream without going through the all the data #7640

Open
yoonghm opened this issue Aug 10, 2020 · 8 comments
Assignees
Labels

Comments

@yoonghm
Copy link
Contributor

yoonghm commented Aug 10, 2020

I am using Redis 6. Given a stream, are there ways to retrieve

  1. All IDs in the stream
  2. Latest timestamp of all IDs

Thanks

@yoonghm
Copy link
Contributor Author

yoonghm commented Aug 10, 2020

I checked that it is not available. I wouls do this

  1. Use xrevrange or xrange to read all data to find out all IDs and its latest timestamp.

  2. Store the information in Redis using hset.

  3. Update corresponding hset records whenever xadd is used.

@yoonghm yoonghm closed this as completed Aug 10, 2020
@guybe7
Copy link
Collaborator

guybe7 commented Aug 10, 2020

@yoonghm can you please elaborate? maybe give an example fo what you're trying to do? AFAIU stream "ID" and timestamp is the same thing
anyway, try checking out XINFO STREAM <keyname> FULL COUNT <count>

@beasteers
Copy link

I 100% agree that this would be a useful feature -

i.e. being able to use xrange/xrevrange to return, say the last 50 timestamps from a stream (just the timestamps without any of the data attached), to determine if you want to pull all of the data or just specific portions. It's less important if you're storing small data points, but becomes more of an issue if you're storing larger binary data in a stream. Being able to cherry pick data points is a huge advantage of Redis Streams over using standard queues, but it kind of defeats the point if you still have to read all of the data anyways.

On a similar note, it would also be useful to be able to query a specific field (key?) from a stream instead of returning the whole stream data. That way, you could store some metadata alongside a larger piece of data and query that more efficiently. Being able to mix and match these two things would be so useful.

@guybe7 what do you think?

@guybe7
Copy link
Collaborator

guybe7 commented Feb 7, 2023

@beasteers just making sure we're on the same page...

the first feature you're talking about is basically XRANGE which doesn't return the data itself. example:

127.0.0.1:6379> XADD x 1-0 f1 v1
"1-0"
127.0.0.1:6379> XADD x 2-0 f1 v2
"2-0"
127.0.0.1:6379> XADD x 3-0 f1 v3
"3-0"
127.0.0.1:6379> XRANGE x - + JUSTID
1) "1-0"
2) "2-0"
3) "3-0"

the second feature is the same as XRANGE, but returns only partial data. example:

127.0.0.1:6379> XADD x 1-0 f1 v1 f2 v1
"1-0"
127.0.0.1:6379> XADD x 2-0 f2 v2 f3 v2
"2-0"
127.0.0.1:6379> XADD x 3-0 f1 v3 f3 v3
"3-0"
127.0.0.1:6379> XRANGE x - + FILTER *f1*
1) 1) "1-0"
   2) 1) "f1"
      2) "v1"
2) 1) "2-0"
   2) (empty array)
3) 1) "3-0"
   2) 1) "f1"
      2) "v3"

did I understand correctly?

@beasteers
Copy link

beasteers commented Feb 7, 2023

Yep that's the idea!

@guybe7
Copy link
Collaborator

guybe7 commented Feb 7, 2023

@oranagra @itamarhaber WDYT?

@itamarhaber
Copy link
Member

The 1st feature is a good addition, which may also be carried to XREAD. I remember wanting to read only IDs and not the whole message (e.g., imagine a routing process for video frames coming in as a stream [which is what I needed this for]).

The 2nd feature is interesting as well and reminds me of #5827. IMO, we shouldn't return "2-0 (empty array)" and just skip these (or maybe give the option to control), but that's nitpicking. More importantly, we need to be extremely clear that this is still a full scan of the range (i.e. non-indexed), which - depending on the stream's size and server load - could block.

Lastly, the original feature requested at the top looks like what we called XHASH (#9574), whereas these two new FRs appear somewhat different.

@oranagra
Copy link
Member

oranagra commented Feb 7, 2023

Seems useful indeed. I'll reopen this issue and add it to the backlog

@oranagra oranagra reopened this Feb 7, 2023
@hpatro hpatro added the streams label Feb 7, 2023
@guybe7 guybe7 self-assigned this Feb 8, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
Status: Backlog
Development

No branches or pull requests

6 participants