Add a 128-bit k-ordered unique id generator #295

Closed
wants to merge 1 commit into
from

Projects

None yet

6 participants

@pyr
pyr commented Jan 13, 2012

This is similar to what snowflake and the recent boundary
solution do, but it makes sense to use redis for that type
of use cases for people wanting a simple way to get
incremental ids in distributed systems without an additional
daemon requirement.

Unique IDs are composed as follows:

  • epoch seconds: 4 bytes
  • epoch mseconds: 4 bytes
  • host name: 6 bytes
  • sequence id: 2 bytes

host name is truncated to 6 chars, so the appropriate config
directive id-generation-name should be set on each machines
wanting to yield ids if truncating hostname do 6 chars does
not suffice.

@pyr pyr Add a 128-bit k-ordered unique id generator
This is similar to what snowflake and the recent boundary
solution do, but it makes sense to use redis for that type
of use cases for people wanting a simple way to get
incremental ids in distributed systems without an additional
daemon requirement.

Unique IDs are composed as follows:

* epoch seconds: 4 bytes
* epoch mseconds: 4 bytes
* host name: 6 bytes
* sequence id: 2 bytes

host name is truncated to 6 chars, so the appropriate config
directive id-generation-name should be set on each machines
wanting to yield ids if truncating hostname do 6 chars does
not suffice.
dd86c60
@pyr
pyr commented Jan 14, 2012

A few more thoughts on this PR

  • I'm not sure about the name, seqid could be fine too
  • Here are the links to similar projects: https://github.com/twitter/snowflake, https://github.com/boundary/flake
  • This might be possible with lua too, but is only useful in scenarios where it takes huge hits and thus needs the speed of C
  • I'm not sure abut having it live in its own file, I"m willing to have it in some other file if need be
  • I wonder about the best possible way to represent a 128 bit integer from redis, and if this hex string will be sufficient
@pyr
pyr commented Jan 19, 2012

I'm wondering whether it would make sense to also provide a incrid short version which would produce 64bit snowflake like ids for people wanting shorter ids.

@antirez
Owner
antirez commented Feb 2, 2012

What is the reason why this kind of IDs can't be generated client side? I think that exposing this things can be a bad idea, especially since here there is the notion of synchronized time between instances, a premise that all the other parts of Redis don't require.

@pyr
pyr commented Feb 2, 2012

The time sync is not crucial. The more in sync you are, the more sequential IDs will be. The ID behind using redis for it is simple, you can ensure sequential IDs if you have a master / slave setup with failover to the slave in case of failure, and redis would be able to handle a huge generation load.

@Plasma
Plasma commented Feb 21, 2012

Please consider adding this. An alternative is setting up another service (such as snowflake as mentioned), but that's another service that needs monitoring, testing, redundancy, etc.

@tarnfeld
tarnfeld commented Mar 2, 2012

This is exactly what I need. Do you have an idea of how small these generated numbers could start from, and how large they could get. I assume as time goes on, they'll get larger?

@pyr
pyr commented Mar 16, 2012

the format is fixed as described. half of it is a timestamp, the rest
is an id plus a serial

On Fri, Mar 2, 2012 at 7:54 PM, Tom Arnfeld
reply@reply.github.com
wrote:

This is exactly what I need. Do you have an idea of how small these generated numbers could start from, and how large they could get. I assume as time goes on, they'll get larger?


Reply to this email directly or view it on GitHub:
#295 (comment)

@antirez
Owner
antirez commented Mar 16, 2012

we now have scripting int 2.6, and the TIME command. Is this enough to create decent IDs without adding commands?

@paixaop
paixaop commented Mar 24, 2012

It could be implemented via Lua script if a bitwise library, like bitOp was loaded into redis. The important code from Snowflake is

((timestamp - twepoch) << timestampLeftShift) |
  (datacenterId << datacenterIdShift) |
  (workerId << workerIdShift) | 
  sequence
@breznik
breznik commented Jun 4, 2012

With scripting + TIME command, you could presumably generate the ID, but then have to make a separate round trip to store the ID since you can't store IDs after calling TIME in your script, correct?

@pyr pyr closed this Jul 30, 2012
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment