Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Are Sonyflakes strings always guaranteed to be in order and the same length? #18

Closed
caldempsey opened this issue Jun 10, 2020 · 3 comments

Comments

@caldempsey
Copy link

caldempsey commented Jun 10, 2020

Hi!

We would like to know if sonyflakes generated from now are a guaranteed sort order and if not then when is the next time that length will change. We have noticed an issue in our database where our sort order is being ill-computed based on that we sometimes store sonyflakes as strings. Could you kindly affirm whether the length of a sonyflake string can change? Or is there a guaranteed length.

In particular the following algorithm...

l = [
    "1",
    "9999",
    "2",
    "10000",
    "3",
    "4",
    "999",
    "6",
    "7",
    "8",
    "9",
    "5",
    "10",
    "100",
    "1000",
    "90",
]

print(l)
l.sort()
print(l)

Shows that say the number 2 > 1000000000000000 when doing string comparison. I would like to know how Sonyflakes generated reliably are affected by this as the impact of this could affect the sort ordering for some production data (may require migration). Due to the algorithmic complexity of the Sonyflake and how its generated (sonyflakes are always the same length I believe) I'm not sure if we would be impacted by this in the near future.

Go example

package main

import (
    "fmt"
    "sort"
)

func main() {
    sl := []string{"1", "9999", "2", "10000", "3", "4", "999", "6", "7", "8", "9", "5", "10", "100", "1000", "90"}
    fmt.Println(sl)
    sort.Strings(sl)
    fmt.Println(sl)
}

I appreciate that this is not your concern and represents a mistake made in our systems, however I would appreciate any help on the matter. It would be helpful if I can run unit tests locally to generate snowflake now to the range generate one years in the future, but I have no idea how to adjust the snowflake without changing my system time like this. Appreciate the easiest way to diagnose the impact!

@caldempsey caldempsey changed the title Are Sonyflakes always a guaranteed length Are Sonyflakes strings always guaranteed to be in order? Jun 10, 2020
@caldempsey
Copy link
Author

caldempsey commented Jun 10, 2020

In other words we need to identify what our runway is before we have to do a system migration!

@caldempsey caldempsey changed the title Are Sonyflakes strings always guaranteed to be in order? Are Sonyflakes strings always guaranteed to be in order and the same length? Jun 11, 2020
@YoshiyukiMineo
Copy link
Member

Hi,

Sonyflake guarantees the uniqueness of its generated IDs. But its order is not always time order if you use multiple machine IDs. As you know, if you convert Sonyflake IDs, which are originally uint64, to strings, its order is not time order but lexical order.

I think Sonyflake is not suitable for time-order guarantee. But you can guarantee time order if you use Sonyflake under the following constraints:

  • using a single machine ID
  • padding with zeros when you convert Sonyflake IDs to strings like "0000...002"

@caldempsey
Copy link
Author

caldempsey commented Jun 12, 2020

@YoshiyukiMineo Thank you for getting back. The configuration we use for the Sonyflake is as follows... sonyflake.NewSonyflake(sonyflake.Settings{})This has reliably generated new Sonyflakes in, roughly, order of unit time. This might be improper usage but it has worked thus far. Since the machine ID isn't being set am I correct to understand the reason this is consistent is the machine ID is always a calculation of the constants

	BitLenTime      = 39                               // bit length of time
	BitLenSequence  = 8                                // bit length of sequence number
	BitLenMachineID = 63 - BitLenTime - BitLenSequence // bit length of machine id

So 16. Which means that all machine IDs have been a single "constant" machine ID up to now; such that BitLenMachineID=16

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants