Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Switch auto-generated IDs to Flake IDs from random UUIDs #7531

Closed
wants to merge 8 commits into from

Conversation

mikemccand
Copy link
Contributor

Flake IDs give better lookup performance in Lucene since they share
predictable prefixes (timestamp).

Closes #5941

This PR starts from @GaelTadh's original PR (#6004) and just folds in the last round of feedback ... I think it's ready?

GaelTadh and others added 6 commits April 28, 2014 14:25
Using SecureRandom as a UUID generator is slow and doesn't allow us
to take adavantage of some lucene optimizations around ids with common
prefixes. This commit will allow us to use a
timestamp64bit-macAddr-counter UUID. Since the macAddr may be shared
among several nodes running on the same hardware we use an xor of the
macaddr with a SecureRandom number generated on startup.

See elastic#5941
…hout an incoming id.

Wire up the timestampUUID generator to indexing.

See elastic#5941
Incorporate some of the changes from @kimchy and @s1monw.
Move the UUID generators into their own classes and provide a common
interface as a first step to moving them under a singleton. Use a better
method of getting the mac address and fall back to a secure random address
if it fails. Add tests to test conccurency and shared prefix integrity of
UUIDs. Use PaddedAtomicLongs to hold the sequence number and lasttime.
Check to see if a time slip has occured as described by @s1monw in a CAS loop.
Next step is to move the impls under a singleton.

See elastic#5941
Reduce number of time bytes to 6 reducing total number of bytes to 20.
Validate that we have a mac address that contains data to avoid getting
addresses that are just 00:00:00:00:00:00 which can happen on virtualized
machines. Remove use of ByteBuffer on puts to reduce overhead. Add code to
attempt to prevent time slips.

See elastic#5941
Simplify mac address validation routing and remove unneed variable.
/** These are essentially flake ids (http://boundary.com/blog/2012/01/12/flake-a-decentralized-k-ordered-unique-id-generator-in-erlang) but
* we use 6 (not 8) bytes for timestamp, and use 3 (not 2) bytes for sequence number. */

class TimeBasedUUID implements UUIDGenerator {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Odd that it does not end in Generator, which makes it seem like a generated UUID (same for RandomBasedUUID).

@pickypg
Copy link
Member

pickypg commented Sep 1, 2014

@mikemccand LGTM. Just minor fluff.

@mikemccand
Copy link
Contributor Author

Thanks @pickpg I pushed a new commit...


public class MacAddressProvider {

private static final ESLogger logger = Loggers.getLogger("MacAddressProvider");
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should it be Loggers.getLogger(MacAddressProvider.class) for consistency with other classes?

@jpountz jpountz added review and removed review labels Sep 2, 2014
@mikemccand
Copy link
Contributor Author

Thanks @jpount, I pushed a new commit folding in your feedback...

@jpountz
Copy link
Contributor

jpountz commented Sep 2, 2014

LGTM

@jpountz jpountz removed the review label Sep 2, 2014
@mikemccand mikemccand closed this in 9c1ac95 Sep 2, 2014
mikemccand added a commit that referenced this pull request Sep 2, 2014
Flake IDs give better lookup performance in Lucene since they share
predictable prefixes (timestamp).

Closes #7531

Closes #6004

Closes #5941
mikemccand added a commit that referenced this pull request Sep 8, 2014
Flake IDs give better lookup performance in Lucene since they share
predictable prefixes (timestamp).

Closes #7531

Closes #6004

Closes #5941
@clintongormley clintongormley changed the title Switch auto-ids to Flake IDs from random UUIDs Indexing: Switch auto-ids to Flake IDs from random UUIDs Sep 10, 2014
@clintongormley clintongormley added the :Core/Infra/Core Core issues without another label label Jun 6, 2015
@clintongormley clintongormley changed the title Indexing: Switch auto-ids to Flake IDs from random UUIDs Switch auto-generated IDs to Flake IDs from random UUIDs Jun 6, 2015
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Explore using timebased decentralized UUID for autogenerated IDs
5 participants