New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add a generator that ensures monotonicity of ULIDs. #15

Closed
wants to merge 1 commit into
base: master
from

Conversation

Projects
None yet
5 participants
@aybabtme

aybabtme commented Nov 13, 2017

This implements the method described by @alizain (and used in his ulid package) to enforce monotonicity of ULIDs within a same millisecond.

I'm not sure if the API is to your liking, I'm happy to reshape it into something aesthetically pleasing to you.

Addresses issue #6.

r: @peterbourgon @tsenart
cc: @alizain

@aybabtme

This comment has been minimized.

aybabtme commented Nov 13, 2017

Travis seems to be lacking a value for $COVERALLS_TOKEN.

@peterbourgon

This comment has been minimized.

Member

peterbourgon commented Nov 13, 2017

If you rebase, the build should be fixed.

@tsenart

@aybabtme aybabtme force-pushed the aybabtme:monotonic-ulid branch from 5126ac3 to 824bc14 Nov 13, 2017

@alizain

This comment has been minimized.

alizain commented Nov 27, 2017

@aybabtme LGTM, but I'm no expert in go, so take it with a grain of salt.

@peterbourgon @tsenart, thoughts?

@peterbourgon

This comment has been minimized.

Member

peterbourgon commented Nov 27, 2017

The build looks good now. I defer to @tsenart on the appropriateness of the API.

// Monotonic returns a Generator of ULID that guarantees the ULIDs will be strictly
// monotonically increasing.
func Monotonic(entropy io.Reader) Generator {

This comment has been minimized.

@tsenart

tsenart Nov 27, 2017

Contributor

This breaks the consistency of the library APIs by introducing a new stateful Generator interface for this single use case and its accompanying Monotonic fabric implementation.

While not strictly equivalent to the solution here proposed, the underlying monotonicity problem seems to be solvable by encapsulating the state within the entropy source as an io.Reader which would have to guarantee consecutive calls would return strictly monotonic data, regardless of the value of the time component.

This comment has been minimized.

@tsenart

tsenart Nov 27, 2017

Contributor

Additionally, the behavior of such code under concurrent execution should be clearly documented.

This comment has been minimized.

@aybabtme

aybabtme Nov 28, 2017

@tsenart

This breaks the consistency of the library APIs by introducing a new stateful Generator interface for this single use case and its accompanying Monotonic fabric implementation.

Agreed. I mean this current patch as a starting point, I'm aware it breaks the style of the library.


I'm not totally following what you mean by:

While not strictly equivalent to the solution here proposed, the underlying monotonicity problem seems to be solvable by encapsulating the state within the entropy source as an io.Reader which would have to guarantee consecutive calls would return strictly monotonic data, regardless of the value of the time component.

Can you roughly sketch/pseudo-code your intent?

As I understand it, making an io.Reader that guarantee something about consecutive calls means shifting the statefulness in that io.Reader. Also I'm not sure how putting this into the io.Reader would work, since the io.Reader would need to know something about the current timestamp.

Unless you mean decoupling the monotonicity from time altogether and making an io.Reader that returns monotonically increasing random values? If that's the case, I feel this would skew the distribution of randomness given by the source of entropy, eventually leading to an entropy source that's basically a counter. Note that this is true when increasing the entropy within a single millisecond.

This comment has been minimized.

@tsenart

tsenart Dec 3, 2017

Contributor

Unless you mean decoupling the monotonicity from time altogether and making an io.Reader that returns monotonically increasing random values?

That was my train of thought.

If that's the case, I feel this would skew the distribution of randomness given by the source of entropy, eventually leading to an entropy source that's basically a counter. Note that this is true when increasing the entropy within a single millisecond.

I see, you are right. Can you point me to the original discussion and rational for these monotonic requirements? I'd like to understand the context better.

As I understand it, making an io.Reader that guarantee something about consecutive calls means shifting the statefulness in that io.Reader. Also I'm not sure how putting this into the io.Reader would work, since the io.Reader would need to know something about the current timestamp.

While a bit of hack, could we reserve the first bytes of the passed p []byte for the current timestamp in ms? That way we'd know for which timestamp the io.Reader would be generating entropy. What's your take on this rather dubious approach @peterbourgon? 🗡

This comment has been minimized.

@aybabtme

aybabtme Dec 4, 2017

@tsenart the original issue was #6

This comment has been minimized.

@peterbourgon

peterbourgon Dec 5, 2017

Member

@tsenart Hacky indeed, I think this is worth a post to golang-nuts!

)
func TestMonotonic(t *testing.T) {
for i := int64(0); i < 1000; i++ {

This comment has been minimized.

@tsenart

tsenart Nov 27, 2017

Contributor

Instead of a unit test, I suggest we cover the Monotonicity property by a property based test like we're doing for other properties such as lexicographical order being equivalent to byte order.

This comment has been minimized.

@aybabtme

aybabtme Nov 28, 2017

That's a good idea. When I have a better idea of what the API should look like, I'll rework the tests.

@peterbourgon

This comment has been minimized.

Member

peterbourgon commented Jul 12, 2018

Ping? :)

@aybabtme

This comment has been minimized.

aybabtme commented Jul 12, 2018

@tsenart

This comment has been minimized.

Contributor

tsenart commented Jul 12, 2018

Can we try something like this?

package ulid

import "io"

func Monotonic(entropy io.Reader) io.Reader {
	return &monotonic{entropy: entropy}
}

type monotonic struct {
	entropy io.Reader
	last []byte
}

func (m monotonic) Read(p []byte) (n int, err error) {
	var ts []byte
	if len(p) >= 5 {
		ts = p[:5]
	}

	// Read entropy monotonically based on ts and m.last
}

Then when calling Read we'd need to pass the timestamp part of the ULID in p

@bruth

This comment has been minimized.

Contributor

bruth commented Sep 26, 2018

Any progress on this?

@bruth

This comment has been minimized.

Contributor

bruth commented Sep 27, 2018

@tsenart The one issue with the io.Reader-based approach is that the full ulid is not passed in when the entropy is read. So the timestamp is not accessible. Unless you were thinking it would not be passed as an entropy source?

@bruth

This comment has been minimized.

Contributor

bruth commented Sep 27, 2018

Here is constructor-based version which takes another ULID used to compare the timestamp against. If the timestamp matches, the entropy is incremented, otherwise the default New is called.

package service

import (
	"errors"
	"io"

	"github.com/oklog/ulid"
)

func Monotonic(ms uint64, entropy io.Reader, id ulid.ULID) (ulid.ULID, error) {
	// Time doesn't match, so create as normal.
	if id.Time() != ms {
		return ulid.New(ms, entropy)
	}

	var id2 ulid.ULID
	if err := id2.SetTime(ms); err != nil {
		return ulid.ULID{}, err
	}

	if err := incrementBytes(id2[6:], id[6:]); err != nil {
		return ulid.ULID{}, err
	}

	return id2, nil
}

func MustMonotonic(ms uint64, entropy io.Reader, id ulid.ULID) ulid.ULID {
	id2, err := Monotonic(ms, entropy, id)
	if err != nil {
		panic(err)
	}
	return id2
}

func incrementBytes(dst, src []byte) error {
	// Copy bytes into dist.
	copy(dst, src)

	// Start at the least significant position and increment the first available bit.
	for i := (len(dst) - 1); i >= 0; i-- {
		if dst[i] == 255 {
			dst[i] = 0
			continue
		}

		dst[i]++
		return nil
	}

	// Entropy bits are all set.
	return errors.New("overflowed entropy while incrementing it")
}
@tsenart

This comment has been minimized.

Contributor

tsenart commented Sep 27, 2018

I was thinking you’d pass the timestamp bytes in the buffer passed to Read

@bruth

This comment has been minimized.

Contributor

bruth commented Sep 27, 2018

@tsenart

This comment has been minimized.

Contributor

tsenart commented Sep 27, 2018

No, not the user. The library code is the one calling Read.

@bruth

This comment has been minimized.

Contributor

bruth commented Sep 27, 2018

The library code is the one calling Read.

Yes, but right now the only the entropy portion of the byte slice is being passed (noted before). So where in the lifecycle is this the full slice being passed in? Or is the suggestion to extend the internal implementation to pass in the full slice?

@aybabtme

This comment has been minimized.

aybabtme commented Sep 27, 2018

Hey folks, just wanted to loop back. I don't have an immediate use for this anymore and I'm quite busy on other things, so I don't expect to pick this up myself.

@tsenart

This comment has been minimized.

Contributor

tsenart commented Sep 27, 2018

Or is the suggestion to extend the internal implementation to pass in the full slice?

Precisely.

@tsenart

This comment has been minimized.

Contributor

tsenart commented Sep 27, 2018

OK, perhaps a better idea that doesn't require violating the usual io.Reader semantics would be to break out of the interface:

func New(ms uint64, entropy io.Reader) (id ULID, err error) {
	if err = id.SetTime(ms); err != nil {
		return id, err
	}

	if entropy == nil {
		return id, nil
	}

	switch e := entropy.(type) {
	case *monotonic:
		if e.ms == ms {
			increment(e.bs)
			id.SetEntropy(e.bs)
		} else {
			_, err = io.ReadFull(entropy, id[6:])
			e.bs = id[6:]
		}
		e.ms = ms
	default:
		_, err = io.ReadFull(entropy, id[6:])
	}

	return id, err
}

// Monotonic returns an entropy source that is guaranteed to return
// strictly increasing entropy bytes for the same ULID timestamp.
func Monotonic(entropy io.Reader) io.Reader {
	return &monotonic{Reader: entropy}
}

type monotonic struct {
	io.Reader
	ms uint64
	bs []byte
}

func increment(bs []byte) {
	for i := len(bs) - 1; i >= 0; i-- {
		switch bs[i] {
		case 255:
			bs[i] = 0
		default:
			bs[i]++
			return
		}
	}
	panic(ErrOverflow)
}

Thoughts?

@bruth

This comment has been minimized.

Contributor

bruth commented Sep 27, 2018

tsenart added a commit that referenced this pull request Sep 27, 2018

Monotonic entropy source
Fixes #6
Supersedes #15

@tsenart tsenart closed this Sep 27, 2018

@tsenart tsenart referenced this pull request Sep 27, 2018

Merged

Monotonic entropy source #31

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment