Skip to content
This repository has been archived by the owner on Mar 9, 2019. It is now read-only.

Shouldn't NextSequence return uint64? #39

Closed
tv42 opened this issue Feb 20, 2014 · 12 comments · Fixed by #209
Closed

Shouldn't NextSequence return uint64? #39

tv42 opened this issue Feb 20, 2014 · 12 comments · Fixed by #209

Comments

@tv42
Copy link
Contributor

tv42 commented Feb 20, 2014

It seems right now NextSequence ends up having different behavior on 32-bit vs 64-bit platforms. I understand that mmap limits mean a 32-bit platform won't ever have a huge boltdb database, but what if one just e.g. cycles through primary key ids relatively quickly? Leaving id=1 allocated and then wrapping a 32-bit counter is definitely plausible.

@benbjohnson
Copy link
Member

@tv42 The NextSequence() function is really just a simple ease-of-use function. The int type is abundantly used in Go and it's pretty common to use int for IDs. Having to cast between uint64 to int is more complicated. I'd rather keep the function targeted at the common use case instead of accounting for edge cases (e.g. large db on a 32-bit machine). It's pretty easy for someone to roll their own sequence generator.

You do bring up a good point about 32-bit vs 64-bit though. Currently, calling NextSequence() on a 32-bit machine 2,147,483,648 times will cause a panic. I added a check for integer overflow in pull request #40. It now returns a ErrSequenceOverflow error.

@extemporalgenome
Copy link
Contributor

If the API will already be broken with the planned Open function changes, this would also be worth breaking to eliminate the portability issue. Converting between uint64 and int is not considered complicated by any means, and any appropriate use of the existing NextSequence would need to modulo the result by 2^31-1 anyway. Arguing that it's easy enough to roll our own correct implementation is really an argument of removing the current implementation of NextSequence altogether. The only issue with our own implementations is that they can't take advantage of the internal counter field without forking the project.

The return type should be any explicitly sized integer type, whatever that may be, but not int or uint

@benbjohnson
Copy link
Member

@extemporalgenome Converting to a uint64 causes issues for people who use int ids as it's not safe to type convert from a uint64 down to an int. Using an always positive int provides a least common denominator that can safely be converted to uint, uint32, uint64, int32, and int64.

The current NextSequence() implementation will error if the next sequence goes above the maximum int value on both 32-bit systems and 64-bit systems. We'll lose that if we allow users to convert down to smaller int types.

@benbjohnson
Copy link
Member

@extemporalgenome Also, all this is a pretty minor edge case since it only affects people with more than 2 billion key/value pairs in a database smaller than 4GB on an 32-bit machine.

@extemporalgenome
Copy link
Contributor

The primary issue is that it exhibits different behavior on 32-bit platforms than it does on 64-bit platforms; it is non-portable. Also, afaict, the output NextSequence isn't required to be used in a key -- one can just call NextSequence a few billion times in a loop while adding no keys; the equivalent can occur with deletions of low keys, resulting in a small bucket with a high sequence counter.

@benbjohnson
Copy link
Member

It does exhibit different behaviors but I think that's better than having to worry about type conversion from uint64 down to whatever the client's type is. Also, int types are by far the most common within the standard library and many stdlib functions and types work with them (e.g. sort.IntSlice).

I'd rather have a simpler, cleaner API than support the edge case of 2 billion sequence calls on a 32-bit machine.

@extemporalgenome
Copy link
Contributor

You can bound NextSequence to 2^31-1 on all architectures while still keeping int as the concrete type -- that will eliminate any portability issues.

@benbjohnson
Copy link
Member

Yes, but it's much more likely that a 64-bit machine will go over 2 billion sequence numbers. It is a portability issue but I don't see it as one to worry about. If someone is doing >2B on 32-bit then they can roll their own counter. Otherwise they can use a 64-bit architecture.

@tv42
Copy link
Contributor Author

tv42 commented Jun 21, 2014

For what it's worth, the stdlib is very consistent that int is only used for things that are expected to be held in memory at once. For example, slice and string indexes, channel buffer sizes, len() results. Things that are not constrained by memory are not ints. For example: http://golang.org/pkg/os/#File.Seek

@tv42
Copy link
Contributor Author

tv42 commented Jun 21, 2014

Probably an even better example: http://golang.org/pkg/io/#Writer vs http://golang.org/pkg/io/#Copy -- a single Write always fits in memory, as Copy is a loop of Writes, so it is not guaranteed to.

@benbjohnson
Copy link
Member

OK, I'll switch it over to use the raw uint64 value without limiting by architecture. I did a GitHub search and I don't see anyone using the API call except one of my projects.

@benbjohnson benbjohnson reopened this Jun 22, 2014
benbjohnson added a commit to benbjohnson/bolt that referenced this issue Jun 22, 2014
This commit changes NextSequence() to return a uint64 instead of an int. This also
removes the ErrSequenceOverflow error condition since overflowing a uint64 is unlikely.

Fixes boltdb#39.
@benbjohnson
Copy link
Member

Fixed with #209.

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants