-
Notifications
You must be signed in to change notification settings - Fork 589
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add Iterator to iterate over keys #22
Conversation
type EntryInfoFilter func(EntryInfo) bool | ||
|
||
// EntryInfoIterator allows to iterate over entries in the cache | ||
type EntryInfoIterator chan EntryInfo |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I would like this iterator to be as non-blocking as possible
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
With channel you add blocking. Take a look at loops you created. You are adding mutex for reading from map and then pushing value to channel which is blocking. If you remove channel and instead return closure to function that returns entry form shard i
on position index
then you don't need to copy all (1024) elements and this will be a lazy iterator.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Channel is buffered
cache *BigCache | ||
currentShard int | ||
currentIndex int32 | ||
elements []*EntryInfo |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This could be just slice of indexes []uin32
. We don't need all data during iteration.
shard.lock.RLock() | ||
defer shard.lock.RUnlock() | ||
|
||
elements := make([]*EntryInfo, 0) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Try to estimate initial size for elements as len(shard.queue)/cache.config.MaxEntrySize
for iterator.Next() { | ||
current, err := iterator.Value() | ||
|
||
if err == nil { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Add test when error occures: modify map when it terates, delete element under Value()
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It also needs bench test.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Benchmark - done.
|
||
if entry, err := it.cache.shards[it.currentShard].entries.Get(int(current.index)); err != nil { | ||
return nil, fmt.Errorf("Could not retrieve entry from cache") | ||
} else { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
if block ends with a return statement, so drop this else and outdent its block (move short variable declaration to its own line if necessary) (golint)
@@ -33,6 +33,79 @@ type cacheShard struct { | |||
onRemove func(wrappedEntry []byte) | |||
} | |||
|
|||
// EntryInfo holds informations about entry in the cache | |||
type EntryInfo struct { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why not contain entry value?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
done
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What is the reason to keep encoded entry instead of parsed?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reason is simple: entryinfo is lazy - you will only parse what you need instead of everything
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Got it but it's lazy every time it's called instead of loading values on a first call. I think parsing data should not be much slower then copying them but it should be checked.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
So whats whit this? I did a small benchmark and gready extraction is a little bit slower than copy.
diff --git a/bigcache.go b/bigcache.go
index d610216..f15c1de 100644
--- a/bigcache.go
+++ b/bigcache.go
@@ -35,27 +35,30 @@ type cacheShard struct {
// EntryInfo holds informations about entry in the cache
type EntryInfo struct {
- entry []byte
+ timestamp uint64
+ hash uint64
+ key string
+ value []byte
}
// Key returns entry's underlying key
func (e EntryInfo) Key() string {
- return readKeyFromEntry(e.entry)
+ return e.key
}
// Hash returns entry's hash value
func (e EntryInfo) Hash() uint64 {
- return readHashFromEntry(e.entry)
+ return e.hash
}
// Timestamp returns entry's timestamp (time of insertion)
func (e EntryInfo) Timestamp() uint64 {
- return readTimestampFromEntry(e.entry)
+ return e.timestamp
}
// Value returns entry's underlying value
func (e EntryInfo) Value() []byte {
- return readEntry(e.entry)
+ return e.value
}
// EntryInfoIterator allows to iterate over entries in the cache
@@ -140,11 +143,11 @@ func (it *EntryInfoIterator) Value() (EntryInfo, error) {
return EntryInfo{}, fmt.Errorf("Could not retrieve entry from cache")
}
- var dst = make([]byte, len(entry))
- copy(dst, entry)
-
return EntryInfo{
- entry: dst,
+ timestamp: readTimestampFromEntry(entry),
+ hash: readHashFromEntry(entry),
+ key: readKeyFromEntry(entry),
+ value: readEntry(entry),
}, nil
}
$go test ./... -bench=BenchmarkIterateOverCache -run=NONE -count 3 -cpu 1,6 -benchmem > new.txt && git stash && go test ./... -bench=BenchmarkIterateOverCache -run=NONE -count 3 -cpu 1,6 benchmem > old.txt
$ benchcmp old.txt new.txt
benchmark old ns/op new ns/op delta
BenchmarkIterateOverCache/512-shards 224 224 +0.00%
BenchmarkIterateOverCache/512-shards 217 223 +2.76%
BenchmarkIterateOverCache/512-shards 218 248 +13.76%
BenchmarkIterateOverCache/512-shards-6 273 293 +7.33%
BenchmarkIterateOverCache/512-shards-6 273 284 +4.03%
BenchmarkIterateOverCache/512-shards-6 274 284 +3.65%
BenchmarkIterateOverCache/1024-shards 214 216 +0.93%
BenchmarkIterateOverCache/1024-shards 214 215 +0.47%
BenchmarkIterateOverCache/1024-shards 215 218 +1.40%
BenchmarkIterateOverCache/1024-shards-6 268 279 +4.10%
BenchmarkIterateOverCache/1024-shards-6 271 278 +2.58%
BenchmarkIterateOverCache/1024-shards-6 268 281 +4.85%
BenchmarkIterateOverCache/8192-shards 209 210 +0.48%
BenchmarkIterateOverCache/8192-shards 207 210 +1.45%
BenchmarkIterateOverCache/8192-shards 207 211 +1.93%
BenchmarkIterateOverCache/8192-shards-6 265 276 +4.15%
BenchmarkIterateOverCache/8192-shards-6 266 275 +3.38%
BenchmarkIterateOverCache/8192-shards-6 263 279 +6.08%
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What with this?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ok, done, it saves some allocs (20B/op vs 32B/op). Times are the a little smaller.
} | ||
|
||
// Value returns current value from the iterator | ||
func (it *EntryInfoIterator) Value() (*EntryInfo, error) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think we should return value not a pointer to make it obvious that we return copy of underlying value.
|
||
// partition is empty - move to next | ||
if len(it.elements) == 0 { | ||
goto incrementShard |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Just use for
instead of if
above the label
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why? :) I don't understand the pros
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I mean change if in line 90 (above the label) and remove this condition. You are implementing for
loop with if
and goto
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yep, that's exactly what I'm doing and code is readable for me. Goto is a standard jmp in the code
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nevermind - I rewrite this part of iterator
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'd prefer to use for
loop and extract condition to well named function so the comment will be redundant.
func (it *EntryInfoIterator) Value() (EntryInfo, error) { | ||
it.Lock() | ||
defer it.Unlock() | ||
current := it.elements[it.currentIndex] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think we should check if current index is valid.
|
||
for i := it.currentShard + 1; i < it.cache.config.Shards; i++ { | ||
it.currentShard, it.currentIndex, it.elements = i, 0, copyCurrentShardMap(it.cache.shards[i]) | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This line is too long.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ok
|
||
it.currentIndex++ | ||
|
||
if it.currentIndex >= len(it.elements) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
How about reverting condition. Just for readability.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You mean: if !(len(it.elements) > it.currentIndex) ?
return false | ||
} | ||
|
||
for i := it.currentShard + 1; i < it.cache.config.Shards; i++ { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What if all shards are empty?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It will iterate to the end and return false
return elements | ||
} | ||
|
||
// Next returns true if there is next element in the iterator |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This describes only half of the job that Next()
do becouse it also moves iterator to next element or invalid position if it doesn't exist. This comment better suits HasNext()
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah, you're right - changed to HasNext()
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Almost ready. Only small fixes need to be done. We need to decide how EntryInfo
should look like. After introducing it it will be recommended to us it as a default type instead of []byte
.
@druminski @medzin Can you take a look at this?
it.valid = false | ||
it.currentIndex++ | ||
|
||
if !(len(it.elements) > it.currentIndex) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Oh, I was thinking about early return true
so the code below doesn't need to be indent.
if len(it.elements) > it.currentIndex {
it.valid = true;
return true
}
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
+1 it also simplifies logical statement as negation mark is not necessary
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done :) Renamed HasNext
to SetNext
also
assert.Equal(t, current.Key(), "key") | ||
assert.Equal(t, current.Hash(), uint64(0x3dc94a19365b10ec)) | ||
assert.Equal(t, current.Value(), []byte("value")) | ||
assert.Equal(t, current.Timestamp(), uint64(0)) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Parameters order mismatch func Equal(t TestingT, expected, actual interface{}, msgAndArgs ...interface{}) bool
. Check with other places too.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done.
@@ -33,6 +33,79 @@ type cacheShard struct { | |||
onRemove func(wrappedEntry []byte) | |||
} | |||
|
|||
// EntryInfo holds informations about entry in the cache | |||
type EntryInfo struct { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
So whats whit this? I did a small benchmark and gready extraction is a little bit slower than copy.
diff --git a/bigcache.go b/bigcache.go
index d610216..f15c1de 100644
--- a/bigcache.go
+++ b/bigcache.go
@@ -35,27 +35,30 @@ type cacheShard struct {
// EntryInfo holds informations about entry in the cache
type EntryInfo struct {
- entry []byte
+ timestamp uint64
+ hash uint64
+ key string
+ value []byte
}
// Key returns entry's underlying key
func (e EntryInfo) Key() string {
- return readKeyFromEntry(e.entry)
+ return e.key
}
// Hash returns entry's hash value
func (e EntryInfo) Hash() uint64 {
- return readHashFromEntry(e.entry)
+ return e.hash
}
// Timestamp returns entry's timestamp (time of insertion)
func (e EntryInfo) Timestamp() uint64 {
- return readTimestampFromEntry(e.entry)
+ return e.timestamp
}
// Value returns entry's underlying value
func (e EntryInfo) Value() []byte {
- return readEntry(e.entry)
+ return e.value
}
// EntryInfoIterator allows to iterate over entries in the cache
@@ -140,11 +143,11 @@ func (it *EntryInfoIterator) Value() (EntryInfo, error) {
return EntryInfo{}, fmt.Errorf("Could not retrieve entry from cache")
}
- var dst = make([]byte, len(entry))
- copy(dst, entry)
-
return EntryInfo{
- entry: dst,
+ timestamp: readTimestampFromEntry(entry),
+ hash: readHashFromEntry(entry),
+ key: readKeyFromEntry(entry),
+ value: readEntry(entry),
}, nil
}
$go test ./... -bench=BenchmarkIterateOverCache -run=NONE -count 3 -cpu 1,6 -benchmem > new.txt && git stash && go test ./... -bench=BenchmarkIterateOverCache -run=NONE -count 3 -cpu 1,6 benchmem > old.txt
$ benchcmp old.txt new.txt
benchmark old ns/op new ns/op delta
BenchmarkIterateOverCache/512-shards 224 224 +0.00%
BenchmarkIterateOverCache/512-shards 217 223 +2.76%
BenchmarkIterateOverCache/512-shards 218 248 +13.76%
BenchmarkIterateOverCache/512-shards-6 273 293 +7.33%
BenchmarkIterateOverCache/512-shards-6 273 284 +4.03%
BenchmarkIterateOverCache/512-shards-6 274 284 +3.65%
BenchmarkIterateOverCache/1024-shards 214 216 +0.93%
BenchmarkIterateOverCache/1024-shards 214 215 +0.47%
BenchmarkIterateOverCache/1024-shards 215 218 +1.40%
BenchmarkIterateOverCache/1024-shards-6 268 279 +4.10%
BenchmarkIterateOverCache/1024-shards-6 271 278 +2.58%
BenchmarkIterateOverCache/1024-shards-6 268 281 +4.85%
BenchmarkIterateOverCache/8192-shards 209 210 +0.48%
BenchmarkIterateOverCache/8192-shards 207 210 +1.45%
BenchmarkIterateOverCache/8192-shards 207 211 +1.93%
BenchmarkIterateOverCache/8192-shards-6 265 276 +4.15%
BenchmarkIterateOverCache/8192-shards-6 266 275 +3.38%
BenchmarkIterateOverCache/8192-shards-6 263 279 +6.08%
it.valid = false | ||
it.currentIndex++ | ||
|
||
if !(len(it.elements) > it.currentIndex) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
+1 it also simplifies logical statement as negation mark is not necessary
} | ||
|
||
// HasNext returns true if there is next element in the iterator, false otherwise | ||
func (it *EntryInfoIterator) HasNext() bool { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This method name is misleading right now as its implementation does not only answer the question whether there is another element in iterator but also modifies currentIndex
value. Because of that when you call time after time HasNext() then you can get different answers. I see two options here, we can change this method name to SetNext() bool
or we can implement HasNext() to be idempotent and then Value() convert to Next().
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You're right - renamed method to SetNext().
Waiting for final thoughts @janisz |
|
||
// ErrCannotRetrieveEntry is reported when entry cannot be retrieved from underlying | ||
var ErrCannotRetrieveEntry = errors.New("Could not retrieve entry from cache") | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This two should not be public. Use empty struct withe Error method like we do in entry_not_found_error.go or make this private. Take a look at http://dave.cheney.net/2016/04/07/constant-errors
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What do you mean "What with this"? :)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Not this line but I assume you are referring to this discussion #22 (comment)
@@ -33,6 +33,79 @@ type cacheShard struct { | |||
onRemove func(wrappedEntry []byte) | |||
} | |||
|
|||
// EntryInfo holds informations about entry in the cache | |||
type EntryInfo struct { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What with this?
@@ -17,6 +17,8 @@ func BenchmarkWriteToCacheWith1Shard(b *testing.B) { | |||
func BenchmarkWriteToLimitedCacheWithSmallInitSizeAnd1Shard(b *testing.B) { | |||
m := blob('a', 1024) | |||
cache, _ := NewBigCache(Config{1, 100 * time.Second, 100, 256, false, nil, 1, nil}) | |||
|
|||
b.ReportAllocs() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If applied to all tests then better use -test.benchmem
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It will include setup also (creating entries for reads etc)
t.Parallel() | ||
|
||
// given | ||
keysCount := 10000 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why do we need so many entries in unit test?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reduced by 10x
Add Iterator to iterate over keys
No description provided.