New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
StoragePacker V2 #6176
base: main
Are you sure you want to change the base?
StoragePacker V2 #6176
Conversation
s.bucketsCacheLock.Lock() | ||
{ | ||
for _, v := range shards { | ||
s.bucketsCache.Insert(s.GetCacheKey(v.Key), v) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I noticed two things which possibly needs some updating.
- PutBucket, after calling storeBucket inserts the parent bucket into the cache. This is fine if the sharding doesn't happen. If the bucket gets sharded, it should not be added into the cache after a successful storeBucket.
- After adding all the sharded buckets into the cache, the parent bucket should be removed from the cache. No?
// the base level | ||
// | ||
// The rest of the values can change | ||
config.BaseBucketBits = exist.BaseBucketBits |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Shouldn't the BucketShardBits also be a constant?
for _, item := range bucket.Items { | ||
items := make([]*storagepacker.Item, 0, len(bucket.Items)+len(bucket.ItemMap)) | ||
items = append(items, bucket.Items...) | ||
for id, message := range bucket.ItemMap { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Would there be a performance improvement if we do a range on bucket.Items
and bucket.ItemMap
separately instead of populating the items
first and then ranging over it?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah that's fair. I think in reality only one of these will be non-zero at any given time so it's mostly defense in depth.
* Change item storage from proto.Any to []byte * address review feedback
storageLocks []*locksutil.LockEntry | ||
bucketsCache *radix.Tree | ||
|
||
// Note that we're slightly loosy-goosy with this lock. The reason is that |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'd generally prefer a more concrete description of how the lock should be used. I checked through the review and I didn't find cases where a method was called on bucketsCache that didn't have the lock, so I'm confused.
shards := make(map[string]*LockedBucket, numShards) | ||
for i := 0; i < numShards; i++ { | ||
shardKey := fmt.Sprintf("%x", i) | ||
lock := locksutil.LockForKey(s.storageLocks, cacheKey+shardKey) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This doesn't seem deadlock-safe to me. The locks from s.storageLocks can be acquired in some arbitrary order, correct? So any guarantee depending on lock ordering goes out the window. Maybe it's safe if no non-sharding thread acquires more than one bucket lock, and we do have the lock to ensure only one sharing operation at a time.
However, I am worried about the following sequence:
A acquires storageLock 1 in PutBucket
B acquired storageLock 2 in PutBucket for a different bucket
A needs to shard, acquires shardLock
B also needs to shard, blocks on shardLock
A's cacheKey+shardKey duplicates to storageLock 2, deadlock?
We acquire storageLocks both before and after shardLock.
// Finally, update the original and persist | ||
origBucketItemMap := bucket.ItemMap | ||
bucket.ItemMap = nil | ||
if err := s.storeBucket(ctx, bucket, false); err != nil { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is safe against errors in storeBucket but I'm not sure it's crash-safe. What happens if we come back up after a failure and find both the shards and the original bucket present in storage?
This is the initial PR for StoragePacker V2. There will be follow-on PRs that introduce sharding, and then that combine the entity/group SPs into one and introduce a separate type for group memberships.