Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

StoragePacker V2 #6176

Open
wants to merge 60 commits into
base: main
Choose a base branch
from
Open

StoragePacker V2 #6176

wants to merge 60 commits into from

Conversation

jefferai
Copy link
Member

@jefferai jefferai commented Feb 6, 2019

This is the initial PR for StoragePacker V2. There will be follow-on PRs that introduce sharding, and then that combine the entity/group SPs into one and introduce a separate type for group memberships.

@hashicorp-cla
Copy link

hashicorp-cla commented Feb 15, 2019

CLA assistant check
All committers have signed the CLA.

@vishalnayak vishalnayak changed the base branch from 1.1-beta to master March 12, 2019 18:28
s.bucketsCacheLock.Lock()
{
for _, v := range shards {
s.bucketsCache.Insert(s.GetCacheKey(v.Key), v)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I noticed two things which possibly needs some updating.

  1. PutBucket, after calling storeBucket inserts the parent bucket into the cache. This is fine if the sharding doesn't happen. If the bucket gets sharded, it should not be added into the cache after a successful storeBucket.
  2. After adding all the sharded buckets into the cache, the parent bucket should be removed from the cache. No?

// the base level
//
// The rest of the values can change
config.BaseBucketBits = exist.BaseBucketBits
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Shouldn't the BucketShardBits also be a constant?

for _, item := range bucket.Items {
items := make([]*storagepacker.Item, 0, len(bucket.Items)+len(bucket.ItemMap))
items = append(items, bucket.Items...)
for id, message := range bucket.ItemMap {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Would there be a performance improvement if we do a range on bucket.Items and bucket.ItemMap separately instead of populating the items first and then ranging over it?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah that's fair. I think in reality only one of these will be non-zero at any given time so it's mostly defense in depth.

storageLocks []*locksutil.LockEntry
bucketsCache *radix.Tree

// Note that we're slightly loosy-goosy with this lock. The reason is that
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd generally prefer a more concrete description of how the lock should be used. I checked through the review and I didn't find cases where a method was called on bucketsCache that didn't have the lock, so I'm confused.

shards := make(map[string]*LockedBucket, numShards)
for i := 0; i < numShards; i++ {
shardKey := fmt.Sprintf("%x", i)
lock := locksutil.LockForKey(s.storageLocks, cacheKey+shardKey)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This doesn't seem deadlock-safe to me. The locks from s.storageLocks can be acquired in some arbitrary order, correct? So any guarantee depending on lock ordering goes out the window. Maybe it's safe if no non-sharding thread acquires more than one bucket lock, and we do have the lock to ensure only one sharing operation at a time.

However, I am worried about the following sequence:
A acquires storageLock 1 in PutBucket
B acquired storageLock 2 in PutBucket for a different bucket
A needs to shard, acquires shardLock
B also needs to shard, blocks on shardLock
A's cacheKey+shardKey duplicates to storageLock 2, deadlock?

We acquire storageLocks both before and after shardLock.

// Finally, update the original and persist
origBucketItemMap := bucket.ItemMap
bucket.ItemMap = nil
if err := s.storeBucket(ctx, bucket, false); err != nil {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is safe against errors in storeBucket but I'm not sure it's crash-safe. What happens if we come back up after a failure and find both the shards and the original bucket present in storage?

@pbernal pbernal modified the milestones: next-release, triaged May 15, 2020
@VioletHynes VioletHynes added the hashicorp-contributed-pr If the PR is HashiCorp (i.e. not-community) contributed label Jul 6, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
core/storage enhancement hashicorp-contributed-pr If the PR is HashiCorp (i.e. not-community) contributed
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

8 participants