Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Data race? #21

Closed
richardartoul opened this issue Jan 21, 2024 · 4 comments · Fixed by #23
Closed

Data race? #21

richardartoul opened this issue Jan 21, 2024 · 4 comments · Fixed by #23

Comments

@richardartoul
Copy link

My test suite caught this:

Write at 0x00c02f86dee0 by goroutine 3037:
  github.com/scalalang2/golang-fifo/sieve.(*Sieve[go.shape.struct { BucketURL string; FileName string; Offset int; Length int },go.shape.interface { GoAsync(func() (github.com/warpstreamlabs/warpstream/pkg/cache.TWrapper[github.com/warpstreamlabs/warpstream/pkg/filecache.FooterCacheValue], error)); GoSync(func() (github.com/warpstreamlabs/warpstream/pkg/cache.TWrapper[github.com/warpstreamlabs/warpstream/pkg/filecache.FooterCacheValue], error)); IsDone() bool; Reject(error); Resolve(github.com/warpstreamlabs/warpstream/pkg/cache.TWrapper[github.com/warpstreamlabs/warpstream/pkg/filecache.FooterCacheValue]); ResolveOrReject(github.com/warpstreamlabs/warpstream/pkg/cache.TWrapper[github.com/warpstreamlabs/warpstream/pkg/filecache.FooterCacheValue], error); Wait() (github.com/warpstreamlabs/warpstream/pkg/cache.TWrapper[github.com/warpstreamlabs/warpstream/pkg/filecache.FooterCacheValue], error); WaitCtx(context.Context) (github.com/warpstreamlabs/warpstream/pkg/cache.TWrapper[github.com/warpstreamlabs/warpstream/pkg/filecache.FooterCacheValue], error) }]).Get()
      /home/runner/go/pkg/mod/github.com/scalalang2/golang-fifo@v0.1.4/sieve/sieve.go:54 +0x1de
...


Previous write at 0x00c02f86dee0 by goroutine 3007:
  github.com/scalalang2/golang-fifo/sieve.(*Sieve[go.shape.struct { BucketURL string; FileName string; Offset int; Length int },go.shape.interface { GoAsync(func() (github.com/warpstreamlabs/warpstream/pkg/cache.TWrapper[github.com/warpstreamlabs/warpstream/pkg/filecache.FooterCacheValue], error)); GoSync(func() (github.com/warpstreamlabs/warpstream/pkg/cache.TWrapper[github.com/warpstreamlabs/warpstream/pkg/filecache.FooterCacheValue], error)); IsDone() bool; Reject(error); Resolve(github.com/warpstreamlabs/warpstream/pkg/cache.TWrapper[github.com/warpstreamlabs/warpstream/pkg/filecache.FooterCacheValue]); ResolveOrReject(github.com/warpstreamlabs/warpstream/pkg/cache.TWrapper[github.com/warpstreamlabs/warpstream/pkg/filecache.FooterCacheValue], error); Wait() (github.com/warpstreamlabs/warpstream/pkg/cache.TWrapper[github.com/warpstreamlabs/warpstream/pkg/filecache.FooterCacheValue], error); WaitCtx(context.Context) (github.com/warpstreamlabs/warpstream/pkg/cache.TWrapper[github.com/warpstreamlabs/warpstream/pkg/filecache.FooterCacheValue], error) }]).Get()
      /home/runner/go/pkg/mod/github.com/scalalang2/golang-fifo@v0.1.4/sieve/sieve.go:54 +0x1de
func (s *Sieve[K, V]) Get(key K) (value V, ok bool) {
	s.lock.RLock()
	defer s.lock.RUnlock()
	if e, ok := s.items[key]; ok {
		e.Value.(*entry[K, V]).visited = true
		return e.Value.(*entry[K, V]).value, true
	}

	return
}

Looks like concurrent writes while holding an RLock?

@richardartoul
Copy link
Author

I could see this being an intentional performance thing if setting a boolean field compiles down to a single instruction 🤔 but it makes using this library in any codebase that runs tests with the race detector enabled impossible

@scalalang2
Copy link
Owner

scalalang2 commented Jan 21, 2024

@richardartoul Thank you for reporting.

I've researched one thing that can cause this.

  1. Go map is not addressable which means the pointed value may change during performing operations
  2. Changes in pointed value can occur when go map rebalances itself to ensure O(1) access.

I think that there are two possible solutions here

  • Simply change RLock to Lock to make all operations work sequentially
  • Adopt a concurrent-map implementation like this or sync.Map

Hot Fixes

This is a critical bug, I'll handle it with a hotfix and conduct further researchs for this.

@scalalang2
Copy link
Owner

scalalang2 commented Jan 21, 2024

I also want to hear @richardartoul opinions about this.

@richardartoul
Copy link
Author

Your hot fix looks good to me, but I don’t understand the two points you made above. The issue is just that there are two concurrent goroutines loading the same cache value at the same time and they both end up modifying the visited field because they’re not holding an exclusive lock.

another alternative would be to add a sub-lock inside the entry struct and acquire that before marking visited as true. For now though I’d probably merge your existing hot fix since the more complex solutions probably require detailed concurrent benchmarking.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants