Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix data race in ingester #2327

Merged
merged 1 commit into from
Jul 9, 2020
Merged

Conversation

adityacs
Copy link
Collaborator

@adityacs adityacs commented Jul 9, 2020

What this PR does / why we need it:
Fixes data race in ingester

Which issue(s) this PR fixes:
Fixes #2265

@codecov-commenter
Copy link

codecov-commenter commented Jul 9, 2020

Codecov Report

Merging #2327 into master will increase coverage by 0.00%.
The diff coverage is 83.33%.

Impacted file tree graph

@@           Coverage Diff           @@
##           master    #2327   +/-   ##
=======================================
  Coverage   60.99%   61.00%           
=======================================
  Files         158      158           
  Lines       12751    12754    +3     
=======================================
+ Hits         7778     7781    +3     
- Misses       4390     4391    +1     
+ Partials      583      582    -1     
Impacted Files Coverage Δ
pkg/ingester/flush.go 84.81% <83.33%> (+0.29%) ⬆️
pkg/querier/queryrange/downstreamer.go 95.87% <0.00%> (-2.07%) ⬇️
pkg/logql/evaluator.go 92.54% <0.00%> (+0.40%) ⬆️

if err := c.Encode(); err != nil {
return err
}
streamsMtx.Unlock()
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think you should do a defer here.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you explain who is racing for the stream data here ? Querying ?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we should not defer Unlock. The reason being putting chunks to store here would be locked and other flush threads won't be able to call put until the lock is released. This would become sequential io and eventually make all flush threads sequential.

if err := i.store.Put(ctx, wireChunks); err != nil {

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If you return an error you're deadlocked.

Copy link
Collaborator Author

@adityacs adityacs Jul 9, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you explain who is racing for the stream data here ? Querying ?

Yeah, it is querying which access the stream chunks here

for _, c := range s.chunks {

but this is Read protected with streamMtx lock here

i.streamsMtx.RLock()

since c.Encode is not write locked it causes the race.

Also, reading first chunk of the stream and getting bounds on that causes the race.

firstTime, _ := stream.chunks[0].chunk.Bounds()

There are other places which will cause the race. All those are already protected with streamMtx but c.Encode was not protected.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

firstTime, _ := stream.chunks[0].chunk.Bounds() ? this one is fixed or already ok ?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If you return an error you're deadlocked

oh yeah. You are right. Didn't realize that.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

firstTime, _ := stream.chunks[0].chunk.Bounds() ? this one is fixed or already ok ?

this was already protected

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Great !

Copy link
Contributor

@cyriltovena cyriltovena left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@slim-bean slim-bean merged commit 01040c1 into grafana:master Jul 9, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Data races in ingester
4 participants