Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

remote-write: fix race condition between ApplyConfig and Notify #13135

Merged
merged 2 commits into from Nov 14, 2023

Conversation

machine424
Copy link
Collaborator

@machine424 machine424 commented Nov 13, 2023

between Storage.Notify() and Storage.ApplyConfig()

see #12747

I'll push a fix, waiting for the test to fail in CI: DONE

Ready for review.

fixes #12747

between Storage.Notify() and Storage.ApplyConfig()

see prometheus#12747

Signed-off-by: machine424 <ayoubmrini424@gmail.com>
@machine424
Copy link
Collaborator Author

@@ -77,6 +77,9 @@ func NewStorage(l log.Logger, reg prometheus.Registerer, stCallback startTimeCal
}

func (s *Storage) Notify() {
s.mtx.Lock()
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

s.rws.mtx.Lock is sufficient as s.rws.ApplyConfig called from ApplyConfig is the one that really conflicts with Notify, but I didn't see any use of inner locks like that.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it's a better idea if it's locked inside rws. I don't see any other cases where inner fields of rws are touched by Storage.
So the loop for Notify, including lock, should be in a method on WriteStorage, and this method just forwards to that.

@bboreham
Copy link
Member

…with Storage.ApplyConfig()

Signed-off-by: machine424 <ayoubmrini424@gmail.com>
@bboreham bboreham changed the title remote/storage.go: add a test to highlight a race condition remote-write: fix race condition between ApplyConfig and Notify Nov 14, 2023
Copy link
Member

@bboreham bboreham left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@roidelapluie roidelapluie merged commit c44cab4 into prometheus:main Nov 14, 2023
24 checks passed
@cstyan
Copy link
Member

cstyan commented Nov 16, 2023

@machine424 thank you!

@mknapphrt
Copy link
Contributor

Are there plans to add this patch to the 2.45 LTS version? We've been dealing with sporadic crashes hoping this would eventually get included.

@cstyan
Copy link
Member

cstyan commented Apr 15, 2024

I'm not aware of how the LTS releases are being updated, @roidelapluie @jesusvazquez is there a backport process?

@mknapphrt
Copy link
Contributor

Gentle bump on this @roidelapluie @jesusvazquez

@bboreham
Copy link
Member

Prometheus LTS is described here: https://prometheus.io/docs/introduction/release-cycle/

The process to backport is the same as for a regular release. See for instance #13450, #13779.
However @roidelapluie has done all previous LTS releases.

@bboreham bboreham mentioned this pull request Apr 30, 2024
roidelapluie pushed a commit to roidelapluie/prometheus that referenced this pull request Apr 30, 2024
roidelapluie pushed a commit to roidelapluie/prometheus that referenced this pull request Apr 30, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

concurrent map iteration and map write in remote storage
5 participants