Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Handle in-progress scrapes more gracefully on config reload #2336

Closed
SuperQ opened this Issue Jan 11, 2017 · 6 comments

Comments

Projects
None yet
3 participants
@SuperQ
Copy link
Member

SuperQ commented Jan 11, 2017

What did you do?
Reload (HUP) prometheus server to add new rules. In-flight scrapes get canceled with "context canceled". Thus making the up metric bounce to 0.

What did you expect to see?
Reload is graceful, and waits for in-flight scrapes to complete or timeout before continuing with reload.

What did you see instead? Under which circumstances?
The Prometheus server stops scheduling new scrapes, waits for the in-flight scrapes to complete or timeout, then reload the config.

Environment

  • System information:

    insert output of uname -srm here

  • Prometheus version:

    insert output of prometheus -version here

  • Alertmanager version:

    insert output of alertmanager -version here (if relevant to the issue)

  • Prometheus configuration file:

insert configuration here
  • Alertmanager configuration file:
insert configuration here (if relevant to the issue)
  • Logs:
insert Prometheus and Alertmanager logs relevant to the issue here
@SuperQ

This comment has been minimized.

Copy link
Member Author

SuperQ commented Jan 11, 2017

Another possible option would be to have a reload-rules action that doesn't update the targets, but only reloads the rule evaluation.

@brian-brazil

This comment has been minimized.

Copy link
Member

brian-brazil commented Jan 11, 2017

The Prometheus server stops scheduling new scrapes, waits for the in-flight scrapes to complete or timeout, then reload the config.

Stopping all processing for potentially minutes isn't something we should do imho.

@fabxc

This comment has been minimized.

Copy link
Member

fabxc commented Jan 13, 2017

Another possible option would be to have a reload-rules action that doesn't update the targets, but only reloads the rule evaluation.

Doesn't need a separate action but just a validation whether the configuration changed or not. Sounds useful in general. Want to take a stab at implementing that?

@brian-brazil

This comment has been minimized.

Copy link
Member

brian-brazil commented Jan 13, 2017

We've had it in the past where a HUP was used to work around various SD stuckness, I'm not sure we should lose that.

@brian-brazil brian-brazil changed the title Gracefully wait for scrapes to complete on config reload Handle in-progress scrapes more gracefully on config reload Jul 14, 2017

@brian-brazil

This comment has been minimized.

Copy link
Member

brian-brazil commented Jul 14, 2017

Closing in favour of #2756

@lock

This comment has been minimized.

Copy link

lock bot commented Mar 23, 2019

This thread has been automatically locked since there has not been any recent activity after it was closed. Please open a new issue for related bugs.

@lock lock bot locked and limited conversation to collaborators Mar 23, 2019

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
You can’t perform that action at this time.