Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Alerting: Attempt to retry retryable errors #79161

Merged
merged 4 commits into from
Dec 6, 2023

Conversation

gotjosh
Copy link
Contributor

@gotjosh gotjosh commented Dec 6, 2023

Retrying has been broken for a good while now (at least since version 9.4) - this change attempts to re-introduce them in their simplest and safest form possible.

I first introduced #79095 to make sure we don't disrupt or put additional load on our customer's data sources with this change in a patch release. Paired with this change, retries can now work as expected.

There's two small differences between how retries work now and how they used to work in legacy alerting.

Retries only occur for valid alert definitions - if we suspect that that error comes from a malformed alert definition, we skip retrying. We have added a constant backoff of 1s in between retries.

Once this is merged, I'll make sure to update:

With what we should do next.

Retrying has been broken for a good while now (at least since version 9.4) - this change attempts to re-introduce them in their simplest and safest form possible.

I first introduced #79095 to make sure we don't disrupt or put additional load on our customer's data sources with this change in a patch release. Paired with this change, retries can now work as expected.

There's two small differences between how retries work now and how they used to work in legacy alerting.

Retries only occur for valid alert definitions - if we suspect that that error comes from a malformed alert definition we skip retrying.
We have added a constant backoff of 1s in between retries.

Signed-off-by: gotjosh <josue.abreu@gmail.com>
@gotjosh gotjosh added add to changelog product-approved Pull requests that are approved by product/managers and are allowed to be backported backport v9.4.x Mark PR for automatic backport to v9.4.x backport v9.5.x Bot will automatically open backport PR backport v10.0.x backport v10.1.x backport v10.2.x labels Dec 6, 2023
@grafana-delivery-bot grafana-delivery-bot bot added this to the 10.3.x milestone Dec 6, 2023
Signed-off-by: gotjosh <josue.abreu@gmail.com>
Signed-off-by: gotjosh <josue.abreu@gmail.com>
@gotjosh gotjosh marked this pull request as ready for review December 6, 2023 20:25
@gotjosh gotjosh requested a review from a team as a code owner December 6, 2023 20:25
@gotjosh gotjosh requested review from rwwiv, JacobsonMT, yuri-tceretian and grobinson-grafana and removed request for a team December 6, 2023 20:25
Copy link
Contributor

@yuri-tceretian yuri-tceretian left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

Signed-off-by: gotjosh <josue.abreu@gmail.com>
@gotjosh gotjosh merged commit c631261 into main Dec 6, 2023
13 checks passed
@gotjosh gotjosh deleted the attempt-to-retry-errors-take-2 branch December 6, 2023 20:45

This comment was marked as resolved.

@grafana-delivery-bot grafana-delivery-bot bot added the backport-failed Failed to generate backport PR. Please resolve conflicts and create one manually. label Dec 6, 2023

This comment was marked as resolved.

This comment was marked as resolved.

This comment was marked as resolved.

grafana-delivery-bot bot pushed a commit that referenced this pull request Dec 6, 2023
* Alerting: Attempt to retry retryable errors

Retrying has been broken for a good while now (at least since version 9.4) - this change attempts to re-introduce them in their simplest and safest form possible.

I first introduced #79095 to make sure we don't disrupt or put additional load on our customer's data sources with this change in a patch release. Paired with this change, retries can now work as expected.

There's two small differences between how retries work now and how they used to work in legacy alerting.

Retries only occur for valid alert definitions - if we suspect that that error comes from a malformed alert definition we skip retrying.
We have added a constant backoff of 1s in between retries.

---------

Signed-off-by: gotjosh <josue.abreu@gmail.com>
(cherry picked from commit c631261)
gotjosh added a commit that referenced this pull request Dec 6, 2023
* Alerting: Attempt to retry retryable errors

Retrying has been broken for a good while now (at least since version 9.4) - this change attempts to re-introduce them in their simplest and safest form possible.

I first introduced #79095 to make sure we don't disrupt or put additional load on our customer's data sources with this change in a patch release. Paired with this change, retries can now work as expected.

There's two small differences between how retries work now and how they used to work in legacy alerting.

Retries only occur for valid alert definitions - if we suspect that that error comes from a malformed alert definition we skip retrying.
We have added a constant backoff of 1s in between retries.

---------

Signed-off-by: gotjosh <josue.abreu@gmail.com>
(cherry picked from commit c631261)
gotjosh added a commit that referenced this pull request Dec 6, 2023
* Alerting: Attempt to retry retryable errors

Retrying has been broken for a good while now (at least since version 9.4) - this change attempts to re-introduce them in their simplest and safest form possible.

I first introduced #79095 to make sure we don't disrupt or put additional load on our customer's data sources with this change in a patch release. Paired with this change, retries can now work as expected.

There's two small differences between how retries work now and how they used to work in legacy alerting.

Retries only occur for valid alert definitions - if we suspect that that error comes from a malformed alert definition we skip retrying.
We have added a constant backoff of 1s in between retries.

---------

Signed-off-by: gotjosh <josue.abreu@gmail.com>
(cherry picked from commit c631261)
Signed-off-by: gotjosh <josue.abreu@gmail.com>
gotjosh added a commit that referenced this pull request Dec 6, 2023
Alerting: Attempt to retry retryable errors (#79161)

* Alerting: Attempt to retry retryable errors

Retrying has been broken for a good while now (at least since version 9.4) - this change attempts to re-introduce them in their simplest and safest form possible.

I first introduced #79095 to make sure we don't disrupt or put additional load on our customer's data sources with this change in a patch release. Paired with this change, retries can now work as expected.

There's two small differences between how retries work now and how they used to work in legacy alerting.

Retries only occur for valid alert definitions - if we suspect that that error comes from a malformed alert definition we skip retrying.
We have added a constant backoff of 1s in between retries.

---------


(cherry picked from commit c631261)

Signed-off-by: gotjosh <josue.abreu@gmail.com>
gotjosh added a commit that referenced this pull request Dec 7, 2023
Alerting: Attempt to retry retryable errors (#79161)

* Alerting: Attempt to retry retryable errors

Retrying has been broken for a good while now (at least since version 9.4) - this change attempts to re-introduce them in their simplest and safest form possible.

I first introduced #79095 to make sure we don't disrupt or put additional load on our customer's data sources with this change in a patch release. Paired with this change, retries can now work as expected.

There's two small differences between how retries work now and how they used to work in legacy alerting.

Retries only occur for valid alert definitions - if we suspect that that error comes from a malformed alert definition we skip retrying.
We have added a constant backoff of 1s in between retries.

---------

Signed-off-by: gotjosh <josue.abreu@gmail.com>
(cherry picked from commit c631261)

Co-authored-by: gotjosh <josue.abreu@gmail.com>
gotjosh added a commit that referenced this pull request Dec 7, 2023
* Alerting: Attempt to retry retryable errors

Retrying has been broken for a good while now (at least since version 9.4) - this change attempts to re-introduce them in their simplest and safest form possible.

I first introduced #79095 to make sure we don't disrupt or put additional load on our customer's data sources with this change in a patch release. Paired with this change, retries can now work as expected.

There's two small differences between how retries work now and how they used to work in legacy alerting.

Retries only occur for valid alert definitions - if we suspect that that error comes from a malformed alert definition we skip retrying.
We have added a constant backoff of 1s in between retries.

---------

Signed-off-by: gotjosh <josue.abreu@gmail.com>
(cherry picked from commit c631261)
gotjosh added a commit that referenced this pull request Dec 7, 2023
* Alerting: Attempt to retry retryable errors

Retrying has been broken for a good while now (at least since version 9.4) - this change attempts to re-introduce them in their simplest and safest form possible.

I first introduced #79095 to make sure we don't disrupt or put additional load on our customer's data sources with this change in a patch release. Paired with this change, retries can now work as expected.

There's two small differences between how retries work now and how they used to work in legacy alerting.

Retries only occur for valid alert definitions - if we suspect that that error comes from a malformed alert definition we skip retrying.
We have added a constant backoff of 1s in between retries.

---------

Signed-off-by: gotjosh <josue.abreu@gmail.com>
(cherry picked from commit c631261)
Signed-off-by: gotjosh <josue.abreu@gmail.com>
gotjosh added a commit that referenced this pull request Dec 7, 2023
* Alerting: Attempt to retry retryable errors

Retrying has been broken for a good while now (at least since version 9.4) - this change attempts to re-introduce them in their simplest and safest form possible.

I first introduced #79095 to make sure we don't disrupt or put additional load on our customer's data sources with this change in a patch release. Paired with this change, retries can now work as expected.

There's two small differences between how retries work now and how they used to work in legacy alerting.

Retries only occur for valid alert definitions - if we suspect that that error comes from a malformed alert definition we skip retrying.
We have added a constant backoff of 1s in between retries.

---------

Signed-off-by: gotjosh <josue.abreu@gmail.com>
(cherry picked from commit c631261)
gotjosh added a commit that referenced this pull request Dec 7, 2023
* Alerting: Attempt to retry retryable errors

Retrying has been broken for a good while now (at least since version 9.4) - this change attempts to re-introduce them in their simplest and safest form possible.

I first introduced #79095 to make sure we don't disrupt or put additional load on our customer's data sources with this change in a patch release. Paired with this change, retries can now work as expected.

There's two small differences between how retries work now and how they used to work in legacy alerting.

Retries only occur for valid alert definitions - if we suspect that that error comes from a malformed alert definition we skip retrying.
We have added a constant backoff of 1s in between retries.

---------

Signed-off-by: gotjosh <josue.abreu@gmail.com>
(cherry picked from commit c631261)
gotjosh added a commit that referenced this pull request Dec 7, 2023
Alerting: Attempt to retry retryable errors (#79161)

* Alerting: Attempt to retry retryable errors

Retrying has been broken for a good while now (at least since version 9.4) - this change attempts to re-introduce them in their simplest and safest form possible.

I first introduced #79095 to make sure we don't disrupt or put additional load on our customer's data sources with this change in a patch release. Paired with this change, retries can now work as expected.

There's two small differences between how retries work now and how they used to work in legacy alerting.

Retries only occur for valid alert definitions - if we suspect that that error comes from a malformed alert definition we skip retrying.
We have added a constant backoff of 1s in between retries.

---------

Signed-off-by: gotjosh <josue.abreu@gmail.com>
(cherry picked from commit c631261)
gotjosh added a commit that referenced this pull request Dec 7, 2023
Alerting: Attempt to retry retryable errors (#79161)

* Alerting: Attempt to retry retryable errors

Retrying has been broken for a good while now (at least since version 9.4) - this change attempts to re-introduce them in their simplest and safest form possible.

I first introduced #79095 to make sure we don't disrupt or put additional load on our customer's data sources with this change in a patch release. Paired with this change, retries can now work as expected.

There's two small differences between how retries work now and how they used to work in legacy alerting.

Retries only occur for valid alert definitions - if we suspect that that error comes from a malformed alert definition we skip retrying.
We have added a constant backoff of 1s in between retries.

---------


(cherry picked from commit c631261)

Signed-off-by: gotjosh <josue.abreu@gmail.com>
gotjosh added a commit that referenced this pull request Dec 7, 2023
Alerting: Attempt to retry retryable errors (#79161)

* Alerting: Attempt to retry retryable errors

Retrying has been broken for a good while now (at least since version 9.4) - this change attempts to re-introduce them in their simplest and safest form possible.

I first introduced #79095 to make sure we don't disrupt or put additional load on our customer's data sources with this change in a patch release. Paired with this change, retries can now work as expected.

There's two small differences between how retries work now and how they used to work in legacy alerting.

Retries only occur for valid alert definitions - if we suspect that that error comes from a malformed alert definition we skip retrying.
We have added a constant backoff of 1s in between retries.

---------

Signed-off-by: gotjosh <josue.abreu@gmail.com>
(cherry picked from commit c631261)
@aangelisc aangelisc modified the milestones: 10.3.x, 10.2.3 Dec 21, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
add to changelog area/backend backport v9.4.x Mark PR for automatic backport to v9.4.x backport v9.5.x Bot will automatically open backport PR backport v10.0.x backport v10.1.x backport v10.2.x backport-failed Failed to generate backport PR. Please resolve conflicts and create one manually. product-approved Pull requests that are approved by product/managers and are allowed to be backported
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants