Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Clean up job for old annotations. #8224

Closed
sanchitraizada opened this issue Apr 26, 2017 · 13 comments · Fixed by #26156
Closed

Clean up job for old annotations. #8224

sanchitraizada opened this issue Apr 26, 2017 · 13 comments · Fixed by #26156
Labels
area/alerting/notifications Issues when sending alert notifications area/alerting Grafana Alerting area/backend type/feature-request
Milestone

Comments

@sanchitraizada
Copy link
Contributor

Alert execution engine is one of the useful functionality that would potentially be used extensively. The "events" are persisted as annotations. This can potentially grow larger as the adoption increases. Do we see any potential issues with regards to retention/maintenance etc?

Just opening a placeholder issues to discuss and brainstorm on this.

@torkelo
Copy link
Member

torkelo commented Apr 26, 2017

yes, there needs to be some retention config & cleanup scheduling

@torkelo torkelo added area/alerting Grafana Alerting area/alerting/notifications Issues when sending alert notifications labels Apr 26, 2017
@bergquist
Copy link
Contributor

This should be implemented the same way as #9671
But instead of keeping X versions we should delete annotations older than X days.

But I don't think its as easy to find a good default value. So I suggest
this feature should be disabled by default and admins can choose to enable it.

@bergquist bergquist changed the title Data retention for State changes from Alert execution engine Clean up job for old annotations. Dec 18, 2017
@bergquist bergquist added this to Ready For Implementation in Issue preparations Dec 18, 2017
@torkelo
Copy link
Member

torkelo commented Dec 18, 2017

But instead of keeping X versions we should delete annotations older than X days.

That sounds like a good plan. Tricky with a good default value

@bergquist
Copy link
Contributor

Thinking a little bit more about this issue. We need to solve two problems.

  1. Cleaning up alert annotations. Which can be solved the same way as we solved dashboard versions.
  2. Cleaning up other annotations from the annotation table. Which can be solved be deleting all annotations Where type = '' and older than X days.

Both should be triggered from https://github.com/grafana/grafana/blob/master/pkg/services/cleanup/cleanup.go#L48

@FUSAKLA
Copy link
Contributor

FUSAKLA commented Mar 21, 2018

This would be extremely helpful since I just found out that my annotation table in MySQL has 160k entries.

most of them are those second mentioned without type

+-------+----------+
| type  | count(*) |
+-------+----------+
|       |   159692 |
| alert |     7682 |
+-------+----------+

Thank's for looking into this!

@kylebrandt
Copy link
Contributor

kylebrandt commented Mar 20, 2020

Notes on what I think we need for complete annotations cleanup:

Annotations Cleanup Feature:

  • AND/OR
    • Cleanup by End Time (age)
    • Cleanup by Total Annotation Count (limit)
  • Cleanup by Type (Human, Alert, API?)
  • Cleanup of Annotation tags?
Schedule Trigger (with job lock):
    Loop Annotation Types:
        If count of older than EndTime OR count greater than N;
            Loop:
                Delete up to batch size
                If oldest newer than EndTime AND count less than N;
                    break;
                sleep;
                if LoopCount > 1 {
                    log if count grows faster than batch size
                }

Job getting counts for cleanup can also record counts as metrics.

Other notes:

  • Too many total annotations should not delete Human created annotations without administrative configuration to do so.

@bergquist
Copy link
Contributor

bergquist commented Mar 22, 2020

@kylebrandt Good, write up.

I wonder if we can slim down the requirements for this feature to only delete any annotations if they are older the X (off by default). I can see many different configurable ways to delete annotations

  • Delete All
  • Delete annotations created by humans
  • Delete all annotations not created by humans
  • Delete alerts annotations
  • Delete all none alert annotations
  • Delete all annotations matching certain tags

I'm not sure we want to provide configuration for all those options. I know I contradict my previous comments but thinking more about this we need so many different ways to decide what to delete to make it a good feature.

Delete jobs like this is very scriptable in case someone have special requirements.

@marefr
Copy link
Member

marefr commented Mar 23, 2020

I wonder if we can slim down the requirements for this feature to only delete any annotations if they are older the X (off by default). I can see many different configurable ways to delete annotations

I tend to agree. Still, alert annotations are always created by the system and would make sense to be able to automatically delete those older than X. I think manually created annotations are being used to mark for example outages and could be important to keep those intact even though their older than X days.

My suggestion would be to make a difference between alert and non-alert annotations.

Delete jobs like this is very scriptable in case someone have special requirements.

Scripting in database yes, but currently our API's are not very convenient to use since you first need to figure out which annotation identifiers you want to delete. Ref https://grafana.com/docs/grafana/latest/http_api/annotations/#delete-annotation-by-id

@kylebrandt
Copy link
Contributor

I think greater than count in addition to time could be important for keeping hosted grafana stable.

@bergquist
Copy link
Contributor

The code complexity for dealing with all those filters will be fairly small.

I'm more worried about creating too many new settings that can confuse the user.

@bergquist
Copy link
Contributor

bergquist commented Jun 1, 2020

Based on what we said so far. I think that the cleanup job should support both max-age and max count but require the administrator to opt-in for each kind of annotation. Ex Alert/API/Human.

API meaning that there is no dashboard_id associated with the annotations.
Human meaning that the annotations have an associated dashboard_id.

I'm not sure those are good descriptions of what kind of annotation they would include in the cleanup but I don’t have a better suggestion.

The settings could look something like this:

[annotations]
max_annotations_to_keep = 50000 # 0 meaning disabled
max_age = 50d # 0 meaning disabled
prune_alert_annotations = true # false by default
prune_api_annotations = true # false by default
prune_human_made_annotations = true # false by default

I’m not sure how we should deal with tags since it’s a global table and not only related to annotations. To safely delete items from that table we need to check if it’s used in any other places. I think this problem we should solve separately.

@bergquist bergquist added this to the 7.1 milestone Jun 1, 2020
@marefr
Copy link
Member

marefr commented Jun 10, 2020

@bergquist given we have three kinds of annotations or two depending how you see it:

  • Annotations connected to a dashboard
  • Annotations connected to tag(s).
    • Under the impression if you have a dashboard with annotations by tag, adding a new annotation to a panel in that dashboard will not connect it to the dashboard, but I may be wrong?
  • Annotations connected to alerts

Was thinking something along these lines with a section per type.

[annotations.dashboard]
max_annotations_to_keep = 50000 # 0 meaning disabled
max_age = 50d # 0 meaning disabled

[annotations.tags]
max_annotations_to_keep = 50000 # 0 meaning disabled
max_age = 50d # 0 meaning disabled

[annotations.alerting]
max_annotations_to_keep = 50000 # 0 meaning disabled
max_age = 50d # 0 meaning disabled

Maybe make more sense to move the alerting annotation settings to alerting settings instead? e.g.

[alerting.annotations]
max_annotations_to_keep = 50000 # 0 meaning disabled
max_age = 50d # 0 meaning disabled
; maybe max annotations per alert to keep make more sense in alerting context? For example the state history in UI only shows a maximum of 50 annotations/state changes.
max_annotations_per_alert_to_keep = 10 # 0 meaning disabled

@bergquist
Copy link
Contributor

adding a new annotation to a panel in that dashboard will not connect it to the dashboard, but I may be wrong?

Adding annotations from a dashboard will associate it with the dashboard. Even if you use tags.
Annotations based on tags in other dashboards will still be able to pick them up.

Annotations without dashboard id can only be created using the API.

Otherwise, I like your suggestion for dividing it up.

Moving alert annotations to a separate section seems like a good idea. They should be stored somewhere else anyway.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area/alerting/notifications Issues when sending alert notifications area/alerting Grafana Alerting area/backend type/feature-request
Projects
No open projects
Issue preparations
  
Ready For Implementation
6 participants