New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

cmd/bosun: (wip) link previous incidents and next incident #2323

Merged
merged 2 commits into from Oct 3, 2018

Conversation

Projects
None yet
2 participants
@kylebrandt
Member

kylebrandt commented Sep 28, 2018

requires a migration as all of the IncidentState objects are updated.
This makes it so when viewing incidents you can quickly navigate to previous incidents
or the next incident if there is a new one for the same alert key.

@kylebrandt

This comment has been minimized.

Member

kylebrandt commented Sep 28, 2018

selection_073

@kylebrandt

This comment has been minimized.

Member

kylebrandt commented Sep 28, 2018

TODO:

  • Document new Template Vars available
  • Add previous incident links to UI in dashboard
  • Idea: Add template func to fetch other Incident states so you can do things like include the last close, notes, etc messages from the previous incident.
  • Make it so the rule page will fetch actual incident history if your testing number matches a real incident.
@kylebrandt

This comment has been minimized.

Member

kylebrandt commented Sep 28, 2018

example for the template func:

        <h3>Previous Ack/Close Reasons</h3>
        
        <table>
        <tr>
            <th>Action</th>
            <th>Incident</th>
            <th>Who</th>
            <th>When</th>
            <th>Why</th>
        {{- range $id := .PreviousIds -}}
            {{- $pi := $.GetIncidentState $id -}}
            {{- if notNil $pi -}}
                {{- range $action := $pi.Actions -}}
                    {{- $actionString := $action.Type | printf "%s" -}}
                    {{- $closed := eq $actionString "Closed" -}}
                    {{- $ackd := eq $actionString "Acknowledged" -}}
                    {{- if and (or $closed $ackd) (ne $action.User "bosun") }}
                        <tr>
                            <td>{{ $action.Type | printf "%s" }}</td>
                            <td><a target="_blank" href="https://bosun.ds.stackexchange.com/incident?id={{$pi.Id}}">#{{$pi.Id}}</a></td>
                            <td>{{ $action.User }}</td>
                            <td>{{ $action.Time.Format "2006-01-02 15:04" }}</td>
                            <td>{{ $action.Message }}</td>
                        </tr>
                    {{- end -}}
                {{- end -}}
            {{- else -}}
                <tr><td>{{ .LastError }}</td><td></td><td></td><td></td><td></td></tr>   
            {{- end -}}
        {{- end -}}
        </table>
@kylebrandt

This comment has been minimized.

Member

kylebrandt commented Sep 29, 2018

Fancier table with col/rowspan (note to self: if making into example check for nil on incident End time just in case):

        {{template "header" . }}
        <style>
            td, th {
                padding-right: 10px;
                padding-left: 2px;
                border: 1px solid black;
            }
        </style>
        <h3>Previous Incidents with Ack/Close/Note Actions</h3>
        
        <table>
        <thead>
            <tr>
                <th colspan="3">Incident</th>
                <th colspan="4">Actions</th>
            </tr>
            <tr>
                <th>Id</th>
                <th>Duration</th>
                <th>Event Count</th>
                <th>Who</th>
                <th>Action</th>
                <th>When</th>
                <th>Message</th>
            </tr>
        </thead>
        {{- range $id := .PreviousIds -}}
            {{- $pi := $.GetIncidentState $id -}}
            {{- if notNil $pi -}}
                {{- $filteredActions := makeSlice -}}
                {{- $incidentDuration := $pi.End.Sub $pi.Start -}}
                {{- range $action := $pi.Actions -}}
                    {{- $actionString := $action.Type.String -}}
                    {{- $closed := eq $actionString "Closed" -}}
                    {{- $ackd := eq $actionString "Acknowledged" -}}
                    {{- $note := eq $actionString "Note" -}}
                    {{- if or $closed $ackd $note }}
                        {{- $filteredActions = append $filteredActions $action -}}
                    {{- end -}}
                {{- end -}}
                
                <tr>
                    {{ $actionLen := len $filteredActions }}
                    <td {{ if gt $actionLen 1 -}} rowspan="{{- len $filteredActions }}" {{- end -}}>
                        <a target="_blank" href="https://bosun.ds.stackexchange.com/incident?id={{$pi.Id}}">#{{$pi.Id }}</a>    
                    </td>
                    <td {{ if gt $actionLen 1 -}} rowspan="{{- len $filteredActions }}" {{- end -}}>
                        {{ $incidentDuration.Truncate 1e9 }}
                    </td>
                    <td {{ if gt $actionLen 1 -}} rowspan="{{- len $filteredActions }}" {{- end -}}>
                        {{ len $pi.Events }}
                    </td>
                    {{- range $ia, $action := $filteredActions -}}
                        {{- if gt $ia 0 -}}</tr><tr></td>{{ end }}
                        <td>{{ $action.User }}</td>
                        <td>{{ $action.Type | printf "%s" }}</td>
                        <td>{{ $action.Time.Format "2006-01-02 15:04" }}</td>
                        <td>{{ $action.Message }}</td>
                        {{ if gt $ia 0 }}</tr>{{ end }}
                    {{- end -}}
                </tr>
                
            {{- else -}}
                <tr><td rowspan=7>{{ .LastError }}</td></tr>   
            {{- end -}}
        {{- end -}}
        </table>
slog.Infoln("Running population of previous incidents. This can take several minutes.")
// Hacky Work better?
ids, err := d.getAllIncidentIdsByKeys()

This comment has been minimized.

@captncraig

captncraig Oct 1, 2018

Contributor

Possibly collect Alert Keys up front, and run inner loop once per alert key, not once per incident.

return err
}
previousIds := []int64{}
previousIds, err = d.State().GetAllIncidentIdsByAlertKey(incident.AlertKey)

This comment has been minimized.

@captncraig

captncraig Oct 1, 2018

Contributor

At least cache these results by alert key.

}
for _, id := range previousIds {
if incident.Id > id {
incident.PreviousIds = append(incident.PreviousIds, id)

This comment has been minimized.

@captncraig

captncraig Oct 1, 2018

Contributor

Wait, we're storing all previous alert keys on every alert? Gross. Why?

This comment has been minimized.

@kylebrandt

kylebrandt Oct 1, 2018

Member

@captncraig To make it easier to either:

  1. Display more than just the previous incident in a view
  2. Make it easier to iterate over all previous incidents in templates without having to follow a linked list.

This comment has been minimized.

@kylebrandt

kylebrandt Oct 1, 2018

Member

Also to clarify not the previous alert keys, just an array of incident id numbers (int64)

This comment has been minimized.

@kylebrandt

kylebrandt Oct 1, 2018

Member

Another Also :P alert keys are the alert name plus the tagset (not all alerts under the name, unless the tagset is {} or of len 1. So the list generally should not be crazy long.

This comment has been minimized.

@captncraig

captncraig Oct 2, 2018

Contributor

I guess I'm just uncomfortable anytime I see potentially large, and also redundant, and also repetitive collections or lists.

I guess though I'd rather a template could show them all without having to recursively traverse the chain, which was gonna be the alternative I suggested. So ok.

kylebrandt added some commits Oct 3, 2018

cmd/bosun: MIGRATION: Link previous incidents and next incident
You should back up your redis db before this Migration operation. This will update all incidents in redis/ledis, bosun will not start until this one-time operation is complete. ~100k incidents took two minutes with redis on a development workstation.

Add GetIncidentState template function, as well as PreviousIds field to
templates.

@kylebrandt kylebrandt merged commit 10cf57e into master Oct 3, 2018

3 checks passed

bosun All checks Passed!
continuous-integration/travis-ci/pr The Travis CI build passed
Details
continuous-integration/travis-ci/push The Travis CI build passed
Details
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment