Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Opsgenie team responders support #43281

Open
wants to merge 4 commits into
base: master
Choose a base branch
from

Conversation

carloscastrojumo
Copy link
Contributor

@carloscastrojumo carloscastrojumo commented Jun 20, 2024

Currently, when creating an Opsgenie alert with the Opsgenie integration, responders can only be of type schedule.
However that are situations where type teams are required to work with Opsgenie integrations, as well, teams takes multiple schedules into consideration, such as escalations schedules.

Instead of changing the teleport.dev/notify-services annotation to support teams by default, I purpose adding a new annotation teleport.dev/teams that will contains the respective teams and the result will be a merge of both annotations to the responders payload.

Teleport role:

kind: role
metadata:
  name: prod-write-teaccess-request
spec:
  allow:
    request:
      annotations:
        teleport.dev/schedules:
        - MySchedule
        teleport.dev/teams:
        - MyTeam
      roles:
      - prod-write-role

will be converted into Opsgenie payload

{
  "message": "Access request from ...",
  "alias": "....",
  "description":"....",
  "responders":[
  	{"name":"MySchedule", "type":"schedule"},
  	{"name":"MyTeam", "type":"team"}
  ],
  "priority":"P2"
}

Update

While adding the test in the suite, I found a couple of issues:

  1. There was a redundant for loop, iterating all the schedules and creating an alert for each schedule/teams, as we can see from the example in the logs below.
teleport-cluster-auth-5d9dbd496-vmrzd teleport {"ei":0,"event":"access_request.create","uid":"82ead604-b481-4e32-8e3e-f89fdc9e2bab","code":"T5000I","time":"2024-06-28T10:02:48.596Z","cluster_name":"teleport.test-hopeful-cobra.eu-dev.awsmmcn.private","user":"carlos.castro@example.com","user_kind":1,"expires":"2024-06-29T15:42:55Z","roles":["admin-server-access"],"id":"01905e4c-2b8e-72a5-9089-0a8b37c70d5b","state":"PENDING","annotations":{"teleport.dev/notify-services":["Primary - Access","Teleport JIT Testing_schedule"],"teleport.dev/schedules":["Teleport JIT Testing_schedule"]},"resource_ids":[{"cluster":"teleport.test-hopeful-cobra.eu-dev.awsmmcn.private","kind":"node","name":"3a2239b2-9abf-429b-b152-ba2933cd8e9a"},{"cluster":"teleport.test-hopeful-cobra.eu-dev.awsmmcn.private","kind":"node","name":"409ec5d0-f965-49a9-bbb0-0c5a4496503b"}],"max_duration":"2024-06-29T15:42:55Z"}

teleport-cluster-auth-5d9dbd496-slgqc teleport {"caller":"opsgenie/app.go:345","component":null,"level":"info","message":"Successfully created Opsgenie alert","opsgenie_alert_id":"5523ad71-4b13-42e1-a77f-1e8dea8286a1-1719568968975","opsgenie_service_name":"Primary - Access","request_id":"01905e4c-2b8e-72a5-9089-0a8b37c70d5b","request_op":"put","request_state":"PENDING","timestamp":"2024-06-28T10:02:49Z"}

teleport-cluster-auth-5d9dbd496-slgqc teleport {"caller":"opsgenie/app.go:345","component":null,"level":"info","message":"Successfully created Opsgenie alert","opsgenie_alert_id":"5523ad71-4b13-42e1-a77f-1e8dea8286a1-1719568968975","opsgenie_service_name":"Teleport JIT Testing_schedule","request_id":"01905e4c-2b8e-72a5-9089-0a8b37c70d5b","request_op":"put","request_state":"PENDING","timestamp":"2024-06-28T10:02:49Z"}

Since the alias is the same for all alerts, Opsgenie does not consider unique alerts, so it add a count instead of creating duplicate alerts (message below is from an alert activity log). Thats why I also think this was not caught by the tests.

Alert is received with same alias. Message: "Access request from carlos.castro@example.com". Count is increased to 2

However since the alert request already contains all the schedule in the responders fields, there is no need to create multiple alerts.

  1. The alert payload was using the schedule ID, but since we are actually using the name on the annotations, the alert is created with no responders, due to the ID (which is the name) did not exist. Updating to Name instead, fixes the issue.

@r0mant r0mant requested review from EdwardDowling and removed request for jimbishopp and zmb3 June 20, 2024 17:58
@r0mant
Copy link
Collaborator

r0mant commented Jun 20, 2024

@carloscastrojumo Thank you for the contribution! We will review and verify this PR and include in a release if it passes the team's review.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants