Skip to content

OOM/OOD notifications documentation #16581

@stirby

Description

@stirby

We are adding native notifications to alert users when they are overutilizing memory and disk to prevent agent disconnects due to OOM/OOD ahead of time.

This notification requires more configuration than most opt-in alerts. We should inform users before their release at the start of march.

Resource alerting notifications allow template admins to set "high water mark" thresholds for memory and volume consumption in Terraform. When these thresholds are exceeded in workspaces created from that template, the owner of the workspace is notified. To enable OOM/OOD notifications on a template, use the resources_monitoring[1] block on the coder_agent[2] resource in our Terraform provider. You can specify one or more volumes to monitor for OOD alerts, OOM alerts are reported per-agent.

Here's an example configuration to warn the user when memory usage exceeds 90%, or disk usage exceeds 80%/95%:

resource "coder_agent" "main" {
  arch = data.coder_provisioner.dev.arch
  os   = data.coder_provisioner.dev.os
  resources_monitoring {
    memory {
      enabled   = true
      threshold = 90
    }
    volume {
      path      = "/volume1"
      enabled   = true
      threshold = 80
    }
    volume {
      path      = "/volume2"
      enabled   = true
      threshold = 95
    }
  }
}
  }

[1] https://registry.terraform.io/providers/coder/coder/latest/docs/resources/agent#resources_monitoring-1
[2] https://registry.terraform.io/providers/coder/coder/latest/docs/resources/agent

Metadata

Metadata

Assignees

No one assigned

    Labels

    docsArea: coder.com/docs

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions