Skip to content

Latest commit

 

History

History
934 lines (775 loc) · 24.6 KB

agent.mdx

File metadata and controls

934 lines (775 loc) · 24.6 KB
layout page_title description
api
Agent - HTTP API
The /agent endpoints interact with the local Nomad agent to interact with members and servers.

Agent HTTP API

The /agent endpoints are used to interact with the local Nomad agent.

List Members

This endpoint queries the agent for the known peers in the gossip pool. This endpoint is only applicable to servers. Due to the nature of gossip, this is eventually consistent.

Method Path Produces
GET /agent/members application/json

The table below shows this endpoint's support for blocking queries and required ACLs.

Blocking Queries ACL Required
NO node:read

Sample Request

$ curl \
    https://localhost:4646/v1/agent/members

Sample Response

{
  "ServerName": "bacon-mac",
  "ServerRegion": "global",
  "ServerDC": "dc1",
  "Members": [
    {
      "Name": "bacon-mac.global",
      "Addr": "127.0.0.1",
      "Port": 4648,
      "Tags": {
        "mvn": "1",
        "build": "0.5.5dev",
        "port": "4647",
        "bootstrap": "1",
        "role": "nomad",
        "region": "global",
        "dc": "dc1",
        "vsn": "1"
      },
      "Status": "alive",
      "ProtocolMin": 1,
      "ProtocolMax": 5,
      "ProtocolCur": 2,
      "DelegateMin": 2,
      "DelegateMax": 4,
      "DelegateCur": 4
    }
  ]
}

List Servers

This endpoint lists the known server nodes. The servers endpoint is used to query an agent in client mode for its list of known servers. Client nodes register themselves with these server addresses so that they may dequeue work. The servers endpoint can be used to keep this configuration up to date if there are changes in the cluster.

Method Path Produces
GET /agent/servers application/json

The table below shows this endpoint's support for blocking queries and required ACLs.

Blocking Queries ACL Required
NO agent:read

Sample Request

$ curl \
    https://localhost:4646/v1/agent/servers

Sample Response

["127.0.0.1:4647"]

Update Servers

This endpoint updates the list of known servers to the provided list. This replaces all previous server addresses with the new list.

Method Path Produces
POST /agent/servers (empty body)

The table below shows this endpoint's support for blocking queries and required ACLs.

Blocking Queries ACL Required
NO agent:write

Parameters

  • address (string: <required>) - Specifies the list of addresses in the format ip:port. This is specified as a query string.

Sample Request

$ curl \
    --request POST \
    "https://localhost:4646/v1/agent/servers?address=1.2.3.4:4647&address=5.6.7.8:4647"

Query Self

This endpoint queries the state of the target agent (self).

Method Path Produces
GET /agent/self application/json

The table below shows this endpoint's support for blocking queries and required ACLs.

Blocking Queries ACL Required
NO agent:read

Sample Request

$ curl \
    https://localhost:4646/v1/agent/self

Sample Response

{
  "config": {
    "Addresses": {
      "HTTP": "127.0.0.1",
      "RPC": "127.0.0.1",
      "Serf": "127.0.0.1"
    },
    "AdvertiseAddrs": {
      "HTTP": "127.0.0.1:4646",
      "RPC": "127.0.0.1:4647",
      "Serf": "127.0.0.1:4648"
    },
    "BindAddr": "127.0.0.1",
    "Client": {
      "AllocDir": "",
      "ChrootEnv": {},
      "ClientMaxPort": 14512,
      "ClientMinPort": 14000,
      "DisableRemoteExec": false,
      "Enabled": true,
      "GCDiskUsageThreshold": 99,
      "GCInodeUsageThreshold": 99,
      "GCInterval": 600000000000,
      "MaxKillTimeout": "30s",
      "Meta": {},
      "NetworkInterface": "lo0",
      "NetworkSpeed": 0,
      "NodeClass": "",
      "Options": {
        "driver.docker.volumes": "true"
      },
      "Reserved": {
        "CPU": 0,
        "DiskMB": 0,
        "MemoryMB": 0,
        "ParsedReservedPorts": null,
        "ReservedPorts": ""
      },
      "Servers": null,
      "StateDir": ""
    },
    "Consul": {
      "Addr": "",
      "Auth": "",
      "AutoAdvertise": true,
      "CAFile": "",
      "CertFile": "",
      "ChecksUseAdvertise": false,
      "ClientAutoJoin": true,
      "ClientServiceName": "nomad-client",
      "EnableSSL": false,
      "KeyFile": "",
      "ServerAutoJoin": true,
      "ServerServiceName": "nomad",
      "Timeout": 5000000000,
      "Token": "",
      "VerifySSL": false
    },
    "DataDir": "",
    "Datacenter": "dc1",
    "DevMode": true,
    "DisableAnonymousSignature": true,
    "DisableUpdateCheck": false,
    "EnableDebug": true,
    "EnableSyslog": false,
    "Files": null,
    "HTTPAPIResponseHeaders": {},
    "LeaveOnInt": false,
    "LeaveOnTerm": false,
    "LogLevel": "DEBUG",
    "NodeName": "",
    "Ports": {
      "HTTP": 4646,
      "RPC": 4647,
      "Serf": 4648
    },
    "Region": "global",
    "Revision": "f551dcb83e3ac144c9dbb90583b6e82d234662e9",
    "Server": {
      "BootstrapExpect": 0,
      "DataDir": "",
      "Enabled": true,
      "EnabledSchedulers": null,
      "HeartbeatGrace": "",
      "NodeGCThreshold": "",
      "NumSchedulers": 0,
      "ProtocolVersion": 0,
      "RejoinAfterLeave": false,
      "RetryInterval": "30s",
      "RetryJoin": [],
      "RetryMaxAttempts": 0,
      "StartJoin": []
    },
    "SyslogFacility": "LOCAL0",
    "TLSConfig": {
      "CAFile": "",
      "CertFile": "",
      "EnableHTTP": false,
      "EnableRPC": false,
      "KeyFile": "",
      "VerifyServerHostname": false
    },
    "Telemetry": {
      "CirconusAPIApp": "",
      "CirconusAPIToken": "",
      "CirconusAPIURL": "",
      "CirconusBrokerID": "",
      "CirconusBrokerSelectTag": "",
      "CirconusCheckDisplayName": "",
      "CirconusCheckForceMetricActivation": "",
      "CirconusCheckID": "",
      "CirconusCheckInstanceID": "",
      "CirconusCheckSearchTag": "",
      "CirconusCheckSubmissionURL": "",
      "CirconusCheckTags": "",
      "CirconusSubmissionInterval": "",
      "CollectionInterval": "1s",
      "DataDogAddr": "",
      "DataDogTags": [],
      "DisableHostname": false,
      "PublishAllocationMetrics": false,
      "PublishNodeMetrics": false,
      "StatsdAddr": "",
      "StatsiteAddr": "",
      "UseNodeName": false
    },
    "Vault": {
      "Addr": "https://vault.service.consul:8200",
      "AllowUnauthenticated": true,
      "ConnectionRetryIntv": 30000000000,
      "Enabled": null,
      "Role": "",
      "TLSCaFile": "",
      "TLSCaPath": "",
      "TLSCertFile": "",
      "TLSKeyFile": "",
      "TLSServerName": "",
      "TLSSkipVerify": null,
      "TaskTokenTTL": "",
      "Token": "root"
    },
    "Version": "0.5.5",
    "VersionPrerelease": "dev"
  },
  "member": {
    "Addr": "127.0.0.1",
    "DelegateCur": 4,
    "DelegateMax": 4,
    "DelegateMin": 2,
    "Name": "bacon-mac.global",
    "Port": 4648,
    "ProtocolCur": 2,
    "ProtocolMax": 5,
    "ProtocolMin": 1,
    "Status": "alive",
    "Tags": {
      "role": "nomad",
      "region": "global",
      "dc": "dc1",
      "vsn": "1",
      "mvn": "1",
      "build": "0.5.5dev",
      "port": "4647",
      "bootstrap": "1"
    }
  },
  "stats": {
    "runtime": {
      "cpu_count": "8",
      "kernel.name": "darwin",
      "arch": "amd64",
      "version": "go1.8",
      "max_procs": "7",
      "goroutines": "79"
    },
    "nomad": {
      "server": "true",
      "leader": "true",
      "leader_addr": "127.0.0.1:4647",
      "bootstrap": "false",
      "known_regions": "1"
    },
    "raft": {
      "num_peers": "0",
      "fsm_pending": "0",
      "last_snapshot_index": "0",
      "last_log_term": "2",
      "commit_index": "144",
      "term": "2",
      "last_log_index": "144",
      "snapshot_version_max": "1",
      "latest_configuration_index": "1",
      "latest_configuration": "[{Suffrage:Voter ID:127.0.0.1:4647 Address:127.0.0.1:4647}]",
      "last_contact": "never",
      "applied_index": "144",
      "state": "Leader",
      "last_snapshot_term": "0"
    },
    "client": {
      "heartbeat_ttl": "17.79568937s",
      "node_id": "fb2170a8-257d-3c64-b14d-bc06cc94e34c",
      "known_servers": "127.0.0.1:4647",
      "num_allocations": "0",
      "last_heartbeat": "10.107423052s"
    },
    "serf": {
      "event_time": "1",
      "event_queue": "0",
      "encrypted": "false",
      "member_time": "1",
      "query_time": "1",
      "intent_queue": "0",
      "query_queue": "0",
      "members": "1",
      "failed": "0",
      "left": "0",
      "health_score": "0"
    }
  }
}

Join Agent

This endpoint introduces a new member to the gossip pool. This endpoint is only eligible for servers.

Method Path Produces
POST /agent/join application/json

The table below shows this endpoint's support for blocking queries and required ACLs.

Blocking Queries ACL Required
NO none

Parameters

  • address (string: <required>) - Specifies the address to join in the ip:port format. This is provided as a query parameter and may be specified multiple times to join multiple servers.

Sample Request

$ curl \
    --request POST \
    "https://localhost:4646/v1/agent/join?address=1.2.3.4&address=5.6.7.8"

Sample Response

{
  "error": "",
  "num_joined": 2
}

Force Leave Agent

This endpoint forces a member of the gossip pool from the "failed" state to the "left" state. This allows the consensus protocol to remove the peer and stop attempting replication. This is only applicable for servers.

Method Path Produces
POST /agent/force-leave application/json

The table below shows this endpoint's support for blocking queries and required ACLs.

Blocking Queries ACL Required
NO agent:write

Parameters

  • node (string: <required>) - Specifies the name of the node to force leave.
  • prune (boolean: <optional>) - Removes failed or left server from the Serf member list immediately. If member is actually still alive, it will eventually rejoin the cluster again.

Sample Request

$ curl \
    --request POST \
    https://localhost:4646/v1/agent/force-leave?node=client-ab2e23dc&prune=true

Health

This endpoint returns whether or not the agent is healthy. When using Consul it is the endpoint Nomad will register for its own health checks.

When the agent is unhealthy 500 will be returned along with JSON response containing an error message.

Method Path Produces
GET /agent/health application/json

The table below shows this endpoint's support for blocking queries and required ACLs.

Blocking Queries ACL Required
NO none

Sample Request

$ curl \
    https://localhost:4646/v1/agent/health

Sample Response

{
  "client": {
    "message": "ok",
    "ok": true
  },
  "server": {
    "message": "ok",
    "ok": true
  }
}

Host

This endpoint returns data about the agent's host environment from the perspective of the agent. It is included in the archive produced by nomad operator debug. Known sensitive environment variables are shown as <redacted>, but the response may still contain sensitive information.

Method Path Produces
GET /agent/host application/json

The table below shows this endpoint's support for blocking queries and required ACLs.

Blocking Queries ACL Required
NO agent:read

Parameters

  • server_id (string: <optional>) - Specify the server name for targeting.

  • node_id (string: <optional>) - Specify the client node id for targeting.

Sample Request

$ curl \
    https://localhost:4646/v1/agent/host?node_id=4bb9aca7-d43b-43fc-d604-3a271ef0a6c0

Sample Response

{
  "AgentID": "4bb9aca7-d43b-43fc-d604-3a271ef0a6c0",
  "HostData": {
    "OS": "x86_64 ip-172-31-92-147 4.15.0-1007-aws Linux #7-Ubuntu SMP Tue Apr 24 10:56:17 UTC 2018",
    "Network": [
      {
        "DialPacket": "\"udp4\" \"\"",
        "DialStream": "\"tcp4\" \"\"",
        "ListenPacket": "\"udp4\" \"\"",
        "ListenStream": "\"tcp4\" \"\"",
        "address": "127.0.0.1",
        "binary": "01111111000000000000000000000001",
        "broadcast": "127.255.255.255",
        "first_usable": "127.0.0.1",
        "hex": "7f000001",
        "host": "127.0.0.1",
        "last_usable": "127.255.255.254",
        "mask_bits": "8",
        "netmask": "255.0.0.0",
        "network": "127.0.0.0",
        "octets": "127 0 0 1",
        "port": "0",
        "size": "16777216",
        "string": "127.0.0.1/8",
        "type": "IPv4",
        "uint32": "2130706433"
      }
    ],
    "ResolvConf": "nameserver 172.17.0.1\nnameserver 8.8.8.8\n",
    "Hosts": "127.0.0.1 localhost\n\n# The following lines are desirable for IPv6 capable hosts\n::1 ip6-localhost ip6-loopback\nfe00::0 ip6-localnet\nff00::0 ip6-mcastprefix\nff02::1 ip6-allnodes\nff02::2 ip6-allrouters\nff02::3 ip6-allhosts\n127.0.0.1 ip-172-31-71-163\n127.0.0.1 ip-172-31-92-147\n",
    "Environment": {
      "INVOCATION_ID": "b106b6ac67764c9b9f85c5cc5c3357e5",
      "JOURNAL_STREAM": "9:109527",
      "LANG": "C.UTF-8",
      "PATH": "/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin"
    },
    "Disk": {
      "/": {
        "DiskMB": 7876,
        "UsedMB": 4287
      }
    }
  }
}

Stream Logs

This endpoint streams logs from the local agent until the connection is closed

Method Path Produces
GET /agent/monitor application/json

The table below shows this endpoint's support for blocking queries and required ACLs.

Blocking Queries ACL Required
NO agent:read

Parameters

  • log_level (string: "info") - Specifies a text string containing a log level to filter on, such as info. Possible values include trace, debug, info, warn, error

  • log_json (bool: false) - Specifies if the log format for streamed logs should be JSON.

  • log_include_location (bool: false) - Specifies if the logs streamed should include file and line information.

  • node_id (string: "a57b2adb-1a30-2dda-8df0-25abb0881952") - Specifies a text string containing a node-id to target for streaming.

  • server_id (string: "server1.global") - Specifies a text string containing a server name or "leader" to target a specific remote server or leader for streaming.

  • plain (bool: false) - Specifies if the response should be JSON or plaintext

Sample Request

$ curl \
    "https://localhost:4646/v1/agent/monitor?log_level=debug&server_id=leader"

$ curl \
    "https://localhost:4646/v1/agent/monitor?log_level=debug&node_id=a57b2adb-1a30-2dda-8df0-25abb0881952"

Sample Response

{
  "Offset": 0,
  "Data": "NTMxOTMyCjUzMTkzMwo1MzE5MzQKNTMx..."
  "FileEvent": "log"
}

Field Reference

The return value is a stream of frames. These frames contain the following fields:

  • Data - A base64 encoding of the bytes being streamed.

  • FileEvent - An event that could cause a change in the streams position. The possible value for this endpoint is "log".

  • Offset - Offset is the offset into the stream.

Agent Runtime Profiles

This endpoint is the equivalent of Go's /debug/pprof endpoint but is protected by ACLs and supports remote forwarding to a client node or server. See the Golang documentation for a list of available profiles.

Method Path Produces
GET /agent/pprof/cmdline text/plain
GET /agent/pprof/profile application/octet-stream
GET /agent/pprof/trace application/octet-stream
GET /agent/pprof/<pprof profile> application/octet-stream

The table below shows this endpoint's support for blocking queries and required ACLs.

Blocking Queries ACL Required
NO agent:write

Default Behavior

This endpoint is enabled whenever ACLs are enabled. Due to the potentially sensitive nature of data contained in profiles, as well as their significant performance impact, the agent/pprof endpoint is protected by a high level ACL: agent:write. For these reasons its recommended to leave enable_debug unset and only use the ACL-protected endpoints.

The following table explains when each endpoint is available:

Endpoint enable_debug ACLs Available?
/v1/agent/pprof unset n/a no
/v1/agent/pprof true n/a yes
/v1/agent/pprof false n/a no
/v1/agent/pprof unset off no
/v1/agent/pprof unset on yes
/v1/agent/pprof true off yes
/v1/agent/pprof false on yes

Parameters

  • node_id (string: "a57b2adb-1a30-2dda-8df0-25abb0881952") - Specifies a text string containing a Node ID to target for profiling.

  • server_id (string: "server1.global") - Specifies a text string containing a Server ID, name, or leader to target a specific remote server or leader for profiling.

  • seconds (int: 3) - Specifies the amount of time to run a profile or trace request for.

  • debug (int: 0) - Specifies if a given pprof profile should be returned as human readable plain text instead of the pprof binary format. Defaults to 0, setting to 1 enables human readable plain text. For goroutine profiles, setting to 2 extends the plain text format with additional information helpful for debugging stalled goroutines.

Sample Request

$ curl -O -J \
    --header "X-Nomad-Token: 8176afd3-772d-0b71-8f85-7fa5d903e9d4" \
    "https://localhost:4646/v1/agent/pprof/goroutine?server_id=leader"

$ go tool pprof goroutine

$ curl -O -J \
    --header "X-Nomad-Token: 8176afd3-772d-0b71-8f85-7fa5d903e9d4" \
    "https://localhost:4646/v1/agent/pprof/profile?seconds=5&node_id=a57b2adb-1a30-2dda-8df0-25abb0881952"

$ go tool pprof profile

$ curl -O -J \
    --header "X-Nomad-Token: 8176afd3-772d-0b71-8f85-7fa5d903e9d4" \
    "https://localhost:4646/v1/agent/pprof/trace?&seconds=5&server_id=server1.global"

go tool trace trace

Fetch all scheduler worker's status

The /agent/schedulers endpoint allow Nomad operators to inspect the state of a Nomad server agent's scheduler workers.

Method Path Produces
GET /agent/schedulers application/json

The table below shows this endpoint's support for blocking queries and required ACLs.

Blocking Queries ACL Required
NO agent:read

Parameters

This endpoint accepts no additional parameters.

Sample Request

$ curl \
    https://localhost:4646/v1/agent/schedulers

Sample Response

{
  "schedulers": [
    {
      "enabled_schedulers": [
        "service",
        "batch",
        "system",
        "sysbatch",
        "_core"
      ],
      "id": "5669d6fa-0def-7369-6558-a47c35fdc675",
      "started": "2021-12-21T19:25:00.911883Z",
      "status": "Paused",
      "workload_status": "Paused"
    },
    {
      "enabled_schedulers": [
        "service",
        "batch",
        "system",
        "sysbatch",
        "_core"
      ],
      "id": "c919709d-6d14-66bf-b425-80b8167a267e",
      "started": "2021-12-21T19:25:00.91189Z",
      "status": "Paused",
      "workload_status": "Paused"
    },
    {
      "enabled_schedulers": [
        "service",
        "batch",
        "system",
        "sysbatch",
        "_core"
      ],
      "id": "f5edb69a-6122-be8f-b32a-23cd8511dba5",
      "started": "2021-12-21T19:25:00.911961Z",
      "status": "Paused",
      "workload_status": "Paused"
    },
    {
      "enabled_schedulers": [
        "service",
        "batch",
        "system",
        "sysbatch",
        "_core"
      ],
      "id": "458816ae-83cf-0710-d8d4-35d2ad2e42d7",
      "started": "2021-12-21T19:25:00.912119Z",
      "status": "Started",
      "workload_status": "WaitingToDequeue"
    }
  ],
  "server_id": "server1.global"
}

Read scheduler worker configuration

This endpoint returns data about the agent's scheduler configuration from the perspective of the agent. This is only applicable for servers.

Method Path Produces
GET /agent/schedulers/config application/json

The table below shows this endpoint's support for blocking queries and required ACLs.

Blocking Queries ACL Required
NO agent:read

Parameters

This endpoint accepts no additional parameters.

Sample Request

$ curl \
    https://localhost:4646/v1/agent/schedulers/config

Sample Response

{
  "enabled_schedulers": [
    "service",
    "batch",
    "system",
    "sysbatch",
    "_core"
  ],
  "num_schedulers": 8,
  "server_id": "server1.global"
}

Update scheduler worker configuration

This allows a Nomad operator to modify the server's running scheduler configuration, which will remain in effect until another update or until the node is restarted. For durable changes to this value, set the corresponding values—num_schedulers and enabled_schedulers—in the node's configuration file. The response contains the configuration after attempting to apply the provided values. This is only applicable for servers.

Method Path Produces
PUT /agent/schedulers/config application/json

The table below shows this endpoint's support for blocking queries and required ACLs.

Blocking Queries ACL Required
NO agent:write

Sample Payload

{
  "enabled_schedulers": [
    "service",
    "batch",
    "system",
    "sysbatch",
    "_core"
  ],
  "num_schedulers": 12
  "server_id": "server1.global"
}

Sample Request

$ curl \
    --request PUT \
    --data @payload.json \
    https://localhost:4646/v1/agent/schedulers/config

Sample Response

{
  "enabled_schedulers": [
    "service",
    "batch",
    "system",
    "sysbatch",
    "_core"
  ],
  "num_schedulers": 12,
  "server_id": "server1.global"
}