diff --git a/content/nomad/v0.11.x/content/api-docs/acl-policies.mdx b/content/nomad/v0.11.x/content/api-docs/acl-policies.mdx new file mode 100644 index 0000000000..9611f95bf1 --- /dev/null +++ b/content/nomad/v0.11.x/content/api-docs/acl-policies.mdx @@ -0,0 +1,168 @@ +--- +layout: api +page_title: ACL Policies - HTTP API +sidebar_title: ACL Policies +description: The /acl/policy endpoints are used to configure and manage ACL policies. +--- + +# ACL Policies HTTP API + +The `/acl/policies` and `/acl/policy/` endpoints are used to manage ACL policies. +For more details about ACLs, please see the [ACL Guide](https://learn.hashicorp.com/nomad?track=acls#operations-and-development). + +## List Policies + +This endpoint lists all ACL policies. This lists the policies that have been replicated +to the region, and may lag behind the authoritative region. + +| Method | Path | Produces | +| ------ | --------------- | ------------------ | +| `GET` | `/acl/policies` | `application/json` | + +The table below shows this endpoint's support for +[blocking queries](/api-docs#blocking-queries), [consistency modes](/api-docs#consistency-modes) and +[required ACLs](/api-docs#acls). + +| Blocking Queries | Consistency Modes | ACL Required | +| ---------------- | ----------------- | -------------------------------------------------------------------------------------------------------------------------------- | +| `YES` | `all` | `management` for all policies.
Output when given a non-management token will be limited to the policies on the token itself | + +### Parameters + +- `prefix` `(string: "")` - Specifies a string to filter ACL policies based on + a name prefix. This is specified as a query string parameter. + +### Sample Request + +```shell-session +$ curl \ + https://localhost:4646/v1/acl/policies +``` + +```shell-session +$ curl \ + https://localhost:4646/v1/acl/policies?prefix=prod +``` + +### Sample Response + +```json +[ + { + "Name": "foo", + "Description": "", + "CreateIndex": 12, + "ModifyIndex": 13 + } +] +``` + +## Create or Update Policy + +This endpoint creates or updates an ACL Policy. This request is always forwarded to the +authoritative region. + +| Method | Path | Produces | +| ------ | -------------------------- | -------------- | +| `POST` | `/acl/policy/:policy_name` | `(empty body)` | + +The table below shows this endpoint's support for +[blocking queries](/api-docs#blocking-queries) and +[required ACLs](/api-docs#acls). + +| Blocking Queries | ACL Required | +| ---------------- | ------------ | +| `NO` | `management` | + +### Parameters + +- `Name` `(string: )` - Specifies the name of the policy. + Creates the policy if the name does not exist, otherwise updates the existing policy. + +- `Description` `(string: )` - Specifies a human readable description. + +- `Rules` `(string: )` - Specifies the Policy rules in HCL or JSON format. + +### Sample Payload + +```json +{ + "Name": "my-policy", + "Description": "This is a great policy", + "Rules": "" +} +``` + +### Sample Request + +```shell-session +$ curl \ + --request POST \ + --data @payload.json \ + https://localhost:4646/v1/acl/policy/my-policy +``` + +## Read Policy + +This endpoint reads an ACL policy with the given name. This queries the policy that have been +replicated to the region, and may lag behind the authoritative region. + +| Method | Path | Produces | +| ------ | -------------------------- | ------------------ | +| `GET` | `/acl/policy/:policy_name` | `application/json` | + +The table below shows this endpoint's support for +[blocking queries](/api-docs#blocking-queries), [consistency modes](/api-docs#consistency-modes) and +[required ACLs](/api-docs#acls). + +| Blocking Queries | Consistency Modes | ACL Required | +| ---------------- | ----------------- | ------------------------------------------- | +| `YES` | `all` | `management` or token with access to policy | + +### Sample Request + +```shell-session +$ curl \ + https://localhost:4646/v1/acl/policy/foo +``` + +### Sample Response + +```json +{ + "Name": "foo", + "Rules": "", + "Description": "", + "CreateIndex": 12, + "ModifyIndex": 13 +} +``` + +## Delete Policy + +This endpoint deletes the named ACL policy. This request is always forwarded to the +authoritative region. + +| Method | Path | Produces | +| -------- | -------------------------- | -------------- | +| `DELETE` | `/acl/policy/:policy_name` | `(empty body)` | + +The table below shows this endpoint's support for +[blocking queries](/api-docs#blocking-queries) and +[required ACLs](/api-docs#acls). + +| Blocking Queries | ACL Required | +| ---------------- | ------------ | +| `NO` | `management` | + +### Parameters + +- `policy_name` `(string: )` - Specifies the policy name to delete. + +### Sample Request + +```shell-session +$ curl \ + --request DELETE \ + https://localhost:4646/v1/acl/policy/foo +``` diff --git a/content/nomad/v0.11.x/content/api-docs/acl-tokens.mdx b/content/nomad/v0.11.x/content/api-docs/acl-tokens.mdx new file mode 100644 index 0000000000..9cca68460b --- /dev/null +++ b/content/nomad/v0.11.x/content/api-docs/acl-tokens.mdx @@ -0,0 +1,344 @@ +--- +layout: api +page_title: ACL Tokens - HTTP API +sidebar_title: ACL Tokens +description: The /acl/token/ endpoints are used to configure and manage ACL tokens. +--- + +# ACL Tokens HTTP API + +The `/acl/bootstrap`, `/acl/tokens`, and `/acl/token/` endpoints are used to manage ACL tokens. +For more details about ACLs, please see the [ACL Guide](https://learn.hashicorp.com/nomad?track=acls#operations-and-development). + +## Bootstrap Token + +This endpoint is used to bootstrap the ACL system and provide the initial management token. +This request is always forwarded to the authoritative region. It can only be invoked once +until a [bootstrap reset](https://learn.hashicorp.com/nomad?track=acls#acls) is performed. + +| Method | Path | Produces | +| ------ | ---------------- | ------------------ | +| `POST` | `/acl/bootstrap` | `application/json` | + +The table below shows this endpoint's support for +[blocking queries](/api-docs#blocking-queries) and +[required ACLs](/api-docs#acls). + +| Blocking Queries | ACL Required | +| ---------------- | ------------ | +| `NO` | `none` | + +### Sample Request + +```shell-session +$ curl \ + --request POST \ + https://localhost:4646/v1/acl/bootstrap +``` + +### Sample Response + +```json +{ + "AccessorID": "b780e702-98ce-521f-2e5f-c6b87de05b24", + "SecretID": "3f4a0fcd-7c42-773c-25db-2d31ba0c05fe", + "Name": "Bootstrap Token", + "Type": "management", + "Policies": null, + "Global": true, + "CreateTime": "2017-08-23T22:47:14.695408057Z", + "CreateIndex": 7, + "ModifyIndex": 7 +} +``` + +## List Tokens + +This endpoint lists all ACL tokens. This lists the local tokens and the global +tokens which have been replicated to the region, and may lag behind the authoritative region. + +| Method | Path | Produces | +| ------ | ------------- | ------------------ | +| `GET` | `/acl/tokens` | `application/json` | + +The table below shows this endpoint's support for +[blocking queries](/api-docs#blocking-queries), [consistency modes](/api-docs#consistency-modes) and +[required ACLs](/api-docs#acls). + +| Blocking Queries | Consistency Modes | ACL Required | +| ---------------- | ----------------- | ------------ | +| `YES` | `all` | `management` | + +### Parameters + +- `prefix` `(string: "")` - Specifies a string to filter ACL tokens based on an + accessor ID prefix. Because the value is decoded to bytes, the prefix must + have an even number of hexadecimal characters (0-9a-f). This is specified as + a query string parameter. + +### Sample Request + +```shell-session +$ curl \ + https://localhost:4646/v1/acl/tokens +``` + +```shell-session +$ curl \ + --request POST \ + https://localhost:4646/v1/acl/tokens?prefix=3da2ed52 +``` + +### Sample Response + +```json +[ + { + "AccessorID": "b780e702-98ce-521f-2e5f-c6b87de05b24", + "Name": "Bootstrap Token", + "Type": "management", + "Policies": null, + "Global": true, + "CreateTime": "2017-08-23T22:47:14.695408057Z", + "CreateIndex": 7, + "ModifyIndex": 7 + } +] +``` + +## Create Token + +This endpoint creates an ACL Token. If the token is a global token, the request +is forwarded to the authoritative region. + +| Method | Path | Produces | +| ------ | ------------ | ------------------ | +| `POST` | `/acl/token` | `application/json` | + +The table below shows this endpoint's support for +[blocking queries](/api-docs#blocking-queries) and +[required ACLs](/api-docs#acls). + +| Blocking Queries | ACL Required | +| ---------------- | ------------ | +| `NO` | `management` | + +### Parameters + +- `Name` `(string: )` - Specifies the human readable name of the token. + +- `Type` `(string: )` - Specifies the type of token. Must be either `client` or `management`. + +- `Policies` `(array: )` - Must be null or blank for `management` type tokens, otherwise must specify at least one policy for `client` type tokens. + +- `Global` `(bool: )` - If true, indicates this token should be replicated globally to all regions. Otherwise, this token is created local to the target region. + +### Sample Payload + +```json +{ + "Name": "Readonly token", + "Type": "client", + "Policies": ["readonly"], + "Global": false +} +``` + +### Sample Request + +```shell-session +$ curl \ + --request POST \ + --data @payload.json \ + https://localhost:4646/v1/acl/token +``` + +### Sample Response + +```json +{ + "AccessorID": "aa534e09-6a07-0a45-2295-a7f77063d429", + "SecretID": "8176afd3-772d-0b71-8f85-7fa5d903e9d4", + "Name": "Readonly token", + "Type": "client", + "Policies": ["readonly"], + "Global": false, + "CreateTime": "2017-08-23T23:25:41.429154233Z", + "CreateIndex": 52, + "ModifyIndex": 52 +} +``` + +## Update Token + +This endpoint updates an existing ACL Token. If the token is a global token, the request +is forwarded to the authoritative region. Note that a token cannot be switched from global +to local or visa versa. + +| Method | Path | Produces | +| ------ | ------------------------- | ------------------ | +| `POST` | `/acl/token/:accessor_id` | `application/json` | + +The table below shows this endpoint's support for +[blocking queries](/api-docs#blocking-queries) and +[required ACLs](/api-docs#acls). + +| Blocking Queries | ACL Required | +| ---------------- | ------------ | +| `NO` | `management` | + +### Parameters + +- `AccessorID` `(string: )` - Specifies the token (by accessor) that is being updated. Must match payload body and request path. + +- `Name` `(string: )` - Specifies the human readable name of the token. + +- `Type` `(string: )` - Specifies the type of token. Must be either `client` or `management`. + +- `Policies` `(array: )` - Must be null or blank for `management` type tokens, otherwise must specify at least one policy for `client` type tokens. + +### Sample Payload + +```json +{ + "AccessorID": "aa534e09-6a07-0a45-2295-a7f77063d429", + "Name": "Read-write token", + "Type": "client", + "Policies": ["readwrite"] +} +``` + +### Sample Request + +```shell-session +$ curl \ + --request POST \ + --data @payload.json \ + https://localhost:4646/v1/acl/token/aa534e09-6a07-0a45-2295-a7f77063d429 +``` + +### Sample Response + +```json +{ + "AccessorID": "aa534e09-6a07-0a45-2295-a7f77063d429", + "SecretID": "8176afd3-772d-0b71-8f85-7fa5d903e9d4", + "Name": "Read-write token", + "Type": "client", + "Policies": ["readwrite"], + "Global": false, + "CreateTime": "2017-08-23T23:25:41.429154233Z", + "CreateIndex": 52, + "ModifyIndex": 64 +} +``` + +## Read Token + +This endpoint reads an ACL token with the given accessor. If the token is a global token +which has been replicated to the region it may lag behind the authoritative region. + +| Method | Path | Produces | +| ------ | ------------------------- | ------------------ | +| `GET` | `/acl/token/:accessor_id` | `application/json` | + +The table below shows this endpoint's support for +[blocking queries](/api-docs#blocking-queries), [consistency modes](/api-docs#consistency-modes) and +[required ACLs](/api-docs#acls). + +| Blocking Queries | Consistency Modes | ACL Required | +| ---------------- | ----------------- | -------------------------------------------------- | +| `YES` | `all` | `management` or a SecretID matching the AccessorID | + +### Sample Request + +```shell-session +$ curl \ + https://localhost:4646/v1/acl/token/aa534e09-6a07-0a45-2295-a7f77063d429 +``` + +### Sample Response + +```json +{ + "AccessorID": "aa534e09-6a07-0a45-2295-a7f77063d429", + "SecretID": "8176afd3-772d-0b71-8f85-7fa5d903e9d4", + "Name": "Read-write token", + "Type": "client", + "Policies": ["readwrite"], + "Global": false, + "CreateTime": "2017-08-23T23:25:41.429154233Z", + "CreateIndex": 52, + "ModifyIndex": 64 +} +``` + +## Read Self Token + +This endpoint reads the ACL token given by the passed SecretID. If the token is a global token +which has been replicated to the region it may lag behind the authoritative region. + +| Method | Path | Produces | +| ------ | ----------------- | ------------------ | +| `GET` | `/acl/token/self` | `application/json` | + +The table below shows this endpoint's support for +[blocking queries](/api-docs#blocking-queries), [consistency modes](/api-docs#consistency-modes) and +[required ACLs](/api-docs#acls). + +| Blocking Queries | Consistency Modes | ACL Required | +| ---------------- | ----------------- | ------------------- | +| `YES` | `all` | Any valid ACL token | + +### Sample Request + +```shell-session +$ curl \ + --header "X-Nomad-Token: 8176afd3-772d-0b71-8f85-7fa5d903e9d4" \ + https://localhost:4646/v1/acl/token/self +``` + +### Sample Response + +```json +{ + "AccessorID": "aa534e09-6a07-0a45-2295-a7f77063d429", + "SecretID": "8176afd3-772d-0b71-8f85-7fa5d903e9d4", + "Name": "Read-write token", + "Type": "client", + "Policies": ["readwrite"], + "Global": false, + "CreateTime": "2017-08-23T23:25:41.429154233Z", + "CreateIndex": 52, + "ModifyIndex": 64 +} +``` + +## Delete Token + +This endpoint deletes the ACL token by accessor. This request is forwarded to the +authoritative region for global tokens. + +| Method | Path | Produces | +| -------- | ------------------------- | -------------- | +| `DELETE` | `/acl/token/:accessor_id` | `(empty body)` | + +The table below shows this endpoint's support for +[blocking queries](/api-docs#blocking-queries) and +[required ACLs](/api-docs#acls). + +| Blocking Queries | ACL Required | +| ---------------- | ------------ | +| `NO` | `management` | + +### Parameters + +- `accessor_id` `(string: )` - Specifies the ACL token accessor ID. + +### Sample Request + +```shell-session +$ curl \ + --request DELETE \ + https://localhost:4646/v1/acl/token/aa534e09-6a07-0a45-2295-a7f77063d429 +``` diff --git a/content/nomad/v0.11.x/content/api-docs/agent.mdx b/content/nomad/v0.11.x/content/api-docs/agent.mdx new file mode 100644 index 0000000000..0d46c58514 --- /dev/null +++ b/content/nomad/v0.11.x/content/api-docs/agent.mdx @@ -0,0 +1,644 @@ +--- +layout: api +page_title: Agent - HTTP API +sidebar_title: Agent +description: |- + The /agent endpoints interact with the local Nomad agent to interact with + members and servers. +--- + +# Agent HTTP API + +The `/agent` endpoints are used to interact with the local Nomad agent. + +## List Members + +This endpoint queries the agent for the known peers in the gossip pool. This +endpoint is only applicable to servers. Due to the nature of gossip, this is +eventually consistent. + +| Method | Path | Produces | +| ------ | ---------------- | ------------------ | +| `GET` | `/agent/members` | `application/json` | + +The table below shows this endpoint's support for +[blocking queries](/api-docs#blocking-queries) and +[required ACLs](/api-docs#acls). + +| Blocking Queries | ACL Required | +| ---------------- | ------------ | +| `NO` | `node:read` | + +### Sample Request + +```shell-session +$ curl \ + https://localhost:4646/v1/agent/members +``` + +### Sample Response + +```json +{ + "ServerName": "bacon-mac", + "ServerRegion": "global", + "ServerDC": "dc1", + "Members": [ + { + "Name": "bacon-mac.global", + "Addr": "127.0.0.1", + "Port": 4648, + "Tags": { + "mvn": "1", + "build": "0.5.5dev", + "port": "4647", + "bootstrap": "1", + "role": "nomad", + "region": "global", + "dc": "dc1", + "vsn": "1" + }, + "Status": "alive", + "ProtocolMin": 1, + "ProtocolMax": 5, + "ProtocolCur": 2, + "DelegateMin": 2, + "DelegateMax": 4, + "DelegateCur": 4 + } + ] +} +``` + +## List Servers + +This endpoint lists the known server nodes. The `servers` endpoint is used to +query an agent in client mode for its list of known servers. Client nodes +register themselves with these server addresses so that they may dequeue work. +The servers endpoint can be used to keep this configuration up to date if there +are changes in the cluster. + +| Method | Path | Produces | +| ------ | ---------------- | ------------------ | +| `GET` | `/agent/servers` | `application/json` | + +The table below shows this endpoint's support for +[blocking queries](/api-docs#blocking-queries) and +[required ACLs](/api-docs#acls). + +| Blocking Queries | ACL Required | +| ---------------- | ------------ | +| `NO` | `agent:read` | + +### Sample Request + +```shell-session +$ curl \ + https://localhost:4646/v1/agent/servers +``` + +### Sample Response + +```json +["127.0.0.1:4647"] +``` + +## Update Servers + +This endpoint updates the list of known servers to the provided list. This +**replaces** all previous server addresses with the new list. + +| Method | Path | Produces | +| ------ | ---------------- | -------------- | +| `POST` | `/agent/servers` | `(empty body)` | + +The table below shows this endpoint's support for +[blocking queries](/api-docs#blocking-queries) and +[required ACLs](/api-docs#acls). + +| Blocking Queries | ACL Required | +| ---------------- | ------------- | +| `NO` | `agent:write` | + +### Parameters + +- `address` `(string: )` - Specifies the list of addresses in the + format `ip:port`. This is specified as a query string! + +### Sample Request + +```shell-session +$ curl \ + --request POST \ + https://localhost:4646/v1/agent/servers?address=1.2.3.4:4647&address=5.6.7.8:4647 +``` + +## Query Self + +This endpoint queries the state of the target agent (self). + +| Method | Path | Produces | +| ------ | ------------- | ------------------ | +| `GET` | `/agent/self` | `application/json` | + +The table below shows this endpoint's support for +[blocking queries](/api-docs#blocking-queries) and +[required ACLs](/api-docs#acls). + +| Blocking Queries | ACL Required | +| ---------------- | ------------ | +| `NO` | `agent:read` | + +### Sample Request + +```shell-session +$ curl \ + https://localhost:4646/v1/agent/self +``` + +### Sample Response + +```json +{ + "config": { + "Addresses": { + "HTTP": "127.0.0.1", + "RPC": "127.0.0.1", + "Serf": "127.0.0.1" + }, + "AdvertiseAddrs": { + "HTTP": "127.0.0.1:4646", + "RPC": "127.0.0.1:4647", + "Serf": "127.0.0.1:4648" + }, + "BindAddr": "127.0.0.1", + "Client": { + "AllocDir": "", + "ChrootEnv": {}, + "ClientMaxPort": 14512, + "ClientMinPort": 14000, + "DisableRemoteExec": false, + "Enabled": true, + "GCDiskUsageThreshold": 99, + "GCInodeUsageThreshold": 99, + "GCInterval": 600000000000, + "MaxKillTimeout": "30s", + "Meta": {}, + "NetworkInterface": "lo0", + "NetworkSpeed": 0, + "NodeClass": "", + "Options": { + "driver.docker.volumes": "true" + }, + "Reserved": { + "CPU": 0, + "DiskMB": 0, + "MemoryMB": 0, + "ParsedReservedPorts": null, + "ReservedPorts": "" + }, + "Servers": null, + "StateDir": "" + }, + "Consul": { + "Addr": "", + "Auth": "", + "AutoAdvertise": true, + "CAFile": "", + "CertFile": "", + "ChecksUseAdvertise": false, + "ClientAutoJoin": true, + "ClientServiceName": "nomad-client", + "EnableSSL": false, + "KeyFile": "", + "ServerAutoJoin": true, + "ServerServiceName": "nomad", + "Timeout": 5000000000, + "Token": "", + "VerifySSL": false + }, + "DataDir": "", + "Datacenter": "dc1", + "DevMode": true, + "DisableAnonymousSignature": true, + "DisableUpdateCheck": false, + "EnableDebug": true, + "EnableSyslog": false, + "Files": null, + "HTTPAPIResponseHeaders": {}, + "LeaveOnInt": false, + "LeaveOnTerm": false, + "LogLevel": "DEBUG", + "NodeName": "", + "Ports": { + "HTTP": 4646, + "RPC": 4647, + "Serf": 4648 + }, + "Region": "global", + "Revision": "f551dcb83e3ac144c9dbb90583b6e82d234662e9", + "Server": { + "BootstrapExpect": 0, + "DataDir": "", + "Enabled": true, + "EnabledSchedulers": null, + "HeartbeatGrace": "", + "NodeGCThreshold": "", + "NumSchedulers": 0, + "ProtocolVersion": 0, + "RejoinAfterLeave": false, + "RetryInterval": "30s", + "RetryJoin": [], + "RetryMaxAttempts": 0, + "StartJoin": [] + }, + "SyslogFacility": "LOCAL0", + "TLSConfig": { + "CAFile": "", + "CertFile": "", + "EnableHTTP": false, + "EnableRPC": false, + "KeyFile": "", + "VerifyServerHostname": false + }, + "Telemetry": { + "CirconusAPIApp": "", + "CirconusAPIToken": "", + "CirconusAPIURL": "", + "CirconusBrokerID": "", + "CirconusBrokerSelectTag": "", + "CirconusCheckDisplayName": "", + "CirconusCheckForceMetricActivation": "", + "CirconusCheckID": "", + "CirconusCheckInstanceID": "", + "CirconusCheckSearchTag": "", + "CirconusCheckSubmissionURL": "", + "CirconusCheckTags": "", + "CirconusSubmissionInterval": "", + "CollectionInterval": "1s", + "DataDogAddr": "", + "DataDogTags": [], + "DisableHostname": false, + "PublishAllocationMetrics": false, + "PublishNodeMetrics": false, + "StatsdAddr": "", + "StatsiteAddr": "", + "UseNodeName": false + }, + "Vault": { + "Addr": "https://vault.service.consul:8200", + "AllowUnauthenticated": true, + "ConnectionRetryIntv": 30000000000, + "Enabled": null, + "Role": "", + "TLSCaFile": "", + "TLSCaPath": "", + "TLSCertFile": "", + "TLSKeyFile": "", + "TLSServerName": "", + "TLSSkipVerify": null, + "TaskTokenTTL": "", + "Token": "root" + }, + "Version": "0.5.5", + "VersionPrerelease": "dev" + }, + "member": { + "Addr": "127.0.0.1", + "DelegateCur": 4, + "DelegateMax": 4, + "DelegateMin": 2, + "Name": "bacon-mac.global", + "Port": 4648, + "ProtocolCur": 2, + "ProtocolMax": 5, + "ProtocolMin": 1, + "Status": "alive", + "Tags": { + "role": "nomad", + "region": "global", + "dc": "dc1", + "vsn": "1", + "mvn": "1", + "build": "0.5.5dev", + "port": "4647", + "bootstrap": "1" + } + }, + "stats": { + "runtime": { + "cpu_count": "8", + "kernel.name": "darwin", + "arch": "amd64", + "version": "go1.8", + "max_procs": "7", + "goroutines": "79" + }, + "nomad": { + "server": "true", + "leader": "true", + "leader_addr": "127.0.0.1:4647", + "bootstrap": "false", + "known_regions": "1" + }, + "raft": { + "num_peers": "0", + "fsm_pending": "0", + "last_snapshot_index": "0", + "last_log_term": "2", + "commit_index": "144", + "term": "2", + "last_log_index": "144", + "protocol_version_max": "3", + "snapshot_version_max": "1", + "latest_configuration_index": "1", + "latest_configuration": "[{Suffrage:Voter ID:127.0.0.1:4647 Address:127.0.0.1:4647}]", + "last_contact": "never", + "applied_index": "144", + "protocol_version": "1", + "protocol_version_min": "0", + "snapshot_version_min": "0", + "state": "Leader", + "last_snapshot_term": "0" + }, + "client": { + "heartbeat_ttl": "17.79568937s", + "node_id": "fb2170a8-257d-3c64-b14d-bc06cc94e34c", + "known_servers": "127.0.0.1:4647", + "num_allocations": "0", + "last_heartbeat": "10.107423052s" + }, + "serf": { + "event_time": "1", + "event_queue": "0", + "encrypted": "false", + "member_time": "1", + "query_time": "1", + "intent_queue": "0", + "query_queue": "0", + "members": "1", + "failed": "0", + "left": "0", + "health_score": "0" + } + } +} +``` + +## Join Agent + +This endpoint introduces a new member to the gossip pool. This endpoint is only +eligible for servers. + +| Method | Path | Produces | +| ------ | ------------- | ------------------ | +| `POST` | `/agent/join` | `application/json` | + +The table below shows this endpoint's support for +[blocking queries](/api-docs#blocking-queries) and +[required ACLs](/api-docs#acls). + +| Blocking Queries | ACL Required | +| ---------------- | ------------ | +| `NO` | `none` | + +### Parameters + +- `address` `(string: )` - Specifies the address to join in the + `ip:port` format. This is provided as a query parameter and may be specified + multiple times to join multiple servers. + +### Sample Request + +```shell-session +$ curl \ + --request POST \ + https://localhost:4646/v1/agent/join?address=1.2.3.4&address=5.6.7.8 +``` + +### Sample Response + +```json +{ + "error": "", + "num_joined": 2 +} +``` + +## Force Leave Agent + +This endpoint forces a member of the gossip pool from the `"failed"` state to +the `"left"` state. This allows the consensus protocol to remove the peer and +stop attempting replication. This is only applicable for servers. + +| Method | Path | Produces | +| ------ | -------------------- | ------------------ | +| `POST` | `/agent/force-leave` | `application/json` | + +The table below shows this endpoint's support for +[blocking queries](/api-docs#blocking-queries) and +[required ACLs](/api-docs#acls). + +| Blocking Queries | ACL Required | +| ---------------- | ------------- | +| `NO` | `agent:write` | + +### Parameters + +- `node` `(string: )` - Specifies the name of the node to force leave. + +### Sample Request + +```shell-session +$ curl \ + --request POST \ + https://localhost:4646/v1/agent/force-leave?node=client-ab2e23dc +``` + +## Health + +This endpoint returns whether or not the agent is healthy. When using Consul it +is the endpoint Nomad will register for its own health checks. + +When the agent is unhealthy 500 will be returned along with JSON response +containing an error message. + +| Method | Path | Produces | +| ------ | --------------- | ------------------ | +| `GET` | `/agent/health` | `application/json` | + +The table below shows this endpoint's support for +[blocking queries](/api-docs#blocking-queries) and +[required ACLs](/api-docs#acls). + +| Blocking Queries | ACL Required | +| ---------------- | ------------ | +| `NO` | `none` | + +### Sample Request + +```shell-session +$ curl \ + https://localhost:4646/v1/agent/health +``` + +### Sample Response + +```json +{ + "client": { + "message": "ok", + "ok": true + }, + "server": { + "message": "ok", + "ok": true + } +} +``` + +## Stream Logs + +This endpoint streams logs from the local agent until the connection is closed + +| Method | Path | Produces | +| ------ | ---------------- | ------------------ | +| `GET` | `/agent/monitor` | `application/json` | + +The table below shows this endpoint's support for +[blocking queries](/api-docs#blocking-queries) and +[required ACLs](/api-docs#acls). + +| Blocking Queries | ACL Required | +| ---------------- | ------------ | +| `NO` | `agent:read` | + +### Parameters + +- `log_level` `(string: "info")` - Specifies a text string containing a log level + to filter on, such as `info`. Possible values include `trace`, `debug`, + `info`, `warn`, `error` + +- `log_json` `(bool: false)` - Specifies if the log format for streamed logs + should be JSON. + +- `node_id` `(string: "a57b2adb-1a30-2dda-8df0-25abb0881952")` - Specifies a text + string containing a node-id to target for streaming. + +- `server_id` `(string: "server1.global")` - Specifies a text + string containing a server name or "leader" to target a specific remote server + or leader for streaming. + +- `plain` `(bool: false)` - Specifies if the response should be JSON or + plaintext + +### Sample Request + +```shell-session +$ curl \ + https://localhost:4646/v1/agent/monitor?log_level=debug&server_id=leader + +$ curl \ + https://localhost:4646/v1/agent/monitor?log_level=debug&node_id=a57b2adb-1a30-2dda-8df0-25abb0881952 +``` + +### Sample Response + +```json +{ + "Offset": 0, + "Data": "NTMxOTMyCjUzMTkzMwo1MzE5MzQKNTMx..." + "FileEvent": "log" +} +``` + +#### Field Reference + +The return value is a stream of frames. These frames contain the following +fields: + +- `Data` - A base64 encoding of the bytes being streamed. + +- `FileEvent` - An event that could cause a change in the streams position. The + possible value for this endpoint is "log". + +- `Offset` - Offset is the offset into the stream. + +## Agent Runtime Profiles + +This endpoint is the equivalent of Go's /debug/pprof endpoint but is protected +by ACLs and supports remote forwarding to a client node or server. See the +[Golang documentation](https://golang.org/pkg/runtime/pprof/#Profile) for a list of available profiles. + +| Method | Path | Produces | +| ------ | ------------------------------ | -------------------------- | +| `GET` | `/agent/pprof/cmdline` | `text/plain` | +| `GET` | `/agent/pprof/profile` | `application/octet-stream` | +| `GET` | `/agent/pprof/trace` | `application/octet-stream` | +| `GET` | `/agent/pprof/` | `application/octet-stream` | + +The table below shows this endpoint's support for +[blocking queries](/api-docs#blocking-queries) and +[required ACLs](/api-docs#acls). + +| Blocking Queries | ACL Required | +| ---------------- | ------------- | +| `NO` | `agent:write` | + +### Default Behavior + +This endpoint is enabled whenever ACLs are enabled. Due to the potentially +sensitive nature of data contained in profiles, as well as their significant +performance impact, the agent/pprof endpoint is protected by a high level ACL: +`agent:write`. For these reasons its recommended to leave [`enable_debug`](/docs/configuration#enable_debug) +unset and only use the ACL-protected endpoints. + +The following table explains when each endpoint is available: + +| Endpoint | `enable_debug` | ACLs | **Available?** | +| --------------- | -------------- | ---- | -------------- | +| /v1/agent/pprof | unset | n/a | no | +| /v1/agent/pprof | `true` | n/a | yes | +| /v1/agent/pprof | `false` | n/a | no | +| /v1/agent/pprof | unset | off | no | +| /v1/agent/pprof | unset | on | **yes** | +| /v1/agent/pprof | `true` | off | yes | +| /v1/agent/pprof | `false` | on | **yes** | + +### Parameters + +- `node_id` `(string: "a57b2adb-1a30-2dda-8df0-25abb0881952")` - Specifies a text + string containing a Node ID to target for profiling. + +- `server_id` `(string: "server1.global")` - Specifies a text + string containing a Server ID, name, or `leader` to target a specific remote + server or leader for profiling. + +- `seconds` `(int: 3)` - Specifies the amount of time to run a profile or trace + request for. + +- `debug` `(int: 1)` - Specifies if a given pprof profile should be returned as + human readable plain text instead of the pprof binary format. Defaults to 0, + setting to 1 enables human readable plain text. + +### Sample Request + +```shell-session +$ curl -O -J \ + --header "X-Nomad-Token: 8176afd3-772d-0b71-8f85-7fa5d903e9d4" \ + https://localhost:4646/v1/agent/pprof/goroutine?server_id=leader + +$ go tool pprof goroutine + +$ curl -O -J \ + --header "X-Nomad-Token: 8176afd3-772d-0b71-8f85-7fa5d903e9d4" \ + https://localhost:4646/v1/agent/profile?seconds=5&node_id=a57b2adb-1a30-2dda-8df0-25abb0881952 + +$ go tool pprof profile + +$ curl -O -J \ + --header "X-Nomad-Token: 8176afd3-772d-0b71-8f85-7fa5d903e9d4" \ + https://localhost:4646/v1/agent/trace?&seconds=5&server_id=server1.global + +go tool trace trace +``` diff --git a/content/nomad/v0.11.x/content/api-docs/allocations.mdx b/content/nomad/v0.11.x/content/api-docs/allocations.mdx new file mode 100644 index 0000000000..a128d60d9d --- /dev/null +++ b/content/nomad/v0.11.x/content/api-docs/allocations.mdx @@ -0,0 +1,690 @@ +--- +layout: api +page_title: Allocations - HTTP API +sidebar_title: Allocations +description: The /allocation endpoints are used to query for and interact with allocations. +--- + +# Allocations HTTP API + +The `/allocation` endpoints are used to query for and interact with allocations. + +## List Allocations + +This endpoint lists all allocations. + +| Method | Path | Produces | +| ------ | ----------------- | ------------------ | +| `GET` | `/v1/allocations` | `application/json` | + +The table below shows this endpoint's support for +[blocking queries](/api-docs#blocking-queries) and +[required ACLs](/api-docs#acls). + +| Blocking Queries | ACL Required | +| ---------------- | -------------------- | +| `YES` | `namespace:read-job` | + +### Parameters + +- `prefix` `(string: "")`- Specifies a string to filter allocations based on an + ID prefix. Because the value is decoded to bytes, the prefix must have an + even number of hexadecimal characters (0-9a-f). This is specified as a query + string parameter. + +### Sample Request + +```shell-session +$ curl \ + https://localhost:4646/v1/allocations +``` + +```shell-session +$ curl \ + https://localhost:4646/v1/allocations?prefix=a8198d79 +``` + +### Sample Response + +```json +[ + { + "ID": "a8198d79-cfdb-6593-a999-1e9adabcba2e", + "EvalID": "5456bd7a-9fc0-c0dd-6131-cbee77f57577", + "Name": "example.cache[0]", + "NodeID": "fb2170a8-257d-3c64-b14d-bc06cc94e34c", + "PreviousAllocation": "516d2753-0513-cfc7-57ac-2d6fac18b9dc", + "NextAllocation": "cd13d9b9-4f97-7184-c88b-7b451981616b", + "RescheduleTracker": { + "Events": [ + { + "PrevAllocID": "516d2753-0513-cfc7-57ac-2d6fac18b9dc", + "PrevNodeID": "9230cd3b-3bda-9a3f-82f9-b2ea8dedb20e", + "RescheduleTime": 1517434161192946200, + "Delay": "5000000000" + } + ] + }, + "JobID": "example", + "TaskGroup": "cache", + "DesiredStatus": "run", + "DesiredDescription": "", + "ClientStatus": "running", + "ClientDescription": "", + "TaskStates": { + "redis": { + "State": "running", + "FinishedAt": "0001-01-01T00:00:00Z", + "LastRestart": "0001-01-01T00:00:00Z", + "Restarts": 0, + "StartedAt": "2017-07-25T23:36:26.106431265Z", + "Failed": false, + "Events": [ + { + "Type": "Received", + "Time": 1495747371795703800, + "FailsTask": false, + "RestartReason": "", + "SetupError": "", + "DriverError": "", + "ExitCode": 0, + "Signal": 0, + "Message": "", + "KillTimeout": 0, + "KillError": "", + "KillReason": "", + "StartDelay": 0, + "DownloadError": "", + "ValidationError": "", + "DiskLimit": 0, + "FailedSibling": "", + "VaultError": "", + "TaskSignalReason": "", + "TaskSignal": "", + "DriverMessage": "" + }, + { + "Type": "Driver", + "Time": 1495747371798867200, + "FailsTask": false, + "RestartReason": "", + "SetupError": "", + "DriverError": "", + "ExitCode": 0, + "Signal": 0, + "Message": "", + "KillTimeout": 0, + "KillError": "", + "KillReason": "", + "StartDelay": 0, + "DownloadError": "", + "ValidationError": "", + "DiskLimit": 0, + "FailedSibling": "", + "VaultError": "", + "TaskSignalReason": "", + "TaskSignal": "", + "DriverMessage": "Downloading image redis:3.2" + }, + { + "Type": "Started", + "Time": 1495747379525667800, + "FailsTask": false, + "RestartReason": "", + "SetupError": "", + "DriverError": "", + "ExitCode": 0, + "Signal": 0, + "Message": "", + "KillTimeout": 0, + "KillError": "", + "KillReason": "", + "StartDelay": 0, + "DownloadError": "", + "ValidationError": "", + "DiskLimit": 0, + "FailedSibling": "", + "VaultError": "", + "TaskSignalReason": "", + "TaskSignal": "", + "DriverMessage": "" + } + ] + } + }, + "CreateIndex": 54, + "ModifyIndex": 57, + "CreateTime": 1495747371794276400, + "ModifyTime": 1495747371794276400 + } +] +``` + +## Read Allocation + +This endpoint reads information about a specific allocation. + +| Method | Path | Produces | +| ------ | -------------------------- | ------------------ | +| `GET` | `/v1/allocation/:alloc_id` | `application/json` | + +The table below shows this endpoint's support for +[blocking queries](/api-docs#blocking-queries) and +[required ACLs](/api-docs#acls). + +| Blocking Queries | ACL Required | +| ---------------- | -------------------- | +| `YES` | `namespace:read-job` | + +### Parameters + +- `:alloc_id` `(string: )`- Specifies the UUID of the allocation. This + must be the full UUID, not the short 8-character one. This is specified as + part of the path. + +### Sample Request + +```shell-session +$ curl \ + https://localhost:4646/v1/allocation/5456bd7a-9fc0-c0dd-6131-cbee77f57577 +``` + +### Sample Response + +```json +{ + "ID": "a8198d79-cfdb-6593-a999-1e9adabcba2e", + "EvalID": "5456bd7a-9fc0-c0dd-6131-cbee77f57577", + "Name": "example.cache[0]", + "NodeID": "fb2170a8-257d-3c64-b14d-bc06cc94e34c", + "PreviousAllocation": "516d2753-0513-cfc7-57ac-2d6fac18b9dc", + "NextAllocation": "cd13d9b9-4f97-7184-c88b-7b451981616b", + "RescheduleTracker": { + "Events": [ + { + "PrevAllocID": "516d2753-0513-cfc7-57ac-2d6fac18b9dc", + "PrevNodeID": "9230cd3b-3bda-9a3f-82f9-b2ea8dedb20e", + "RescheduleTime": 1517434161192946200, + "Delay": "5000000000" + } + ] + }, + "JobID": "example", + "Job": { + "Region": "global", + "ID": "example", + "ParentID": "", + "Name": "example", + "Type": "service", + "Priority": 50, + "AllAtOnce": false, + "Datacenters": ["dc1"], + "Constraints": null, + "Affinities": null, + "TaskGroups": [ + { + "Name": "cache", + "Count": 1, + "Constraints": null, + "Affinities": null, + "RestartPolicy": { + "Attempts": 10, + "Interval": 300000000000, + "Delay": 25000000000, + "Mode": "delay" + }, + "Spreads": null, + "Tasks": [ + { + "Name": "redis", + "Driver": "docker", + "User": "", + "Config": { + "port_map": [ + { + "db": 6379 + } + ], + "image": "redis:3.2" + }, + "Env": null, + "Services": [ + { + "Name": "redis-cache", + "PortLabel": "db", + "Tags": ["global", "cache"], + "Checks": [ + { + "Name": "alive", + "Type": "tcp", + "Command": "", + "Args": null, + "Path": "", + "Protocol": "", + "PortLabel": "", + "Interval": 10000000000, + "Timeout": 2000000000, + "InitialStatus": "" + } + ] + } + ], + "Vault": null, + "Templates": null, + "Constraints": null, + "Affinities": null, + "Resources": { + "CPU": 500, + "MemoryMB": 10, + "DiskMB": 0, + "Networks": [ + { + "Device": "", + "CIDR": "", + "IP": "", + "MBits": 10, + "ReservedPorts": null, + "DynamicPorts": [ + { + "Label": "db", + "Value": 0 + } + ] + } + ] + }, + "Spreads": null, + "DispatchPayload": null, + "Meta": null, + "KillTimeout": 5000000000, + "LogConfig": { + "MaxFiles": 10, + "MaxFileSizeMB": 10 + }, + "Artifacts": null, + "Leader": false + } + ], + "EphemeralDisk": { + "Sticky": false, + "SizeMB": 300, + "Migrate": false + }, + "Meta": null + } + ], + "Update": { + "Stagger": 10000000000, + "MaxParallel": 0 + }, + "Periodic": null, + "ParameterizedJob": null, + "Payload": null, + "Spreads": null, + "Meta": null, + "VaultToken": "", + "Status": "pending", + "StatusDescription": "", + "CreateIndex": 52, + "ModifyIndex": 52, + "JobModifyIndex": 52 + }, + "TaskGroup": "cache", + "Resources": { + "CPU": 500, + "MemoryMB": 10, + "DiskMB": 300, + "Networks": [ + { + "Device": "lo0", + "CIDR": "", + "IP": "127.0.0.1", + "MBits": 10, + "ReservedPorts": null, + "DynamicPorts": [ + { + "Label": "db", + "Value": 23116 + } + ] + } + ] + }, + "SharedResources": { + "CPU": 0, + "MemoryMB": 0, + "DiskMB": 300, + "Networks": null + }, + "TaskResources": { + "redis": { + "CPU": 500, + "MemoryMB": 10, + "DiskMB": 0, + "Networks": [ + { + "Device": "lo0", + "CIDR": "", + "IP": "127.0.0.1", + "MBits": 10, + "ReservedPorts": null, + "DynamicPorts": [ + { + "Label": "db", + "Value": 23116 + } + ] + } + ] + } + }, + "Metrics": { + "NodesEvaluated": 1, + "NodesFiltered": 0, + "NodesAvailable": { + "dc1": 1 + }, + "ClassFiltered": null, + "ConstraintFiltered": null, + "NodesExhausted": 0, + "ClassExhausted": null, + "DimensionExhausted": null, + "Scores": { + "fb2170a8-257d-3c64-b14d-bc06cc94e34c.binpack": 0.6205732522109244 + }, + "AllocationTime": 31729, + "CoalescedFailures": 0 + }, + "DesiredStatus": "run", + "DesiredDescription": "", + "ClientStatus": "running", + "ClientDescription": "", + "TaskStates": { + "redis": { + "State": "running", + "Failed": false, + "FinishedAt": "0001-01-01T00:00:00Z", + "LastRestart": "0001-01-01T00:00:00Z", + "Restarts": 0, + "StartedAt": "2017-07-25T23:36:26.106431265Z", + "Events": [ + { + "Type": "Received", + "Time": 1495747371795703800, + "FailsTask": false, + "RestartReason": "", + "SetupError": "", + "DriverError": "", + "ExitCode": 0, + "Signal": 0, + "Message": "", + "KillTimeout": 0, + "KillError": "", + "KillReason": "", + "StartDelay": 0, + "DownloadError": "", + "ValidationError": "", + "DiskLimit": 0, + "FailedSibling": "", + "VaultError": "", + "TaskSignalReason": "", + "TaskSignal": "", + "DriverMessage": "" + }, + { + "Type": "Driver", + "Time": 1495747371798867200, + "FailsTask": false, + "RestartReason": "", + "SetupError": "", + "DriverError": "", + "ExitCode": 0, + "Signal": 0, + "Message": "", + "KillTimeout": 0, + "KillError": "", + "KillReason": "", + "StartDelay": 0, + "DownloadError": "", + "ValidationError": "", + "DiskLimit": 0, + "FailedSibling": "", + "VaultError": "", + "TaskSignalReason": "", + "TaskSignal": "", + "DriverMessage": "Downloading image redis:3.2" + }, + { + "Type": "Started", + "Time": 1495747379525667800, + "FailsTask": false, + "RestartReason": "", + "SetupError": "", + "DriverError": "", + "ExitCode": 0, + "Signal": 0, + "Message": "", + "KillTimeout": 0, + "KillError": "", + "KillReason": "", + "StartDelay": 0, + "DownloadError": "", + "ValidationError": "", + "DiskLimit": 0, + "FailedSibling": "", + "VaultError": "", + "TaskSignalReason": "", + "TaskSignal": "", + "DriverMessage": "" + } + ] + } + }, + "PreviousAllocation": "", + "CreateIndex": 54, + "ModifyIndex": 57, + "AllocModifyIndex": 54, + "CreateTime": 1495747371794276400, + "ModifyTime": 1495747371794276400 +} +``` + +#### Field Reference + +- `TaskStates` - A map of tasks to their current state and the latest events + that have effected the state. `TaskState` objects contain the following + fields: + + - `State`: The task's current state. It can have one of the following + values: + + - `TaskStatePending` - The task is waiting to be run, either for the first + time or due to a restart. + + - `TaskStateRunning` - The task is currently running. + + - `TaskStateDead` - The task is dead and will not run again. + + - `StartedAt`: The time the task was last started at. Can be updated through + restarts. + + - `FinishedAt`: The time the task was finished at. + + - `LastRestart`: The last time the task was restarted. + + - `Restarts`: The number of times the task has restarted. + + - `Events` - An event contains metadata about the event. The latest 10 events + are stored per task. Each event is timestamped (Unix nanoseconds) and has one + of the following types: + + - `Setup Failure` - The task could not be started because there was a + failure setting up the task prior to it running. + + - `Driver Failure` - The task could not be started due to a failure in the + driver. + + - `Started` - The task was started; either for the first time or due to a + restart. + + - `Terminated` - The task was started and exited. + + - `Killing` - The task has been sent the kill signal. + + - `Killed` - The task was killed by a user. + + - `Received` - The task has been pulled by the client at the given timestamp. + + - `Failed Validation` - The task was invalid and as such it didn't run. + + - `Restarting` - The task terminated and is being restarted. + + - `Not Restarting` - the task has failed and is not being restarted because + it has exceeded its restart policy. + + - `Downloading Artifacts` - The task is downloading the artifact(s) + - specified in the task. + + - `Failed Artifact Download` - Artifact(s) specified in the task failed to + download. + + - `Restart Signaled` - The task was singled to be restarted. + + - `Signaling` - The task was is being sent a signal. + + - `Sibling Task Failed` - A task in the same task group failed. + + - `Leader Task Dead` - The group's leader task is dead. + + - `Driver` - A message from the driver. + + - `Task Setup` - Task setup messages. + + - `Building Task Directory` - Task is building its file system. + + Depending on the type the event will have applicable annotations. + +## Stop Allocation + +This endpoint stops and reschedules a specific allocation. + +| Method | Path | Produces | +| -------------- | ------------------------------- | ------------------ | +| `POST` / `PUT` | `/v1/allocation/:alloc_id/stop` | `application/json` | + +The table below shows this endpoint's support for +[blocking queries](/api-docs#blocking-queries) and +[required ACLs](/api-docs#acls). + +| Blocking Queries | ACL Required | +| ---------------- | --------------------------- | +| `NO` | `namespace:alloc-lifecycle` | + +### Parameters + +- `:alloc_id` `(string: )`- Specifies the UUID of the allocation. This + must be the full UUID, not the short 8-character one. This is specified as + part of the path. + +### Sample Request + +```shell-session +$ curl -X POST \ + https://localhost:4646/v1/allocation/5456bd7a-9fc0-c0dd-6131-cbee77f57577/stop +``` + +### Sample Response + +```json +{ + "EvalID": "5456bd7a-9fc0-c0dd-6131-cbee77f57577", + "Index": 54 +} +``` + +## Signal Allocation + +This endpoint sends a signal to an allocation or task. + +| Method | Path | Produces | +| -------------- | ---------------------------------------- | ------------------ | +| `POST` / `PUT` | `/v1/client/allocation/:alloc_id/signal` | `application/json` | + +The table below shows this endpoint's support for +[blocking queries](/api-docs#blocking-queries) and +[required ACLs](/api-docs#acls). + +| Blocking Queries | ACL Required | +| ---------------- | --------------------------- | +| `NO` | `namespace:alloc-lifecycle` | + +### Parameters + +- `:alloc_id` `(string: )`- Specifies the UUID of the allocation. This + must be the full UUID, not the short 8-character one. This is specified as + part of the path. + +### Sample Payload + +```json +{ + "Signal": "SIGUSR1", + "Task": "FOO" +} +``` + +### Sample Request + +```shell-session +$ curl -X POST -d '{"Signal": "SIGUSR1" }' \ + https://localhost:4646/v1/client/allocation/5456bd7a-9fc0-c0dd-6131-cbee77f57577/signal +``` + +### Sample Response + +```json +{} +``` + +## Restart Allocation + +This endpoint restarts an allocation or task in-place. + +| Method | Path | Produces | +| -------------- | ----------------------------------------- | ------------------ | +| `POST` / `PUT` | `/v1/client/allocation/:alloc_id/restart` | `application/json` | + +The table below shows this endpoint's support for +[blocking queries](/api-docs#blocking-queries) and +[required ACLs](/api-docs#acls). + +| Blocking Queries | ACL Required | +| ---------------- | --------------------------- | +| `NO` | `namespace:alloc-lifecycle` | + +### Parameters + +- `:alloc_id` `(string: )`- Specifies the UUID of the allocation. This + must be the full UUID, not the short 8-character one. This is specified as + part of the path. + +### Sample Payload + +```json +{ + "Task": "FOO" +} +``` + +### Sample Request + +```shell-session +$ curl -X POST -d '{"Task": "redis" }' \ + https://localhost:4646/v1/client/allocation/5456bd7a-9fc0-c0dd-6131-cbee77f57577/restart +``` + +### Sample Response + +```json +{} +``` diff --git a/content/nomad/v0.11.x/content/api-docs/client.mdx b/content/nomad/v0.11.x/content/api-docs/client.mdx new file mode 100644 index 0000000000..e90601b6ca --- /dev/null +++ b/content/nomad/v0.11.x/content/api-docs/client.mdx @@ -0,0 +1,618 @@ +--- +layout: api +page_title: Client - HTTP API +sidebar_title: Client +description: |- + The /client endpoints are used to access client statistics and inspect + allocations running on a particular client. +--- + +# Client HTTP API + +The `/client` endpoints are used to interact with the Nomad clients. + +Since Nomad 0.8.0, both a client and server can handle client endpoints. This is +particularly useful for when a direct connection to a client is not possible due +to the network configuration. For high volume access to the client endpoints, +particularly endpoints streaming file contents, direct access to the node should +be preferred as it avoids adding additional load to the servers. + +When accessing the endpoints via the server, if the desired node is ambiguous +based on the URL, an additional `?node_id` query parameter must be provided to +disambiguate. + +## Read Stats + +This endpoint queries the actual resources consumed on a node. The API endpoint +is hosted by the Nomad client and requests have to be made to the nomad client +whose resource usage metrics are of interest. + +| Method | Path | Produces | +| ------ | --------------- | ------------------ | +| `GET` | `/client/stats` | `application/json` | + +The table below shows this endpoint's support for +[blocking queries](/api-docs#blocking-queries) and +[required ACLs](/api-docs#acls). + +| Blocking Queries | ACL Required | +| ---------------- | ------------ | +| `NO` | `node:read` | + +### Parameters + +- `node_id` `(string: )` - Specifies the node to query. This is + required when the endpoint is being accessed via a server. This is specified as + part of the URL. Note, this must be the _full_ node ID, not the short + 8-character one. This is specified as part of the path. + +### Sample Request + +```shell-session +$ curl \ + https://localhost:4646/v1/client/stats +``` + +### Sample Response + +```json +{ + "AllocDirStats": { + "Available": 142943150080, + "Device": "", + "InodesUsedPercent": 0.05312946180421879, + "Mountpoint": "", + "Size": 249783500800, + "Used": 106578206720, + "UsedPercent": 42.668233241448746 + }, + "CPU": [ + { + "CPU": "cpu0", + "Idle": 80, + "System": 11, + "Total": 20, + "User": 9 + }, + { + "CPU": "cpu1", + "Idle": 99, + "System": 0, + "Total": 1, + "User": 1 + }, + { + "CPU": "cpu2", + "Idle": 89, + "System": 7.000000000000001, + "Total": 11, + "User": 4 + }, + { + "CPU": "cpu3", + "Idle": 100, + "System": 0, + "Total": 0, + "User": 0 + }, + { + "CPU": "cpu4", + "Idle": 92.92929292929293, + "System": 4.040404040404041, + "Total": 7.07070707070707, + "User": 3.0303030303030303 + }, + { + "CPU": "cpu5", + "Idle": 99, + "System": 1, + "Total": 1, + "User": 0 + }, + { + "CPU": "cpu6", + "Idle": 92.07920792079209, + "System": 4.9504950495049505, + "Total": 7.920792079207921, + "User": 2.9702970297029703 + }, + { + "CPU": "cpu7", + "Idle": 99, + "System": 0, + "Total": 1, + "User": 1 + } + ], + "CPUTicksConsumed": 1126.8044804480448, + "DiskStats": [ + { + "Available": 142943150080, + "Device": "/dev/disk1", + "InodesUsedPercent": 0.05312946180421879, + "Mountpoint": "/", + "Size": 249783500800, + "Used": 106578206720, + "UsedPercent": 42.668233241448746 + } + ], + "Memory": { + "Available": 6232244224, + "Free": 470618112, + "Total": 17179869184, + "Used": 10947624960 + }, + "Timestamp": 1495743032992498200, + "Uptime": 193520 +} +``` + +## Read Allocation Statistics + +The client `allocation` endpoint is used to query the actual resources consumed +by an allocation. + +| Method | Path | Produces | +| ------ | ------------------------------------ | ------------------ | +| `GET` | `/client/allocation/:alloc_id/stats` | `application/json` | + +The table below shows this endpoint's support for +[blocking queries](/api-docs#blocking-queries) and +[required ACLs](/api-docs#acls). + +| Blocking Queries | ACL Required | +| ---------------- | -------------------- | +| `NO` | `namespace:read-job` | + +### Parameters + +- `:alloc_id` `(string: )` - Specifies the allocation ID to query. + This is specified as part of the URL. Note, this must be the _full_ allocation + ID, not the short 8-character one. This is specified as part of the path. + +### Sample Request + +```shell-session +$ curl \ + https://localhost:4646/v1/client/allocation/5fc98185-17ff-26bc-a802-0c74fa471c99/stats +``` + +### Sample Response + +```json +{ + "ResourceUsage": { + "CpuStats": { + "Measured": ["Throttled Periods", "Throttled Time", "Percent"], + "Percent": 0.14159538847117795, + "SystemMode": 0, + "ThrottledPeriods": 0, + "ThrottledTime": 0, + "TotalTicks": 3.256693934837093, + "UserMode": 0 + }, + "MemoryStats": { + "Cache": 1744896, + "KernelMaxUsage": 0, + "KernelUsage": 0, + "MaxUsage": 4710400, + "Measured": ["RSS", "Cache", "Swap", "Max Usage"], + "RSS": 1486848, + "Swap": 0 + } + }, + "Tasks": { + "redis": { + "Pids": null, + "ResourceUsage": { + "CpuStats": { + "Measured": ["Throttled Periods", "Throttled Time", "Percent"], + "Percent": 0.14159538847117795, + "SystemMode": 0, + "ThrottledPeriods": 0, + "ThrottledTime": 0, + "TotalTicks": 3.256693934837093, + "UserMode": 0 + }, + "MemoryStats": { + "Cache": 1744896, + "KernelMaxUsage": 0, + "KernelUsage": 0, + "MaxUsage": 4710400, + "Measured": ["RSS", "Cache", "Swap", "Max Usage"], + "RSS": 1486848, + "Swap": 0 + } + }, + "Timestamp": 1495743243970720000 + } + }, + "Timestamp": 1495743243970720000 +} +``` + +## Read File + +This endpoint reads the contents of a file in an allocation directory. + +| Method | Path | Produces | +| ------ | -------------------------- | ------------ | +| `GET` | `/client/fs/cat/:alloc_id` | `text/plain` | + +The table below shows this endpoint's support for +[blocking queries](/api-docs#blocking-queries) and +[required ACLs](/api-docs#acls). + +| Blocking Queries | ACL Required | +| ---------------- | ------------------- | +| `NO` | `namespace:read-fs` | + +### Parameters + +- `:alloc_id` `(string: )` - Specifies the allocation ID to query. + This is specified as part of the URL. Note, this must be the _full_ allocation + ID, not the short 8-character one. This is specified as part of the path. + +- `path` `(string: "/")` - Specifies the path of the file to read, relative to + the root of the allocation directory. + +### Sample Request + +```shell-session +$ curl \ + https://localhost:4646/v1/client/fs/cat/5fc98185-17ff-26bc-a802-0c74fa471c99 +``` + +```shell-session +$ curl \ + https://localhost:4646/v1/client/fs/cat/5fc98185-17ff-26bc-a802-0c74fa471c99?path=alloc/file.json +``` + +### Sample Response + +```text +(whatever was in the file...) +``` + +## Read File at Offset + +This endpoint reads the contents of a file in an allocation directory at a +particular offset and limit. + +| Method | Path | Produces | +| ------ | ----------------------------- | ------------ | +| `GET` | `/client/fs/readat/:alloc_id` | `text/plain` | + +The table below shows this endpoint's support for +[blocking queries](/api-docs#blocking-queries) and +[required ACLs](/api-docs#acls). + +| Blocking Queries | ACL Required | +| ---------------- | ------------------- | +| `NO` | `namespace:read-fs` | + +### Parameters + +- `:alloc_id` `(string: )` - Specifies the allocation ID to query. + This is specified as part of the URL. Note, this must be the _full_ allocation + ID, not the short 8-character one. This is specified as part of the path. + +- `path` `(string: "/")` - Specifies the path of the file to read, relative to + the root of the allocation directory. + +- `offset` `(int: )` - Specifies the byte offset from where content + will be read. + +- `limit` `(int: )` - Specifies the number of bytes to read from the + offset. + +### Sample Request + +```shell-session +$ curl \ + https://localhost:4646/v1/client/fs/readat/5fc98185-17ff-26bc-a802-0c74fa471c99?path=/alloc/foo&offset=1323&limit=19303 +``` + +### Sample Response + +```text +(whatever was in the file, starting from offset, up to limit bytes...) +``` + +## Stream File + +This endpoint streams the contents of a file in an allocation directory. + +| Method | Path | Produces | +| ------ | ----------------------------- | ------------ | +| `GET` | `/client/fs/stream/:alloc_id` | `text/plain` | + +The table below shows this endpoint's support for +[blocking queries](/api-docs#blocking-queries) and +[required ACLs](/api-docs#acls). + +| Blocking Queries | ACL Required | +| ---------------- | ------------------- | +| `NO` | `namespace:read-fs` | + +### Parameters + +- `:alloc_id` `(string: )` - Specifies the allocation ID to query. + This is specified as part of the URL. Note, this must be the _full_ allocation + ID, not the short 8-character one. This is specified as part of the path. + +- `path` `(string: "/")` - Specifies the path of the file to read, relative to + the root of the allocation directory. + +- `follow` `(bool: true)`- Specifies whether to tail the file. + +- `offset` `(int: )` - Specifies the byte offset from where content + will be read. + +- `origin` `(string: "start|end")` - Applies the relative offset to either the + start or end of the file. + +### Sample Request + +```shell-session +$ curl \ + https://localhost:4646/v1/client/fs/stream/5fc98185-17ff-26bc-a802-0c74fa471c99?path=/alloc/logs/redis.log +``` + +### Sample Response + +```json +({ + "File": "alloc/logs/redis.log", + "Offset": 3604480, + "Data": "NTMxOTMyCjUzMTkzMwo1MzE5MzQKNTMx..." +}, +{ + "File": "alloc/logs/redis.log", + "FileEvent": "file deleted" +}) +``` + +#### Field Reference + +The return value is a stream of frames. These frames contain the following +fields: + +- `Data` - A base64 encoding of the bytes being streamed. + +- `FileEvent` - An event that could cause a change in the streams position. The + possible values are "file deleted" and "file truncated". + +- `Offset` - Offset is the offset into the stream. + +- `File` - The name of the file being streamed. + +## Stream Logs + +This endpoint streams a task's stderr/stdout logs. + +| Method | Path | Produces | +| ------ | --------------------------- | ------------ | +| `GET` | `/client/fs/logs/:alloc_id` | `text/plain` | + +The table below shows this endpoint's support for +[blocking queries](/api-docs#blocking-queries) and +[required ACLs](/api-docs#acls). + +| Blocking Queries | ACL Required | +| ---------------- | -------------------------------------------- | +| `NO` | `namespace:read-logs` or `namespace:read-fs` | + +### Parameters + +- `:alloc_id` `(string: )` - Specifies the allocation ID to query. + This is specified as part of the URL. Note, this must be the _full_ allocation + ID, not the short 8-character one. This is specified as part of the path. + +- `task` `(string: )` - Specifies the name of the task inside the + allocation to stream logs from. + +- `follow` `(bool: false)`- Specifies whether to tail the logs. + +- `type` `(string: "stderr|stdout")` - Specifies the stream to stream. + +- `offset` `(int: 0)` - Specifies the offset to start streaming from. + +- `origin` `(string: "start|end")` - Specifies either "start" or "end" and + applies the offset relative to either the start or end of the logs + respectively. Defaults to "start". + +- `plain` `(bool: false)` - Return just the plain text without framing. This can + be useful when viewing logs in a browser. + +### Sample Request + +```shell-session +$ curl \ + https://localhost:4646/v1/client/fs/logs/5fc98185-17ff-26bc-a802-0c74fa471c99 +``` + +### Sample Response + +```json +({ + "File": "alloc/logs/redis.stdout.0", + "Offset": 3604480, + "Data": "NTMxOTMyCjUzMTkzMwo1MzE5MzQKNTMx..." +}, +{ + "File": "alloc/logs/redis.stdout.0", + "FileEvent": "file deleted" +}) +``` + +#### Field Reference + +The return value is a stream of frames. These frames contain the following +fields: + +- `Data` - A base64 encoding of the bytes being streamed. + +- `FileEvent` - An event that could cause a change in the streams position. The + possible values are "file deleted" and "file truncated". + +- `Offset` - Offset is the offset into the stream. + +- `File` - The name of the file being streamed. + +## List Files + +This endpoint lists files in an allocation directory. + +| Method | Path | Produces | +| ------ | ------------------------- | ------------ | +| `GET` | `/client/fs/ls/:alloc_id` | `text/plain` | + +The table below shows this endpoint's support for +[blocking queries](/api-docs#blocking-queries) and +[required ACLs](/api-docs#acls). + +| Blocking Queries | ACL Required | +| ---------------- | ------------------- | +| `NO` | `namespace:read-fs` | + +### Parameters + +- `:alloc_id` `(string: )` - Specifies the allocation ID to query. + This is specified as part of the URL. Note, this must be the _full_ allocation + ID, not the short 8-character one. This is specified as part of the path. + +- `path` `(string: "/")` - Specifies the path of the file to read, relative to + the root of the allocation directory. + +### Sample Request + +```shell-session +$ curl \ + https://localhost:4646/v1/client/fs/ls/5fc98185-17ff-26bc-a802-0c74fa471c99 +``` + +### Sample Response + +```json +[ + { + "Name": "alloc", + "IsDir": true, + "Size": 4096, + "FileMode": "drwxrwxr-x", + "ModTime": "2016-03-15T15:40:00.414236712-07:00" + }, + { + "Name": "redis", + "IsDir": true, + "Size": 4096, + "FileMode": "drwxrwxr-x", + "ModTime": "2016-03-15T15:40:56.810238153-07:00" + } +] +``` + +## Stat File + +This endpoint stats a file in an allocation. + +| Method | Path | Produces | +| ------ | --------------------------- | ------------ | +| `GET` | `/client/fs/stat/:alloc_id` | `text/plain` | + +The table below shows this endpoint's support for +[blocking queries](/api-docs#blocking-queries) and +[required ACLs](/api-docs#acls). + +| Blocking Queries | ACL Required | +| ---------------- | ------------------- | +| `NO` | `namespace:read-fs` | + +### Parameters + +- `:alloc_id` `(string: )` - Specifies the allocation ID to query. + This is specified as part of the URL. Note, this must be the _full_ allocation + ID, not the short 8-character one. This is specified as part of the path. + +- `path` `(string: "/")` - Specifies the path of the file to read, relative to + the root of the allocation directory. + +### Sample Request + +```shell-session +$ curl \ + https://localhost:4646/v1/client/fs/stat/5fc98185-17ff-26bc-a802-0c74fa471c99 +``` + +### Sample Response + +```json +{ + "Name": "redis-syslog-collector.out", + "IsDir": false, + "Size": 96, + "FileMode": "-rw-rw-r--", + "ModTime": "2016-03-15T15:40:56.822238153-07:00" +} +``` + +## GC Allocation + +This endpoint forces a garbage collection of a particular, stopped allocation +on a node. + +| Method | Path | Produces | +| ------ | --------------------------------- | ------------------ | +| `GET` | `/client/allocation/:alloc_id/gc` | `application/json` | + +The table below shows this endpoint's support for +[blocking queries](/api-docs#blocking-queries) and +[required ACLs](/api-docs#acls). + +| Blocking Queries | ACL Required | +| ---------------- | ---------------------- | +| `NO` | `namespace:submit-job` | + +### Parameters + +- `:alloc_id` `(string: )` - Specifies the allocation ID to query. + This is specified as part of the URL. Note, this must be the _full_ allocation + ID, not the short 8-character one. This is specified as part of the path. + +### Sample Request + +```shell-session +$ curl \ + https://nomad.rocks/v1/client/allocation/5fc98185-17ff-26bc-a802-0c74fa471c99/gc +``` + +## GC All Allocation + +This endpoint forces a garbage collection of all stopped allocations on a node. + +| Method | Path | Produces | +| ------ | ------------ | ------------ | +| `GET` | `/client/gc` | `text/plain` | + +The table below shows this endpoint's support for +[blocking queries](/api-docs#blocking-queries) and +[required ACLs](/api-docs#acls). + +| Blocking Queries | ACL Required | +| ---------------- | ------------ | +| `NO` | `node:write` | + +### Parameters + +- `node_id` `(string: )` - Specifies the node to target. This is + required when the endpoint is being accessed via a server. This is specified as + part of the URL. Note, this must be the _full_ node ID, not the short + 8-character one. This is specified as part of the path. + +### Sample Request + +```shell-session +$ curl \ + https://localhost:4646/v1/client/gc +``` diff --git a/content/nomad/v0.11.x/content/api-docs/deployments.mdx b/content/nomad/v0.11.x/content/api-docs/deployments.mdx new file mode 100644 index 0000000000..6554454661 --- /dev/null +++ b/content/nomad/v0.11.x/content/api-docs/deployments.mdx @@ -0,0 +1,493 @@ +--- +layout: api +page_title: Deployments - HTTP API +sidebar_title: Deployments +description: The /deployment endpoints are used to query for and interact with deployments. +--- + +# Deployments HTTP API + +The `/deployment` endpoints are used to query for and interact with deployments. + +## List Deployments + +This endpoint lists all deployments. + +| Method | Path | Produces | +| ------ | ----------------- | ------------------ | +| `GET` | `/v1/deployments` | `application/json` | + +The table below shows this endpoint's support for +[blocking queries](/api-docs#blocking-queries) and +[required ACLs](/api-docs#acls). + +| Blocking Queries | ACL Required | +| ---------------- | -------------------- | +| `YES` | `namespace:read-job` | + +### Parameters + +- `prefix` `(string: "")`- Specifies a string to filter deployments based on + an ID prefix. Because the value is decoded to bytes, the prefix must have an + even number of hexadecimal characters (0-9a-f) .This is specified as a query + string parameter. + +### Sample Request + +```shell-session +$ curl \ + https://localhost:4646/v1/deployments +``` + +```shell-session +$ curl \ + https://localhost:4646/v1/deployments?prefix=25ba81c +``` + +### Sample Response + +```json +[ + { + "ID": "70638f62-5c19-193e-30d6-f9d6e689ab8e", + "JobID": "example", + "JobVersion": 1, + "JobModifyIndex": 17, + "JobSpecModifyIndex": 17, + "JobCreateIndex": 7, + "TaskGroups": { + "cache": { + "Promoted": false, + "DesiredCanaries": 1, + "DesiredTotal": 3, + "PlacedAllocs": 1, + "HealthyAllocs": 0, + "UnhealthyAllocs": 0 + } + }, + "Status": "running", + "StatusDescription": "", + "CreateIndex": 19, + "ModifyIndex": 19 + } +] +``` + +## Read Deployment + +This endpoint reads information about a specific deployment by ID. + +| Method | Path | Produces | +| ------ | ------------------------------- | ------------------ | +| `GET` | `/v1/deployment/:deployment_id` | `application/json` | + +The table below shows this endpoint's support for +[blocking queries](/api-docs#blocking-queries) and +[required ACLs](/api-docs#acls). + +| Blocking Queries | ACL Required | +| ---------------- | -------------------- | +| `YES` | `namespace:read-job` | + +### Parameters + +- `:deployment_id` `(string: )`- Specifies the UUID of the deployment. + This must be the full UUID, not the short 8-character one. This is specified + as part of the path. + +### Sample Request + +```shell-session +$ curl \ + https://localhost:4646/v1/deployment/70638f62-5c19-193e-30d6-f9d6e689ab8e +``` + +### Sample Response + +```json +{ + "ID": "70638f62-5c19-193e-30d6-f9d6e689ab8e", + "JobID": "example", + "JobVersion": 1, + "JobModifyIndex": 17, + "JobSpecModifyIndex": 17, + "JobCreateIndex": 7, + "TaskGroups": { + "cache": { + "Promoted": false, + "DesiredCanaries": 1, + "DesiredTotal": 3, + "PlacedAllocs": 1, + "HealthyAllocs": 0, + "UnhealthyAllocs": 0 + } + }, + "Status": "running", + "StatusDescription": "", + "CreateIndex": 19, + "ModifyIndex": 19 +} +``` + +## List Allocations for Deployment + +This endpoint lists the allocations created or modified for the given +deployment. + +| Method | Path | Produces | +| ------ | ------------------------------------------- | ------------------ | +| `GET` | `/v1/deployment/allocations/:deployment_id` | `application/json` | + +The table below shows this endpoint's support for +[blocking queries](/api-docs#blocking-queries) and +[required ACLs](/api-docs#acls). + +| Blocking Queries | ACL Required | +| ---------------- | -------------------- | +| `YES` | `namespace:read-job` | + +### Parameters + +- `:deployment_id` `(string: )`- Specifies the UUID of the deployment. + This must be the full UUID, not the short 8-character one. This is specified + as part of the path. + +### Sample Request + +```shell-session +$ curl \ + https://localhost:4646/v1/deployment/allocations/5456bd7a-9fc0-c0dd-6131-cbee77f57577 +``` + +### Sample Response + +```json +[ + { + "ID": "287b65cc-6c25-cea9-0332-e4a75ca2af98", + "EvalID": "9751cb74-1a0d-190e-d026-ad2bc666ad2c", + "Name": "example.cache[0]", + "NodeID": "cb1f6030-a220-4f92-57dc-7baaabdc3823", + "JobID": "example", + "TaskGroup": "cache", + "DesiredStatus": "run", + "DesiredDescription": "", + "ClientStatus": "running", + "ClientDescription": "", + "TaskStates": { + "redis": { + "State": "running", + "Failed": false, + "StartedAt": "2017-06-29T22:29:41.52000268Z", + "FinishedAt": "0001-01-01T00:00:00Z", + "Events": [ + { + "Type": "Received", + "Time": 1498775380693307400, + "FailsTask": false, + "RestartReason": "", + "SetupError": "", + "DriverError": "", + "ExitCode": 0, + "Signal": 0, + "Message": "", + "KillTimeout": 0, + "KillError": "", + "KillReason": "", + "StartDelay": 0, + "DownloadError": "", + "ValidationError": "", + "DiskLimit": 0, + "FailedSibling": "", + "VaultError": "", + "TaskSignalReason": "", + "TaskSignal": "", + "DriverMessage": "" + }, + { + "Type": "Task Setup", + "Time": 1498775380693659000, + "FailsTask": false, + "RestartReason": "", + "SetupError": "", + "DriverError": "", + "ExitCode": 0, + "Signal": 0, + "Message": "Building Task Directory", + "KillTimeout": 0, + "KillError": "", + "KillReason": "", + "StartDelay": 0, + "DownloadError": "", + "ValidationError": "", + "DiskLimit": 0, + "FailedSibling": "", + "VaultError": "", + "TaskSignalReason": "", + "TaskSignal": "", + "DriverMessage": "" + }, + { + "Type": "Started", + "Time": 1498775381508493800, + "FailsTask": false, + "RestartReason": "", + "SetupError": "", + "DriverError": "", + "ExitCode": 0, + "Signal": 0, + "Message": "", + "KillTimeout": 0, + "KillError": "", + "KillReason": "", + "StartDelay": 0, + "DownloadError": "", + "ValidationError": "", + "DiskLimit": 0, + "FailedSibling": "", + "VaultError": "", + "TaskSignalReason": "", + "TaskSignal": "", + "DriverMessage": "" + } + ] + } + }, + "DeploymentStatus": null, + "CreateIndex": 19, + "ModifyIndex": 22, + "CreateTime": 1498775380678486300, + "ModifyTime": 1498775380678486300 + } +] +``` + +## Fail Deployment + +This endpoint is used to mark a deployment as failed. This should be done to +force the scheduler to stop creating allocations as part of the deployment or to +cause a rollback to a previous job version. This endpoint only triggers a rollback +if the most recent stable version of the job has a different specification than +the job being reverted. + +| Method | Path | Produces | +| ------ | ------------------------------------ | ------------------ | +| `POST` | `/v1/deployment/fail/:deployment_id` | `application/json` | + +The table below shows this endpoint's support for +[blocking queries](/api-docs#blocking-queries) and +[required ACLs](/api-docs#acls). + +| Blocking Queries | ACL Required | +| ---------------- | ---------------------- | +| `NO` | `namespace:submit-job` | + +### Parameters + +- `:deployment_id` `(string: )`- Specifies the UUID of the deployment. + This must be the full UUID, not the short 8-character one. This is specified + as part of the path. + +### Sample Request + +```shell-session +$ curl \ + --request POST \ + https://localhost:4646/v1/deployment/fail/5456bd7a-9fc0-c0dd-6131-cbee77f57577 +``` + +### Sample Response + +```json +{ + "EvalID": "0d834913-58a0-81ac-6e33-e452d83a0c66", + "EvalCreateIndex": 20, + "DeploymentModifyIndex": 20, + "RevertedJobVersion": 1, + "Index": 20 +} +``` + +## Pause Deployment + +This endpoint is used to pause or unpause a deployment. This is done to pause +a rolling upgrade or resume it. + +| Method | Path | Produces | +| ------ | ------------------------------------- | ------------------ | +| `POST` | `/v1/deployment/pause/:deployment_id` | `application/json` | + +The table below shows this endpoint's support for +[blocking queries](/api-docs#blocking-queries) and +[required ACLs](/api-docs#acls). + +| Blocking Queries | ACL Required | +| ---------------- | ---------------------- | +| `NO` | `namespace:submit-job` | + +### Parameters + +- `:deployment_id` `(string: )`- Specifies the UUID of the deployment. + This must be the full UUID, not the short 8-character one. This is specified + as part of the path and in the JSON payload. + +- `Pause` `(bool: false)` - Specifies whether to pause or resume the deployment. + +### Sample Payload + +```javascript +{ + "DeploymentID": "5456bd7a-9fc0-c0dd-6131-cbee77f57577", + "Pause": true +} +``` + +### Sample Request + +```shell-session +$ curl \ + --request POST \ + https://localhost:4646/v1/deployment/pause/5456bd7a-9fc0-c0dd-6131-cbee77f57577 +``` + +### Sample Response + +```json +{ + "EvalID": "0d834913-58a0-81ac-6e33-e452d83a0c66", + "EvalCreateIndex": 20, + "DeploymentModifyIndex": 20, + "Index": 20 +} +``` + +## Promote Deployment + +This endpoint is used to promote task groups that have canaries for a +deployment. This should be done when the placed canaries are healthy and the +rolling upgrade of the remaining allocations should begin. + +| Method | Path | Produces | +| ------ | --------------------------------------- | ------------------ | +| `POST` | `/v1/deployment/promote/:deployment_id` | `application/json` | + +The table below shows this endpoint's support for +[blocking queries](/api-docs#blocking-queries) and +[required ACLs](/api-docs#acls). + +| Blocking Queries | ACL Required | +| ---------------- | ---------------------- | +| `NO` | `namespace:submit-job` | + +### Parameters + +- `:deployment_id` `(string: )`- Specifies the UUID of the deployment. + This must be the full UUID, not the short 8-character one. This is specified + as part of the path and JSON payload. + +- `All` `(bool: false)` - Specifies whether all task groups should be promoted. + +- `Groups` `(array: nil)` - Specifies a particular set of task groups + that should be promoted. + +### Sample Payload + +```javascript +{ + "DeploymentID": "5456bd7a-9fc0-c0dd-6131-cbee77f57577", + "All": true +} +``` + +```javascript +{ + "DeploymentID": "5456bd7a-9fc0-c0dd-6131-cbee77f57577", + "Groups": ["web", "api-server"] +} +``` + +### Sample Request + +```shell-session +$ curl \ + --request POST \ + https://localhost:4646/v1/deployment/promote/5456bd7a-9fc0-c0dd-6131-cbee77f57577 +``` + +### Sample Response + +```json +{ + "EvalID": "0d834913-58a0-81ac-6e33-e452d83a0c66", + "EvalCreateIndex": 20, + "DeploymentModifyIndex": 20, + "Index": 20 +} +``` + +## Set Allocation Health in Deployment + +This endpoint is used to set the health of an allocation that is in the +deployment manually. In some use cases, automatic detection of allocation health +may not be desired. As such those task groups can be marked with an upgrade +policy that uses `health_check = "manual"`. Those allocations must have their +health marked manually using this endpoint. Marking an allocation as healthy +will allow the rolling upgrade to proceed. Marking it as failed will cause the +deployment to fail. This endpoint only triggers a rollback if the most recent stable +version of the job has a different specification than the job being reverted. + +| Method | Path | Produces | +| ------ | ------------------------------------------------- | ------------------ | +| `POST` | `/v1/deployment/allocation-health/:deployment_id` | `application/json` | + +The table below shows this endpoint's support for +[blocking queries](/api-docs#blocking-queries) and +[required ACLs](/api-docs#acls). + +| Blocking Queries | ACL Required | +| ---------------- | ---------------------- | +| `NO` | `namespace:submit-job` | + +### Parameters + +- `:deployment_id` `(string: )`- Specifies the UUID of the deployment. + This must be the full UUID, not the short 8-character one. This is specified + as part of the path and the JSON payload. + +- `HealthyAllocationIDs` `(array: nil)` - Specifies the set of + allocation that should be marked as healthy. + +- `UnhealthyAllocationIDs` `(array: nil)` - Specifies the set of + allocation that should be marked as unhealthy. + +### Sample Payload + +```javascript +{ + "DeploymentID": "5456bd7a-9fc0-c0dd-6131-cbee77f57577", + "HealthyAllocationIDs": [ + "eb13bc8a-7300-56f3-14c0-d4ad115ec3f5", + "6584dad8-7ae3-360f-3069-0b4309711cc1" + ] +} +``` + +### Sample Request + +```shell-session +$ curl \ + --request POST \ + https://localhost:4646/v1/deployment/allocation-health/5456bd7a-9fc0-c0dd-6131-cbee77f57577 +``` + +### Sample Response + +```json +{ + "EvalID": "0d834913-58a0-81ac-6e33-e452d83a0c66", + "EvalCreateIndex": 20, + "DeploymentModifyIndex": 20, + "RevertedJobVersion": 1, + "Index": 20 +} +``` diff --git a/content/nomad/v0.11.x/content/api-docs/evaluations.mdx b/content/nomad/v0.11.x/content/api-docs/evaluations.mdx new file mode 100644 index 0000000000..02c576e2f5 --- /dev/null +++ b/content/nomad/v0.11.x/content/api-docs/evaluations.mdx @@ -0,0 +1,267 @@ +--- +layout: api +page_title: Evaluations - HTTP API +sidebar_title: Evaluations +description: The /evaluation are used to query for and interact with evaluations. +--- + +# Evaluations HTTP API + +The `/evaluation` endpoints are used to query for and interact with evaluations. + +## List Evaluations + +This endpoint lists all evaluations. + +| Method | Path | Produces | +| ------ | ----------------- | ------------------ | +| `GET` | `/v1/evaluations` | `application/json` | + +The table below shows this endpoint's support for +[blocking queries](/api-docs#blocking-queries) and +[required ACLs](/api-docs#acls). + +| Blocking Queries | ACL Required | +| ---------------- | -------------------- | +| `YES` | `namespace:read-job` | + +### Parameters + +- `prefix` `(string: "")`- Specifies a string to filter evaluations based on an + ID prefix. Because the value is decoded to bytes, the prefix must have an + even number of hexadecimal characters (0-9a-f). This is specified as a query + string parameter. + +### Sample Request + +```shell-session +$ curl \ + https://localhost:4646/v1/evaluations +``` + +```shell-session +$ curl \ + https://localhost:4646/v1/evaluations?prefix=25ba81 +``` + +### Sample Response + +```json +[ + { + "ID": "5456bd7a-9fc0-c0dd-6131-cbee77f57577", + "Priority": 50, + "Type": "service", + "TriggeredBy": "job-register", + "JobID": "example", + "JobModifyIndex": 52, + "NodeID": "", + "NodeModifyIndex": 0, + "Status": "complete", + "StatusDescription": "", + "Wait": 0, + "NextEval": "", + "PreviousEval": "", + "BlockedEval": "", + "FailedTGAllocs": null, + "ClassEligibility": null, + "EscapedComputedClass": false, + "AnnotatePlan": false, + "SnapshotIndex": 53, + "QueuedAllocations": { + "cache": 0 + }, + "CreateIndex": 53, + "ModifyIndex": 55 + } +] +``` + +## Read Evaluation + +This endpoint reads information about a specific evaluation by ID. + +| Method | Path | Produces | +| ------ | ------------------------- | ------------------ | +| `GET` | `/v1/evaluation/:eval_id` | `application/json` | + +The table below shows this endpoint's support for +[blocking queries](/api-docs#blocking-queries) and +[required ACLs](/api-docs#acls). + +| Blocking Queries | ACL Required | +| ---------------- | -------------------- | +| `YES` | `namespace:read-job` | + +### Parameters + +- `:eval_id` `(string: )`- Specifies the UUID of the evaluation. This + must be the full UUID, not the short 8-character one. This is specified as + part of the path. + +### Sample Request + +```shell-session +$ curl \ + https://localhost:4646/v1/evaluation/5456bd7a-9fc0-c0dd-6131-cbee77f57577 +``` + +### Sample Response + +```json +{ + "ID": "5456bd7a-9fc0-c0dd-6131-cbee77f57577", + "Priority": 50, + "Type": "service", + "TriggeredBy": "job-register", + "JobID": "example", + "JobModifyIndex": 52, + "NodeID": "", + "NodeModifyIndex": 0, + "Status": "complete", + "StatusDescription": "", + "Wait": 0, + "NextEval": "", + "PreviousEval": "", + "BlockedEval": "", + "FailedTGAllocs": null, + "ClassEligibility": null, + "EscapedComputedClass": false, + "AnnotatePlan": false, + "SnapshotIndex": 53, + "QueuedAllocations": { + "cache": 0 + }, + "CreateIndex": 53, + "ModifyIndex": 55 +} +``` + +## List Allocations for Evaluation + +This endpoint lists the allocations created or modified for the given +evaluation. + +| Method | Path | Produces | +| ------ | ------------------------------------- | ------------------ | +| `GET` | `/v1/evaluation/:eval_id/allocations` | `application/json` | + +The table below shows this endpoint's support for +[blocking queries](/api-docs#blocking-queries) and +[required ACLs](/api-docs#acls). + +| Blocking Queries | ACL Required | +| ---------------- | -------------------- | +| `YES` | `namespace:read-job` | + +### Parameters + +- `:eval_id` `(string: )`- Specifies the UUID of the evaluation. This + must be the full UUID, not the short 8-character one. This is specified as + part of the path. + +### Sample Request + +```shell-session +$ curl \ + https://localhost:4646/v1/evaluation/5456bd7a-9fc0-c0dd-6131-cbee77f57577/allocations +``` + +### Sample Response + +```json +[ + { + "ID": "a8198d79-cfdb-6593-a999-1e9adabcba2e", + "EvalID": "5456bd7a-9fc0-c0dd-6131-cbee77f57577", + "Name": "example.cache[0]", + "NodeID": "fb2170a8-257d-3c64-b14d-bc06cc94e34c", + "JobID": "example", + "TaskGroup": "cache", + "DesiredStatus": "run", + "DesiredDescription": "", + "ClientStatus": "running", + "ClientDescription": "", + "TaskStates": { + "redis": { + "State": "running", + "Failed": false, + "Events": [ + { + "Type": "Received", + "Time": 1495747371795703800, + "FailsTask": false, + "RestartReason": "", + "SetupError": "", + "DriverError": "", + "ExitCode": 0, + "Signal": 0, + "Message": "", + "KillTimeout": 0, + "KillError": "", + "KillReason": "", + "StartDelay": 0, + "DownloadError": "", + "ValidationError": "", + "DiskLimit": 0, + "FailedSibling": "", + "VaultError": "", + "TaskSignalReason": "", + "TaskSignal": "", + "DriverMessage": "" + }, + { + "Type": "Driver", + "Time": 1495747371798867200, + "FailsTask": false, + "RestartReason": "", + "SetupError": "", + "DriverError": "", + "ExitCode": 0, + "Signal": 0, + "Message": "", + "KillTimeout": 0, + "KillError": "", + "KillReason": "", + "StartDelay": 0, + "DownloadError": "", + "ValidationError": "", + "DiskLimit": 0, + "FailedSibling": "", + "VaultError": "", + "TaskSignalReason": "", + "TaskSignal": "", + "DriverMessage": "Downloading image redis:3.2" + }, + { + "Type": "Started", + "Time": 1495747379525667800, + "FailsTask": false, + "RestartReason": "", + "SetupError": "", + "DriverError": "", + "ExitCode": 0, + "Signal": 0, + "Message": "", + "KillTimeout": 0, + "KillError": "", + "KillReason": "", + "StartDelay": 0, + "DownloadError": "", + "ValidationError": "", + "DiskLimit": 0, + "FailedSibling": "", + "VaultError": "", + "TaskSignalReason": "", + "TaskSignal": "", + "DriverMessage": "" + } + ] + } + }, + "CreateIndex": 54, + "ModifyIndex": 57, + "CreateTime": 1495747371794276400 + } +] +``` diff --git a/content/nomad/v0.11.x/content/api-docs/index.mdx b/content/nomad/v0.11.x/content/api-docs/index.mdx new file mode 100644 index 0000000000..e0f5a4b8d9 --- /dev/null +++ b/content/nomad/v0.11.x/content/api-docs/index.mdx @@ -0,0 +1,211 @@ +--- +layout: api +page_title: HTTP API +sidebar_title: HTTP API Overview +description: |- + Nomad exposes a RESTful HTTP API to control almost every aspect of the + Nomad agent. +--- + +# HTTP API + +The main interface to Nomad is a RESTful HTTP API. The API can query the current +state of the system as well as modify the state of the system. The Nomad CLI +actually invokes Nomad's HTTP for many commands. + +## Version Prefix + +All API routes are prefixed with `/v1/`. + +This documentation is only for the v1 API. + +~> **Backwards compatibility:** At the current version, Nomad does not yet +promise backwards compatibility even with the v1 prefix. We'll remove this +warning when this policy changes. We expect to reach API stability by Nomad +1.0. + +## Addressing & Ports + +Nomad binds to a specific set of addresses and ports. The HTTP API is served via +the `http` address and port. This `address:port` must be accessible locally. If +you bind to `127.0.0.1:4646`, the API is only available _from that host_. If you +bind to a private internal IP, the API will be available from within that +network. If you bind to a public IP, the API will be available from the public +Internet (not recommended). + +The default port for the Nomad HTTP API is `4646`. This can be overridden via +the Nomad configuration block. Here is an example curl request to query a Nomad +server with the default configuration: + +```shell-session +$ curl http://127.0.0.1:4646/v1/agent/members +``` + +The conventions used in the API documentation do not list a port and use the +standard URL `localhost:4646`. Be sure to replace this with your Nomad agent URL +when using the examples. + +## Data Model and Layout + +There are five primary nouns in Nomad: + +- jobs +- nodes +- allocations +- deployments +- evaluations + +[![Nomad Data Model](/img/nomad-data-model.png)](/img/nomad-data-model.png) + +Jobs are submitted by users and represent a _desired state_. A job is a +declarative description of tasks to run which are bounded by constraints and +require resources. Jobs can also have affinities which are used to express placement +preferences. Nodes are the servers in the clusters that tasks can be +scheduled on. The mapping of tasks in a job to nodes is done using allocations. +An allocation is used to declare that a set of tasks in a job should be run on a +particular node. Scheduling is the process of determining the appropriate +allocations and is done as part of an evaluation. Deployments are objects to +track a rolling update of allocations between two versions of a job. + +The API is modeled closely on the underlying data model. Use the links to the +left for documentation about specific endpoints. There are also "Agent" APIs +which interact with a specific agent and not the broader cluster used for +administration. + +## ACLs + +Several endpoints in Nomad use or require ACL tokens to operate. The token are used to authenticate the request and determine if the request is allowed based on the associated authorizations. Tokens are specified per-request by using the `X-Nomad-Token` request header set to the `SecretID` of an ACL Token. + +For more details about ACLs, please see the [ACL Guide](https://learn.hashicorp.com/nomad?track=acls#operations-and-development). + +## Authentication + +When ACLs are enabled, a Nomad token should be provided to API requests using the `X-Nomad-Token` header. When using authentication, clients should communicate via TLS. + +Here is an example using curl: + +```shell-session +$ curl \ + --header "X-Nomad-Token: aa534e09-6a07-0a45-2295-a7f77063d429" \ + https://localhost:4646/v1/jobs +``` + +## Blocking Queries + +Many endpoints in Nomad support a feature known as "blocking queries". A +blocking query is used to wait for a potential change using long polling. Not +all endpoints support blocking, but each endpoint uniquely documents its support +for blocking queries in the documentation. + +Endpoints that support blocking queries return an HTTP header named +`X-Nomad-Index`. This is a unique identifier representing the current state of +the requested resource. On a new Nomad cluster the value of this index starts at 1. + +On subsequent requests for this resource, the client can set the `index` query +string parameter to the value of `X-Nomad-Index`, indicating that the client +wishes to wait for any changes subsequent to that index. + +When this is provided, the HTTP request will "hang" until a change in the system +occurs, or the maximum timeout is reached. A critical note is that the return of +a blocking request is **no guarantee** of a change. It is possible that the +timeout was reached or that there was an idempotent write that does not affect +the result of the query. + +In addition to `index`, endpoints that support blocking will also honor a `wait` +parameter specifying a maximum duration for the blocking request. This is +limited to 10 minutes. If not set, the wait time defaults to 5 minutes. This +value can be specified in the form of "10s" or "5m" (i.e., 10 seconds or 5 +minutes, respectively). A small random amount of additional wait time is added +to the supplied maximum `wait` time to spread out the wake up time of any +concurrent requests. This adds up to `wait / 16` additional time to the maximum +duration. + +## Consistency Modes + +Most of the read query endpoints support multiple levels of consistency. Since +no policy will suit all clients' needs, these consistency modes allow the user +to have the ultimate say in how to balance the trade-offs inherent in a +distributed system. + +The two read modes are: + +- `default` - If not specified, the default is strongly consistent in almost all + cases. However, there is a small window in which a new leader may be elected + during which the old leader may service stale values. The trade-off is fast + reads but potentially stale values. The condition resulting in stale reads is + hard to trigger, and most clients should not need to worry about this case. + Also, note that this race condition only applies to reads, not writes. + +- `stale` - This mode allows any server to service the read regardless of + whether it is the leader. This means reads can be arbitrarily stale; however, + results are generally consistent to within 50 milliseconds of the leader. The + trade-off is very fast and scalable reads with a higher likelihood of stale + values. Since this mode allows reads without a leader, a cluster that is + unavailable will still be able to respond to queries. + +To switch these modes, use the `stale` query parameter on requests. + +To support bounding the acceptable staleness of data, responses provide the +`X-Nomad-LastContact` header containing the time in milliseconds that a server +was last contacted by the leader node. The `X-Nomad-KnownLeader` header also +indicates if there is a known leader. These can be used by clients to gauge the +staleness of a result and take appropriate action. + +## Cross-Region Requests + +By default, any request to the HTTP API will default to the region on which the +machine is servicing the request. If the agent runs in "region1", the request +will query the region "region1". A target region can be explicitly request using +the `?region` query parameter. The request will be transparently forwarded and +serviced by a server in the requested region. + +## Compressed Responses + +The HTTP API will gzip the response if the HTTP request denotes that the client +accepts gzip compression. This is achieved by passing the accept encoding: + +```shell-session +$ curl \ + --header "Accept-Encoding: gzip" \ + https://localhost:4646/v1/... +``` + +## Formatted JSON Output + +By default, the output of all HTTP API requests is minimized JSON. If the client +passes `pretty` on the query string, formatted JSON will be returned. + +In general, clients should prefer a client-side parser like `jq` instead of +server-formatted data. Asking the server to format the data takes away +processing cycles from more important tasks. + +```shell-session +$ curl https://localhost:4646/v1/page?pretty +``` + +## HTTP Methods + +Nomad's API aims to be RESTful, although there are some exceptions. The API +responds to the standard HTTP verbs GET, PUT, and DELETE. Each API method will +clearly document the verb(s) it responds to and the generated response. The same +path with different verbs may trigger different behavior. For example: + +```text +PUT /v1/jobs +GET /v1/jobs +``` + +Even though these share a path, the `PUT` operation creates a new job whereas +the `GET` operation reads all jobs. + +## HTTP Response Codes + +Individual API's will contain further documentation in the case that more +specific response codes are returned but all clients should handle the following: + +- 200 and 204 as success codes. +- 400 indicates a validation failure and if a parameter is modified in the + request, it could potentially succeed. +- 403 marks that the client isn't authenticated for the request. +- 404 indicates an unknown resource. +- 5xx means that the client should not expect the request to succeed if retried. diff --git a/content/nomad/v0.11.x/content/api-docs/jobs.mdx b/content/nomad/v0.11.x/content/api-docs/jobs.mdx new file mode 100644 index 0000000000..1096087790 --- /dev/null +++ b/content/nomad/v0.11.x/content/api-docs/jobs.mdx @@ -0,0 +1,1858 @@ +--- +layout: api +page_title: Jobs - HTTP API +sidebar_title: Jobs +description: The /jobs endpoints are used to query for and interact with jobs. +--- + +# Jobs HTTP API + +The `/jobs` endpoints are used to query for and interact with jobs. + +## List Jobs + +This endpoint lists all known jobs in the system registered with Nomad. + +| Method | Path | Produces | +| ------ | ---------- | ------------------ | +| `GET` | `/v1/jobs` | `application/json` | + +The table below shows this endpoint's support for +[blocking queries](/api-docs#blocking-queries) and +[required ACLs](/api-docs#acls). + +| Blocking Queries | ACL Required | +| ---------------- | --------------------- | +| `YES` | `namespace:list-jobs` | + +### Parameters + +- `prefix` `(string: "")` - Specifies a string to filter jobs on based on + an index prefix. This is specified as a query string parameter. + +### Sample Request + +```shell-session +$ curl https://localhost:4646/v1/jobs +``` + +```shell-session +$ curl https://localhost:4646/v1/jobs?prefix=team +``` + +### Sample Response + +```json +[ + { + "ID": "example", + "ParentID": "", + "Name": "example", + "Type": "service", + "Priority": 50, + "Status": "pending", + "StatusDescription": "", + "JobSummary": { + "JobID": "example", + "Summary": { + "cache": { + "Queued": 1, + "Complete": 1, + "Failed": 0, + "Running": 0, + "Starting": 0, + "Lost": 0 + } + }, + "Children": { + "Pending": 0, + "Running": 0, + "Dead": 0 + }, + "CreateIndex": 52, + "ModifyIndex": 96 + }, + "CreateIndex": 52, + "ModifyIndex": 93, + "JobModifyIndex": 52 + } +] +``` + +## Create Job + +This endpoint creates (aka "registers") a new job in the system. + +| Method | Path | Produces | +| ------ | ---------- | ------------------ | +| `POST` | `/v1/jobs` | `application/json` | + +The table below shows this endpoint's support for +[blocking queries](/api-docs#blocking-queries) and +[required ACLs](/api-docs#acls). + +| Blocking Queries | ACL Required | +| ---------------- | --------------------------------------------------------------------------------- | +| `NO` | `namespace:submit-job`
`namespace:sentinel-override` if `PolicyOverride` set | + +### Parameters + +- `Job` `(Job: )` - Specifies the JSON definition of the job. + +- `EnforceIndex` `(bool: false)` - If set, the job will only be registered if the + passed `JobModifyIndex` matches the current job's index. If the index is zero, + the register only occurs if the job is new. This paradigm allows check-and-set + style job updating. + +- `JobModifyIndex` `(int: 0)` - Specifies the `JobModifyIndex` to enforce the + current job is at. + +- `PolicyOverride` `(bool: false)` - If set, any soft mandatory Sentinel policies + will be overridden. This allows a job to be registered when it would be denied + by policy. + +### Sample Payload + +```json +{ + "Job": { + "ID": "example", + "Name": "example", + "Type": "service", + "Priority": 50, + "Datacenters": ["dc1"], + "TaskGroups": [ + { + "Name": "cache", + "Count": 1, + "Tasks": [ + { + "Name": "redis", + "Driver": "docker", + "User": "", + "Config": { + "image": "redis:3.2", + "port_map": [ + { + "db": 6379 + } + ] + }, + "Services": [ + { + "Id": "", + "Name": "redis-cache", + "Tags": ["global", "cache"], + "Meta": { + "meta": "for my service" + }, + "PortLabel": "db", + "AddressMode": "", + "Checks": [ + { + "Id": "", + "Name": "alive", + "Type": "tcp", + "Command": "", + "Args": null, + "Path": "", + "Protocol": "", + "PortLabel": "", + "Interval": 10000000000, + "Timeout": 2000000000, + "InitialStatus": "", + "TLSSkipVerify": false + } + ] + } + ], + "Resources": { + "CPU": 500, + "MemoryMB": 256, + "Networks": [ + { + "Device": "", + "CIDR": "", + "IP": "", + "MBits": 10, + "DynamicPorts": [ + { + "Label": "db", + "Value": 0 + } + ] + } + ] + }, + "Leader": false + } + ], + "RestartPolicy": { + "Interval": 300000000000, + "Attempts": 10, + "Delay": 25000000000, + "Mode": "delay" + }, + "ReschedulePolicy": { + "Attempts": 10, + "Delay": 30000000000, + "DelayFunction": "exponential", + "Interval": 36000000000000, + "MaxDelay": 3600000000000, + "Unlimited": false + }, + "EphemeralDisk": { + "SizeMB": 300 + } + } + ], + "Update": { + "MaxParallel": 1, + "MinHealthyTime": 10000000000, + "HealthyDeadline": 180000000000, + "AutoRevert": false, + "Canary": 0 + } + } +} +``` + +### Sample Request + +```shell-session +$ curl \ + --request POST \ + --data @payload.json \ + https://localhost:4646/v1/jobs +``` + +### Sample Response + +```json +{ + "EvalID": "", + "EvalCreateIndex": 0, + "JobModifyIndex": 109, + "Warnings": "", + "Index": 0, + "LastContact": 0, + "KnownLeader": false +} +``` + +## Parse Job + +This endpoint will parse a HCL jobspec and produce the equivalent JSON encoded +job. + +| Method | Path | Produces | +| ------ | ---------------- | ------------------ | +| `POST` | `/v1/jobs/parse` | `application/json` | + +The table below shows this endpoint's support for +[blocking queries](/api-docs#blocking-queries) and +[required ACLs](/api-docs#acls). + +| Blocking Queries | ACL Required | +| ---------------- | ------------ | +| `NO` | `none` | + +### Parameters + +- `JobHCL` `(string: )` - Specifies the HCL definition of the job + encoded in a JSON string. +- `Canonicalize` `(bool: false)` - Flag to enable setting any unset fields to + their default values. + +## Sample Payload + +```json +{ + "JobHCL": "job \"example\" { type = \"service\" group \"cache\" {} }", + "Canonicalize": true +} +``` + +### Sample Request + +```shell-session +$ curl \ + --request POST \ + --data @payload.json \ + https://localhost:4646/v1/jobs/parse +``` + +### Sample Response + +```json +{ + "AllAtOnce": false, + "Constraints": null, + "Affinities": null, + "CreateIndex": 0, + "Datacenters": null, + "ID": "my-job", + "JobModifyIndex": 0, + "Meta": null, + "Migrate": null, + "ModifyIndex": 0, + "Name": "my-job", + "Namespace": "default", + "ParameterizedJob": null, + "ParentID": "", + "Payload": null, + "Periodic": null, + "Priority": 50, + "Region": "global", + "Reschedule": null, + "Stable": false, + "Status": "", + "StatusDescription": "", + "Stop": false, + "SubmitTime": null, + "TaskGroups": null, + "Type": "service", + "Update": null, + "VaultToken": "", + "Version": 0 +} +``` + +## Read Job + +This endpoint reads information about a single job for its specification and +status. + +| Method | Path | Produces | +| ------ | ----------------- | ------------------ | +| `GET` | `/v1/job/:job_id` | `application/json` | + +The table below shows this endpoint's support for +[blocking queries](/api-docs#blocking-queries) and +[required ACLs](/api-docs#acls). + +| Blocking Queries | ACL Required | +| ---------------- | -------------------- | +| `YES` | `namespace:read-job` | + +### Parameters + +- `:job_id` `(string: )` - Specifies the ID of the job (as specified in + the job file during submission). This is specified as part of the path. + +### Sample Request + +```shell-session +$ curl \ + https://localhost:4646/v1/job/my-job +``` + +### Sample Response + +```json +{ + "Region": "global", + "ID": "example", + "ParentID": "", + "Name": "example", + "Type": "batch", + "Priority": 50, + "AllAtOnce": false, + "Datacenters": ["dc1"], + "Constraints": [ + { + "LTarget": "${attr.kernel.name}", + "RTarget": "linux", + "Operand": "=" + } + ], + "TaskGroups": [ + { + "Name": "cache", + "Count": 1, + "Constraints": [ + { + "LTarget": "${attr.os.signals}", + "RTarget": "SIGUSR1", + "Operand": "set_contains" + } + ], + "Affinities": [ + { + "LTarget": "${meta.datacenter}", + "RTarget": "dc1", + "Operand": "=", + "Weight": 50 + } + ], + "RestartPolicy": { + "Attempts": 10, + "Interval": 300000000000, + "Delay": 25000000000, + "Mode": "delay" + }, + "Tasks": [ + { + "Name": "redis", + "Driver": "docker", + "User": "foo-user", + "Config": { + "image": "redis:latest", + "port_map": [ + { + "db": 6379 + } + ] + }, + "Env": { + "foo": "bar", + "baz": "pipe" + }, + "Services": [ + { + "Name": "cache-redis", + "PortLabel": "db", + "Tags": ["global", "cache"], + "Checks": [ + { + "Name": "alive", + "Type": "tcp", + "Command": "", + "Args": null, + "Path": "", + "Protocol": "", + "PortLabel": "", + "Interval": 10000000000, + "Timeout": 2000000000, + "InitialStatus": "" + } + ] + } + ], + "Vault": null, + "Templates": [ + { + "SourcePath": "local/config.conf.tpl", + "DestPath": "local/config.conf", + "EmbeddedTmpl": "", + "ChangeMode": "signal", + "ChangeSignal": "SIGUSR1", + "Splay": 5000000000, + "Perms": "" + } + ], + "Constraints": null, + "Affinities": null, + "Resources": { + "CPU": 500, + "MemoryMB": 256, + "DiskMB": 0, + "Networks": [ + { + "Device": "", + "CIDR": "", + "IP": "", + "MBits": 10, + "ReservedPorts": [ + { + "Label": "rpc", + "Value": 25566 + } + ], + "DynamicPorts": [ + { + "Label": "db", + "Value": 0 + } + ] + } + ] + }, + "DispatchPayload": { + "File": "config.json" + }, + "Meta": { + "foo": "bar", + "baz": "pipe" + }, + "KillTimeout": 5000000000, + "LogConfig": { + "MaxFiles": 10, + "MaxFileSizeMB": 10 + }, + "Artifacts": [ + { + "GetterSource": "http://foo.com/artifact.tar.gz", + "GetterOptions": { + "checksum": "md5:c4aa853ad2215426eb7d70a21922e794" + }, + "RelativeDest": "local/" + } + ], + "Leader": false + } + ], + "EphemeralDisk": { + "Sticky": false, + "SizeMB": 300, + "Migrate": false + }, + "Meta": { + "foo": "bar", + "baz": "pipe" + } + } + ], + "Update": { + "Stagger": 10000000000, + "MaxParallel": 1 + }, + "Periodic": { + "Enabled": true, + "Spec": "* * * * *", + "SpecType": "cron", + "ProhibitOverlap": true + }, + "ParameterizedJob": { + "Payload": "required", + "MetaRequired": ["foo"], + "MetaOptional": ["bar"] + }, + "Payload": null, + "Meta": { + "foo": "bar", + "baz": "pipe" + }, + "VaultToken": "", + "Status": "running", + "StatusDescription": "", + "CreateIndex": 7, + "ModifyIndex": 7, + "JobModifyIndex": 7 +} +``` + +## List Job Versions + +This endpoint reads information about all versions of a job. + +| Method | Path | Produces | +| ------ | -------------------------- | ------------------ | +| `GET` | `/v1/job/:job_id/versions` | `application/json` | + +The table below shows this endpoint's support for +[blocking queries](/api-docs#blocking-queries) and +[required ACLs](/api-docs#acls). + +| Blocking Queries | ACL Required | +| ---------------- | -------------------- | +| `YES` | `namespace:read-job` | + +### Parameters + +- `:job_id` `(string: )` - Specifies the ID of the job (as specified in + the job file during submission). This is specified as part of the path. + +### Sample Request + +```shell-session +$ curl \ + https://localhost:4646/v1/job/my-job/versions +``` + +### Sample Response + +```json +[ + { + "Stop": false, + "Region": "global", + "ID": "example", + "ParentID": "", + "Name": "example", + "Type": "service", + "Priority": 50, + "AllAtOnce": false, + "Datacenters": ["dc1"], + "Constraints": null, + "Affinities": null, + "TaskGroups": [ + { + "Name": "cache", + "Count": 1, + "Update": { + "Stagger": 0, + "MaxParallel": 1, + "HealthCheck": "checks", + "MinHealthyTime": 10000000000, + "HealthyDeadline": 300000000000, + "AutoRevert": false, + "Canary": 0 + }, + "Constraints": null, + "Affinities": null, + "RestartPolicy": { + "Attempts": 10, + "Interval": 300000000000, + "Delay": 25000000000, + "Mode": "delay" + }, + "Spreads": [ + { + "Attribute": "${node.datacenter}", + "SpreadTarget": null, + "Weight": 100 + } + ], + "Tasks": [ + { + "Name": "redis", + "Driver": "docker", + "User": "", + "Config": { + "image": "redis:3.2", + "port_map": [ + { + "db": 6379 + } + ] + }, + "Env": null, + "Services": [ + { + "Name": "redis-cache", + "PortLabel": "db", + "Tags": ["global", "cache"], + "Checks": [ + { + "Name": "alive", + "Type": "tcp", + "Command": "", + "Args": null, + "Path": "", + "Protocol": "", + "PortLabel": "", + "Interval": 10000000000, + "Timeout": 2000000000, + "InitialStatus": "", + "TLSSkipVerify": false + } + ] + } + ], + "Vault": null, + "Templates": null, + "Constraints": null, + "Affinities": null, + "Spreads": null, + "Resources": { + "CPU": 500, + "MemoryMB": 256, + "DiskMB": 0, + "Networks": [ + { + "Device": "", + "CIDR": "", + "IP": "", + "MBits": 10, + "ReservedPorts": null, + "DynamicPorts": [ + { + "Label": "db", + "Value": 0 + } + ] + } + ] + }, + "DispatchPayload": null, + "Meta": null, + "KillTimeout": 5000000000, + "LogConfig": { + "MaxFiles": 10, + "MaxFileSizeMB": 10 + }, + "Artifacts": null, + "Leader": false + } + ], + "EphemeralDisk": { + "Sticky": false, + "SizeMB": 300, + "Migrate": false + }, + "Meta": null + } + ], + "Update": { + "Stagger": 10000000000, + "MaxParallel": 1, + "HealthCheck": "", + "MinHealthyTime": 0, + "HealthyDeadline": 0, + "AutoRevert": false, + "Canary": 0 + }, + "Periodic": null, + "ParameterizedJob": null, + "Payload": null, + "Meta": null, + "VaultToken": "", + "Spreads": null, + "Status": "pending", + "StatusDescription": "", + "Stable": false, + "Version": 0, + "CreateIndex": 7, + "ModifyIndex": 7, + "JobModifyIndex": 7 + } +] +``` + +## List Job Allocations + +This endpoint reads information about a single job's allocations. + +| Method | Path | Produces | +| ------ | ----------------------------- | ------------------ | +| `GET` | `/v1/job/:job_id/allocations` | `application/json` | + +The table below shows this endpoint's support for +[blocking queries](/api-docs#blocking-queries) and +[required ACLs](/api-docs#acls). + +| Blocking Queries | ACL Required | +| ---------------- | -------------------- | +| `YES` | `namespace:read-job` | + +### Parameters + +- `:job_id` `(string: )` - Specifies the ID of the job (as specified in + the job file during submission). This is specified as part of the path. + +- `all` `(bool: false)` - Specifies whether the list of allocations should + include allocations from a previously registered job with the same ID. This is + possible if the job is deregistered and reregistered. + +### Sample Request + +```shell-session +$ curl \ + https://localhost:4646/v1/job/my-job/allocations +``` + +### Sample Response + +```json +[ + { + "ID": "ed344e0a-7290-d117-41d3-a64f853ca3c2", + "EvalID": "a9c5effc-2242-51b2-f1fe-054ee11ab189", + "Name": "example.cache[0]", + "NodeID": "cb1f6030-a220-4f92-57dc-7baaabdc3823", + "PreviousAllocation": "516d2753-0513-cfc7-57ac-2d6fac18b9dc", + "NextAllocation": "cd13d9b9-4f97-7184-c88b-7b451981616b", + "RescheduleTracker": { + "Events": [ + { + "PrevAllocID": "516d2753-0513-cfc7-57ac-2d6fac18b9dc", + "PrevNodeID": "9230cd3b-3bda-9a3f-82f9-b2ea8dedb20e", + "RescheduleTime": 1517434161192946200, + "Delay": 5000000000 + } + ] + }, + "JobID": "example", + "TaskGroup": "cache", + "DesiredStatus": "run", + "DesiredDescription": "", + "ClientStatus": "running", + "ClientDescription": "", + "TaskStates": { + "redis": { + "State": "running", + "Failed": false, + "StartedAt": "2017-05-25T23:41:23.240184101Z", + "FinishedAt": "0001-01-01T00:00:00Z", + "Events": [ + { + "Type": "Received", + "Time": 1495755675956923000, + "FailsTask": false, + "RestartReason": "", + "SetupError": "", + "DriverError": "", + "ExitCode": 0, + "Signal": 0, + "Message": "", + "KillTimeout": 0, + "KillError": "", + "KillReason": "", + "StartDelay": 0, + "DownloadError": "", + "ValidationError": "", + "DiskLimit": 0, + "FailedSibling": "", + "VaultError": "", + "TaskSignalReason": "", + "TaskSignal": "", + "DriverMessage": "" + }, + { + "Type": "Task Setup", + "Time": 1495755675957466400, + "FailsTask": false, + "RestartReason": "", + "SetupError": "", + "DriverError": "", + "ExitCode": 0, + "Signal": 0, + "Message": "Building Task Directory", + "KillTimeout": 0, + "KillError": "", + "KillReason": "", + "StartDelay": 0, + "DownloadError": "", + "ValidationError": "", + "DiskLimit": 0, + "FailedSibling": "", + "VaultError": "", + "TaskSignalReason": "", + "TaskSignal": "", + "DriverMessage": "" + }, + { + "Type": "Driver", + "Time": 1495755675970286800, + "FailsTask": false, + "RestartReason": "", + "SetupError": "", + "DriverError": "", + "ExitCode": 0, + "Signal": 0, + "Message": "", + "KillTimeout": 0, + "KillError": "", + "KillReason": "", + "StartDelay": 0, + "DownloadError": "", + "ValidationError": "", + "DiskLimit": 0, + "FailedSibling": "", + "VaultError": "", + "TaskSignalReason": "", + "TaskSignal": "", + "DriverMessage": "Downloading image redis:3.2" + }, + { + "Type": "Started", + "Time": 1495755683227522000, + "FailsTask": false, + "RestartReason": "", + "SetupError": "", + "DriverError": "", + "ExitCode": 0, + "Signal": 0, + "Message": "", + "KillTimeout": 0, + "KillError": "", + "KillReason": "", + "StartDelay": 0, + "DownloadError": "", + "ValidationError": "", + "DiskLimit": 0, + "FailedSibling": "", + "VaultError": "", + "TaskSignalReason": "", + "TaskSignal": "", + "DriverMessage": "" + } + ] + } + }, + "CreateIndex": 9, + "ModifyIndex": 13, + "CreateTime": 1495755675944527600, + "ModifyTime": 1495755675944527600 + } +] +``` + +## List Job Evaluations + +This endpoint reads information about a single job's evaluations + +| Method | Path | Produces | +| ------ | ----------------------------- | ------------------ | +| `GET` | `/v1/job/:job_id/evaluations` | `application/json` | + +The table below shows this endpoint's support for +[blocking queries](/api-docs#blocking-queries) and +[required ACLs](/api-docs#acls). + +| Blocking Queries | ACL Required | +| ---------------- | -------------------- | +| `YES` | `namespace:read-job` | + +### Parameters + +- `:job_id` `(string: )` - Specifies the ID of the job (as specified in + the job file during submission). This is specified as part of the path. + +### Sample Request + +```shell-session +$ curl \ + https://localhost:4646/v1/job/my-job/evaluations +``` + +### Sample Response + +```json +[ + { + "ID": "a9c5effc-2242-51b2-f1fe-054ee11ab189", + "Priority": 50, + "Type": "service", + "TriggeredBy": "job-register", + "JobID": "example", + "JobModifyIndex": 7, + "NodeID": "", + "NodeModifyIndex": 0, + "Status": "complete", + "StatusDescription": "", + "Wait": 0, + "NextEval": "", + "PreviousEval": "", + "BlockedEval": "", + "FailedTGAllocs": null, + "ClassEligibility": null, + "EscapedComputedClass": false, + "AnnotatePlan": false, + "QueuedAllocations": { + "cache": 0 + }, + "SnapshotIndex": 8, + "CreateIndex": 8, + "ModifyIndex": 10 + } +] +``` + +## List Job Deployments + +This endpoint lists a single job's deployments + +| Method | Path | Produces | +| ------ | ----------------------------- | ------------------ | +| `GET` | `/v1/job/:job_id/deployments` | `application/json` | + +The table below shows this endpoint's support for +[blocking queries](/api-docs#blocking-queries) and +[required ACLs](/api-docs#acls). + +| Blocking Queries | ACL Required | +| ---------------- | -------------------- | +| `YES` | `namespace:read-job` | + +### Parameters + +- `:job_id` `(string: )` - Specifies the ID of the job (as specified in + the job file during submission). This is specified as part of the path. + +- `all` `(bool: false)` - Specifies whether the list of deployments should + include deployments from a previously registered job with the same ID. This is + possible if the job is deregistered and reregistered. + +### Sample Request + +```shell-session +$ curl \ + https://localhost:4646/v1/job/my-job/deployments +``` + +### Sample Response + +```json +[ + { + "ID": "85ee4a9a-339f-a921-a9ef-0550d20b2c61", + "JobID": "my-job", + "JobVersion": 1, + "JobModifyIndex": 19, + "JobCreateIndex": 7, + "TaskGroups": { + "cache": { + "AutoRevert": true, + "Promoted": false, + "PlacedCanaries": [ + "d0ad0808-2765-abf6-1e15-79fb7fe5a416", + "38c70cd8-81f2-1489-a328-87bb29ec0e0f" + ], + "DesiredCanaries": 2, + "DesiredTotal": 3, + "PlacedAllocs": 2, + "HealthyAllocs": 2, + "UnhealthyAllocs": 0 + } + }, + "Status": "running", + "StatusDescription": "Deployment is running", + "CreateIndex": 21, + "ModifyIndex": 25 + }, + { + "ID": "fb6070fb-4a44-e255-4e6f-8213eba3871a", + "JobID": "my-job", + "JobVersion": 0, + "JobModifyIndex": 7, + "JobCreateIndex": 7, + "TaskGroups": { + "cache": { + "AutoRevert": true, + "Promoted": false, + "PlacedCanaries": null, + "DesiredCanaries": 0, + "DesiredTotal": 3, + "PlacedAllocs": 3, + "HealthyAllocs": 3, + "UnhealthyAllocs": 0 + } + }, + "Status": "successful", + "StatusDescription": "Deployment completed successfully", + "CreateIndex": 9, + "ModifyIndex": 17 + } +] +``` + +## Read Job's Most Recent Deployment + +This endpoint returns a single job's most recent deployment. + +| Method | Path | Produces | +| ------ | ---------------------------- | ------------------ | +| `GET` | `/v1/job/:job_id/deployment` | `application/json` | + +The table below shows this endpoint's support for +[blocking queries](/api-docs#blocking-queries) and +[required ACLs](/api-docs#acls). + +| Blocking Queries | ACL Required | +| ---------------- | -------------------- | +| `YES` | `namespace:read-job` | + +### Parameters + +- `:job_id` `(string: )` - Specifies the ID of the job (as specified in + the job file during submission). This is specified as part of the path. + +### Sample Request + +```shell-session +$ curl \ + https://localhost:4646/v1/job/my-job/deployment +``` + +### Sample Response + +```json +{ + "ID": "85ee4a9a-339f-a921-a9ef-0550d20b2c61", + "JobID": "my-job", + "JobVersion": 1, + "JobModifyIndex": 19, + "JobCreateIndex": 7, + "TaskGroups": { + "cache": { + "AutoRevert": true, + "Promoted": false, + "PlacedCanaries": [ + "d0ad0808-2765-abf6-1e15-79fb7fe5a416", + "38c70cd8-81f2-1489-a328-87bb29ec0e0f" + ], + "DesiredCanaries": 2, + "DesiredTotal": 3, + "PlacedAllocs": 2, + "HealthyAllocs": 2, + "UnhealthyAllocs": 0 + } + }, + "Status": "running", + "StatusDescription": "Deployment is running", + "CreateIndex": 21, + "ModifyIndex": 25 +} +``` + +## Read Job Summary + +This endpoint reads summary information about a job. + +| Method | Path | Produces | +| ------ | ------------------------- | ------------------ | +| `GET` | `/v1/job/:job_id/summary` | `application/json` | + +The table below shows this endpoint's support for +[blocking queries](/api-docs#blocking-queries) and +[required ACLs](/api-docs#acls). + +| Blocking Queries | ACL Required | +| ---------------- | -------------------- | +| `YES` | `namespace:read-job` | + +### Parameters + +- `:job_id` `(string: )` - Specifies the ID of the job (as specified in + the job file during submission). This is specified as part of the path. + +### Sample Request + +```shell-session +$ curl \ + https://localhost:4646/v1/job/my-job/summary +``` + +### Sample Response + +```json +{ + "JobID": "example", + "Summary": { + "cache": { + "Queued": 0, + "Complete": 0, + "Failed": 0, + "Running": 1, + "Starting": 0, + "Lost": 0 + } + }, + "Children": { + "Pending": 0, + "Running": 0, + "Dead": 0 + }, + "CreateIndex": 7, + "ModifyIndex": 13 +} +``` + +## Update Existing Job + +This endpoint registers a new job or updates an existing job. + +| Method | Path | Produces | +| ------ | ----------------- | ------------------ | +| `POST` | `/v1/job/:job_id` | `application/json` | + +The table below shows this endpoint's support for +[blocking queries](/api-docs#blocking-queries) and +[required ACLs](/api-docs#acls). + +| Blocking Queries | ACL Required | +| ---------------- | --------------------------------------------------------------------------------- | +| `NO` | `namespace:submit-job`
`namespace:sentinel-override` if `PolicyOverride` set | + +### Parameters + +- `:job_id` `(string: )` - Specifies the ID of the job (as specified in + the job file during submission). This is specified as part of the path. + +- `Job` `(Job: )` - Specifies the JSON definition of the job. + +- `EnforceIndex` `(bool: false)` - If set, the job will only be registered if the + passed `JobModifyIndex` matches the current job's index. If the index is zero, + the register only occurs if the job is new. This paradigm allows check-and-set + style job updating. + +- `JobModifyIndex` `(int: 0)` - Specifies the `JobModifyIndex` to enforce the + current job is at. + +- `PolicyOverride` `(bool: false)` - If set, any soft mandatory Sentinel policies + will be overridden. This allows a job to be registered when it would be denied + by policy. + +### Sample Payload + +```javascript +{ + "Job": { + // ... + }, + "EnforceIndex": true, + "JobModifyIndex": 4 +} +``` + +### Sample Request + +```shell-session +$ curl \ + --request POST \ + --data @payload.json \ + https://localhost:4646/v1/job/my-job +``` + +### Sample Response + +```json +{ + "EvalID": "d092fdc0-e1fd-2536-67d8-43af8ca798ac", + "EvalCreateIndex": 35, + "JobModifyIndex": 34 +} +``` + +## Dispatch Job + +This endpoint dispatches a new instance of a parameterized job. + +| Method | Path | Produces | +| ------ | -------------------------- | ------------------ | +| `POST` | `/v1/job/:job_id/dispatch` | `application/json` | + +The table below shows this endpoint's support for +[blocking queries](/api-docs#blocking-queries) and +[required ACLs](/api-docs#acls). + +| Blocking Queries | ACL Required | +| ---------------- | ------------------------ | +| `NO` | `namespace:dispatch-job` | + +### Parameters + +- `:job_id` `(string: )` - Specifies the ID of the job (as specified + in the job file during submission). This is specified as part of the path. + +- `Payload` `(string: "")` - Specifies a base64 encoded string containing the + payload. This is limited to 15 KB. + +- `Meta` `(meta: nil)` - Specifies arbitrary metadata to pass to + the job. + +### Sample Payload + +```json +{ + "Payload": "A28C3==", + "Meta": { + "key": "Value" + } +} +``` + +### Sample Request + +```shell-session +$ curl \ + --request POST \ + --data @payload.json \ + https://localhost:4646/v1/job/my-job/dispatch +``` + +### Sample Response + +```json +{ + "Index": 13, + "JobCreateIndex": 12, + "EvalCreateIndex": 13, + "EvalID": "e5f55fac-bc69-119d-528a-1fc7ade5e02c", + "DispatchedJobID": "example/dispatch-1485408778-81644024" +} +``` + +## Revert to older Job Version + +This endpoint reverts the job to an older version. + +| Method | Path | Produces | +| ------ | ------------------------ | ------------------ | +| `POST` | `/v1/job/:job_id/revert` | `application/json` | + +The table below shows this endpoint's support for +[blocking queries](/api-docs#blocking-queries) and +[required ACLs](/api-docs#acls). + +| Blocking Queries | ACL Required | +| ---------------- | ---------------------- | +| `NO` | `namespace:submit-job` | + +### Parameters + +- `JobID` `(string: )` - Specifies the ID of the job (as specified + in the job file during submission). This is specified as part of the path. + +- `JobVersion` `(integer: 0)` - Specifies the job version to revert to. + +- `EnforcePriorVersion` `(integer: nil)` - Optional value specifying the current + job's version. This is checked and acts as a check-and-set value before + reverting to the specified job. + +- `ConsulToken` `(string:"")` - Optional value specifying the [consul token](/docs/commands/job/revert) + used for Consul [service identity polity authentication checking](/docs/configuration/consul#allow_unauthenticated). + +- `VaultToken` `(string: "")` - Optional value specifying the [vault token](/docs/commands/job/revert) + used for Vault [policy authentication checking](/docs/configuration/vault#allow_unauthenticated). + +### Sample Payload + +```json +{ + "JobID": "my-job", + "JobVersion": 2 +} +``` + +### Sample Request + +```shell-session +$ curl \ + --request POST \ + --data @payload.json \ + https://localhost:4646/v1/job/my-job/revert +``` + +### Sample Response + +```json +{ + "EvalID": "d092fdc0-e1fd-2536-67d8-43af8ca798ac", + "EvalCreateIndex": 35, + "JobModifyIndex": 34 +} +``` + +## Set Job Stability + +This endpoint sets the job's stability. + +| Method | Path | Produces | +| ------ | ------------------------ | ------------------ | +| `POST` | `/v1/job/:job_id/stable` | `application/json` | + +The table below shows this endpoint's support for +[blocking queries](/api-docs#blocking-queries) and +[required ACLs](/api-docs#acls). + +| Blocking Queries | ACL Required | +| ---------------- | ---------------------- | +| `NO` | `namespace:submit-job` | + +### Parameters + +- `JobID` `(string: )` - Specifies the ID of the job (as specified + in the job file during submission). This is specified as part of the path. + +- `JobVersion` `(integer: 0)` - Specifies the job version to set the stability on. + +- `Stable` `(bool: false)` - Specifies whether the job should be marked as + stable or not. + +### Sample Payload + +```json +{ + "JobID": "my-job", + "JobVersion": 2, + "Stable": true +} +``` + +### Sample Request + +```shell-session +$ curl \ + --request POST \ + --data @payload.json \ + https://localhost:4646/v1/job/my-job/stable +``` + +### Sample Response + +```json +{ + "JobModifyIndex": 34 +} +``` + +## Create Job Evaluation + +This endpoint creates a new evaluation for the given job. This can be used to +force run the scheduling logic if necessary. Since Nomad 0.8.4, this endpoint +supports a JSON payload with additional options. Support for calling this end point +without a JSON payload will be removed in Nomad 0.9. + +| Method | Path | Produces | +| ------ | -------------------------- | ------------------ | +| `POST` | `/v1/job/:job_id/evaluate` | `application/json` | + +The table below shows this endpoint's support for +[blocking queries](/api-docs#blocking-queries) and +[required ACLs](/api-docs#acls). + +| Blocking Queries | ACL Required | +| ---------------- | -------------------- | +| `NO` | `namespace:read-job` | + +### Parameters + +- `:job_id` `(string: )` - Specifies the ID of the job (as specified in + the job file during submission). This is specified as part of the path. + +- `JobID` `(string: )` - Specify the ID of the job in the JSON payload + +- `EvalOptions` `()` - Specify additional options to be used during the forced evaluation. + - `ForceReschedule` `(bool: false)` - If set, failed allocations of the job are rescheduled + immediately. This is useful for operators to force immediate placement even if the failed allocations are past + their reschedule limit, or are delayed by several hours because the allocation's reschedule policy has exponential delay. + +### Sample Payload + +```json +{ + "JobID": "my-job", + "EvalOptions": { + "ForceReschedule": true + } +} +``` + +### Sample Request + +```shell-session +$ curl \ + --request POST \ + -d @sample.json \ + https://localhost:4646/v1/job/my-job/evaluate +``` + +### Sample Response + +```json +{ + "EvalID": "d092fdc0-e1fd-2536-67d8-43af8ca798ac", + "EvalCreateIndex": 35, + "JobModifyIndex": 34 +} +``` + +## Create Job Plan + +This endpoint invokes a dry-run of the scheduler for the job. + +| Method | Path | Produces | +| ------ | ---------------------- | ------------------ | +| `POST` | `/v1/job/:job_id/plan` | `application/json` | + +The table below shows this endpoint's support for +[blocking queries](/api-docs#blocking-queries) and +[required ACLs](/api-docs#acls). + +| Blocking Queries | ACL Required | +| ---------------- | --------------------------------------------------------------------------------- | +| `NO` | `namespace:submit-job`
`namespace:sentinel-override` if `PolicyOverride` set | + +### Parameters + +- `:job_id` `(string: )` - Specifies the ID of the job (as specified in +- the job file during submission). This is specified as part of the path. + +- `Job` `(string: )` - Specifies the JSON definition of the job. + +- `Diff` `(bool: false)` - Specifies whether the diff structure between the + submitted and server side version of the job should be included in the + response. + +- `PolicyOverride` `(bool: false)` - If set, any soft mandatory Sentinel policies + will be overridden. This allows a job to be registered when it would be denied + by policy. + +### Sample Payload + +```json +{ + "Job": "...", + "Diff": true, + "PolicyOverride": false +} +``` + +### Sample Request + +```shell-session +$ curl \ + --request POST \ + --data @payload.json \ + https://localhost:4646/v1/job/my-job/plan +``` + +### Sample Response + +```json +{ + "Index": 0, + "NextPeriodicLaunch": "0001-01-01T00:00:00Z", + "Warnings": "", + "Diff": { + "Type": "Added", + "TaskGroups": [ + { + "Updates": { + "create": 1 + }, + "Type": "Added", + "Tasks": [ + { + "Type": "Added", + "Objects": ["..."], + "Name": "redis", + "Fields": [ + { + "Type": "Added", + "Old": "", + "New": "docker", + "Name": "Driver", + "Annotations": null + }, + { + "Type": "Added", + "Old": "", + "New": "5000000000", + "Name": "KillTimeout", + "Annotations": null + } + ], + "Annotations": ["forces create"] + } + ], + "Objects": ["..."], + "Name": "cache", + "Fields": ["..."] + } + ], + "Objects": [ + { + "Type": "Added", + "Objects": null, + "Name": "Datacenters", + "Fields": ["..."] + }, + { + "Type": "Added", + "Objects": null, + "Name": "Constraint", + "Fields": ["..."] + }, + { + "Type": "Added", + "Objects": null, + "Name": "Update", + "Fields": ["..."] + } + ], + "ID": "example", + "Fields": ["..."] + }, + "CreatedEvals": [ + { + "ModifyIndex": 0, + "CreateIndex": 0, + "SnapshotIndex": 0, + "AnnotatePlan": false, + "EscapedComputedClass": false, + "NodeModifyIndex": 0, + "NodeID": "", + "JobModifyIndex": 0, + "JobID": "example", + "TriggeredBy": "job-register", + "Type": "batch", + "Priority": 50, + "ID": "312e6a6d-8d01-0daf-9105-14919a66dba3", + "Status": "blocked", + "StatusDescription": "created to place remaining allocations", + "Wait": 0, + "NextEval": "", + "PreviousEval": "80318ae4-7eda-e570-e59d-bc11df134817", + "BlockedEval": "", + "FailedTGAllocs": null, + "ClassEligibility": { + "v1:7968290453076422024": true + } + } + ], + "JobModifyIndex": 0, + "FailedTGAllocs": { + "cache": { + "CoalescedFailures": 3, + "AllocationTime": 46415, + "Scores": null, + "NodesEvaluated": 1, + "NodesFiltered": 0, + "NodesAvailable": { + "dc1": 1 + }, + "ClassFiltered": null, + "ConstraintFiltered": null, + "NodesExhausted": 1, + "ClassExhausted": null, + "DimensionExhausted": { + "cpu": 1 + } + } + }, + "Annotations": { + "DesiredTGUpdates": { + "cache": { + "DestructiveUpdate": 0, + "InPlaceUpdate": 0, + "Stop": 0, + "Migrate": 0, + "Place": 11, + "Ignore": 0 + } + } + } +} +``` + +#### Field Reference + +- `Diff` - A diff structure between the submitted job and the server side + version. The top-level object is a Job Diff which contains Task Group Diffs, + which in turn contain Task Diffs. Each of these objects then has Object and + Field Diff structures embedded. + +- `NextPeriodicLaunch` - If the job being planned is periodic, this field will + include the next launch time for the job. + +- `CreatedEvals` - A set of evaluations that were created as a result of the + dry-run. These evaluations can signify a follow-up rolling update evaluation + or a blocked evaluation. + +- `JobModifyIndex` - The `JobModifyIndex` of the server side version of this job. + +- `FailedTGAllocs` - A set of metrics to understand any allocation failures that + occurred for the Task Group. + +- `Annotations` - Annotations include the `DesiredTGUpdates`, which tracks what +- the scheduler would do given enough resources for each Task Group. + +## Force New Periodic Instance + +This endpoint forces a new instance of the periodic job. A new instance will be +created even if it violates the job's +[`prohibit_overlap`](/docs/job-specification/periodic#prohibit_overlap) +settings. As such, this should be only used to immediately run a periodic job. + +| Method | Path | Produces | +| ------ | -------------------------------- | ------------------ | +| `POST` | `/v1/job/:job_id/periodic/force` | `application/json` | + +The table below shows this endpoint's support for +[blocking queries](/api-docs#blocking-queries) and +[required ACLs](/api-docs#acls). + +| Blocking Queries | ACL Required | +| ---------------- | ---------------------- | +| `NO` | `namespace:submit-job` | + +### Parameters + +- `:job_id` `(string: )` - Specifies the ID of the job (as specified in + the job file during submission). This is specified as part of the path. + +### Sample Request + +```shell-session +$ curl \ + --request POST \ + https://localhost:4646/v1/job/my-job/periodic/force +``` + +### Sample Response + +```json +{ + "EvalCreateIndex": 7, + "EvalID": "57983ddd-7fcf-3e3a-fd24-f699ccfb36f4" +} +``` + +## Stop a Job + +This endpoint deregisters a job, and stops all allocations part of it. + +| Method | Path | Produces | +| -------- | ----------------- | ------------------ | +| `DELETE` | `/v1/job/:job_id` | `application/json` | + +The table below shows this endpoint's support for +[blocking queries](/api-docs#blocking-queries) and +[required ACLs](/api-docs#acls). + +| Blocking Queries | ACL Required | +| ---------------- | ---------------------- | +| `NO` | `namespace:submit-job` | + +### Parameters + +- `:job_id` `(string: )` - Specifies the ID of the job (as specified in + the job file during submission). This is specified as part of the path. + +- `purge` `(bool: false)` - Specifies that the job should stopped and purged + immediately. This means the job will not be queryable after being stopped. If + not set, the job will be purged by the garbage collector. + +### Sample Request + +```shell-session +$ curl \ + --request DELETE \ + https://localhost:4646/v1/job/my-job?purge=true +``` + +### Sample Response + +```json +{ + "EvalID": "d092fdc0-e1fd-2536-67d8-43af8ca798ac", + "EvalCreateIndex": 35, + "JobModifyIndex": 34 +} +``` + +## Read Job Scale Status Beta + +This endpoint reads scale information about a job. + +| Method | Path | Produces | +| ------ | ----------------------- | ------------------ | +| `GET` | `/v1/job/:job_id/scale` | `application/json` | + +The table below shows this endpoint's support for +[blocking queries](/api-docs#blocking-queries) and +[required ACLs](/api-docs#acls). + +| Blocking Queries | ACL Required | +| ---------------- | ---------------------------------------------------- | +| `YES` | `namespace:read-job-scaling` or `namespace:read-job` | + +### Parameters + +- `:job_id` `(string: )` - Specifies the ID of the job (as specified in + the job file during submission). This is specified as part of the path. + +### Sample Request + +```shell-session +$ curl \ + https://localhost:4646/v1/job/my-job/scale +``` + +### Sample Response + +```json +{ + "JobCreateIndex": 10, + "JobID": "example", + "JobModifyIndex": 18, + "JobStopped": false, + "TaskGroups": { + "cache": { + "Desired": 1, + "Events": null, + "Healthy": 1, + "Placed": 1, + "Running": 0, + "Unhealthy": 0 + } + } +} +``` + +## Scale Task Group Beta + +This endpoint performs a scaling action against a job. +Currently, this endpoint supports scaling the count for a task group. + +| Method | Path | Produces | +| ------ | ----------------------- | ------------------ | +| `POST` | `/v1/job/:job_id/scale` | `application/json` | + +The table below shows this endpoint's support for +[blocking queries](/api-docs#blocking-queries) and +[required ACLs](/api-docs#acls). + +| Blocking Queries | ACL Required | +| ---------------- | ---------------------------------------------------------------------------------------------------------- | +| `NO` | `namespace:scale-job` or `namespace:submit-job`
`namespace:sentinel-override` if `PolicyOverride` set | + +### Parameters + +- `:job_id` `(string: )` - Specifies the ID of the job (as specified in + the job file during submission). This is specified as part of the path. + +- `Count` `(int: )` - Specifies the new task group count. + +- `Target` `(json: required)` - JSON map containing the target of the scaling operation. + Must contain a field `Group` with the name of the task group that is the target of this scaling action. + +- `Reason` `(string: )` - Description of the scale action, persisted as part of the scaling event. + Indicates information or reason for scaling; one of `Reason` or `Error` must be provided. + +- `Error` `(string: )` - Description of the scale action, persisted as part of the scaling event. + Indicates an error state preventing scaling; one of `Reason` or `Error` must be provided. + +- `Meta` `(json: )` - JSON block that is persisted as part of the scaling event. + +- `PolicyOverride` `(bool: false)` - If set, any soft mandatory Sentinel policies + will be overridden. This allows a job to be scaled when it would be denied + by policy. + +### Sample Payload + +```javascript +{ + "Count": 5, + "Meta": { + "metrics": [ + "cpu", + "memory" + ] + }, + "Reason": "metric did not satisfy SLA", + "Target": { + "Group": "cache" + } +} +``` + +### Sample Request + +```shell-session +$ curl \ + --request POST \ + --data @payload.json \ + https://localhost:4646/v1/job/example/scale +``` + +### Sample Response + +This is the same payload as returned by job update. +`EvalCreateIndex` and `EvalID` will only be present if the scaling operation resulted in the creation of an evaluation. + +```json +{ + "EvalCreateIndex": 45, + "EvalID": "116f3ede-f6a5-f6e7-2d0e-1fda136390f0", + "Index": 45, + "JobModifyIndex": 44, + "KnownLeader": false, + "LastContact": 0, + "Warnings": "" +} +``` diff --git a/content/nomad/v0.11.x/content/api-docs/json-jobs.mdx b/content/nomad/v0.11.x/content/api-docs/json-jobs.mdx new file mode 100644 index 0000000000..36d454696d --- /dev/null +++ b/content/nomad/v0.11.x/content/api-docs/json-jobs.mdx @@ -0,0 +1,1064 @@ +--- +layout: api +page_title: JSON Job Specification - HTTP API +sidebar_title: JSON Jobs +description: |- + Jobs can also be specified via the HTTP API using a JSON format. This guide + discusses the job specification in JSON format. +--- + +# JSON Job Specification + +This guide covers the JSON syntax for submitting jobs to Nomad. A useful command +for generating valid JSON versions of HCL jobs is: + +```shell-session +$ nomad job run -output my-job.nomad +``` + +## Syntax + +Below is the JSON representation of the job outputted by `$ nomad init`: + +```json +{ + "Job": { + "ID": "example", + "Name": "example", + "Type": "service", + "Priority": 50, + "Datacenters": ["dc1"], + "TaskGroups": [ + { + "Name": "cache", + "Count": 1, + "Migrate": { + "HealthCheck": "checks", + "HealthyDeadline": 300000000000, + "MaxParallel": 1, + "MinHealthyTime": 10000000000 + }, + "Tasks": [ + { + "Name": "redis", + "Driver": "docker", + "User": "", + "Config": { + "image": "redis:3.2", + "port_map": [ + { + "db": 6379 + } + ] + }, + "Services": [ + { + "Id": "", + "Name": "redis-cache", + "Tags": ["global", "cache"], + "Meta": { + "meta": "for my service" + }, + "PortLabel": "db", + "AddressMode": "", + "Checks": [ + { + "Id": "", + "Name": "alive", + "Type": "tcp", + "Command": "", + "Args": null, + "Header": {}, + "Method": "", + "Path": "", + "Protocol": "", + "PortLabel": "", + "Interval": 10000000000, + "Timeout": 2000000000, + "InitialStatus": "", + "TLSSkipVerify": false, + "CheckRestart": { + "Limit": 3, + "Grace": 30000000000, + "IgnoreWarnings": false + } + } + ] + } + ], + "Resources": { + "CPU": 500, + "MemoryMB": 256, + "Networks": [ + { + "Device": "", + "CIDR": "", + "IP": "", + "MBits": 10, + "DynamicPorts": [ + { + "Label": "db", + "Value": 0 + } + ] + } + ] + }, + "Leader": false + } + ], + "RestartPolicy": { + "Interval": 1800000000000, + "Attempts": 2, + "Delay": 15000000000, + "Mode": "fail" + }, + "ReschedulePolicy": { + "Attempts": 10, + "Delay": 30000000000, + "DelayFunction": "exponential", + "Interval": 0, + "MaxDelay": 3600000000000, + "Unlimited": true + }, + "EphemeralDisk": { + "SizeMB": 300 + } + } + ], + "Update": { + "MaxParallel": 1, + "MinHealthyTime": 10000000000, + "HealthyDeadline": 180000000000, + "AutoRevert": false, + "Canary": 0 + } + } +} +``` + +The example JSON could be submitted as a job using the following: + +```shell-session +$ curl -XPUT -d @example.json http://127.0.0.1:4646/v1/job/example +{ + "EvalID": "5d6ded54-0b2a-8858-6583-be5f476dec9d", + "EvalCreateIndex": 12, + "JobModifyIndex": 11, + "Warnings": "", + "Index": 12, + "LastContact": 0, + "KnownLeader": false +} +``` + +## Syntax Reference + +Following is a syntax reference for the possible keys that are supported and +their default values if any for each type of object. + +### Job + +The `Job` object supports the following keys: + +- `AllAtOnce` - Controls whether the scheduler can make partial placements if + optimistic scheduling resulted in an oversubscribed node. This does not + control whether all allocations for the job, where all would be the desired + count for each task group, must be placed atomically. This should only be + used for special circumstances. Defaults to `false`. + +- `Constraints` - A list to define additional constraints where a job can be + run. See the constraint reference for more details. + +- `Affinities` - A list to define placement preferences on nodes where a job can be + run. See the affinity reference for more details. + +- `Spreads` - A list to define allocation spread across attributes. See the spread reference + for more details. + +- `Datacenters` - A list of datacenters in the region which are eligible + for task placement. This must be provided, and does not have a default. + +- `TaskGroups` - A list to define additional task groups. See the task group + reference for more details. + +- `Meta` - Annotates the job with opaque metadata. + +- `Namespace` - The namespace to execute the job in, defaults to "default". + Values other than default are not allowed in non-Enterprise versions of Nomad. + +- `ParameterizedJob` - Specifies the job as a parameterized job such that it can + be dispatched against. The `ParameterizedJob` object supports the following + attributes: + + - `MetaOptional` - Specifies the set of metadata keys that may be provided + when dispatching against the job as a string array. + + - `MetaRequired` - Specifies the set of metadata keys that must be provided + when dispatching against the job as a string array. + + - `Payload` - Specifies the requirement of providing a payload when + dispatching against the parameterized job. The options for this field are + "optional", "required" and "forbidden". The default value is "optional". + +- `Payload` - The payload may not be set when submitting a job but may appear in + a dispatched job. The `Payload` will be a base64 encoded string containing the + payload that the job was dispatched with. The `payload` has a **maximum size + of 16 KiB**. + +- `Priority` - Specifies the job priority which is used to prioritize + scheduling and access to resources. Must be between 1 and 100 inclusively, + and defaults to 50. + +- `Region` - The region to run the job in, defaults to "global". + +- `Type` - Specifies the job type and switches which scheduler + is used. Nomad provides the `service`, `system` and `batch` schedulers, + and defaults to `service`. To learn more about each scheduler type visit + [here](/docs/schedulers) + +- `Update` - Specifies an update strategy to be applied to all task groups + within the job. When specified both at the job level and the task group level, + the update blocks are merged with the task group's taking precedence. For more + details on the update stanza, please see below. + +- `Periodic` - `Periodic` allows the job to be scheduled at fixed times, dates + or intervals. The periodic expression is always evaluated in the UTC + timezone to ensure consistent evaluation when Nomad Servers span multiple + time zones. The `Periodic` object is optional and supports the following attributes: + + - `Enabled` - `Enabled` determines whether the periodic job will spawn child + jobs. + + - `TimeZone` - Specifies the time zone to evaluate the next launch interval + against. This is useful when wanting to account for day light savings in + various time zones. The time zone must be parsable by Golang's + [LoadLocation](https://golang.org/pkg/time/#LoadLocation). The default is + UTC. + + - `SpecType` - `SpecType` determines how Nomad is going to interpret the + periodic expression. `cron` is the only supported `SpecType` currently. + + - `Spec` - A cron expression configuring the interval the job is launched + at. Supports predefined expressions such as "@daily" and "@weekly" See + [here](https://github.com/gorhill/cronexpr#implementation) for full + documentation of supported cron specs and the predefined expressions. + + - `ProhibitOverlap` - `ProhibitOverlap` can be set + to true to enforce that the periodic job doesn't spawn a new instance of the + job if any of the previous jobs are still running. It is defaulted to false. + + An example `periodic` block: + + ```json + { + "Periodic": { + "Spec": "*/15 - *", + "TimeZone": "Europe/Berlin", + "SpecType": "cron", + "Enabled": true, + "ProhibitOverlap": true + } + } + ``` + +- `ReschedulePolicy` - Specifies a reschedule policy to be applied to all task groups + within the job. When specified both at the job level and the task group level, + the reschedule blocks are merged, with the task group's taking precedence. For more + details on `ReschedulePolicy`, please see below. + +### Task Group + +`TaskGroups` is a list of `TaskGroup` objects, each supports the following +attributes: + +- `Constraints` - This is a list of `Constraint` objects. See the constraint + reference for more details. + +- `Affinities` - This is a list of `Affinity` objects. See the affinity + reference for more details. + +- `Spreads` - This is a list of `Spread` objects. See the spread + reference for more details. + +- `Count` - Specifies the number of the task groups that should + be running. Must be non-negative, defaults to one. + +- `Meta` - A key-value map that annotates the task group with opaque metadata. + +- `Migrate` - Specifies a migration strategy to be applied during [node + drains][drain]. + + - `HealthCheck` - One of `checks` or `task_states`. Indicates how task health + should be determined: either via Consul health checks or whether the task + was able to run successfully. + + - `HealthyDeadline` - Specifies duration a task must become healthy within + before it is considered unhealthy. + + - `MaxParallel` - Specifies how many allocations may be migrated at once. + + - `MinHealthyTime` - Specifies duration a task must be considered healthy + before the migration is considered healthy. + +- `Name` - The name of the task group. Must be specified. + +- `RestartPolicy` - Specifies the restart policy to be applied to tasks in this group. + If omitted, a default policy for batch and non-batch jobs is used based on the + job type. See the [restart policy reference](#restart_policy) for more details. + +- `ReschedulePolicy` - Specifies the reschedule policy to be applied to tasks in this group. + If omitted, a default policy is used for batch and service jobs. System jobs are not eligible + for rescheduling. See the [reschedule policy reference](#reschedule_policy) for more details. + +- `Scaling` - Specifies the autoscaling policy for the task group. This is primarily for supporting + external autoscalers. See the [scaling policy reference](#scaling_policy) for more details. + +- `EphemeralDisk` - Specifies the group's ephemeral disk requirements. See the + [ephemeral disk reference](#ephemeral_disk) for more details. + +- `Update` - Specifies an update strategy to be applied to all task groups + within the job. When specified both at the job level and the task group level, + the update blocks are merged with the task group's taking precedence. For more + details on the update stanza, please see below. + +- `Tasks` - A list of `Task` object that are part of the task group. + +### Task + +The `Task` object supports the following keys: + +- `Artifacts` - `Artifacts` is a list of `Artifact` objects which define + artifacts to be downloaded before the task is run. See the artifacts + reference for more details. + +- `Config` - A map of key-value configuration passed into the driver + to start the task. The details of configurations are specific to + each driver. + +- `Constraints` - This is a list of `Constraint` objects. See the constraint + reference for more details. + +- `Affinities` - This is a list of `Affinity` objects. See the affinity + reference for more details. + +- `Spreads` - This is a list of `Spread` objects. See the spread + reference for more details. + +- `DispatchPayload` - Configures the task to have access to dispatch payloads. + The `DispatchPayload` object supports the following attributes: + + - `File` - Specifies the file name to write the content of dispatch payload + to. The file is written relative to the task's local directory. + +- `Driver` - Specifies the task driver that should be used to run the + task. See the [driver documentation](/docs/drivers) for what + is available. Examples include `docker`, `qemu`, `java`, and `exec`. + +- `Env` - A map of key-value representing environment variables that + will be passed along to the running process. Nomad variables are + interpreted when set in the environment variable values. See the table of + interpreted variables [here](/docs/runtime/interpolation). + + For example the below environment map will be reinterpreted: + + ```json + { + "Env": { + "NODE_CLASS": "${nomad.class}" + } + } + ``` + +- `KillSignal` - Specifies a configurable kill signal for a task, where the + default is SIGINT. Note that this is only supported for drivers which accept + sending signals (currently `docker`, `exec`, `raw_exec`, and `java` drivers). + +- `KillTimeout` - `KillTimeout` is a time duration in nanoseconds. It can be + used to configure the time between signaling a task it will be killed and + actually killing it. Drivers first sends a task the `SIGINT` signal and then + sends `SIGTERM` if the task doesn't die after the `KillTimeout` duration has + elapsed. The default `KillTimeout` is 5 seconds. + +- `Leader` - Specifies whether the task is the leader task of the task group. If + set to true, when the leader task completes, all other tasks within the task + group will be gracefully shutdown. + +- `LogConfig` - This allows configuring log rotation for the `stdout` and `stderr` + buffers of a Task. See the log rotation reference below for more details. + +- `Meta` - Annotates the task group with opaque metadata. + +- `Name` - The name of the task. This field is required. + +- `Resources` - Provides the resource requirements of the task. + See the resources reference for more details. + +- `RestartPolicy` - Specifies the task-specific restart policy. + If omitted, the restart policy from the encapsulating task group is used. If both + are present, they are merged. See the [restart policy reference](#restart_policy) + for more details. + +- `Services` - `Services` is a list of `Service` objects. Nomad integrates with + Consul for service discovery. A `Service` object represents a routable and + discoverable service on the network. Nomad automatically registers when a task + is started and de-registers it when the task transitions to the dead state. + [Click here](/docs/integrations/consul-integration#service-discovery) to learn more about + services. Below is the fields in the `Service` object: + + - `Name`: An explicit name for the Service. Nomad will replace `${JOB}`, + `${TASKGROUP}` and `${TASK}` by the name of the job, task group or task, + respectively. `${BASE}` expands to the equivalent of + `${JOB}-${TASKGROUP}-${TASK}`, and is the default name for a Service. + Each service defined for a given task must have a distinct name, so if + a task has multiple services only one of them can use the default name + and the others must be explicitly named. Names must adhere to + [RFC-1123 §2.1](https://tools.ietf.org/html/rfc1123#section-2) and are + limited to alphanumeric and hyphen characters (i.e. `[a-z0-9\-]`), and be + less than 64 characters in length. + + - `Tags`: A list of string tags associated with this Service. String + interpolation is supported in tags. + + - `Meta`: A key-value map that annotates the Consul service with + user-defined metadata. String interpolation is supported in meta. + + - `CanaryTags`: A list of string tags associated with this Service while it + is a canary. Once the canary is promoted, the registered tags will be + updated to the set defined in the `Tags` field. String interpolation is + supported in tags. + + - `CanaryMeta`: A key-value map that annotates this Service while it + is a canary. Once the canary is promoted, the registered meta will be + updated to the set defined in the `Meta` field or removed if the `Meta` + field is not set. String interpolation is supported in meta keys and + values. + + - `PortLabel`: `PortLabel` is an optional string and is used to associate + a port with the service. If specified, the port label must match one + defined in the resources block. This could be a label of either a + dynamic or a static port. + + - `AddressMode`: Specifies what address (host or driver-specific) this + service should advertise. This setting is supported in Docker since + Nomad 0.6 and rkt since Nomad 0.7. Valid options are: + + - `auto` - Allows the driver to determine whether the host or driver + address should be used. Defaults to `host` and only implemented by + Docker. If you use a Docker network plugin such as weave, Docker will + automatically use its address. + + - `driver` - Use the IP specified by the driver, and the port specified + in a port map. A numeric port may be specified since port maps aren't + required by all network plugins. Useful for advertising SDN and + overlay network addresses. Task will fail if driver network cannot be + determined. Only implemented for Docker and rkt. + + - `host` - Use the host IP and port. + + - `Checks`: `Checks` is an array of check objects. A check object defines a + health check associated with the service. Nomad supports the `script`, + `http` and `tcp` Consul Checks. Script checks are not supported for the + qemu driver since the Nomad client doesn't have access to the file system + of a task using the Qemu driver. + + - `Type`: This indicates the check types supported by Nomad. Valid + options are currently `script`, `http` and `tcp`. + + - `Name`: The name of the health check. + + - `AddressMode`: Same as `AddressMode` on `Service`. Unlike services, + checks do not have an `auto` address mode as there's no way for + Nomad to know which is the best address to use for checks. Consul + needs access to the address for any HTTP or TCP checks. Added in + Nomad 0.7.1. Unlike `PortLabel`, this setting is _not_ inherited + from the `Service`. + + - `PortLabel`: Specifies the label of the port on which the check will + be performed. Note this is the _label_ of the port and not the port + number unless `AddressMode: "driver"`. The port label must match one + defined in the Network stanza. If a port value was declared on the + `Service`, this will inherit from that value if not supplied. If + supplied, this value takes precedence over the `Service.PortLabel` + value. This is useful for services which operate on multiple ports. + `http` and `tcp` checks require a port while `script` checks do not. + Checks will use the host IP and ports by default. In Nomad 0.7.1 or + later numeric ports may be used if `AddressMode: "driver"` is set on + the check. + + - `Header`: Headers for HTTP checks. Should be an object where the values are an + array of values. Headers will be written once for each value. + + - `Interval`: This indicates the frequency of the health checks that + Consul will perform. + + - `Timeout`: This indicates how long Consul will wait for a health + check query to succeed. + + - `Method`: The HTTP method to use for HTTP checks. Defaults to GET. + + - `Path`: The path of the HTTP endpoint which Consul will query to query + the health of a service if the type of the check is `http`. Nomad + will add the IP of the service and the port, users are only required + to add the relative URL of the health check endpoint. Absolute paths + are not allowed. + + - `Protocol`: This indicates the protocol for the HTTP checks. Valid + options are `http` and `https`. We default it to `http`. + + - `Command`: This is the command that the Nomad client runs for doing + script based health check. + + - `Args`: Additional arguments to the `command` for script based health + checks. + + - `TLSSkipVerify`: If true, Consul will not attempt to verify the + certificate when performing HTTPS checks. Requires Consul >= 0.7.2. + + - `CheckRestart`: `CheckRestart` is an object which enables + restarting of tasks based upon Consul health checks. + + - `Limit`: The number of unhealthy checks allowed before the + service is restarted. Defaults to `0` which disables health-based restarts. + + - `Grace`: The duration to wait after a task starts or restarts + before counting unhealthy checks count against the limit. Defaults to "1s". + + - `IgnoreWarnings`: Treat checks that are warning as passing. + Defaults to false which means warnings are considered unhealthy. + +- `ShutdownDelay` - Specifies the duration to wait when killing a task between + removing it from Consul and sending it a shutdown signal. Ideally services + would fail healthchecks once they receive a shutdown signal. Alternatively + `ShutdownDelay` may be set to give in flight requests time to complete before + shutting down. + +- `Templates` - Specifies the set of [`Template`](#template) objects to render for the task. + Templates can be used to inject both static and dynamic configuration with + data populated from environment variables, Consul and Vault. + +- `User` - Set the user that will run the task. It defaults to the same user + the Nomad client is being run as. This can only be set on Linux platforms. + +### Resources + +The `Resources` object supports the following keys: + +- `CPU` - The CPU required in MHz. + +- `MemoryMB` - The memory required in MB. + +- `Networks` - A list of network objects. + +- `Devices` - A list of device objects. + +The Network object supports the following keys: + +- `MBits` - The number of MBits in bandwidth required. + +Nomad can allocate two types of ports to a task - Dynamic and Static/Reserved +ports. A network object allows the user to specify a list of `DynamicPorts` and +`ReservedPorts`. Each object supports the following attributes: + +- `Value` - The port number for static ports. If the port is dynamic, then this + attribute is ignored. +- `Label` - The label to annotate a port so that it can be referred in the + service discovery block or environment variables. + +The Device object supports the following keys: + +- `Name` - Specifies the device required. The following inputs are valid: + + - ``: If a single value is given, it is assumed to be the device + type, such as "gpu", or "fpga". + + - `/`: If two values are given separated by a `/`, the + given device type will be selected, constraining on the provided vendor. + Examples include "nvidia/gpu" or "amd/gpu". + + - `//`: If three values are given separated by a `/`, the + given device type will be selected, constraining on the provided vendor, and + model name. Examples include "nvidia/gpu/1080ti" or "nvidia/gpu/2080ti". + +* `Count` - The count of devices being requested per task. Defaults to 1. + +* `Constraints` - A list to define constraints on which device can satisfy the + request. See the constraint reference for more details. + +* `Affinities` - A list to define preferences for which device should be + chosen. See the affinity reference for more details. + + + +### Ephemeral Disk + +The `EphemeralDisk` object supports the following keys: + +- `Migrate` - Specifies that the Nomad client should make a best-effort attempt + to migrate the data from a remote machine if placement cannot be made on the + original node. During data migration, the task will block starting until the + data migration has completed. Value is a boolean and the default is false. + +- `SizeMB` - Specifies the size of the ephemeral disk in MB. Default is 300. + +- `Sticky` - Specifies that Nomad should make a best-effort attempt to place the + updated allocation on the same machine. This will move the `local/` and + `alloc/data` directories to the new allocation. Value is a boolean and the + default is false. + + + +### Reschedule Policy + +The `ReschedulePolicy` object supports the following keys: + +- `Attempts` - `Attempts` is the number of reschedule attempts allowed + in an `Interval`. + +- `Interval` - `Interval` is a time duration that is specified in nanoseconds. + The `Interval` is a sliding window within which at most `Attempts` number + of reschedule attempts are permitted. + +- `Delay` - A duration to wait before attempting rescheduling. It is specified in + nanoseconds. + +- `DelayFunction` - Specifies the function that is used to calculate subsequent reschedule delays. + The initial delay is specified by the `Delay` parameter. Allowed values for `DelayFunction` are listed below: + + - `constant` - The delay between reschedule attempts stays at the `Delay` value. + - `exponential` - The delay between reschedule attempts doubles. + - `fibonacci` - The delay between reschedule attempts is calculated by adding the two most recent + delays applied. For example if `Delay` is set to 5 seconds, the next five reschedule attempts will be + delayed by 5 seconds, 5 seconds, 10 seconds, 15 seconds, and 25 seconds respectively. + +- `MaxDelay` - `MaxDelay` is an upper bound on the delay beyond which it will not increase. This parameter is used when + `DelayFunction` is `exponential` or `fibonacci`, and is ignored when `constant` delay is used. + +- `Unlimited` - `Unlimited` enables unlimited reschedule attempts. If this is set to true + the `Attempts` and `Interval` fields are not used. + + + +### Restart Policy + +The `RestartPolicy` object supports the following keys: + +- `Attempts` - `Attempts` is the number of restarts allowed in an `Interval`. + +- `Interval` - `Interval` is a time duration that is specified in nanoseconds. + The `Interval` begins when the first task starts and ensures that only + `Attempts` number of restarts happens within it. If more than `Attempts` + number of failures happen, behavior is controlled by `Mode`. + +- `Delay` - A duration to wait before restarting a task. It is specified in + nanoseconds. A random jitter of up to 25% is added to the delay. + +- `Mode` - `Mode` is given as a string and controls the behavior when the task + fails more than `Attempts` times in an `Interval`. Possible values are listed + below: + + - `delay` - `delay` will delay the next restart until the next `Interval` is + reached. + + - `fail` - `fail` will not restart the task again. + +### Update + +Specifies the task group update strategy. When omitted, rolling updates are +disabled. The update stanza can be specified at the job or task group level. +When specified at the job, the update stanza is inherited by all task groups. +When specified in both the job and in a task group, the stanzas are merged with +the task group's taking precedence. The `Update` object supports the following +attributes: + +- `MaxParallel` - `MaxParallel` is given as an integer value and specifies + the number of tasks that can be updated at the same time. + +- `HealthCheck` - Specifies the mechanism in which allocations health is + determined. The potential values are: + + - "checks" - Specifies that the allocation should be considered healthy when + all of its tasks are running and their associated checks are healthy, + and unhealthy if any of the tasks fail or not all checks become healthy. + This is a superset of "task_states" mode. + + - "task_states" - Specifies that the allocation should be considered healthy + when all its tasks are running and unhealthy if tasks fail. + + - "manual" - Specifies that Nomad should not automatically determine health + and that the operator will specify allocation health using the [HTTP + API](/api-docs/deployments#set-allocation-health-in-deployment). + +- `MinHealthyTime` - Specifies the minimum time the allocation must be in the + healthy state before it is marked as healthy and unblocks further allocations + from being updated. + +- `HealthyDeadline` - Specifies the deadline in which the allocation must be + marked as healthy after which the allocation is automatically transitioned to + unhealthy. + +- `ProgressDeadline` - Specifies the deadline in which an allocation must be + marked as healthy. The deadline begins when the first allocation for the + deployment is created and is reset whenever an allocation as part of the + deployment transitions to a healthy state. If no allocation transitions to the + healthy state before the progress deadline, the deployment is marked as + failed. If the `progress_deadline` is set to `0`, the first allocation to be + marked as unhealthy causes the deployment to fail. + +- `AutoRevert` - Specifies if the job should auto-revert to the last stable job + on deployment failure. A job is marked as stable if all the allocations as + part of its deployment were marked healthy. + +- `Canary` - Specifies that changes to the job that would result in destructive + updates should create the specified number of canaries without stopping any + previous allocations. Once the operator determines the canaries are healthy, + they can be promoted which unblocks a rolling update of the remaining + allocations at a rate of `max_parallel`. + +- `AutoPromote` - Specifies if the job should automatically promote to + the new deployment if all canaries become healthy. + +- `Stagger` - Specifies the delay between migrating allocations off nodes marked + for draining. + +An example `Update` block: + +```json +{ + "Update": { + "MaxParallel": 3, + "HealthCheck": "checks", + "MinHealthyTime": 15000000000, + "HealthyDeadline": 180000000000, + "AutoRevert": false, + "AutoPromote": false, + "Canary": 1 + } +} +``` + +### Constraint + +The `Constraint` object supports the following keys: + +- `LTarget` - Specifies the attribute to examine for the + constraint. See the table of attributes [here](/docs/runtime/interpolation#interpreted_node_vars). + +- `RTarget` - Specifies the value to compare the attribute against. + This can be a literal value, another attribute or a regular expression if + the `Operator` is in "regexp" mode. + +- `Operand` - Specifies the test to be performed on the two targets. It takes on the + following values: + + - `regexp` - Allows the `RTarget` to be a regular expression to be matched. + + - `set_contains` - Allows the `RTarget` to be a comma separated list of values + that should be contained in the LTarget's value. + + - `distinct_hosts` - If set, the scheduler will not co-locate any task groups on the same + machine. This can be specified as a job constraint which applies the + constraint to all task groups in the job, or as a task group constraint which + scopes the effect to just that group. The constraint may not be + specified at the task level. + + Placing the constraint at both the job level and at the task group level is + redundant since when placed at the job level, the constraint will be applied + to all task groups. When specified, `LTarget` and `RTarget` should be + omitted. + + - `distinct_property` - If set, the scheduler selects nodes that have a + distinct value of the specified property. The `RTarget` specifies how + many allocations are allowed to share the value of a property. The + `RTarget` must be 1 or greater and if omitted, defaults to 1. This can + be specified as a job constraint which applies the constraint to all + task groups in the job, or as a task group constraint which scopes the + effect to just that group. The constraint may not be specified at the + task level. + + Placing the constraint at both the job level and at the task group level is + redundant since when placed at the job level, the constraint will be applied + to all task groups. When specified, `LTarget` should be the property + that should be distinct and `RTarget` should be omitted. + + - Comparison Operators - `=`, `==`, `is`, `!=`, `not`, `>`, `>=`, `<`, `<=`. The + ordering is compared lexically. + +### Affinity + +Affinities allow operators to express placement preferences. More details on how they work +are described in [affinities](/docs/job-specification/affinity) + +The `Affinity` object supports the following keys: + +- `LTarget` - Specifies the attribute to examine for the + affinity. See the table of attributes [here](/docs/runtime/interpolation#interpreted_node_vars). + +- `RTarget` - Specifies the value to compare the attribute against. + This can be a literal value, another attribute or a regular expression if + the `Operator` is in "regexp" mode. + +- `Operand` - Specifies the test to be performed on the two targets. It takes on the + following values: + + - `regexp` - Allows the `RTarget` to be a regular expression to be matched. + + - `set_contains_all` - Allows the `RTarget` to be a comma separated list of values + that should be contained in the LTarget's value. + + - `set_contains_any` - Allows the `RTarget` to be a comma separated list of values + any of which should be contained in the LTarget's value. + + - Comparison Operators - `=`, `==`, `is`, `!=`, `not`, `>`, `>=`, `<`, `<=`. The + ordering is compared lexically. + +- `Weight` - A non zero weight, valid values are from -100 to 100. Used to express + relative preference when there is more than one affinity. + +### Log Rotation + +The `LogConfig` object configures the log rotation policy for a task's `stdout` and +`stderr`. The `LogConfig` object supports the following attributes: + +- `MaxFiles` - The maximum number of rotated files Nomad will retain for + `stdout` and `stderr`, each tracked individually. + +- `MaxFileSizeMB` - The size of each rotated file. The size is specified in + `MB`. + +If the amount of disk resource requested for the task is less than the total +amount of disk space needed to retain the rotated set of files, Nomad will return +a validation error when a job is submitted. + +```json +{ + "LogConfig": { + "MaxFiles": 3, + "MaxFileSizeMB": 10 + } +} +``` + +In the above example we have asked Nomad to retain 3 rotated files for both +`stderr` and `stdout` and size of each file is 10 MB. The minimum disk space that +would be required for the task would be 60 MB. + +### Artifact + +Nomad downloads artifacts using +[`go-getter`](https://github.com/hashicorp/go-getter). The `go-getter` library +allows downloading of artifacts from various sources using a URL as the input +source. The key-value pairs given in the `options` block map directly to +parameters appended to the supplied `source` URL. These are then used by +`go-getter` to appropriately download the artifact. `go-getter` also has a CLI +tool to validate its URL and can be used to check if the Nomad `artifact` is +valid. + +Nomad allows downloading `http`, `https`, and `S3` artifacts. If these artifacts +are archives (zip, tar.gz, bz2, etc.), these will be unarchived before the task +is started. + +The `Artifact` object supports the following keys: + +- `GetterSource` - The path to the artifact to download. + +- `RelativeDest` - An optional path to download the artifact into relative to the + root of the task's directory. If omitted, it will default to `local/`. + +- `GetterOptions` - A `map[string]string` block of options for `go-getter`. + Full documentation of supported options are available + [here](https://github.com/hashicorp/go-getter/tree/ef5edd3d8f6f482b775199be2f3734fd20e04d4a#protocol-specific-options-1). + An example is given below: + +```json +{ + "GetterOptions": { + "checksum": "md5:c4aa853ad2215426eb7d70a21922e794", + + "aws_access_key_id": "", + "aws_access_key_secret": "", + "aws_access_token": "" + } +} +``` + +An example of downloading and unzipping an archive is as simple as: + +```json +{ + "Artifacts": [ + { + "GetterSource": "https://example.com/my.zip", + "GetterOptions": { + "checksum": "md5:7f4b3e3b4dd5150d4e5aaaa5efada4c3" + } + } + ] +} +``` + +#### S3 examples + +S3 has several different types of addressing and more detail can be found +[here](http://docs.aws.amazon.com/AmazonS3/latest/dev/UsingBucket.html#access-bucket-intro) + +S3 region specific endpoints can be found +[here](http://docs.aws.amazon.com/general/latest/gr/rande.html#s3_region) + +Path based style: + +```json +{ + "Artifacts": [ + { + "GetterSource": "https://my-bucket-example.s3-us-west-2.amazonaws.com/my_app.tar.gz" + } + ] +} +``` + +or to override automatic detection in the URL, use the S3-specific syntax + +```json +{ + "Artifacts": [ + { + "GetterSource": "s3::https://my-bucket-example.s3-eu-west-1.amazonaws.com/my_app.tar.gz" + } + ] +} +``` + +Virtual hosted based style + +```json +{ + "Artifacts": [ + { + "GetterSource": "my-bucket-example.s3-eu-west-1.amazonaws.com/my_app.tar.gz" + } + ] +} +``` + +### Template + +The `Template` block instantiates an instance of a template renderer. This +creates a convenient way to ship configuration files that are populated from +environment variables, Consul data, Vault secrets, or just general +configurations within a Nomad task. + +Nomad utilizes a tool called [Consul Template][ct]. Since Nomad v0.5.3, the +template can reference [Nomad's runtime environment variables][env]. For a full +list of the API template functions, please refer to the [Consul Template +README][ct]. + +`Template` object supports following attributes: + +- `ChangeMode` - Specifies the behavior Nomad should take if the rendered + template changes. The default value is `"restart"`. The possible values are: + + - `"noop"` - take no action (continue running the task) + - `"restart"` - restart the task + - `"signal"` - send a configurable signal to the task + +- `ChangeSignal` - Specifies the signal to send to the task as a string like + "SIGUSR1" or "SIGINT". This option is required if the `ChangeMode` is + `signal`. + +- `DestPath` - Specifies the location where the resulting template should be + rendered, relative to the task directory. + +- `EmbeddedTmpl` - Specifies the raw template to execute. One of `SourcePath` + or `EmbeddedTmpl` must be specified, but not both. This is useful for smaller + templates, but we recommend using `SourcePath` for larger templates. + +- `Envvars` - Specifies the template should be read back as environment + variables for the task. + +- `LeftDelim` - Specifies the left delimiter to use in the template. The default + is "{{" for some templates, it may be easier to use a different delimiter that + does not conflict with the output file itself. + +- `Perms` - Specifies the rendered template's permissions. File permissions are + given as octal of the Unix file permissions `rwxrwxrwx`. + +- `RightDelim` - Specifies the right delimiter to use in the template. The default + is "}}" for some templates, it may be easier to use a different delimiter that + does not conflict with the output file itself. + +- `SourcePath` - Specifies the path to the template to be rendered. `SourcePath` + is mutually exclusive with `EmbeddedTmpl` attribute. The source can be fetched + using an [`Artifact`](#artifact) resource. The template must exist on the + machine prior to starting the task; it is not possible to reference a template + inside of a Docker container, for example. + +- `Splay` - Specifies a random amount of time to wait between 0 ms and the given + splay value before invoking the change mode. Should be specified in + nanoseconds. + +- `VaultGrace` - [Deprecated](https://github.com/hashicorp/consul-template/issues/1268) + +```json +{ + "Templates": [ + { + "SourcePath": "local/config.conf.tpl", + "DestPath": "local/config.conf", + "EmbeddedTmpl": "", + "ChangeMode": "signal", + "ChangeSignal": "SIGUSR1", + "Splay": 5000000000 + } + ] +} +``` + +### Spread + +Spread allow operators to target specific percentages of allocations based on +any node attribute or metadata. More details on how they work are described +in [spread](/docs/job-specification/spread). + +The `Spread` object supports the following keys: + +- `Attribute` - Specifies the attribute to examine for the + spread. See the [table of attributes](/docs/runtime/interpolation#interpreted_node_vars) for examples. + +- `SpreadTarget` - Specifies a list of attribute values and percentages. This is an optional field, when + left empty Nomad will evenly spread allocations across values of the attribute. + + - `Value` - The value of a specific target attribute, like "dc1" for `${node.datacenter}`. + - `Percent` - Desired percentage of allocations for this attribute value. The sum of + all spread target percentages must add up to 100. + +- `Weight` - A non zero weight, valid values are from -100 to 100. Used to express + relative preference when there is more than one spread or affinity. + + + +### Scaling + +Scaling policies allow operators to attach autoscaling configuration to a task group. This information +can be queried by [external autoscalers](https://github.com/hashicorp/nomad-autoscaler). + +The `Scaling` object supports the following keys: + +- `Min` - The minimum allowable count for the task group. This is optional; if absent, the default + is the `Count` specified in the task group. Attempts to set the task group `Count` below `Min` will + result in a 400 error during job registration. + +- `Max` - The maximum allowable count for the task group. This is required if a scaling policy is provided. + This must be larger than `Min`. Attempts to set the task group `Count` above `Max` wil result in a 400 + error during job registration. + +- `Enabled` - An optional boolean (default: `true`). This indicates to the autoscaler that this + scaling policy should be ignored. It is intended to allow autoscaling to be temporarily disabled + for a task group. + +- `Policy` - An optional JSON block. This is opaque to Nomad; see the documentation for the external + autoscaler (e.g., [nomad-autoscaler](https://github.com/hashicorp/nomad-autoscaler)). + +[ct]: https://github.com/hashicorp/consul-template 'Consul Template by HashiCorp' +[drain]: /docs/commands/node/drain +[env]: /docs/runtime/environment 'Nomad Runtime Environment' diff --git a/content/nomad/v0.11.x/content/api-docs/libraries-and-sdks.mdx b/content/nomad/v0.11.x/content/api-docs/libraries-and-sdks.mdx new file mode 100644 index 0000000000..b708c4e1da --- /dev/null +++ b/content/nomad/v0.11.x/content/api-docs/libraries-and-sdks.mdx @@ -0,0 +1,36 @@ +--- +layout: api +page_title: Libraries and SDKs - HTTP API +sidebar_title: Libraries & SDKs +description: |- + There are many third-party libraries for interacting with Nomad's HTTP API. + This page lists the HashiCorp and community-maintained Nomad HTTP API client + libraries. +--- + +# Client Libraries & SDKs + +The programming libraries listed on this page can be used to consume the API +more conveniently. Some are officially maintained while others are provided by +the community. + +## Official Libraries + +- [`api`](https://github.com/hashicorp/nomad/tree/master/api) - Official Golang + client for the Nomad HTTP API + +- [`nomad-java-sdk`](https://github.com/hashicorp/nomad-java-sdk) - Official + Java client for the Nomad HTTP API. + +- [`nomad-ruby`](https://github.com/hashicorp/nomad-ruby) - Official Ruby client + for the Nomad HTTP API + +- [`nomad-scala-sdk`](https://github.com/hashicorp/nomad-scala-sdk) - Official + Scala client for the Nomad HTTP API. + +## Third-Party Libraries + +- [`python-nomad`](https://github.com/jrxfive/python-nomad) - Python client + for the Nomad HTTP API. + +_Want to see your library here? [Submit a Pull Request](https://github.com/hashicorp/nomad)._ diff --git a/content/nomad/v0.11.x/content/api-docs/metrics.mdx b/content/nomad/v0.11.x/content/api-docs/metrics.mdx new file mode 100644 index 0000000000..d2d5407732 --- /dev/null +++ b/content/nomad/v0.11.x/content/api-docs/metrics.mdx @@ -0,0 +1,97 @@ +--- +layout: api +page_title: Metrics - HTTP API +sidebar_title: Metrics +description: The /metrics endpoint is used to view metrics for Nomad +--- + +# Metrics HTTP API + +The `/metrics` endpoint returns metrics for the current Nomad process. + +| Method | Path | Produces | +| ------ | ------------- | ------------------ | +| `GET` | `/v1/metrics` | `application/json` | + +The table below shows this endpoint's support for +[blocking queries](/api-docs#blocking-queries) and +[required ACLs](/api-docs#acls). + +| Blocking Queries | ACL Required | +| ---------------- | ------------ | +| `NO` | `none` | + +### Parameters + +- `format` `(string: "")` - Specifies the metrics format to be other than the + JSON default. Currently, only `prometheus` is supported as an alternative + format. This is specified as a query string parameter. + +### Sample Request + +```shell-session +$ curl https://localhost:4646/v1/metrics +``` + +```shell-session +$ curl https://localhost:4646/v1/metrics?format=prometheus +``` + +### Sample Response + +```json +{ + "Counters": [ + { + "Count": 11, + "Labels": {}, + "Max": 1.0, + "Mean": 1.0, + "Min": 1.0, + "Name": "nomad.nomad.rpc.query", + "Stddev": 0.0, + "Sum": 11.0 + } + ], + "Gauges": [ + { + "Labels": { + "node_id": "cd7c3e0c-0174-29dd-17ba-ea4609e0fd1f", + "datacenter": "dc1" + }, + "Name": "nomad.client.allocations.blocked", + "Value": 0.0 + }, + { + "Labels": { + "datacenter": "dc1", + "node_id": "cd7c3e0c-0174-29dd-17ba-ea4609e0fd1f" + }, + "Name": "nomad.client.allocations.migrating", + "Value": 0.0 + } + ], + "Samples": [ + { + "Count": 20, + "Labels": {}, + "Max": 0.03544100001454353, + "Mean": 0.023678050097078084, + "Min": 0.00956599973142147, + "Name": "nomad.memberlist.gossip", + "Stddev": 0.005445327799243976, + "Sum": 0.4735610019415617 + }, + { + "Count": 1, + "Labels": {}, + "Max": 0.0964059978723526, + "Mean": 0.0964059978723526, + "Min": 0.0964059978723526, + "Name": "nomad.nomad.client.update_status", + "Stddev": 0.0, + "Sum": 0.0964059978723526 + } + ] +} +``` diff --git a/content/nomad/v0.11.x/content/api-docs/namespaces.mdx b/content/nomad/v0.11.x/content/api-docs/namespaces.mdx new file mode 100644 index 0000000000..8173ddb14c --- /dev/null +++ b/content/nomad/v0.11.x/content/api-docs/namespaces.mdx @@ -0,0 +1,187 @@ +--- +layout: api +page_title: Namespace - HTTP API +sidebar_title: Namespaces +description: The /namespace endpoints are used to query for and interact with namespaces. +--- + +# Namespace HTTP API + +The `/namespace` endpoints are used to query for and interact with namespaces. + +~> **Enterprise Only!** This API endpoint and functionality only exists in +Nomad Enterprise. This is not present in the open source version of Nomad. + +## List Namespaces + +This endpoint lists all namespaces. + +| Method | Path | Produces | +| ------ | ---------------- | ------------------ | +| `GET` | `/v1/namespaces` | `application/json` | + +The table below shows this endpoint's support for +[blocking queries](/api-docs#blocking-queries) and +[required ACLs](/api-docs#acls). + +| Blocking Queries | ACL Required | +| ---------------- | -------------------------------------------------------------------------- | +| `YES` | `namespace:*`
Any capability on the namespace authorizes the endpoint | + +### Parameters + +- `prefix` `(string: "")`- Specifies a string to filter namespaces on based on + an index prefix. This is specified as a query string parameter. + +### Sample Request + +```shell-session +$ curl \ + https://localhost:4646/v1/namespaces +``` + +```shell-session +$ curl \ + https://localhost:4646/v1/namespaces?prefix=prod +``` + +### Sample Response + +```json +[ + { + "CreateIndex": 31, + "Description": "Production API Servers", + "ModifyIndex": 31, + "Name": "api-prod", + "Quota": "" + }, + { + "CreateIndex": 5, + "Description": "Default shared namespace", + "ModifyIndex": 5, + "Name": "default", + "Quota": "" + } +] +``` + +## Read Namespace + +This endpoint reads information about a specific namespace. + +| Method | Path | Produces | +| ------ | -------------------------- | ------------------ | +| `GET` | `/v1/namespace/:namespace` | `application/json` | + +The table below shows this endpoint's support for +[blocking queries](/api-docs#blocking-queries) and +[required ACLs](/api-docs#acls). + +| Blocking Queries | ACL Required | +| ---------------- | -------------------------------------------------------------------------- | +| `YES` | `namespace:*`
Any capability on the namespace authorizes the endpoint | + +### Parameters + +- `:namespace` `(string: )`- Specifies the namespace to query. + +### Sample Request + +```shell-session +$ curl \ + https://localhost:4646/v1/namespace/api-prod +``` + +### Sample Response + +```json +{ + "CreateIndex": 31, + "Description": "Production API Servers", + "Quota": "", + "Hash": "N8WvePwqkp6J354eLJMKyhvsFdPELAos0VuBfMoVKoU=", + "ModifyIndex": 31, + "Name": "api-prod" +} +``` + +## Create or Update Namespace + +This endpoint is used to create or update a namespace. + +| Method | Path | Produces | +| ------ | ------------------------------------------------- | ------------------ | +| `POST` | `/v1/namespace/:namespace`
`/v1/namespace` | `application/json` | + +The table below shows this endpoint's support for +[blocking queries](/api-docs#blocking-queries) and +[required ACLs](/api-docs#acls). + +| Blocking Queries | ACL Required | +| ---------------- | ------------ | +| `NO` | `management` | + +### Parameters + +- `Name` `(string: )`- Specifies the namespace to create or + update. + +- `Description` `(string: "")` - Specifies an optional human-readable + description of the namespace. + +- `Quota` `(string: "")` - Specifies an quota to attach to the namespace. + +### Sample Payload + +```javascript +{ + "Name": "api-prod", + "Description": "Production API Servers", + "Quota": "prod-quota" +} +``` + +### Sample Request + +```shell-session +$ curl \ + --request POST \ + --data @namespace.json \ + https://localhost:4646/v1/namespace/api-prod +``` + +```shell-session +$ curl \ + --request POST \ + --data @namespace.json \ + https://localhost:4646/v1/namespace +``` + +## Delete Namespace + +This endpoint is used to delete a namespace. + +| Method | Path | Produces | +| -------- | -------------------------- | ------------------ | +| `DELETE` | `/v1/namespace/:namespace` | `application/json` | + +The table below shows this endpoint's support for +[blocking queries](/api-docs#blocking-queries) and +[required ACLs](/api-docs#acls). + +| Blocking Queries | ACL Required | +| ---------------- | ------------ | +| `NO` | `management` | + +### Parameters + +- `:namespace` `(string: )`- Specifies the namespace to delete. + +### Sample Request + +```shell-session +$ curl \ + --request DELETE \ + https://localhost:4646/v1/namespace/api-prod +``` diff --git a/content/nomad/v0.11.x/content/api-docs/nodes.mdx b/content/nomad/v0.11.x/content/api-docs/nodes.mdx new file mode 100644 index 0000000000..3ff11dd0b7 --- /dev/null +++ b/content/nomad/v0.11.x/content/api-docs/nodes.mdx @@ -0,0 +1,1018 @@ +--- +layout: api +page_title: Nodes - HTTP API +sidebar_title: Nodes +description: The /node endpoints are used to query for and interact with client nodes. +--- + +# Nodes HTTP API + +The `/node` endpoints are used to query for and interact with client nodes. + +## List Nodes + +This endpoint lists all nodes registered with Nomad. + +| Method | Path | Produces | +| ------ | ----------- | ------------------ | +| `GET` | `/v1/nodes` | `application/json` | + +The table below shows this endpoint's support for +[blocking queries](/api-docs#blocking-queries) and +[required ACLs](/api-docs#acls). + +| Blocking Queries | ACL Required | +| ---------------- | ------------ | +| `YES` | `node:read` | + +### Parameters + +- `prefix` `(string: "")`- Specifies a string to filter nodes based on an ID + prefix. Because the value is decoded to bytes, the prefix must have an even + number of hexadecimal characters (0-9a-f). This is specified as a query + string parameter. + +### Sample Request + +```shell-session +$ curl \ + http://localhost:4646/v1/nodes +``` + +```shell-session +$ curl \ + http://localhost:4646/v1/nodes?prefix=f7476465 +``` + +### Sample Response + +```json +[ + { + "Address": "10.138.0.5", + "CreateIndex": 6, + "Datacenter": "dc1", + "Drain": false, + "Drivers": { + "java": { + "Attributes": { + "driver.java.runtime": "OpenJDK Runtime Environment (build 1.8.0_162-8u162-b12-1~deb9u1-b12)", + "driver.java.vm": "OpenJDK 64-Bit Server VM (build 25.162-b12, mixed mode)", + "driver.java.version": "openjdk version \"1.8.0_162" + }, + "Detected": true, + "HealthDescription": "", + "Healthy": true, + "UpdateTime": "2018-04-11T23:33:48.781948669Z" + }, + "qemu": { + "Attributes": null, + "Detected": false, + "HealthDescription": "", + "Healthy": false, + "UpdateTime": "2018-04-11T23:33:48.7819898Z" + }, + "rkt": { + "Attributes": { + "driver.rkt.appc.version": "0.8.11", + "driver.rkt.volumes.enabled": "1", + "driver.rkt.version": "1.29.0" + }, + "Detected": true, + "HealthDescription": "Driver rkt is detected: true", + "Healthy": true, + "UpdateTime": "2018-04-11T23:34:48.81079772Z" + }, + "docker": { + "Attributes": { + "driver.docker.bridge_ip": "172.17.0.1", + "driver.docker.version": "18.03.0-ce", + "driver.docker.volumes.enabled": "1" + }, + "Detected": true, + "HealthDescription": "Driver is available and responsive", + "Healthy": true, + "UpdateTime": "2018-04-11T23:34:48.713720323Z" + }, + "exec": { + "Attributes": {}, + "Detected": true, + "HealthDescription": "Driver exec is detected: true", + "Healthy": true, + "UpdateTime": "2018-04-11T23:34:48.711026521Z" + }, + "raw_exec": { + "Attributes": {}, + "Detected": true, + "HealthDescription": "", + "Healthy": true, + "UpdateTime": "2018-04-11T23:33:48.710448534Z" + } + }, + "ID": "f7476465-4d6e-c0de-26d0-e383c49be941", + "ModifyIndex": 2526, + "Name": "nomad-4", + "NodeClass": "", + "SchedulingEligibility": "eligible", + "Status": "ready", + "StatusDescription": "", + "Version": "0.8.0-rc1" + } +] +``` + +## Read Node + +This endpoint queries the status of a client node. + +| Method | Path | Produces | +| ------ | ------------------- | ------------------ | +| `GET` | `/v1/node/:node_id` | `application/json` | + +The table below shows this endpoint's support for +[blocking queries](/api-docs#blocking-queries) and +[required ACLs](/api-docs#acls). + +| Blocking Queries | ACL Required | +| ---------------- | ------------ | +| `YES` | `node:read` | + +### Parameters + +- `:node_id` `(string: )`- Specifies the ID of the node. This must be + the full UUID, not the short 8-character one. This is specified as part of the + path. + +### Sample Request + +```shell-session +$ curl \ + http://localhost:4646/v1/node/f7476465-4d6e-c0de-26d0-e383c49be941 +``` + +### Sample Response + +```json +{ + "Attributes": { + "consul.datacenter": "dc1", + "consul.revision": "d2adfc0bd", + "consul.server": "true", + "consul.version": "1.5.2", + "cpu.arch": "amd64", + "cpu.frequency": "4000", + "cpu.modelname": "Intel(R) Core(TM) i7-8550U CPU @ 1.80GHz", + "cpu.numcores": "8", + "cpu.totalcompute": "32000", + "driver.docker": "1", + "driver.docker.bridge_ip": "172.17.0.1", + "driver.docker.os_type": "linux", + "driver.docker.runtimes": "runc", + "driver.docker.version": "18.09.6", + "driver.docker.volumes.enabled": "true", + "driver.mock": "true", + "driver.mock_driver": "1", + "driver.raw_exec": "1", + "kernel.name": "linux", + "kernel.version": "4.19.56", + "memory.totalbytes": "16571674624", + "nomad.advertise.address": "127.0.0.1:4646", + "nomad.revision": "30da2b8f6c3aa860113c9d313c695a05eff5bb97+CHANGES", + "nomad.version": "0.10.0-dev", + "os.name": "nixos", + "os.signals": "SIGTTOU,SIGTTIN,SIGSTOP,SIGSYS,SIGXCPU,SIGBUS,SIGKILL,SIGTERM,SIGIOT,SIGILL,SIGIO,SIGQUIT,SIGSEGV,SIGUSR1,SIGXFSZ,SIGCHLD,SIGUSR2,SIGURG,SIGFPE,SIGHUP,SIGINT,SIGPROF,SIGCONT,SIGALRM,SIGPIPE,SIGTRAP,SIGTSTP,SIGWINCH,SIGABRT", + "os.version": "\"19.03.173017.85f820d6e41 (Koi)\"", + "unique.cgroup.mountpoint": "/sys/fs/cgroup", + "unique.consul.name": "mew", + "unique.hostname": "mew", + "unique.network.ip-address": "127.0.0.1", + "unique.storage.bytesfree": "8273698816", + "unique.storage.bytestotal": "8285835264", + "unique.storage.volume": "tmpfs" + }, + "ComputedClass": "v1:390058673753570317", + "CreateIndex": 6, + "Datacenter": "dc1", + "Drain": false, + "DrainStrategy": null, + "Drivers": { + "docker": { + "Attributes": { + "driver.docker": "true", + "driver.docker.bridge_ip": "172.17.0.1", + "driver.docker.os_type": "linux", + "driver.docker.runtimes": "runc", + "driver.docker.version": "18.09.6", + "driver.docker.volumes.enabled": "true" + }, + "Detected": true, + "HealthDescription": "Healthy", + "Healthy": true, + "UpdateTime": "2019-08-26T12:22:50.762716458+02:00" + }, + "exec": { + "Attributes": null, + "Detected": false, + "HealthDescription": "Driver must run as root", + "Healthy": false, + "UpdateTime": "2019-08-26T12:22:50.6873373+02:00" + }, + "java": { + "Attributes": null, + "Detected": false, + "HealthDescription": "Driver must run as root", + "Healthy": false, + "UpdateTime": "2019-08-26T12:22:50.687274359+02:00" + }, + "mock_driver": { + "Attributes": { + "driver.mock": "true" + }, + "Detected": true, + "HealthDescription": "Healthy", + "Healthy": true, + "UpdateTime": "2019-08-26T12:22:50.687978919+02:00" + }, + "qemu": { + "Attributes": null, + "Detected": false, + "HealthDescription": "", + "Healthy": false, + "UpdateTime": "2019-08-26T12:22:50.688023782+02:00" + }, + "raw_exec": { + "Attributes": { + "driver.raw_exec": "true" + }, + "Detected": true, + "HealthDescription": "Healthy", + "Healthy": true, + "UpdateTime": "2019-08-26T12:22:50.687733347+02:00" + }, + "rkt": { + "Attributes": null, + "Detected": false, + "HealthDescription": "Driver must run as root", + "Healthy": false, + "UpdateTime": "2019-08-26T12:22:50.68796043+02:00" + } + }, + "Events": [ + { + "CreateIndex": 0, + "Details": null, + "Message": "Node registered", + "Subsystem": "Cluster", + "Timestamp": "2019-08-26T12:22:50+02:00" + } + ], + "HTTPAddr": "127.0.0.1:4646", + "HostVolumes": { + "certificates": { + "Name": "certificates", + "Path": "/etc/ssl/certs", + "ReadOnly": true + }, + "prod-mysql-a": { + "Name": "prod-mysql-a", + "Path": "/data/mysql", + "ReadOnly": false + } + }, + "ID": "1ac61e33-a465-2ace-f63f-cffa1285e7eb", + "Links": { + "consul": "dc1.mew" + }, + "Meta": { + "connect.log_level": "info", + "connect.sidecar_image": "envoyproxy/envoy:v1.11.1" + }, + "ModifyIndex": 9, + "Name": "mew", + "NodeClass": "", + "NodeResources": { + "Cpu": { + "CpuShares": 32000 + }, + "Devices": null, + "Disk": { + "DiskMB": 7890 + }, + "Memory": { + "MemoryMB": 15803 + }, + "Networks": [ + { + "CIDR": "127.0.0.1/32", + "Device": "lo", + "DynamicPorts": null, + "IP": "127.0.0.1", + "MBits": 1000, + "Mode": "", + "ReservedPorts": null + }, + { + "CIDR": "::1/128", + "Device": "lo", + "DynamicPorts": null, + "IP": "::1", + "MBits": 1000, + "Mode": "", + "ReservedPorts": null + } + ] + }, + "Reserved": { + "CPU": 0, + "Devices": null, + "DiskMB": 0, + "IOPS": 0, + "MemoryMB": 0, + "Networks": null + }, + "ReservedResources": { + "Cpu": { + "CpuShares": 0 + }, + "Disk": { + "DiskMB": 0 + }, + "Memory": { + "MemoryMB": 0 + }, + "Networks": { + "ReservedHostPorts": "" + } + }, + "Resources": { + "CPU": 32000, + "Devices": null, + "DiskMB": 7890, + "IOPS": 0, + "MemoryMB": 15803, + "Networks": [ + { + "CIDR": "127.0.0.1/32", + "Device": "lo", + "DynamicPorts": null, + "IP": "127.0.0.1", + "MBits": 1000, + "Mode": "", + "ReservedPorts": null + }, + { + "CIDR": "::1/128", + "Device": "lo", + "DynamicPorts": null, + "IP": "::1", + "MBits": 1000, + "Mode": "", + "ReservedPorts": null + } + ] + }, + "SchedulingEligibility": "eligible", + "SecretID": "", + "Status": "ready", + "StatusDescription": "", + "StatusUpdatedAt": 1566814982, + "TLSEnabled": false +} +``` + +## List Node Allocations + +This endpoint lists all of the allocations for the given node. This can be used to +determine what allocations have been scheduled on the node, their current status, +and the values of dynamically assigned resources, like ports. + +| Method | Path | Produces | +| ------ | ------------------------------- | ------------------ | +| `GET` | `/v1/node/:node_id/allocations` | `application/json` | + +The table below shows this endpoint's support for +[blocking queries](/api-docs#blocking-queries) and +[required ACLs](/api-docs#acls). + +| Blocking Queries | ACL Required | +| ---------------- | ------------------------------ | +| `YES` | `node:read,namespace:read-job` | + +### Parameters + +- `:node_id` `(string: )`- Specifies the UUID of the node. This must + be the full UUID, not the short 8-character one. This is specified as part of + the path. + +### Sample Request + +```shell-session +$ curl \ + http://localhost:4646/v1/node/e02b6169-83bd-9df6-69bd-832765f333eb/allocations +``` + +### Sample Response + +```json +[ + { + "AllocModifyIndex": 2555, + "ClientDescription": "", + "ClientStatus": "running", + "CreateIndex": 2555, + "CreateTime": 1523490066575461000, + "DeploymentID": "", + "DeploymentStatus": { + "Healthy": true, + "ModifyIndex": 0 + }, + "DesiredDescription": "", + "DesiredStatus": "run", + "DesiredTransition": { + "Migrate": null + }, + "EvalID": "5129bc74-9785-c39a-08da-bddc8aa778b1", + "FollowupEvalID": "", + "ID": "fefe81d0-08b2-4eca-fae6-6560cde46d31", + "Job": { + "AllAtOnce": false, + "Constraints": null, + "CreateIndex": 2553, + "Datacenters": ["dc1"], + "ID": "webapp", + "JobModifyIndex": 2553, + "Meta": null, + "ModifyIndex": 2554, + "Name": "webapp", + "Namespace": "default", + "ParameterizedJob": null, + "ParentID": "", + "Payload": null, + "Periodic": null, + "Priority": 50, + "Region": "global", + "Stable": false, + "Status": "pending", + "StatusDescription": "", + "Stop": false, + "SubmitTime": 1523490066563405000, + "TaskGroups": [ + { + "Constraints": null, + "Count": 9, + "EphemeralDisk": { + "Migrate": false, + "SizeMB": 300, + "Sticky": false + }, + "Meta": null, + "Migrate": { + "HealthCheck": "checks", + "HealthyDeadline": 300000000000, + "MaxParallel": 2, + "MinHealthyTime": 15000000000 + }, + "Name": "webapp", + "ReschedulePolicy": { + "Attempts": 0, + "Delay": 30000000000, + "DelayFunction": "exponential", + "Interval": 0, + "MaxDelay": 3600000000000, + "Unlimited": true + }, + "RestartPolicy": { + "Attempts": 2, + "Delay": 15000000000, + "Interval": 1800000000000, + "Mode": "fail" + }, + "Tasks": [ + { + "Artifacts": null, + "Config": { + "args": ["-text", "ok4"], + "image": "hashicorp/http-echo:0.2.3", + "port_map": [ + { + "http": 5678 + } + ] + }, + "Constraints": null, + "DispatchPayload": null, + "Driver": "docker", + "Env": null, + "KillSignal": "", + "KillTimeout": 5000000000, + "Leader": false, + "LogConfig": { + "MaxFileSizeMB": 10, + "MaxFiles": 10 + }, + "Meta": null, + "Name": "webapp", + "Resources": { + "CPU": 100, + "DiskMB": 0, + "MemoryMB": 300, + "Networks": [ + { + "CIDR": "", + "Device": "", + "DynamicPorts": [ + { + "Label": "http", + "Value": 0 + } + ], + "IP": "", + "MBits": 10, + "ReservedPorts": null + } + ] + }, + "Services": [ + { + "AddressMode": "auto", + "Checks": [ + { + "AddressMode": "", + "Args": null, + "CheckRestart": null, + "Command": "", + "Header": null, + "InitialStatus": "", + "Interval": 10000000000, + "Method": "", + "Name": "http-ok", + "Path": "/", + "PortLabel": "", + "Protocol": "", + "TLSSkipVerify": false, + "Timeout": 2000000000, + "Type": "http" + } + ], + "Name": "webapp", + "PortLabel": "http", + "Tags": null + } + ], + "ShutdownDelay": 0, + "Templates": null, + "User": "", + "Vault": null + } + ], + "Update": null + } + ], + "Type": "service", + "Update": { + "AutoRevert": false, + "Canary": 0, + "HealthCheck": "", + "HealthyDeadline": 0, + "MaxParallel": 0, + "MinHealthyTime": 0, + "Stagger": 0 + }, + "VaultToken": "", + "Version": 0 + }, + "JobID": "webapp", + "Metrics": { + "AllocationTime": 63337, + "ClassExhausted": null, + "ClassFiltered": null, + "CoalescedFailures": 0, + "ConstraintFiltered": null, + "DimensionExhausted": null, + "NodesAvailable": { + "dc1": 2 + }, + "NodesEvaluated": 2, + "NodesExhausted": 0, + "NodesFiltered": 0, + "QuotaExhausted": null, + "Scores": { + "46f1c6c4-a0e5-21f6-fd5c-d76c3d84e806.binpack": 2.6950883117541586, + "f7476465-4d6e-c0de-26d0-e383c49be941.binpack": 2.6950883117541586 + } + }, + "ModifyIndex": 2567, + "ModifyTime": 1523490089807324000, + "Name": "webapp.webapp[0]", + "Namespace": "default", + "NextAllocation": "", + "NodeID": "f7476465-4d6e-c0de-26d0-e383c49be941", + "PreviousAllocation": "", + "RescheduleTracker": null, + "Resources": { + "CPU": 100, + "DiskMB": 300, + "MemoryMB": 300, + "Networks": [ + { + "CIDR": "", + "Device": "eth0", + "DynamicPorts": [ + { + "Label": "http", + "Value": 25920 + } + ], + "IP": "10.138.0.5", + "MBits": 10, + "ReservedPorts": null + } + ] + }, + "SharedResources": { + "CPU": 0, + "DiskMB": 300, + "MemoryMB": 0, + "Networks": null + }, + "TaskGroup": "webapp", + "TaskResources": { + "webapp": { + "CPU": 100, + "DiskMB": 0, + "MemoryMB": 300, + "Networks": [ + { + "CIDR": "", + "Device": "eth0", + "DynamicPorts": [ + { + "Label": "http", + "Value": 25920 + } + ], + "IP": "10.138.0.5", + "MBits": 10, + "ReservedPorts": null + } + ] + } + }, + "TaskStates": { + "webapp": { + "Events": [ + { + "Details": {}, + "DiskLimit": 0, + "DisplayMessage": "Task received by client", + "DownloadError": "", + "DriverError": "", + "DriverMessage": "", + "ExitCode": 0, + "FailedSibling": "", + "FailsTask": false, + "GenericSource": "", + "KillError": "", + "KillReason": "", + "KillTimeout": 0, + "Message": "", + "RestartReason": "", + "SetupError": "", + "Signal": 0, + "StartDelay": 0, + "TaskSignal": "", + "TaskSignalReason": "", + "Time": 1523490066712543500, + "Type": "Received", + "ValidationError": "", + "VaultError": "" + }, + { + "Details": { + "message": "Building Task Directory" + }, + "DiskLimit": 0, + "DisplayMessage": "Building Task Directory", + "DownloadError": "", + "DriverError": "", + "DriverMessage": "", + "ExitCode": 0, + "FailedSibling": "", + "FailsTask": false, + "GenericSource": "", + "KillError": "", + "KillReason": "", + "KillTimeout": 0, + "Message": "Building Task Directory", + "RestartReason": "", + "SetupError": "", + "Signal": 0, + "StartDelay": 0, + "TaskSignal": "", + "TaskSignalReason": "", + "Time": 1523490066715208000, + "Type": "Task Setup", + "ValidationError": "", + "VaultError": "" + }, + { + "Details": {}, + "DiskLimit": 0, + "DisplayMessage": "Task started by client", + "DownloadError": "", + "DriverError": "", + "DriverMessage": "", + "ExitCode": 0, + "FailedSibling": "", + "FailsTask": false, + "GenericSource": "", + "KillError": "", + "KillReason": "", + "KillTimeout": 0, + "Message": "", + "RestartReason": "", + "SetupError": "", + "Signal": 0, + "StartDelay": 0, + "TaskSignal": "", + "TaskSignalReason": "", + "Time": 1523490068433051100, + "Type": "Started", + "ValidationError": "", + "VaultError": "" + } + ], + "Failed": false, + "FinishedAt": "0001-01-01T00:00:00Z", + "LastRestart": "0001-01-01T00:00:00Z", + "Restarts": 0, + "StartedAt": "2018-04-11T23:41:08.445128764Z", + "State": "running" + } + } + } +] +``` + +## Create Node Evaluation + +This endpoint creates a new evaluation for the given node. This can be used to +force a run of the scheduling logic. + +| Method | Path | Produces | +| ------ | ---------------------------- | ------------------ | +| `POST` | `/v1/node/:node_id/evaluate` | `application/json` | + +The table below shows this endpoint's support for +[blocking queries](/api-docs#blocking-queries) and +[required ACLs](/api-docs#acls). + +| Blocking Queries | ACL Required | +| ---------------- | ------------ | +| `NO` | `node:write` | + +### Parameters + +- `:node_id` `(string: )`- Specifies the UUID of the node. This must + be the full UUID, not the short 8-character one. This is specified as part of + the path. + +### Sample Request + +```shell-session +$ curl \ + http://localhost:4646/v1/node/fb2170a8-257d-3c64-b14d-bc06cc94e34c/evaluate +``` + +### Sample Response + +```json +{ + "EvalCreateIndex": 3671, + "EvalIDs": ["4dfc2db7-b481-c53b-3072-14479aa44be3"], + "HeartbeatTTL": 0, + "Index": 3671, + "KnownLeader": false, + "LastContact": 0, + "LeaderRPCAddr": "10.138.0.2:4647", + "NodeModifyIndex": 0, + "NumNodes": 3, + "Servers": [ + { + "Datacenter": "dc1", + "RPCAdvertiseAddr": "10.138.0.2:4647", + "RPCMajorVersion": 1, + "RPCMinorVersion": 1 + }, + { + "Datacenter": "dc1", + "RPCAdvertiseAddr": "10.138.0.3:4647", + "RPCMajorVersion": 1, + "RPCMinorVersion": 1 + }, + { + "Datacenter": "dc1", + "RPCAdvertiseAddr": "10.138.0.4:4647", + "RPCMajorVersion": 1, + "RPCMinorVersion": 1 + } + ] +} +``` + +## Drain Node + +This endpoint toggles the drain mode of the node. When draining is enabled, no +further allocations will be assigned to this node, and existing allocations will +be migrated to new nodes. See the [Workload Migration +Guide](https://learn.hashicorp.com/nomad/operating-nomad/node-draining) for suggested usage. + +| Method | Path | Produces | +| ------ | ------------------------- | ------------------ | +| `POST` | `/v1/node/:node_id/drain` | `application/json` | + +The table below shows this endpoint's support for +[blocking queries](/api-docs#blocking-queries) and +[required ACLs](/api-docs#acls). + +| Blocking Queries | ACL Required | +| ---------------- | ------------ | +| `NO` | `node:write` | + +### Parameters + +- `:node_id` `(string: )`- Specifies the UUID of the node. This must + be the full UUID, not the short 8-character one. This is specified as part of + the path. + +- `DrainSpec` `(object: )` - Specifies if drain mode should be + enabled. A missing or null value disables an existing drain. + + - `Deadline` `(int: )` - Specifies how long to wait in nanoseconds + for allocations to finish migrating before they are force stopped. This is + also how long batch jobs are given to complete before being migrated. + + - `IgnoreSystemJobs` `(bool: false)` - Specifies whether or not to stop system + jobs as part of a drain. By default system jobs will be stopped after all + other allocations have migrated or the deadline is reached. Setting this to + `true` means system jobs are always left running. + +- `MarkEligible` `(bool: false)` - Specifies whether to mark a node as eligible + for scheduling again when _disabling_ a drain. + +### Sample Payload + +```json +{ + "DrainSpec": { + "Deadline": 3600000000000, + "IgnoreSystemJobs": true + } +} +``` + +### Sample Request + +```shell-session +$ curl \ + -XPOST \ + --data @drain.json \ + http://localhost:4646/v1/node/fb2170a8-257d-3c64-b14d-bc06cc94e34c/drain +``` + +### Sample Response + +```json +{ + "EvalCreateIndex": 0, + "EvalIDs": null, + "Index": 3742, + "NodeModifyIndex": 3742 +} +``` + +## Purge Node + +This endpoint purges a node from the system. Nodes can still join the cluster if +they are alive. + +| Method | Path | Produces | +| ------ | ------------------------- | ------------------ | +| `POST` | `/v1/node/:node_id/purge` | `application/json` | + +The table below shows this endpoint's support for +[blocking queries](/api-docs#blocking-queries) and +[required ACLs](/api-docs#acls). + +| Blocking Queries | ACL Required | +| ---------------- | ------------ | +| `NO` | `node:write` | + +### Parameters + +- `:node_id` `(string: )`- Specifies the UUID of the node. This must + be the full UUID, not the short 8-character one. This is specified as part of + the path. + +### Sample Request + +```shell-session +$ curl \ + -XPOST http://localhost:4646/v1/node/f7476465-4d6e-c0de-26d0-e383c49be941/purge +``` + +### Sample Response + +```json +{ + "EvalCreateIndex": 3817, + "EvalIDs": ["71bad787-5ab1-9939-be02-4809441583cd"], + "HeartbeatTTL": 0, + "Index": 3816, + "KnownLeader": false, + "LastContact": 0, + "LeaderRPCAddr": "", + "NodeModifyIndex": 3816, + "NumNodes": 0, + "Servers": null +} +``` + +## Toggle Node Eligibility + +This endpoint toggles the scheduling eligibility of the node. + +| Method | Path | Produces | +| ------ | ------------------------------- | ------------------ | +| `POST` | `/v1/node/:node_id/eligibility` | `application/json` | + +The table below shows this endpoint's support for +[blocking queries](/api-docs#blocking-queries) and +[required ACLs](/api-docs#acls). + +| Blocking Queries | ACL Required | +| ---------------- | ------------ | +| `NO` | `node:write` | + +### Parameters + +- `:node_id` `(string: )`- Specifies the UUID of the node. This must + be the full UUID, not the short 8-character one. This is specified as part of + the path. + +- `Eligibility` `(string: )` - Either `eligible` or `ineligible`. + +### Sample Payload + +```json +{ + "Eligibility": "ineligible" +} +``` + +### Sample Request + +```shell-session +$ curl \ + -XPOST \ + --data @eligibility.json \ + http://localhost:4646/v1/node/fb2170a8-257d-3c64-b14d-bc06cc94e34c/eligibility +``` + +### Sample Response + +```json +{ + "EvalCreateIndex": 0, + "EvalIDs": null, + "Index": 3742, + "NodeModifyIndex": 3742 +} +``` + +#### Field Reference + +- Events - A list of the last 10 node events for this node. A node event is a + high level concept of noteworthy events for a node. + + Each node event has the following fields: + + - `Message` - The specific message for the event, detailing what occurred. + + - `Subsystem` - The subsystem where the node event took place. Subsystems + include: + + - `Drain` - The Nomad server draining subsystem. + + - `Driver` - The Nomad client driver subsystem. + + - `Heartbeat` - Either Nomad client or server heartbeating subsystem. + + - `Cluster` - Nomad server cluster management subsystem. + + - `Details` - Any further details about the event, formatted as a key/value + pair. + + - `Timestamp` - Each node event has an ISO 8601 timestamp. + + - `CreateIndex` - The Raft index at which the event was committed. diff --git a/content/nomad/v0.11.x/content/api-docs/operator.mdx b/content/nomad/v0.11.x/content/api-docs/operator.mdx new file mode 100644 index 0000000000..eb5fd9dea8 --- /dev/null +++ b/content/nomad/v0.11.x/content/api-docs/operator.mdx @@ -0,0 +1,473 @@ +--- +layout: api +page_title: Operator - HTTP API +sidebar_title: Operator +description: |- + The /operator endpoints provides cluster-level tools for Nomad operators, such + as interacting with the Raft subsystem. +--- + +# /v1/operator + +The `/operator` endpoint provides cluster-level tools for Nomad operators, such +as interacting with the Raft subsystem. + +~> Use this interface with extreme caution, as improper use could lead to a +Nomad outage and even loss of data. + +See the [Outage Recovery](https://learn.hashicorp.com/nomad/operating-nomad/outage) guide for some examples of how +these capabilities are used. For a CLI to perform these operations manually, +please see the documentation for the +[`nomad operator`](/docs/commands/operator) command. + +## Read Raft Configuration + +This endpoint queries the status of a client node registered with Nomad. + +| Method | Path | Produces | +| ------ | --------------------------------- | ------------------ | +| `GET` | `/v1/operator/raft/configuration` | `application/json` | + +The table below shows this endpoint's support for +[blocking queries](/api-docs#blocking-queries) and +[required ACLs](/api-docs#acls). + +| Blocking Queries | ACL Required | +| ---------------- | ------------ | +| `NO` | `management` | + +### Parameters + +- `stale` - Specifies if the cluster should respond without an active leader. + This is specified as a query string parameter. + +### Sample Request + +```shell-session +$ curl \ + https://localhost:4646/v1/operator/raft/configuration +``` + +### Sample Response + +```json +{ + "Index": 1, + "Servers": [ + { + "Address": "127.0.0.1:4647", + "ID": "127.0.0.1:4647", + "Leader": true, + "Node": "bacon-mac.global", + "RaftProtocol": 2, + "Voter": true + } + ] +} +``` + +#### Field Reference + +- `Index` `(int)` - The `Index` value is the Raft corresponding to this + configuration. The latest configuration may not yet be committed if changes + are in flight. + +- `Servers` `(array: Server)` - The returned `Servers` array has information + about the servers in the Raft peer configuration. + + - `ID` `(string)` - The ID of the server. This is the same as the `Address` + but may be upgraded to a GUID in a future version of Nomad. + + - `Node` `(string)` - The node name of the server, as known to Nomad, or + `"(unknown)"` if the node is stale and not known. + + - `Address` `(string)` - The `ip:port` for the server. + + - `Leader` `(bool)` - is either "true" or "false" depending on the server's + role in the Raft configuration. + + - `Voter` `(bool)` - is "true" or "false", indicating if the server has a vote + in the Raft configuration. Future versions of Nomad may add support for + non-voting servers. + +## Remove Raft Peer + +This endpoint removes a Nomad server with given address from the Raft +configuration. The return code signifies success or failure. + +| Method | Path | Produces | +| -------- | ------------------------ | ------------------ | +| `DELETE` | `/v1/operator/raft/peer` | `application/json` | + +The table below shows this endpoint's support for +[blocking queries](/api-docs#blocking-queries) and +[required ACLs](/api-docs#acls). + +| Blocking Queries | ACL Required | +| ---------------- | ------------ | +| `NO` | `management` | + +### Parameters + +- `address` `(string: )` - Specifies the server to remove as + `ip:port`. This cannot be provided along with the `id` parameter. + +- `id` `(string: )` - Specifies the server to remove as + `id`. This cannot be provided along with the `address` parameter. + +### Sample Request + +```shell-session +$ curl \ + --request DELETE \ + https://localhost:4646/v1/operator/raft/peer?address=1.2.3.4 +``` + +## Read Autopilot Configuration + +This endpoint retrieves its latest Autopilot configuration. + +| Method | Path | Produces | +| ------ | -------------------------------------- | ------------------ | +| `GET` | `/v1/operator/autopilot/configuration` | `application/json` | + +The table below shows this endpoint's support for +[blocking queries](/api-docs#blocking-queries) and +[required ACLs](/api-docs#acls). + +| Blocking Queries | ACL Required | +| ---------------- | --------------- | +| `NO` | `operator:read` | + +### Sample Request + +```shell-session +$ curl \ + https://localhost:4646/v1/operator/autopilot/configuration +``` + +### Sample Response + +```json +{ + "CleanupDeadServers": true, + "LastContactThreshold": "200ms", + "MaxTrailingLogs": 250, + "ServerStabilizationTime": "10s", + "EnableRedundancyZones": false, + "DisableUpgradeMigration": false, + "EnableCustomUpgrades": false, + "CreateIndex": 4, + "ModifyIndex": 4 +} +``` + +For more information about the Autopilot configuration options, see the +[agent configuration section](/docs/configuration/autopilot). + +## Update Autopilot Configuration + +This endpoint updates the Autopilot configuration of the cluster. + +| Method | Path | Produces | +| ------ | -------------------------------------- | ------------------ | +| `PUT` | `/v1/operator/autopilot/configuration` | `application/json` | + +The table below shows this endpoint's support for +[blocking queries](/api-docs#blocking-queries) and +[required ACLs](/api-docs#acls). + +| Blocking Queries | ACL Required | +| ---------------- | ---------------- | +| `NO` | `operator:write` | + +### Parameters + +- `cas` `(int: 0)` - Specifies to use a Check-And-Set operation. The update will + only happen if the given index matches the `ModifyIndex` of the configuration + at the time of writing. + +### Sample Payload + +```json +{ + "CleanupDeadServers": true, + "LastContactThreshold": "200ms", + "MaxTrailingLogs": 250, + "ServerStabilizationTime": "10s", + "EnableRedundancyZones": false, + "DisableUpgradeMigration": false, + "EnableCustomUpgrades": false, + "CreateIndex": 4, + "ModifyIndex": 4 +} +``` + +- `CleanupDeadServers` `(bool: true)` - Specifies automatic removal of dead + server nodes periodically and whenever a new server is added to the cluster. + +- `LastContactThreshold` `(string: "200ms")` - Specifies the maximum amount of + time a server can go without contact from the leader before being considered + unhealthy. Must be a duration value such as `10s`. + +- `MaxTrailingLogs` `(int: 250)` specifies the maximum number of log entries + that a server can trail the leader by before being considered unhealthy. + +- `ServerStabilizationTime` `(string: "10s")` - Specifies the minimum amount of + time a server must be stable in the 'healthy' state before being added to the + cluster. Only takes effect if all servers are running Raft protocol version 3 + or higher. Must be a duration value such as `30s`. + +- `EnableRedundancyZones` `(bool: false)` - (Enterprise-only) Specifies whether + to enable redundancy zones. + +- `DisableUpgradeMigration` `(bool: false)` - (Enterprise-only) Disables Autopilot's + upgrade migration strategy in Nomad Enterprise of waiting until enough + newer-versioned servers have been added to the cluster before promoting any of + them to voters. + +- `EnableCustomUpgrades` `(bool: false)` - (Enterprise-only) Specifies whether to + enable using custom upgrade versions when performing migrations. + +## Read Health + +This endpoint queries the health of the autopilot status. + +| Method | Path | Produces | +| ------ | ------------------------------- | ------------------ | +| `GET` | `/v1/operator/autopilot/health` | `application/json` | + +The table below shows this endpoint's support for +[blocking queries](/api-docs#blocking-queries) and +[required ACLs](/api-docs#acls). + +| Blocking Queries | ACL Required | +| ---------------- | --------------- | +| `NO` | `operator:read` | + +### Sample Request + +```shell-session +$ curl \ + https://localhost:4646/v1/operator/autopilot/health +``` + +### Sample response + +```json +{ + "Healthy": true, + "FailureTolerance": 0, + "Servers": [ + { + "ID": "e349749b-3303-3ddf-959c-b5885a0e1f6e", + "Name": "node1", + "Address": "127.0.0.1:8300", + "SerfStatus": "alive", + "Version": "0.8.0", + "Leader": true, + "LastContact": "0s", + "LastTerm": 2, + "LastIndex": 46, + "Healthy": true, + "Voter": true, + "StableSince": "2017-03-06T22:07:51Z" + }, + { + "ID": "e36ee410-cc3c-0a0c-c724-63817ab30303", + "Name": "node2", + "Address": "127.0.0.1:8205", + "SerfStatus": "alive", + "Version": "0.8.0", + "Leader": false, + "LastContact": "27.291304ms", + "LastTerm": 2, + "LastIndex": 46, + "Healthy": true, + "Voter": false, + "StableSince": "2017-03-06T22:18:26Z" + } + ] +} +``` + +- `Healthy` is whether all the servers are currently healthy. + +- `FailureTolerance` is the number of redundant healthy servers that could be + fail without causing an outage (this would be 2 in a healthy cluster of 5 + servers). + +- `Servers` holds detailed health information on each server: + + - `ID` is the Raft ID of the server. + + - `Name` is the node name of the server. + + - `Address` is the address of the server. + + - `SerfStatus` is the SerfHealth check status for the server. + + - `Version` is the Nomad version of the server. + + - `Leader` is whether this server is currently the leader. + + - `LastContact` is the time elapsed since this server's last contact with the leader. + + - `LastTerm` is the server's last known Raft leader term. + + - `LastIndex` is the index of the server's last committed Raft log entry. + + - `Healthy` is whether the server is healthy according to the current Autopilot configuration. + + - `Voter` is whether the server is a voting member of the Raft cluster. + + - `StableSince` is the time this server has been in its current `Healthy` state. + + The HTTP status code will indicate the health of the cluster. If `Healthy` is true, then a + status of 200 will be returned. If `Healthy` is false, then a status of 429 will be returned. + +## Read Scheduler Configuration + +This endpoint retrieves the latest Scheduler configuration. This API was introduced in +Nomad 0.9 and currently supports enabling/disabling preemption. More options may be added in +the future. + +| Method | Path | Produces | +| ------ | -------------------------------------- | ------------------ | +| `GET` | `/v1/operator/scheduler/configuration` | `application/json` | + +The table below shows this endpoint's support for +[blocking queries](/api-docs#blocking-queries) and +[required ACLs](/api-docs#acls). + +| Blocking Queries | ACL Required | +| ---------------- | --------------- | +| `NO` | `operator:read` | + +### Sample Request + +```shell-session +$ curl \ + https://localhost:4646/v1/operator/scheduler/configuration +``` + +### Sample Response + +```json +{ + "Index": 5, + "KnownLeader": true, + "LastContact": 0, + "SchedulerConfig": { + "CreateIndex": 5, + "ModifyIndex": 5, + "SchedulerAlgorithm": "spread", + "PreemptionConfig": { + "SystemSchedulerEnabled": true, + "BatchSchedulerEnabled": false, + "ServiceSchedulerEnabled": false + } + } +} +``` + +#### Field Reference + +- `Index` `(int)` - The `Index` value is the Raft commit index corresponding to this + configuration. + +- `SchedulerConfig` `(SchedulerConfig)` - The returned `SchedulerConfig` object has configuration + settings mentioned below. + + - `SchedulerAlgorithm` `(string: "binpack")` - Specifies whether scheduler binpacks or spreads allocations on available nodes. + + - `PreemptionConfig` `(PreemptionConfig)` - Options to enable preemption for various schedulers. + + - `SystemSchedulerEnabled` `(bool: true)` - Specifies whether preemption for system jobs is enabled. Note that + this defaults to true. + + - `BatchSchedulerEnabled` `(bool: false)` (Enterprise Only) - Specifies whether preemption for batch jobs is enabled. Note that + this defaults to false and must be explicitly enabled. + + - `ServiceSchedulerEnabled` `(bool: false)` (Enterprise Only) - Specifies whether preemption for service jobs is enabled. Note that + this defaults to false and must be explicitly enabled. + + - `CreateIndex` - The Raft index at which the config was created. + - `ModifyIndex` - The Raft index at which the config was modified. + +## Update Scheduler Configuration + +This endpoint updates the scheduler configuration of the cluster. + +| Method | Path | Produces | +| ------------- | -------------------------------------- | ------------------ | +| `PUT`, `POST` | `/v1/operator/scheduler/configuration` | `application/json` | + +The table below shows this endpoint's support for +[blocking queries](/api-docs#blocking-queries) and +[required ACLs](/api-docs#acls). + +| Blocking Queries | ACL Required | +| ---------------- | ---------------- | +| `NO` | `operator:write` | + +### Bootstrap Configuration Element + +The [`default_scheduler_config`][] attribute of the server stanza will provide a +starting value for this configuration. Once bootstrapped, the value in the +server state is authoritative. + +### Parameters + +- `cas` `(int: 0)` - Specifies to use a Check-And-Set operation. The update will + only happen if the given index matches the `ModifyIndex` of the configuration + at the time of writing. + +### Sample Payload + +```json +{ + "SchedulerAlgorithm": "spread", + "PreemptionConfig": { + "SystemSchedulerEnabled": true, + "BatchSchedulerEnabled": false, + "ServiceSchedulerEnabled": true + } +} +``` + +- `SchedulerAlgorithm` `(string: "binpack")` - Specifies whether scheduler + binpacks or spreads allocations on available nodes. Possible values are + `"binpack"` and `"spread"` + +- `PreemptionConfig` `(PreemptionConfig)` - Options to enable preemption for + various schedulers. + +- `SystemSchedulerEnabled` `(bool: true)` - Specifies whether preemption for + system jobs is enabled. Note that if this is set to true, then system jobs + can preempt any other jobs. + +- `BatchSchedulerEnabled` `(bool: false)` (Enterprise Only) - Specifies + whether preemption for batch jobs is enabled. Note that if this is set to + true, then batch jobs can preempt any other jobs. + +- `ServiceSchedulerEnabled` `(bool: false)` (Enterprise Only) - Specifies + whether preemption for service jobs is enabled. Note that if this is set to + true, then service jobs can preempt any other jobs. + +### Sample Response + +```json +{ + "Updated": false, + "Index": 15 +} +``` + +- `Updated` - Indicates that the configuration was updated when a `cas` value is + provided. For non-CAS requests, this field will be false even though the + update is applied. + +- `Index` - Current Raft index when the request was received. + +[`default_scheduler_config`]: /docs/configuration/server#default_scheduler_config diff --git a/content/nomad/v0.11.x/content/api-docs/plugins.mdx b/content/nomad/v0.11.x/content/api-docs/plugins.mdx new file mode 100644 index 0000000000..1d717995fd --- /dev/null +++ b/content/nomad/v0.11.x/content/api-docs/plugins.mdx @@ -0,0 +1,141 @@ +--- +layout: api +page_title: Plugins - HTTP API +sidebar_title: Plugins +description: The `/plugin` endpoints are used to query for and interact with dynamic plugins. +--- + +# Plugins HTTP API + +The `/plugin` endpoints are used to query for and interact with +dynamic plugins. Currently only Container Storage Interface (CSI) +plugins are dynamic. + +## List Plugins + +This endpoint lists all dynamic plugins. + +| Method | Path | Produces | +| ------ | ------------- | ------------------ | +| `GET` | `/v1/plugins` | `application/json` | + +The table below shows this endpoint's support for +[blocking queries](/api-docs#blocking-queries) and +[required ACLs](/api-docs#acls). + +| Blocking Queries | ACL Required | +| ---------------- | --------------------------- | +| `YES` | `namespace:csi-list-plugin` | + +### Parameters + +- `type` `(string: "")`- Specifies the type of plugin to + query. Currently only supports `csi`. This is specified as a query + string parameter. Returns an empty list if omitted. + +### Sample Request + +```shell-session +$ curl \ + https://localhost:4646/v1/plugins?type=csi +``` + +### Sample Response + +```json +[ + { + "ID": "example", + "Provider": "aws.ebs", + "ControllerRequired": true, + "ControllersHealthy": 2, + "ControllersExpected": 3, + "NodesHealthy": 14, + "NodesExpected": 16, + "CreateIndex": 52, + "ModifyIndex": 93 + } +] +``` + +## Read Plugin + +Get details of a single plugin, including information about the +plugin's job and client fingerprint data. + +| Method | Path | Produces | +| ------ | --------------------------- | ------------------ | +| `GET` | `/v1/plugin/csi/:plugin_id` | `application/json` | + +The table below shows this endpoint's support for +[blocking queries](/api-docs#blocking-queries) and +[required ACLs](/api-docs#acls). + +| Blocking Queries | ACL Required | +| ---------------- | --------------------------- | +| `YES` | `namespace:csi-read-plugin` | + +### Sample Request + +```shell-session +$ curl \ + https://localhost:4646/v1/plugin/csi/example_plugin_id +``` + +### Sample Response + +```json +[ + { + "ID": "example_plugin_id", + "Topologies": [ + {"key": "val"}, + {"key": "val2"} + ], + "Provider": "aws.ebs", + "Version": "1.0.1", + "ControllersRequired": true, + "ControllersHealthy": 1, + "Controllers": { + "example_node_id": { + "PluginID": "example_plugin_id", + "Provider": "aws.ebs", + "ProviderVersion": "1.0.1", + "AllocID": "alloc-id", + "Healthy": true, + "HealthDescription": "healthy", + "UpdateTime": "2020-01-31T00:00:00.000Z", + "RequiresControllerPlugin": true, + "RequiresTopologies": true, + "ControllerInfo": { + "SupportsReadOnlyAttach": true, + "SupportsAttachDetach": true, + "SupportsListVolumes": true, + "SupportsListVolumesAttachedNodes": false + } + }, + "NodesHealthy": 1, + "Nodes": { + "example_node_id": { + "PluginID": "example_plugin_id", + "Provider": "aws.ebs", + "ProviderVersion": "1.0.1", + "AllocID": "alloc-id", + "Healthy": true, + "HealthDescription": "healthy", + "UpdateTime": "2020-01-30T00:00:00.000Z", + "RequiresControllerPlugin": true, + "RequiresTopologies": true, + "NodeInfo": { + "ID": "example_node_id", + "MaxVolumes": 51, + "AccessibleTopology": { + "key": "val2" + }, + "RequiresNodeStageVolume": true + } + } + } + } +] +``` diff --git a/content/nomad/v0.11.x/content/api-docs/quotas.mdx b/content/nomad/v0.11.x/content/api-docs/quotas.mdx new file mode 100644 index 0000000000..a9e2febd42 --- /dev/null +++ b/content/nomad/v0.11.x/content/api-docs/quotas.mdx @@ -0,0 +1,351 @@ +--- +layout: api +page_title: Quotas - HTTP API +sidebar_title: Quotas +description: The /quota endpoints are used to query for and interact with quotas. +--- + +# Quota HTTP API + +The `/quota` endpoints are used to query for and interact with quotas. + +~> **Enterprise Only!** This API endpoint and functionality only exists in +Nomad Enterprise. This is not present in the open source version of Nomad. + +## List Quota Specifications + +This endpoint lists all quota specifications. + +| Method | Path | Produces | +| ------ | ------------ | ------------------ | +| `GET` | `/v1/quotas` | `application/json` | + +The table below shows this endpoint's support for +[blocking queries](/api-docs#blocking-queries) and +[required ACLs](/api-docs#acls). + +| Blocking Queries | ACL Required | +| ---------------- | --------------------------------------------------------------- | +| `YES` | `quota:read`
`namespace:*` if namespace has quota attached | + +### Parameters + +- `prefix` `(string: "")`- Specifies a string to filter quota specifications on + based on an index prefix. This is specified as a query string parameter. + +### Sample Request + +```shell-session +$ curl \ + https://localhost:4646/v1/quotas +``` + +```shell-session +$ curl \ + https://localhost:4646/v1/quotas?prefix=sha +``` + +### Sample Response + +```json +[ + { + "CreateIndex": 8, + "Description": "Limit the shared default namespace", + "Hash": "SgDCH7L5ZDqNSi2NmJlqdvczt/Q6mjyVwVJC0XjWglQ=", + "Limits": [ + { + "Hash": "NLOoV2WBU8ieJIrYXXx8NRb5C2xU61pVVWRDLEIMxlU=", + "Region": "global", + "RegionLimit": { + "CPU": 2500, + "DiskMB": 0, + "MemoryMB": 2000, + "Networks": [ + { + "CIDR": "", + "Device": "", + "DynamicPorts": null, + "IP": "", + "MBits": 50, + "Mode": "", + "ReservedPorts": null + } + ] + } + } + ], + "ModifyIndex": 56, + "Name": "shared-quota" + } +] +``` + +## Read Quota Specification + +This endpoint reads information about a specific quota specification. + +| Method | Path | Produces | +| ------ | ------------------ | ------------------ | +| `GET` | `/v1/quota/:quota` | `application/json` | + +The table below shows this endpoint's support for +[blocking queries](/api-docs#blocking-queries) and +[required ACLs](/api-docs#acls). + +| Blocking Queries | ACL Required | +| ---------------- | --------------------------------------------------------------- | +| `YES` | `quota:read`
`namespace:*` if namespace has quota attached | + +### Parameters + +- `:quota` `(string: )`- Specifies the quota specification to query + where the identifier is the quota's name. + +### Sample Request + +```shell-session +$ curl \ + https://localhost:4646/v1/quota/shared-quota +``` + +### Sample Response + +```json +{ + "CreateIndex": 8, + "Description": "Limit the shared default namespace", + "Hash": "SgDCH7L5ZDqNSi2NmJlqdvczt/Q6mjyVwVJC0XjWglQ=", + "Limits": [ + { + "Hash": "NLOoV2WBU8ieJIrYXXx8NRb5C2xU61pVVWRDLEIMxlU=", + "Region": "global", + "RegionLimit": { + "CPU": 2500, + "DiskMB": 0, + "MemoryMB": 2000, + "Networks": [ + { + "CIDR": "", + "Device": "", + "DynamicPorts": null, + "IP": "", + "MBits": 50, + "Mode": "", + "ReservedPorts": null + } + ] + } + } + ], + "ModifyIndex": 56, + "Name": "shared-quota" +} +``` + +## Create or Update Quota Specification + +This endpoint is used to create or update a quota specification. + +| Method | Path | Produces | +| ------ | ------------------------------------- | ------------------ | +| `POST` | `/v1/quota/:quota`
`/v1/quota` | `application/json` | + +The table below shows this endpoint's support for +[blocking queries](/api-docs#blocking-queries) and +[required ACLs](/api-docs#acls). + +| Blocking Queries | ACL Required | +| ---------------- | ------------- | +| `NO` | `quota:write` | + +### Body + +The request body contains a valid, JSON quota specification. View the api +package to see the definition of a [`QuotaSpec` +object](https://github.com/hashicorp/nomad/blob/master/api/quota.go#L100-L131). + +### Sample Payload + +```javascript +{ + "Name": "shared-quota", + "Description": "Limit the shared default namespace", + "Limits": [ + { + "Region": "global", + "RegionLimit": { + "CPU": 2500, + "MemoryMB": 1000, + "Networks": [ + { + "Mbits": 50 + } + ] + } + } + ] +} +``` + +### Sample Request + +```shell-session +$ curl \ + --request POST \ + --data @spec.json \ + https://localhost:4646/v1/quota/shared-quota +``` + +```shell-session +$ curl \ + --request POST \ + --data @spec.json \ + https://localhost:4646/v1/quota +``` + +## Delete Quota Specification + +This endpoint is used to delete a quota specification. + +| Method | Path | Produces | +| -------- | ------------------ | ------------------ | +| `DELETE` | `/v1/quota/:quota` | `application/json` | + +The table below shows this endpoint's support for +[blocking queries](/api-docs#blocking-queries) and +[required ACLs](/api-docs#acls). + +| Blocking Queries | ACL Required | +| ---------------- | ------------- | +| `NO` | `quota:write` | + +### Parameters + +- `:quota` `(string: )`- Specifies the quota specification to delete + where the identifier is the quota's name. + +### Sample Request + +```shell-session +$ curl \ + --request DELETE \ + https://localhost:4646/v1/quota/shared-quota +``` + +## List Quota Usages + +This endpoint lists all quota usages. + +| Method | Path | Produces | +| ------ | ------------------ | ------------------ | +| `GET` | `/v1/quota-usages` | `application/json` | + +The table below shows this endpoint's support for +[blocking queries](/api-docs#blocking-queries) and +[required ACLs](/api-docs#acls). + +| Blocking Queries | ACL Required | +| ---------------- | --------------------------------------------------------------- | +| `YES` | `quota:read`
`namespace:*` if namespace has quota attached | + +### Parameters + +- `prefix` `(string: "")`- Specifies a string to filter quota specifications on + based on an index prefix. This is specified as a query string parameter. + +### Sample Request + +```shell-session +$ curl \ + https://localhost:4646/v1/quota-usages +``` + +```shell-session +$ curl \ + https://localhost:4646/v1/quota-usages?prefix=sha +``` + +### Sample Response + +```json +[ + { + "Used": { + "NLOoV2WBU8ieJIrYXXx8NRb5C2xU61pVVWRDLEIMxlU=": { + "Region": "global", + "RegionLimit": { + "CPU": 500, + "MemoryMB": 256, + "DiskMB": 0, + "Networks": null + }, + "Hash": "NLOoV2WBU8ieJIrYXXx8NRb5C2xU61pVVWRDLEIMxlU=" + } + }, + "Name": "default", + "CreateIndex": 8, + "ModifyIndex": 56 + } +] +``` + +## Read Quota Usage + +This endpoint reads information about a specific quota usage. + +| Method | Path | Produces | +| ------ | ------------------------ | ------------------ | +| `GET` | `/v1/quota/usage/:quota` | `application/json` | + +The table below shows this endpoint's support for +[blocking queries](/api-docs#blocking-queries) and +[required ACLs](/api-docs#acls). + +| Blocking Queries | ACL Required | +| ---------------- | --------------------------------------------------------------- | +| `YES` | `quota:read`
`namespace:*` if namespace has quota attached | + +### Parameters + +- `:quota` `(string: )`- Specifies the quota specification to query + where the identifier is the quota's name. + +### Sample Request + +```shell-session +$ curl \ + https://localhost:4646/v1/quota/shared-quota +``` + +### Sample Response + +```json +{ + "Used": { + "NLOoV2WBU8ieJIrYXXx8NRb5C2xU61pVVWRDLEIMxlU=": { + "Region": "global", + "RegionLimit": { + "CPU": 500, + "MemoryMB": 256, + "DiskMB": 0, + "Networks": [ + { + "CIDR": "", + "Device": "", + "DynamicPorts": null, + "IP": "", + "MBits": 50, + "Mode": "", + "ReservedPorts": null + } + ] + }, + "Hash": "NLOoV2WBU8ieJIrYXXx8NRb5C2xU61pVVWRDLEIMxlU=" + } + }, + "Name": "default", + "CreateIndex": 8, + "ModifyIndex": 56 +} +``` diff --git a/content/nomad/v0.11.x/content/api-docs/regions.mdx b/content/nomad/v0.11.x/content/api-docs/regions.mdx new file mode 100644 index 0000000000..dd0559f09d --- /dev/null +++ b/content/nomad/v0.11.x/content/api-docs/regions.mdx @@ -0,0 +1,37 @@ +--- +layout: api +page_title: Regions - HTTP API +sidebar_title: Regions +description: The /regions endpoints list all known regions. +--- + +# Regions HTTP API + +The `/regions` endpoints list all known regions. + +## List Regions + +| Method | Path | Produces | +| ------ | ---------- | ------------------ | +| `GET` | `/regions` | `application/json` | + +The table below shows this endpoint's support for +[blocking queries](/api-docs#blocking-queries) and +[required ACLs](/api-docs#acls). + +| Blocking Queries | ACL Required | +| ---------------- | ------------ | +| `NO` | `none` | + +### Sample Request + +```shell-session +$ curl \ + https://localhost:4646/v1/regions +``` + +### Sample Response + +```json +["region1", "region2"] +``` diff --git a/content/nomad/v0.11.x/content/api-docs/scaling-policies.mdx b/content/nomad/v0.11.x/content/api-docs/scaling-policies.mdx new file mode 100644 index 0000000000..626c6986f3 --- /dev/null +++ b/content/nomad/v0.11.x/content/api-docs/scaling-policies.mdx @@ -0,0 +1,103 @@ +--- +layout: api +page_title: Scaling Policies - HTTP API +sidebar_title: Scaling Policies Beta +description: The /scaling/policy endpoints are used to list and view scaling policies. +--- + +# Scaling Policies HTTP API + +The `/scaling/policies` and `/scaling/policy/` endpoints are used to list and view scaling policies. + +## List Scaling Policies Beta + +This endpoint returns the scaling policies from all jobs. + +| Method | Path | Produces | +| ------ | ------------------- | ------------------ | +| `GET` | `/scaling/policies` | `application/json` | + +The table below shows this endpoint's support for +[blocking queries](/api-docs#blocking-queries), [consistency modes](/api-docs#consistency-modes) and +[required ACLs](/api-docs#acls). + +| Blocking Queries | Consistency Modes | ACL Required | +| ---------------- | ----------------- | --------------------------------- | +| `YES` | `all` | `namespace:list-scaling-policies` | + +### Sample Request + +```shell-session +$ curl \ + https://localhost:4646/v1/scaling/policies +``` + +### Sample Response + +```json +[ + { + "CreateIndex": 10, + "Enabled": true, + "ID": "5e9f9ef2-5223-6d35-bac1-be0f3cb974ad", + "ModifyIndex": 10, + "Target": { + "Group": "cache", + "Job": "example", + "Namespace": "default" + } + } +] +``` + +## Read Scaling Policy Beta + +This endpoint reads a specific scaling policy. + +| Method | Path | Produces | +| ------ | ---------------------------- | ------------------ | +| `GET` | `/scaling/policy/:policy_id` | `application/json` | + +The table below shows this endpoint's support for +[blocking queries](/api-docs#blocking-queries), [consistency modes](/api-docs#consistency-modes) and +[required ACLs](/api-docs#acls). + +| Blocking Queries | Consistency Modes | ACL Required | +| ---------------- | ----------------- | ------------------------------- | +| `YES` | `all` | `namespace:read-scaling-policy` | + +### Parameters + +- `:policy_id` `(string: )` - Specifies the ID of the scaling policy (as returned + by the scaling policy list endpoint). This is specified as part of the path. + +### Sample Request + +```shell-session +$ curl \ + https://localhost:4646/v1/scaling/policy/5e9f9ef2-5223-6d35-bac1-be0f3cb974ad +``` + +### Sample Response + +```json +{ + "CreateIndex": 10, + "Enabled": true, + "ID": "5e9f9ef2-5223-6d35-bac1-be0f3cb974ad", + "Max": 10, + "Min": 0, + "ModifyIndex": 10, + "Policy": { + "engage": true, + "foo": "bar", + "howdy": "doody", + "value": 6.0 + }, + "Target": { + "Group": "cache", + "Job": "example", + "Namespace": "default" + } +} +``` diff --git a/content/nomad/v0.11.x/content/api-docs/search.mdx b/content/nomad/v0.11.x/content/api-docs/search.mdx new file mode 100644 index 0000000000..cac8ff3821 --- /dev/null +++ b/content/nomad/v0.11.x/content/api-docs/search.mdx @@ -0,0 +1,120 @@ +--- +layout: api +page_title: Search - HTTP API +sidebar_title: Search +description: The /search endpoint is used to search for Nomad objects +--- + +# Search HTTP API + +The `/search` endpoint returns matches for a given prefix and context, where a +context can be jobs, allocations, evaluations, nodes, deployments, plugins, +or volumes. When using Nomad Enterprise, the allowed contexts include quotas +and namespaces. Additionally, a prefix can be searched for within every context. + +| Method | Path | Produces | +| ------ | ------------ | ------------------ | +| `POST` | `/v1/search` | `application/json` | + +The table below shows this endpoint's support for +[blocking queries](/api-docs#blocking-queries) and +[required ACLs](/api-docs#acls). + +| Blocking Queries | ACL Required | +| ---------------- | -------------------------------- | +| `NO` | `node:read, namespace:read-jobs` | + +When ACLs are enabled, requests must have a token valid for `node:read` or +`namespace:read-jobs` roles. If the token is only valid for `node:read`, then +job related results will not be returned. If the token is only valid for +`namespace:read-jobs`, then node results will not be returned. + +### Parameters + +- `Prefix` `(string: )` - Specifies the identifier against which + matches will be found. For example, if the given prefix were "a", potential + matches might be "abcd", or "aabb". +- `Context` `(string: )` - Defines the scope in which a search for a + prefix operates. Contexts can be: "jobs", "evals", "allocs", "nodes", + "deployment", "plugins", "volumes" or "all", where "all" means every + context will be searched. + +### Sample Payload (for all contexts) + +```javascript +{ + "Prefix": "abc", + "Context": "all" +} +``` + +### Sample Request + +```shell-session +$ curl \ + --request POST \ + --data @payload.json \ + https://localhost:4646/v1/search +``` + +### Sample Response + +```json +{ + "Matches": { + "allocs": null, + "deployment": null, + "evals": ["abc2fdc0-e1fd-2536-67d8-43af8ca798ac"], + "jobs": ["abcde"], + "nodes": null, + "plugins": null, + "volumes": null + }, + "Truncations": { + "allocs": "false", + "deployment": "false", + "evals": "false", + "jobs": "false", + "nodes": "false", + "plugins": "false", + "volumes": "false" + } +} +``` + +#### Field Reference + +- `Matches` - A map of contexts to matching arrays of identifiers. + +- `Truncations` - Search results are capped at 20; if more matches were found for a particular context, it will be `true`. + +### Sample Payload (for a specific context) + +```javascript +{ + "Prefix": "abc", + "Context": "evals" +} +``` + +### Sample Request + +```shell-session +$ curl \ + --request POST \ + --data @payload.json \ + https://localhost:4646/v1/search +``` + +### Sample Response + +```json +{ + "Matches": { + "evals": ["abc2fdc0-e1fd-2536-67d8-43af8ca798ac"] + }, + "Truncations": { + "evals": "false" + } +} +``` diff --git a/content/nomad/v0.11.x/content/api-docs/sentinel-policies.mdx b/content/nomad/v0.11.x/content/api-docs/sentinel-policies.mdx new file mode 100644 index 0000000000..455a050100 --- /dev/null +++ b/content/nomad/v0.11.x/content/api-docs/sentinel-policies.mdx @@ -0,0 +1,178 @@ +--- +layout: api +page_title: Sentinel Policies - HTTP API +sidebar_title: Sentinel Policies +description: >- + The /sentinel/policy/ endpoints are used to configure and manage Sentinel + policies. +--- + +# Sentinel Policies HTTP API + +The `/sentinel/policies` and `/sentinel/policy/` endpoints are used to manage Sentinel policies. +For more details about Sentinel policies, please see the [Sentinel Policy Guide](https://learn.hashicorp.com/nomad/governance-and-policy/sentinel). + +Sentinel endpoints are only available when ACLs are enabled. For more details about ACLs, please see the [ACL Guide](https://learn.hashicorp.com/nomad?track=acls#operations-and-development). + +~> **Enterprise Only!** This API endpoint and functionality only exists in +Nomad Enterprise. This is not present in the open source version of Nomad. + +## List Policies + +This endpoint lists all Sentinel policies. This lists the policies that have been replicated +to the region, and may lag behind the authoritative region. + +| Method | Path | Produces | +| ------ | -------------------- | ------------------ | +| `GET` | `/sentinel/policies` | `application/json` | + +The table below shows this endpoint's support for +[blocking queries](/api-docs#blocking-queries), [consistency modes](/api-docs#consistency-modes) and +[required ACLs](/api-docs#acls). + +| Blocking Queries | Consistency Modes | ACL Required | +| ---------------- | ----------------- | ------------ | +| `YES` | `all` | `management` | + +### Sample Request + +```shell-session +$ curl \ + https://localhost:4646/v1/sentinel/policies +``` + +### Sample Response + +```json +[ + { + "Name": "foo", + "Description": "test policy", + "Scope": "submit-job", + "EnforcementLevel": "advisory", + "Hash": "CIs8aNX5OfFvo4D7ihWcQSexEJpHp+Za+dHSncVx5+8=", + "CreateIndex": 8, + "ModifyIndex": 8 + } +] +``` + +## Create or Update Policy + +This endpoint creates or updates an Sentinel Policy. This request is always forwarded to the +authoritative region. + +| Method | Path | Produces | +| ------ | ------------------------------- | -------------- | +| `POST` | `/sentinel/policy/:policy_name` | `(empty body)` | + +The table below shows this endpoint's support for +[blocking queries](/api-docs#blocking-queries) and +[required ACLs](/api-docs#acls). + +| Blocking Queries | ACL Required | +| ---------------- | ------------ | +| `NO` | `management` | + +### Parameters + +- `Name` `(string: )` - Specifies the name of the policy. + Creates the policy if the name does not exist, otherwise updates the existing policy. + +- `Description` `(string: )` - Specifies a human readable description. + +- `Scope` `(string: )` - Specifies the scope of when this policy applies. Only `submit-job` is currently supported. + +- `EnforcementLevel` `(string: )` - Specifies the enforcement level of the policy. Can be `advisory` which warns on failure, + `hard-mandatory` which prevents an operation on failure, and `soft-mandatory` which is like `hard-mandatory` but can be overridden. + +- `Policy` `(string: )` - Specifies the Sentinel policy itself. + +### Sample Payload + +```json +{ + "Name": "my-policy", + "Description": "This is a great policy", + "Scope": "submit-job", + "EnforcementLevel": "advisory", + "Policy": "main = rule { true }" +} +``` + +### Sample Request + +```shell-session +$ curl \ + --request POST \ + --data @payload.json \ + https://localhost:4646/v1/sentinel/policy/my-policy +``` + +## Read Policy + +This endpoint reads a Sentinel policy with the given name. This queries the policy that have been +replicated to the region, and may lag behind the authoritative region. + +| Method | Path | Produces | +| ------ | ------------------------------- | ------------------ | +| `GET` | `/sentinel/policy/:policy_name` | `application/json` | + +The table below shows this endpoint's support for +[blocking queries](/api-docs#blocking-queries), [consistency modes](/api-docs#consistency-modes) and +[required ACLs](/api-docs#acls). + +| Blocking Queries | Consistency Modes | ACL Required | +| ---------------- | ----------------- | ------------ | +| `YES` | `all` | `management` | + +### Sample Request + +```shell-session +$ curl \ + https://localhost:4646/v1/sentinel/policy/foo +``` + +### Sample Response + +```json +{ + "Name": "foo", + "Description": "test policy", + "Scope": "submit-job", + "EnforcementLevel": "advisory", + "Policy": "main = rule { true }\n", + "Hash": "CIs8aNX5OfFvo4D7ihWcQSexEJpHp+Za+dHSncVx5+8=", + "CreateIndex": 8, + "ModifyIndex": 8 +} +``` + +## Delete Policy + +This endpoint deletes the named Sentinel policy. This request is always forwarded to the +authoritative region. + +| Method | Path | Produces | +| -------- | ------------------------------- | -------------- | +| `DELETE` | `/sentinel/policy/:policy_name` | `(empty body)` | + +The table below shows this endpoint's support for +[blocking queries](/api-docs#blocking-queries) and +[required ACLs](/api-docs#acls). + +| Blocking Queries | ACL Required | +| ---------------- | ------------ | +| `NO` | `management` | + +### Parameters + +- `policy_name` `(string: )` - Specifies the policy name to delete. + +### Sample Request + +```shell-session +$ curl \ + --request DELETE \ + https://localhost:4646/v1/sentinel/policy/foo +``` diff --git a/content/nomad/v0.11.x/content/api-docs/status.mdx b/content/nomad/v0.11.x/content/api-docs/status.mdx new file mode 100644 index 0000000000..f56e328ba7 --- /dev/null +++ b/content/nomad/v0.11.x/content/api-docs/status.mdx @@ -0,0 +1,68 @@ +--- +layout: api +page_title: Status - HTTP API +sidebar_title: Status +description: The /status endpoints query the Nomad system status. +--- + +# Status HTTP API + +The `/status` endpoints query the Nomad system status. + +## Read Leader + +This endpoint returns the address of the current leader in the region. + +| Method | Path | Produces | +| ------ | ---------------- | ------------------ | +| `GET` | `/status/leader` | `application/json` | + +The table below shows this endpoint's support for +[blocking queries](/api-docs#blocking-queries) and +[required ACLs](/api-docs#acls). + +| Blocking Queries | ACL Required | +| ---------------- | ------------ | +| `NO` | `none` | + +### Sample Request + +```shell-session +$ curl \ + https://localhost:4646/v1/status/leader +``` + +### Sample Response + +```json +"127.0.0.1:4647" +``` + +## List Peers + +This endpoint returns the set of raft peers in the region. + +| Method | Path | Produces | +| ------ | --------------- | ------------------ | +| `GET` | `/status/peers` | `application/json` | + +The table below shows this endpoint's support for +[blocking queries](/api-docs#blocking-queries) and +[required ACLs](/api-docs#acls). + +| Blocking Queries | ACL Required | +| ---------------- | ------------ | +| `NO` | `none` | + +### Sample Request + +```shell-session +$ curl \ + https://localhost:4646/v1/status/peers +``` + +### Sample Response + +```json +["127.0.0.1:4647"] +``` diff --git a/content/nomad/v0.11.x/content/api-docs/system.mdx b/content/nomad/v0.11.x/content/api-docs/system.mdx new file mode 100644 index 0000000000..6a8d5e4e6b --- /dev/null +++ b/content/nomad/v0.11.x/content/api-docs/system.mdx @@ -0,0 +1,59 @@ +--- +layout: api +page_title: System - HTTP API +sidebar_title: System +description: The /system endpoints are used for system maintenance. +--- + +# System HTTP API + +The `/system` endpoints are used to for system maintenance and should not be +necessary for most users. + +## Force GC + +This endpoint initializes a garbage collection of jobs, evaluations, allocations, and +nodes. This is an asynchronous operation. + +| Method | Path | Produces | +| ------ | --------------- | ------------------ | +| `PUT` | `/v1/system/gc` | `application/json` | + +The table below shows this endpoint's support for +[blocking queries](/api-docs#blocking-queries) and +[required ACLs](/api-docs#acls). + +| Blocking Queries | ACL Required | +| ---------------- | ------------ | +| `NO` | `management` | + +### Sample Request + +```shell-session +$ curl \ + --request PUT \ + https://localhost:4646/v1/system/gc +``` + +## Reconcile Summaries + +This endpoint reconciles the summaries of all registered jobs. + +| Method | Path | Produces | +| ------ | -------------------------------- | ------------------ | +| `PUT` | `/v1/system/reconcile/summaries` | `application/json` | + +The table below shows this endpoint's support for +[blocking queries](/api-docs#blocking-queries) and +[required ACLs](/api-docs#acls). + +| Blocking Queries | ACL Required | +| ---------------- | ------------ | +| `NO` | `management` | + +### Sample Request + +```shell-session +$ curl \ + https://localhost:4646/v1/system/reconcile/summaries +``` diff --git a/content/nomad/v0.11.x/content/api-docs/ui.mdx b/content/nomad/v0.11.x/content/api-docs/ui.mdx new file mode 100644 index 0000000000..98b21f70a3 --- /dev/null +++ b/content/nomad/v0.11.x/content/api-docs/ui.mdx @@ -0,0 +1,264 @@ +--- +layout: api +page_title: UI +sidebar_title: UI +description: The /ui namespace is used to access the Nomad web user interface. +--- + +# Nomad Web UI + +Starting in v0.7, the Nomad UI is accessible at `/ui`. It is not namespaced by version. A request to `/` +will also redirect to `/ui`. + +## List Jobs + +This page lists all known jobs in a paginated, searchable, and sortable table. + +| Path | Produces | +| ---------- | ----------- | +| `/ui/jobs` | `text/html` | + +### Parameters + +- `namespace` `(string: "")` - Specifies the namespace all jobs should be a member + of. This is specified as a query string parameter. Namespaces are an enterprise feature. + +- `sort` `(string: "")` - Specifies the property the list of jobs should be sorted by. + This is specified as a query string parameter. + +- `desc` `(boolean: false)` - Specifies whether or not the sort direction is descending + or ascending. This is specified as a query string parameter. + +- `search` `(string: "")` - Specifies a regular expression uses to filter the list of + visible jobs. This is specified as a query string parameter. + +- `page` `(int: 0)` - Specifies the page in the jobs list that should be visible. This + is specified as a query string parameter. + +## Job Detail + +This page shows an overview of a specific job. Details include name, status, type, +priority, allocation statuses, and task groups. Additionally, if there is a running +deployment for the job, it will be shown on the overview. + +This page shows an overview of a specific job. The exact information shown varies +based on the type of job. + +- **Service Job** - Includes job metadata (name, status, priority, namespace), allocation + statuses, placement failures, active deployment, task groups, and evaluations. + +- **Batch Job** - Includes job metadata, allocation statuses, placement failures, task + groups, and evaluations. + +- **System Job** - Includes job metadata, allocation statuses, placement failures, task + groups, and evaluations. + +- **Periodic Job** - Includes job metadata, cron information force launch action, children statuses, + and children list. + +- **Parameterized Job** - Includes job metadata, children statuses, and children list. + +- **Periodic Child** - Includes job metadata, link to parent job, allocation statuses, placement + failures, task groups, and evaluations. + +- **Parameterized Child** - Includes job metadata, link to parent job, allocation statuses, + placement failures, task groups, evaluations, and dispatch payload. + +| Path | Produces | +| ------------------ | ----------- | +| `/ui/jobs/:job_id` | `text/html` | + +### Parameters + +- `sort` `(string: "")` - Specifies the property the list of task groups should be + sorted by. This is specified as a query string parameter. + +- `desc` `(boolean: false)` - Specifies whether or not the sort direction is descending + or ascending. This is specified as a query string parameter. + +- `page` `(int: 0)` - Specifies the page in the task groups list that should be visible. This + is specified as a query string parameter. + +### Job Definition + +This page shows the definition of a job as pretty-printed, syntax-highlighted, JSON. + +| Path | Produces | +| ----------------------------- | ----------- | +| `/ui/jobs/:job_id/definition` | `text/html` | + +### Job Versions + +This page lists all available versions for a job in a timeline view. Each version in +the timeline can be expanded to show a pretty-printed, syntax-highlighted diff between +job versions. + +| Path | Produces | +| --------------------------- | ----------- | +| `/ui/jobs/:job_id/versions` | `text/html` | + +### Job Deployments + +This page lists all available deployments for a job when the job has deployments. The +deployments are listed in a timeline view. Each deployment shows pertinent information +such as deployment ID, status, associated version, and submit time. Each deployment can +also be expanded to show detail information regarding canary placements, allocation +placements, healthy and unhealthy allocations, as well the current description for the +status. A table of task groups is also present in the detail view, which shows allocation +metrics by task group. Lastly, each expanded deployment lists all associated allocations +in a table to drill into for task events. + +| Path | Produces | +| ------------------------------ | ----------- | +| `/ui/jobs/:job_id/deployments` | `text/html` | + +## Task Group Detail + +This page shows an overview of a specific task group. Details include the number of tasks, the aggregated amount of reserved CPU, memory, and disk, all associated allocations broken +down by status, and a list of allocations. The list of allocations include details such as +status, the node the allocation was placed on, and the current CPU and Memory usage of the +allocations. + +| Path | Produces | +| ----------------------------------- | ----------- | +| `/ui/jobs/:job_id/:task_group_name` | `text/html` | + +### Parameters + +- `sort` `(string: "")` - Specifies the property the list of allocations should be sorted by. + This is specified as a query string parameter. + +- `desc` `(boolean: false)` - Specifies whether or not the sort direction is descending + or ascending. This is specified as a query string parameter. + +- `search` `(string: "")` - Specifies a regular expression uses to filter the list of + visible allocations. This is specified as a query string parameter. + +- `page` `(int: 0)` - Specifies the page in the allocations list that should be visible. This + is specified as a query string parameter. + +## Allocation Detail + +This page shows details and events for an allocation. Details include the job the allocation +belongs to, the node the allocation is placed on, a list of all tasks, and lists of task +events per task. Each task in the task list includes the task name, state, last event, time, +and addresses. Each task event in a task history list includes the time, type, and +description of the event. + +| Path | Produces | +| --------------------------- | ----------- | +| `/ui/allocations/:alloc_id` | `text/html` | + +### Parameters + +- `sort` `(string: "")` - Specifies the property the list of tasks should be sorted by. + This is specified as a query string parameter. + +- `desc` `(boolean: false)` - Specifies whether or not the sort direction is descending + or ascending. This is specified as a query string parameter. + +## Task Detail + +This page shows details and events for a specific task. Details include when the task started +and stopped, all static and dynamic addresses, and all recent events. + +| Path | Produces | +| -------------------------------------- | ----------- | +| `/ui/allocations/:alloc_id/:task_name` | `text/html` | + +## Task Logs + +This page streams `stdout` and `stderr` logs for a task. By default, `stdout` is tailed, but +there are available actions to see the head of the log, pause and play streaming, and switching +to `stderr`. + +| Path | Produces | +| ------------------------------------------- | ----------- | +| `/ui/allocations/:alloc_id/:task_name/logs` | `text/html` | + +## Nodes List + +This page lists all nodes in the Nomad cluster in a sortable, searchable, paginated +table. + +| Path | Produces | +| ----------- | ----------- | +| `/ui/nodes` | `text/html` | + +### Parameters + +- `sort` `(string: "")` - Specifies the property the list of client nodes should be sorted by. + This is specified as a query string parameter. + +- `desc` `(boolean: false)` - Specifies whether or not the sort direction is descending + or ascending. This is specified as a query string parameter. + +- `search` `(string: "")` - Specifies a regular expression uses to filter the list of + visible client nodes. This is specified as a query string parameter. + +- `page` `(int: 0)` - Specifies the page in the client nodes list that should be visible. This + is specified as a query string parameter. + +## Node Detail + +This page shows the details of a node, including the node name, status, full ID, +address, port, datacenter, allocations, and attributes. + +| Path | Produces | +| -------------------- | ----------- | +| `/ui/nodes/:node_id` | `text/html` | + +### Parameters + +- `sort` `(string: "")` - Specifies the property the list of allocations should be sorted by. + This is specified as a query string parameter. + +- `desc` `(boolean: false)` - Specifies whether or not the sort direction is descending + or ascending. This is specified as a query string parameter. + +- `search` `(string: "")` - Specifies a regular expression uses to filter the list of + visible allocations. This is specified as a query string parameter. + +- `page` `(int: 0)` - Specifies the page in the allocations list that should be visible. This + is specified as a query string parameter. + +## Servers List + +This page lists all servers in the Nomad cluster in a sortable table. Details for each +server include the server status, address, port, datacenter, and whether or not it is +the leader. + +| Path | Produces | +| ------------- | ----------- | +| `/ui/servers` | `text/html` | + +### Parameters + +- `sort` `(string: "")` - Specifies the property the list of server agents should be sorted by. + This is specified as a query string parameter. + +- `desc` `(boolean: false)` - Specifies whether or not the sort direction is descending + or ascending. This is specified as a query string parameter. + +- `page` `(int: 0)` - Specifies the page in the server agents list that should be visible. This + is specified as a query string parameter. + +## Server Detail + +This page lists all tags associated with a server. + +| Path | Produces | +| ------------------------ | ----------- | +| `/ui/servers/:server_id` | `text/html` | + +## ACL Tokens + +This page lets you enter an ACL token (both accessor ID and secret ID) to use with the UI. +If the cluster does not have ACLs enabled, this page is unnecessary. If the cluster has an +anonymous policy that grants cluster-wide read access, this page is unnecessary. If the +anonymous policy only grants partial read access, then providing an ACL Token will +authenticate all future requests to allow read access to additional resources. + +| Path | Produces | +| --------------------- | ----------- | +| `/ui/settings/tokens` | `text/html` | diff --git a/content/nomad/v0.11.x/content/api-docs/validate.mdx b/content/nomad/v0.11.x/content/api-docs/validate.mdx new file mode 100644 index 0000000000..92f04330b5 --- /dev/null +++ b/content/nomad/v0.11.x/content/api-docs/validate.mdx @@ -0,0 +1,65 @@ +--- +layout: api +page_title: Validate - HTTP API +sidebar_title: Validate +description: |- + The /validate endpoints are used to validate object structs, fields, and + types. +--- + +# Validate HTTP API + +The `/validate` endpoints are used to validate object structs, fields, and +types. + +## Validate Job + +This endpoint validates a Nomad job file. The local Nomad agent forwards the +request to a server. In the event a server can't be reached the agent verifies +the job file locally but skips validating driver configurations. + +~> This endpoint accepts a **JSON job file**, not an HCL job file. + +| Method | Path | Produces | +| ------ | ------------------ | ------------------ | +| `POST` | `/v1/validate/job` | `application/json` | + +The table below shows this endpoint's support for +[blocking queries](/api-docs#blocking-queries) and +[required ACLs](/api-docs#acls). + +| Blocking Queries | ACL Required | +| ---------------- | -------------------- | +| `NO` | `namespace:read-job` | + +### Parameters + +There are no parameters, but the request _body_ contains the entire job file. + +### Sample Payload + +```text +(any valid nomad job IN JSON FORMAT) +``` + +### Sample Request + +```shell-session +$ curl \ + --request POST \ + --data @my-job.nomad \ + https://localhost:4646/v1/validate/job +``` + +### Sample Response + +```json +{ + "DriverConfigValidated": true, + "ValidationErrors": [ + "Task group cache validation failed: 1 error(s) occurred:\n\n* Task redis validation failed: 1 error(s) occurred:\n\n* 1 error(s) occurred:\n\n* minimum CPU value is 20; got 1" + ], + "Warnings": "1 warning(s):\n\n* Group \"cache\" has warnings: 1 error(s) occurred:\n\n* Update max parallel count is greater than task group count (13 > 1). A destructive change would result in the simultaneous replacement of all allocations.", + "Error": "1 error(s) occurred:\n\n* Task group cache validation failed: 1 error(s) occurred:\n\n* Task redis validation failed: 1 error(s) occurred:\n\n* 1 error(s) occurred:\n\n* minimum CPU value is 20; got 1" +} +``` diff --git a/content/nomad/v0.11.x/content/api-docs/volumes.mdx b/content/nomad/v0.11.x/content/api-docs/volumes.mdx new file mode 100644 index 0000000000..b4d8a85573 --- /dev/null +++ b/content/nomad/v0.11.x/content/api-docs/volumes.mdx @@ -0,0 +1,328 @@ +--- +layout: api +page_title: Volumes - HTTP API +sidebar_title: Volumes +description: The `/volume` endpoints are used to query for and interact with volumes. +--- + +# Volumes HTTP API + +The `/volume` endpoints are used to query for and interact with volumes. + +## List Volumes + +This endpoint lists all volumes. + +| Method | Path | Produces | +| ------ | ------------- | ------------------ | +| `GET` | `/v1/volumes` | `application/json` | + +The table below shows this endpoint's support for +[blocking queries](/api-docs#blocking-queries) and +[required ACLs](/api-docs#acls). + +| Blocking Queries | ACL Required | +| ---------------- | --------------------------- | +| `YES` | `namespace:csi-list-volume` | + +### Parameters + +- `type` `(string: "")`- Specifies the type of volume to + query. Currently only supports `csi`. This is specified as a query + string parameter. Returns an empty list if omitted. + +- `node_id` `(string: "")` - Specifies a string to filter volumes + based on an Node ID prefix. Because the value is decoded to bytes, + the prefix must have an even number of hexadecimal characters + (0-9a-f). This is specified as a query string parameter. + +- `plugin_id` `(string: "")` - Specifies a string to filter volumes + based on a plugin ID prefix. Because the value is decoded to bytes, + the prefix must have an even number of hexadecimal characters + (0-9a-f). This is specified as a query string parameter. + +### Sample Request + +```shell-session +$ curl \ + https://localhost:4646/v1/volumes?type=csi&node_id=foo&plugin_id=plugin-id1 +``` + +### Sample Response + +```json +[ + { + "ID": "volume-id1", + "ExternalID": "volume-id1", + "Namespace": "default", + "Name": "volume id1", + "Topologies": [ + { + "foo": "bar" + } + ], + "AccessMode": "multi-node-single-writer", + "AttachmentMode": "file-system", + "Schedulable": true, + "PluginID": "plugin-id1", + "Provider": "ebs", + "ControllerRequired": true, + "ControllersHealthy": 3, + "ControllersExpected": 3, + "NodesHealthy": 15, + "NodesExpected": 18, + "ResourceExhausted": 0, + "CreateIndex": 42, + "ModifyIndex": 64 + } +] +``` + +## Read Volume + +This endpoint reads information about a specific volume. + +| Method | Path | Produces | +| ------ | --------------------------- | ------------------ | +| `GET` | `/v1/volume/csi/:volume_id` | `application/json` | + +The table below shows this endpoint's support for +[blocking queries](/api-docs#blocking-queries) and +[required ACLs](/api-docs#acls). + +| Blocking Queries | ACL Required | +| ---------------- | --------------------------- | +| `YES` | `namespace:csi-read-volume` | + +### Parameters + +- `:volume_id` `(string: )`- Specifies the ID of the + volume. This must be the full ID. This is specified as part of the + path. + +### Sample Request + +```shell-session +$ curl \ + https://localhost:4646/v1/volume/csi/volume-id1 +``` + +### Sample Response + +```json +{ + "ID": "volume-id1", + "Name": "volume id1", + "Namespace": "default", + "ExternalID": "volume-id1", + "Topologies": [{ "foo": "bar" }], + "AccessMode": "multi-node-single-writer", + "AttachmentMode": "file-system", + "Allocations": [ + { + "ID": "a8198d79-cfdb-6593-a999-1e9adabcba2e", + "EvalID": "5456bd7a-9fc0-c0dd-6131-cbee77f57577", + "Name": "example.cache[0]", + "NodeID": "fb2170a8-257d-3c64-b14d-bc06cc94e34c", + "PreviousAllocation": "516d2753-0513-cfc7-57ac-2d6fac18b9dc", + "NextAllocation": "cd13d9b9-4f97-7184-c88b-7b451981616b", + "RescheduleTracker": { + "Events": [ + { + "PrevAllocID": "516d2753-0513-cfc7-57ac-2d6fac18b9dc", + "PrevNodeID": "9230cd3b-3bda-9a3f-82f9-b2ea8dedb20e", + "RescheduleTime": 1517434161192946200, + "Delay": "5000000000" + } + ] + }, + "JobID": "example", + "TaskGroup": "cache", + "DesiredStatus": "run", + "DesiredDescription": "", + "ClientStatus": "running", + "ClientDescription": "", + "TaskStates": { + "redis": { + "State": "running", + "FinishedAt": "0001-01-01T00:00:00Z", + "LastRestart": "0001-01-01T00:00:00Z", + "Restarts": 0, + "StartedAt": "2017-07-25T23:36:26.106431265Z", + "Failed": false, + "Events": [ + { + "Type": "Received", + "Time": 1495747371795703800, + "FailsTask": false, + "RestartReason": "", + "SetupError": "", + "DriverError": "", + "ExitCode": 0, + "Signal": 0, + "Message": "", + "KillTimeout": 0, + "KillError": "", + "KillReason": "", + "StartDelay": 0, + "DownloadError": "", + "ValidationError": "", + "DiskLimit": 0, + "FailedSibling": "", + "VaultError": "", + "TaskSignalReason": "", + "TaskSignal": "", + "DriverMessage": "" + }, + { + "Type": "Driver", + "Time": 1495747371798867200, + "FailsTask": false, + "RestartReason": "", + "SetupError": "", + "DriverError": "", + "ExitCode": 0, + "Signal": 0, + "Message": "", + "KillTimeout": 0, + "KillError": "", + "KillReason": "", + "StartDelay": 0, + "DownloadError": "", + "ValidationError": "", + "DiskLimit": 0, + "FailedSibling": "", + "VaultError": "", + "TaskSignalReason": "", + "TaskSignal": "", + "DriverMessage": "Downloading image redis:3.2" + }, + { + "Type": "Started", + "Time": 1495747379525667800, + "FailsTask": false, + "RestartReason": "", + "SetupError": "", + "DriverError": "", + "ExitCode": 0, + "Signal": 0, + "Message": "", + "KillTimeout": 0, + "KillError": "", + "KillReason": "", + "StartDelay": 0, + "DownloadError": "", + "ValidationError": "", + "DiskLimit": 0, + "FailedSibling": "", + "VaultError": "", + "TaskSignalReason": "", + "TaskSignal": "", + "DriverMessage": "" + } + ] + } + }, + "CreateIndex": 54, + "ModifyIndex": 57, + "CreateTime": 1495747371794276400, + "ModifyTime": 1495747371794276400 + } + ], + "Schedulable": true, + "PluginID": "plugin-id1", + "Provider": "ebs", + "Version": "1.0.1", + "ControllerRequired": true, + "ControllersHealthy": 3, + "ControllersExpected": 3, + "NodesHealthy": 15, + "NodesExpected": 18, + "ResourceExhausted": 0, + "CreateIndex": 42, + "ModifyIndex": 64 +} +``` + +## Register Volume + +This endpoint registers an external volume with Nomad. It is an error +to register an existing volume. + +| Method | Path | Produces | +| ------ | --------------------------- | ------------------ | +| `PUT` | `/v1/volume/csi/:volume_id` | `application/json` | + +The table below shows this endpoint's support for +[blocking queries](/api-docs#blocking-queries) and +[required ACLs](/api-docs#acls). + +| Blocking Queries | ACL Required | +| ---------------- | ---------------------------- | +| `NO` | `namespace:csi-write-volume` | + +### Parameters + +- `:volume_id` `(string: )`- Specifies the ID of the + volume. This must be the full ID. This is specified as part of the + path. + +### Sample Payload + +The payload must include a JSON document that described the volume's +parameters. + +```json +{ + "ID": "volume-id1", + "Name": "volume id1", + "Namespace": "default", + "ExternalID": "volume-id1", + "Topologies": [{ "foo": "bar" }], + "AccessMode": "multi-node-single-writer", + "AttachmentMode": "file-system", + "PluginID": "plugin-id1" +} +``` + +### Sample Request + +```shell-session +$ curl \ + --request PUT \ + --data @payload.json \ + https://localhost:4646/v1/volume/csi/volume-id1 +``` + +## Delete Volume + +This endpoint deregisters an external volume with Nomad. It is an +error to deregister a volume that is in use. + +| Method | Path | Produces | +| ------- | --------------------------- | ------------------ | +| `DELTE` | `/v1/volume/csi/:volume_id` | `application/json` | + +The table below shows this endpoint's support for +[blocking queries](/api-docs#blocking-queries) and +[required ACLs](/api-docs#acls). + +| Blocking Queries | ACL Required | +| ---------------- | ---------------------------- | +| `NO` | `namespace:csi-write-volume` | + +### Parameters + +- `:volume_id` `(string: )`- Specifies the ID of the + volume. This must be the full ID. This is specified as part of the + path. + +### Sample Request + +```shell-session +$ curl \ + --request DELETE \ + --data @payload.json \ + https://localhost:4646/v1/volume/csi/volume-id1 +``` diff --git a/content/nomad/v0.11.x/content/docs/commands/acl/bootstrap.mdx b/content/nomad/v0.11.x/content/docs/commands/acl/bootstrap.mdx new file mode 100644 index 0000000000..836c720e09 --- /dev/null +++ b/content/nomad/v0.11.x/content/docs/commands/acl/bootstrap.mdx @@ -0,0 +1,40 @@ +--- +layout: docs +page_title: 'Commands: acl bootstrap' +sidebar_title: bootstrap +description: | + The bootstrap command is used to create the initial ACL token. +--- + +# Command: acl bootstrap + +The `acl bootstrap` command is used to bootstrap the initial ACL token. + +## Usage + +```plaintext +nomad acl bootstrap [options] +``` + +The `acl bootstrap` command requires no arguments. + +## General Options + +@include 'general_options.mdx' + +## Examples + +Bootstrap the initial token: + +```shell-session +$ nomad acl bootstrap +Accessor ID = 5b7fd453-d3f7-6814-81dc-fcfe6daedea5 +Secret ID = 9184ec35-65d4-9258-61e3-0c066d0a45c5 +Name = Bootstrap Token +Type = management +Global = true +Policies = n/a +Create Time = 2017-09-11 17:38:10.999089612 +0000 UTC +Create Index = 7 +Modify Index = 7 +``` diff --git a/content/nomad/v0.11.x/content/docs/commands/acl/index.mdx b/content/nomad/v0.11.x/content/docs/commands/acl/index.mdx new file mode 100644 index 0000000000..9f83e24f31 --- /dev/null +++ b/content/nomad/v0.11.x/content/docs/commands/acl/index.mdx @@ -0,0 +1,43 @@ +--- +layout: docs +page_title: 'Commands: acl' +sidebar_title: acl +description: | + The acl command is used to interact with ACL policies and tokens. +--- + +# Command: acl + +The `acl` command is used to interact with ACL policies and tokens. Learn more +about using Nomad's ACL system in the [Secure Nomad with Access Control +guide][secure-guide]. + +## Usage + +Usage: `nomad acl [options]` + +Run `nomad acl -h` for help on that subcommand. The following +subcommands are available: + +- [`acl bootstrap`][bootstrap] - Bootstrap the initial ACL token +- [`acl policy apply`][policyapply] - Create or update ACL policies +- [`acl policy delete`][policydelete] - Delete an existing ACL policies +- [`acl policy info`][policyinfo] - Fetch information on an existing ACL policy +- [`acl policy list`][policylist] - List available ACL policies +- [`acl token create`][tokencreate] - Create new ACL token +- [`acl token delete`][tokendelete] - Delete an existing ACL token +- [`acl token info`][tokeninfo] - Get info on an existing ACL token +- [`acl token self`][tokenself] - Get info on self ACL token +- [`acl token update`][tokenupdate] - Update existing ACL token + +[bootstrap]: /docs/commands/acl/bootstrap +[policyapply]: /docs/commands/acl/policy-apply +[policydelete]: /docs/commands/acl/policy-delete +[policyinfo]: /docs/commands/acl/policy-info +[policylist]: /docs/commands/acl/policy-list +[tokencreate]: /docs/commands/acl/token-create +[tokenupdate]: /docs/commands/acl/token-update +[tokendelete]: /docs/commands/acl/token-delete +[tokeninfo]: /docs/commands/acl/token-info +[tokenself]: /docs/commands/acl/token-self +[secure-guide]: https://learn.hashicorp.com/nomad/acls/fundamentals diff --git a/content/nomad/v0.11.x/content/docs/commands/acl/policy-apply.mdx b/content/nomad/v0.11.x/content/docs/commands/acl/policy-apply.mdx new file mode 100644 index 0000000000..3f7c8d7cc3 --- /dev/null +++ b/content/nomad/v0.11.x/content/docs/commands/acl/policy-apply.mdx @@ -0,0 +1,37 @@ +--- +layout: docs +page_title: 'Commands: acl policy apply' +sidebar_title: policy apply +description: | + The policy apply command is used to create or update ACL policies. +--- + +# Command: acl policy apply + +The `acl policy apply` command is used to create or update ACL policies. + +## Usage + +```plaintext +nomad acl policy apply [options] +``` + +The `acl policy apply` command requires two arguments, the policy name and path +to file. The policy can be read from stdin by setting the path to "-". + +## General Options + +@include 'general_options.mdx' + +## Apply Options + +- `-description`: Sets the human readable description for the ACL policy. + +## Examples + +Create a new ACL Policy: + +```shell-session +$ nomad acl policy apply my-policy my-policy.json +Successfully wrote 'my-policy' ACL policy! +``` diff --git a/content/nomad/v0.11.x/content/docs/commands/acl/policy-delete.mdx b/content/nomad/v0.11.x/content/docs/commands/acl/policy-delete.mdx new file mode 100644 index 0000000000..95d88b9710 --- /dev/null +++ b/content/nomad/v0.11.x/content/docs/commands/acl/policy-delete.mdx @@ -0,0 +1,32 @@ +--- +layout: docs +page_title: 'Commands: acl policy delete' +sidebar_title: policy delete +description: | + The policy delete command is used to delete an existing ACL policies. +--- + +# Command: acl policy delete + +The `acl policy delete` command is used to delete an existing ACL policies. + +## Usage + +```plaintext +nomad acl policy delete +``` + +The `acl policy delete` command requires the policy name as an argument. + +## General Options + +@include 'general_options.mdx' + +## Examples + +Delete an ACL Policy: + +```shell-session +$ nomad acl policy delete my-policy +Successfully deleted 'my-policy' ACL policy! +``` diff --git a/content/nomad/v0.11.x/content/docs/commands/acl/policy-info.mdx b/content/nomad/v0.11.x/content/docs/commands/acl/policy-info.mdx new file mode 100644 index 0000000000..d2bf26165f --- /dev/null +++ b/content/nomad/v0.11.x/content/docs/commands/acl/policy-info.mdx @@ -0,0 +1,42 @@ +--- +layout: docs +page_title: 'Commands: acl policy info' +sidebar_title: policy info +description: > + The policy info command is used to fetch information on an existing ACL + policy. +--- + +# Command: acl policy info + +The `acl policy info` command is used to fetch information on an existing ACL +policy. + +## Usage + +```plaintext +nomad acl policy info +``` + +The `acl policy info` command requires the policy name. + +## General Options + +@include 'general_options.mdx' + +## Examples + +Fetch information on an existing ACL Policy: + +```shell-session +$ nomad acl policy info my-policy +Name = my-policy +Description = +Rules = { + "Name": "my-policy", + "Description": "This is a great policy", + "Rules": "list_jobs" +} +CreateIndex = 749 +ModifyIndex = 758 +``` diff --git a/content/nomad/v0.11.x/content/docs/commands/acl/policy-list.mdx b/content/nomad/v0.11.x/content/docs/commands/acl/policy-list.mdx new file mode 100644 index 0000000000..417e797b73 --- /dev/null +++ b/content/nomad/v0.11.x/content/docs/commands/acl/policy-list.mdx @@ -0,0 +1,37 @@ +--- +layout: docs +page_title: 'Commands: acl policy list' +sidebar_title: policy list +description: | + The policy list command is used to list available ACL policies. +--- + +# Command: acl policy list + +The `acl policy list` command is used to list available ACL policies. + +## Usage + +```plaintext +nomad acl policy list +``` + +## General Options + +@include 'general_options.mdx' + +## List Options + +- `-json` : Output the policies in their JSON format. +- `-t` : Format and display the policies using a Go template. + +## Examples + +List all ACL policies: + +```shell-session +$ nomad acl policy list +Name Description +policy-1 The first policy +policy-2 The second policy +``` diff --git a/content/nomad/v0.11.x/content/docs/commands/acl/token-create.mdx b/content/nomad/v0.11.x/content/docs/commands/acl/token-create.mdx new file mode 100644 index 0000000000..678b9756b8 --- /dev/null +++ b/content/nomad/v0.11.x/content/docs/commands/acl/token-create.mdx @@ -0,0 +1,53 @@ +--- +layout: docs +page_title: 'Commands: acl token create' +sidebar_title: token create +description: | + The token create command is used to create new ACL tokens. +--- + +# Command: acl token create + +The `acl token create` command is used to create new ACL tokens. + +## Usage + +```plaintext +nomad acl token create [options] +``` + +The `acl token create` command requires no arguments. + +## General Options + +@include 'general_options.mdx' + +## Create Options + +- `-name`: Sets the human readable name for the ACL token. + +- `-type`: Sets the type of token. Must be one of "client" (default), or + "management". + +- `-global`: Toggles the global mode of the token. Global tokens are replicated + to all regions. Defaults false. + +- `-policy`: Specifies a policy to associate with the token. Can be specified + multiple times, but only with client type tokens. + +## Examples + +Create a new ACL token: + +```shell-session +$ nomad acl token create -name="my token" -policy=foo -policy=bar +Accessor ID = d532c40a-30f1-695c-19e5-c35b882b0efd +Secret ID = 85310d07-9afa-ef53-0933-0c043cd673c7 +Name = my token +Type = client +Global = false +Policies = [foo bar] +Create Time = 2017-09-15 05:04:41.814954949 +0000 UTC +Create Index = 8 +Modify Index = 8 +``` diff --git a/content/nomad/v0.11.x/content/docs/commands/acl/token-delete.mdx b/content/nomad/v0.11.x/content/docs/commands/acl/token-delete.mdx new file mode 100644 index 0000000000..6bd65565aa --- /dev/null +++ b/content/nomad/v0.11.x/content/docs/commands/acl/token-delete.mdx @@ -0,0 +1,33 @@ +--- +layout: docs +page_title: 'Commands: acl token delete' +sidebar_title: token delete +description: | + The token create command is used to delete existing ACL tokens. +--- + +# Command: acl token delete + +The `acl token delete` command is used to delete existing ACL tokens. + +## Usage + +```plaintext +nomad acl token delete +``` + +The `acl token delete` command requires an existing token's AccessorID. + +## General Options + +@include 'general_options.mdx' + +## Examples + +Delete an existing ACL token: + +```shell-session +$ nomad acl token delete d532c40a-30f1-695c-19e5-c35b882b0efd + +Token d532c40a-30f1-695c-19e5-c35b882b0efd successfully deleted +``` diff --git a/content/nomad/v0.11.x/content/docs/commands/acl/token-info.mdx b/content/nomad/v0.11.x/content/docs/commands/acl/token-info.mdx new file mode 100644 index 0000000000..6adcb7d5b8 --- /dev/null +++ b/content/nomad/v0.11.x/content/docs/commands/acl/token-info.mdx @@ -0,0 +1,41 @@ +--- +layout: docs +page_title: 'Commands: acl token info' +sidebar_title: token info +description: > + The token info command is used to fetch information about an existing ACL + token. +--- + +# Command: acl token info + +The `acl token info` command is used to fetch information about an existing ACL token. + +## Usage + +```plaintext +nomad acl token info +``` + +The `acl token info` command requires an existing token's AccessorID. + +## General Options + +@include 'general_options.mdx' + +## Examples + +Fetch information about an existing ACL token: + +```shell-session +$ nomad acl token info d532c40a-30f1-695c-19e5-c35b882b0efd +Accessor ID = d532c40a-30f1-695c-19e5-c35b882b0efd +Secret ID = 85310d07-9afa-ef53-0933-0c043cd673c7 +Name = my token +Type = client +Global = false +Policies = [foo bar] +Create Time = 2017-09-15 05:04:41.814954949 +0000 UTC +Create Index = 8 +Modify Index = 8 +``` diff --git a/content/nomad/v0.11.x/content/docs/commands/acl/token-list.mdx b/content/nomad/v0.11.x/content/docs/commands/acl/token-list.mdx new file mode 100644 index 0000000000..1599643556 --- /dev/null +++ b/content/nomad/v0.11.x/content/docs/commands/acl/token-list.mdx @@ -0,0 +1,37 @@ +--- +layout: docs +page_title: 'Commands: acl token list' +sidebar_title: token list +description: | + The token list command is used to list existing ACL tokens. +--- + +# Command: acl token list + +The `acl token list` command is used to list existing ACL tokens. + +## Usage + +```plaintext +nomad acl token list +``` + +## General Options + +@include 'general_options.mdx' + +## List Options + +- `-json` : Output the tokens in their JSON format. +- `-t` : Format and display the tokens using a Go template. + +## Examples + +List all ACL tokens: + +```shell-session +$ nomad acl token list +Name Type Global Accessor ID +Bootstrap Token management true 32b61154-47f1-3694-1430-a5544bafcd3e + client false fcf2bf84-a257-8f39-9d16-a954ed25b5be +``` diff --git a/content/nomad/v0.11.x/content/docs/commands/acl/token-self.mdx b/content/nomad/v0.11.x/content/docs/commands/acl/token-self.mdx new file mode 100644 index 0000000000..a3544208d1 --- /dev/null +++ b/content/nomad/v0.11.x/content/docs/commands/acl/token-self.mdx @@ -0,0 +1,42 @@ +--- +layout: docs +page_title: 'Commands: acl token self' +sidebar_title: token self +description: > + The token self command is used to fetch information about the currently set + ACL token. +--- + +# Command: acl token self + +The `acl token self` command is used to fetch information about the currently +set ACL token. + +## Usage + +```plaintext +nomad acl token self +``` + +## General Options + +@include 'general_options.mdx' + +## Examples + +Fetch information about an existing ACL token: + +```shell-session +$ export NOMAD_TOKEN=85310d07-9afa-ef53-0933-0c043cd673c7 + +$ nomad acl token self +Accessor ID = d532c40a-30f1-695c-19e5-c35b882b0efd +Secret ID = 85310d07-9afa-ef53-0933-0c043cd673c7 +Name = my token +Type = client +Global = false +Policies = [foo bar] +Create Time = 2017-09-15 05:04:41.814954949 +0000 UTC +Create Index = 8 +Modify Index = 8 +``` diff --git a/content/nomad/v0.11.x/content/docs/commands/acl/token-update.mdx b/content/nomad/v0.11.x/content/docs/commands/acl/token-update.mdx new file mode 100644 index 0000000000..6f5004ff68 --- /dev/null +++ b/content/nomad/v0.11.x/content/docs/commands/acl/token-update.mdx @@ -0,0 +1,53 @@ +--- +layout: docs +page_title: 'Commands: acl token update' +sidebar_title: token update +description: | + The token update command is used to update existing ACL tokens. +--- + +# Command: acl token update + +The `acl token update` command is used to update existing ACL tokens. + +## Usage + +```plaintext +nomad acl token update [options] +``` + +The `acl token update` command requires an existing token's accessor ID. + +## General Options + +@include 'general_options.mdx' + +## Create Options + +- `-name`: Sets the human readable name for the ACL token. + +- `-type`: Sets the type of token. Must be one of "client" (default), or + "management". + +- `-global`: Toggles the global mode of the token. Global tokens are replicated + to all regions. Defaults false. + +- `-policy`: Specifies a policy to associate with the token. Can be specified + multiple times, but only with client type tokens. + +## Examples + +Update an existing ACL token: + +```shell-session +$ nomad acl token update -name="my updated token" -policy=foo -policy=bar d532c40a-30f1-695c-19e5-c35b882b0efd +Accessor ID = d532c40a-30f1-695c-19e5-c35b882b0efd +Secret ID = 85310d07-9afa-ef53-0933-0c043cd673c7 +Name = my updated token +Type = client +Global = false +Policies = [foo bar] +Create Time = 2017-09-15 05:04:41.814954949 +0000 UTC +Create Index = 8 +Modify Index = 8 +``` diff --git a/content/nomad/v0.11.x/content/docs/commands/agent-info.mdx b/content/nomad/v0.11.x/content/docs/commands/agent-info.mdx new file mode 100644 index 0000000000..ddbbe6ac48 --- /dev/null +++ b/content/nomad/v0.11.x/content/docs/commands/agent-info.mdx @@ -0,0 +1,80 @@ +--- +layout: docs +page_title: 'Commands: agent-info' +sidebar_title: agent-info +description: | + Display information and status of a running agent. +--- + +# Command: agent-info + +The `agent-info` command dumps metrics and status information of a running +agent. The information displayed pertains to the specific agent the CLI +is connected to. This is useful for troubleshooting and performance monitoring. + +## Usage + +```plaintext +nomad agent-info [options] +``` + +## General Options + +@include 'general_options.mdx' + +## Output + +Depending on the agent queried, information from different subsystems is +returned. These subsystems are described below: + +- client - Status of the local Nomad client +- nomad - Status of the local Nomad server +- serf - Gossip protocol metrics and information +- raft - Status information about the Raft consensus protocol +- runtime - Various metrics from the runtime environment + +## Examples + +```shell-session +$ nomad agent-info +raft + commit_index = 0 + fsm_pending = 0 + last_contact = never + last_snapshot_term = 0 + state = Follower + term = 0 + applied_index = 0 + last_log_index = 0 + last_log_term = 0 + last_snapshot_index = 0 + num_peers = 0 +runtime + cpu_count = 4 + goroutines = 43 + kernel.name = darwin + max_procs = 4 + version = go1.5 + arch = amd64 +serf + intent_queue = 0 + member_time = 1 + query_queue = 0 + event_time = 1 + event_queue = 0 + failed = 0 + left = 0 + members = 1 + query_time = 1 + encrypted = false +client + heartbeat_ttl = 0 + known_servers = 0 + last_heartbeat = 9223372036854775807 + num_allocations = 0 +nomad + bootstrap = false + known_regions = 1 + leader = false + server = true +``` diff --git a/content/nomad/v0.11.x/content/docs/commands/agent.mdx b/content/nomad/v0.11.x/content/docs/commands/agent.mdx new file mode 100644 index 0000000000..6970eb8213 --- /dev/null +++ b/content/nomad/v0.11.x/content/docs/commands/agent.mdx @@ -0,0 +1,225 @@ +--- +layout: docs +page_title: 'Commands: agent' +sidebar_title: agent +description: | + The agent command is the main entrypoint to running a Nomad client or server. +--- + +# Command: agent + +The agent command is the heart of Nomad: it runs the agent that handles client +or server functionality, including exposing interfaces for client consumption +and running jobs. + +Due to the power and flexibility of this command, the Nomad agent is documented +in its own section. See the [Nomad Agent] guide and the [Configuration] +documentation section for more information on how to use this command and the +options it has. + +## Command-line Options + +A subset of the available Nomad agent configuration can optionally be passed in +via CLI arguments. The `agent` command accepts the following arguments: + +- `-alloc-dir=`: Equivalent to the Client [alloc_dir] config + option. + +- `-acl-enabled`: Equivalent to the ACL [enabled] config option. + +- `-acl-replication-token`: Equivalent to the ACL [replication_token] config + option. + +- `-bind=
`: Equivalent to the [bind_addr] config option. + +- `-bootstrap-expect=`: Equivalent to the + [bootstrap_expect] config option. + +- `-client`: Enable client mode on the local agent. + +- `-config=`: Specifies the path to a configuration file or a directory of + configuration files to load. Can be specified multiple times. + +- `-consul-address=`: Equivalent to the [address] config option. + +- `-consul-auth=`: Equivalent to the [auth] config option. + +- `-consul-auto-advertise`: Equivalent to the [auto_advertise] config option. + +- `-consul-ca-file=`: Equivalent to the [ca_file] config option. + +- `-consul-cert-file=`: Equivalent to the [cert_file] config option. + +- `-consul-checks-use-advertise`: Equivalent to the [checks_use_advertise] + config option. + +- `-consul-client-auto-join`: Equivalent to the [client_auto_join] config + option. + +- `-consul-client-service-name=`: Equivalent to the [client_service_name] + config option. + +- `-consul-client-http-check-name=`: Equivalent to the + [client_http_check_name] config option. + +- `-consul-key-file=`: Equivalent to the [key_file] config option. + +- `-consul-server-service-name=`: Equivalent to the [server_service_name] + config option. + +- `-consul-server-http-check-name=`: Equivalent to the + [server_http_check_name] config option. + +- `-consul-server-serf-check-name=`: Equivalent to the + [server_serf_check_name] config option. + +- `-consul-server-rpc-check-name=`: Equivalent to the + [server_rpc_check_name] config option. + +- `-consul-server-auto-join`: Equivalent to the [server_auto_join] config + option. + +- `-consul-ssl`: Equivalent to the [ssl] config option. + +- `-consul-token=`: Equivalent to the [token] config option. + +- `-consul-verify-ssl`: Equivalent to the [verify_ssl] config option. + +- `-data-dir=`: Equivalent to the [data_dir] config option. + +- `-dc=`: Equivalent to the [datacenter] config option. + +- `-dev`: Start the agent in development mode. This enables a pre-configured + dual-role agent (client + server) which is useful for developing or testing + Nomad. No other configuration is required to start the agent in this mode, + but you may pass an optional comma-separated list of mode configurations: + +- `-dev-connect`: Start the agent in development mode, but bind to a public + network interface rather than localhost for using Consul Connect. This mode + is supported only on Linux as root. + +- `-encrypt`: Set the Serf encryption key. See the [Encryption Overview] for + more details. + +- `-join=
`: Address of another agent to join upon starting up. This can + be specified multiple times to specify multiple agents to join. + +- `-log-level=`: Equivalent to the [log_level] config option. + +- `-log-json`: Equivalent to the [log_json] config option. + +- `-meta=`: Equivalent to the Client [meta] config option. + +- `-network-interface=`: Equivalent to the Client + [network_interface] config option. + +- `-network-speed=`: Equivalent to the Client + [network_speed] config option. + +- `-node=`: Equivalent to the [name] config option. + +- `-node-class=`: Equivalent to the Client [node_class] + config option. + +- `-plugin-dir=`: Equivalent to the [plugin_dir] config option. + +- `-region=`: Equivalent to the [region] config option. + +- `-rejoin`: Equivalent to the [rejoin_after_leave] config option. + +- `-retry-interval`: Equivalent to the [retry_interval] config option. + +- `-retry-join`: Similar to `-join` but allows retrying a join if the first + attempt fails. + + ```shell-session + $ nomad agent -retry-join "127.0.0.1:4648" + ``` + + `retry-join` can be defined as a command line flag only for servers. Clients + can configure `retry-join` only in configuration files. + +- `-retry-max`: Similar to the [retry_max] config option. + +- `-server`: Enable server mode on the local agent. + +- `-servers=`: Equivalent to the Client [servers] config + option. + +- `-state-dir=`: Equivalent to the Client [state_dir] config + option. + +- `-vault-enabled`: Whether to enable or disabled Vault integration. + +- `-vault-address=`: The address to communicate with Vault. + +- `-vault-token=`: The Vault token used to derive tokens. Only needs to + be set on Servers. Overrides the Vault token read from the VAULT_TOKEN + environment variable. + +- `-vault-create-from-role=`: The role name to create tokens for tasks + from. + +- `-vault-ca-file=`: Path to a PEM-encoded CA cert file used to verify the + Vault server SSL certificate. + +- `-vault-ca-path=`: Path to a directory of PEM-encoded CA cert files used + to verify the Vault server SSL certificate.Whether to enable or disabled Vault + integration. + +- `vault-cert-file=`: The path to the certificate for Vault communication. + +- `vault-key-file=`: The path to the private key for Vault communication. + +- `vault-namespace=`: The Vault namespace used for the integration. + Required for servers and clients. Overrides the Vault namespace read from the + VAULT_NAMESPACE environment variable. + +- `vault-tls-skip-verify`: A boolean that determines whether to skip SSL + certificate verification. + +- `vault-tls-server-name=`: Used to set the SNI host when connecting to + Vault over TLS. + +[address]: /docs/configuration/consul#address +[alloc_dir]: /docs/configuration/client/#alloc_dir +[auth]: /docs/configuration/consul#auth +[auto_advertise]: /docs/configuration/consul#auto_advertise +[bind_addr]: /docs/configuration/#bind_addr +[bootstrap_expect]: /docs/configuration/server/#bootstrap_expect +[ca_file]: /docs/configuration/consul#ca_file +[cert_file]: /docs/configuration/consul#cert_file +[checks_use_advertise]: /docs/configuration/consul#checks_use_advertise +[client_auto_join]: /docs/configuration/consul#client_auto_join +[client_http_check_name]: /docs/configuration/consul#client_http_check_name +[client_service_name]: /docs/configuration/consul#client_service_name +[configuration]: /docs/configuration +[data_dir]: /docs/configuration#data_dir +[datacenter]: /docs/configuration/#datacenter +[enabled]: /docs/configuration/acl#enabled +[encryption overview]: https://learn.hashicorp.com/nomad/transport-security/gossip-encryption +[key_file]: /docs/configuration/consul#key_file +[log_json]: /docs/configuration#log_json +[log_level]: /docs/configuration#log_level +[meta]: /docs/configuration/client/#meta +[name]: /docs/configuration/#name +[network_interface]: /docs/configuration/client/#network_interface +[network_speed]: /docs/configuration/client/#network_speed +[node_class]: /docs/configuration/client/#node_class +[nomad agent]: /docs/install/production/nomad-agent +[plugin_dir]: /docs/configuration#plugin_dir +[region]: /docs/configuration/#region +[rejoin_after_leave]: /docs/configuration/server/#rejoin_after_leave +[replication_token]: /docs/configuration/acl#replication_token +[retry_interval]: /docs/configuration/server/#retry_interval +[retry_max]: /docs/configuration/server/#retry_max +[server_auto_join]: /docs/configuration/consul#server_auto_join +[server_http_check_name]: /docs/configuration/consul#server_http_check_name +[server_rpc_check_name]: /docs/configuration/consul#server_rpc_check_name +[server_serf_check_name]: /docs/configuration/consul#server_serf_check_name +[server_service_name]: /docs/configuration/consul#server_service_name +[servers]: /docs/configuration/client/#servers +[ssl]: /docs/configuration/consul#ssl +[state_dir]: /docs/configuration/client/#state_dir +[token]: /docs/configuration/consul#token +[verify_ssl]: /docs/configuration/consul#verify_ssl diff --git a/content/nomad/v0.11.x/content/docs/commands/alloc/exec.mdx b/content/nomad/v0.11.x/content/docs/commands/alloc/exec.mdx new file mode 100644 index 0000000000..6b84e32bc5 --- /dev/null +++ b/content/nomad/v0.11.x/content/docs/commands/alloc/exec.mdx @@ -0,0 +1,136 @@ +--- +layout: docs +page_title: 'Commands: alloc exec' +sidebar_title: exec +description: | + Runs a command in a container. +--- + +# Command: alloc exec + +**Alias: `nomad exec`** + +The `alloc exec` command runs a command in a running allocation. + +## Usage + +```plaintext +nomad alloc exec [options] [...] +``` + +The nomad exec command can be use to run commands inside a running task/allocation. + +Use cases are for inspecting container state, debugging a failed application +without needing ssh access into the node that's running the allocation. + +This command executes the command in the given task in the allocation. If the +allocation is only running a single task, the task name can be omitted. +Optionally, the `-job` option may be used in which case a random allocation from +the given job will be chosen. + +## General Options + +@include 'general_options.mdx' + +## Exec Options + +- `-task`: Sets the task to exec command in. + +- `-job`: Use a random allocation from the specified job ID. + +- `-i`: Pass stdin to the container, defaults to true. Pass `-i=false` to + disable explicitly. + +- `-t`: Allocate a pseudo-tty, defaults to true if stdin is detected to be a tty + session. Pass `-t=false` to disable explicitly. + +- `-e` : Sets the escape character for sessions with a pty + (default: '~'). The escape character is only recognized at the beginning of a + line. The escape character followed by a dot ('.') closes the connection. + Setting the character to 'none' disables any escapes and makes the session + fully transparent. + +## Examples + +To start an interactive debugging session in a particular alloc, invoke exec +command with your desired shell available inside the task: + +```shell-session +$ nomad alloc exec eb17e557 /bin/bash +root@eb17e557:/data# # now run any debugging commands inside container +root@eb17e557:/data# # ps -ef +``` + +To run a command and stream results without starting an interactive shell, you +can pass the command and its arguments to exec directly: + +```shell-session# run commands without starting an interactive session +$ nomad alloc exec eb17e557 cat /etc/resolv.conf +... +``` + +When passing command arguments to be evaluated in task, you may need to ensure +that your host shell doesn't interpolate values before invoking `exec` command. +For example, the following command would return the environment variable on +operator shell rather than task containers: + +```shell-session +$ nomad alloc exec eb17e557 echo $NOMAD_ALLOC_ID # wrong +... +``` + +Here, we must start a shell in task to interpolate `$NOMAD_ALLOC_ID`, and quote +command or use the [heredoc syntax][heredoc] + +```shell-session# by quoting argument +$ nomad alloc exec eb17e557 /bin/sh -c 'echo $NOMAD_ALLOC_ID' +eb17e557-443e-4c51-c049-5bba7a9850bc + +$ # by using heredoc and passing command in stdin +$ nomad alloc exec eb17e557 /bin/sh <<'EOF' +> echo $NOMAD_ALLOC_ID +> EOF +eb17e557-443e-4c51-c049-5bba7a9850bc +``` + +This technique applies when aiming to run a shell pipeline without streaming +intermediate command output across the network: + +```shell-session# e.g. find top appearing lines in some output +$ nomad alloc exec eb17e557 /bin/sh -c 'cat /output | sort | uniq -c | sort -rn | head -n 5' +... +``` + +## Using Job ID instead of Allocation ID + +Setting the `-job` flag causes a random allocation of the specified job to be +selected. + +```plaintext +nomad alloc exec -job [...] +``` + +Choosing a specific allocation is useful for debugging issues with a specific +instance of a service. For other operations using the `-job` flag may be more +convenient than looking up an allocation ID to use. + +## Disabling remote execution + +`alloc exec` is enabled by default to aid with debugging. Operators can disable +the feature by setting [`disable_remote_exec` client config +option][disable_remote_exec_flag] on all clients, or a subset of clients that +run sensitive workloads. + +## Exec targeting a specific task + +When trying to `alloc exec` for a job that has more than one task associated +with it, you may want to target a specific task. + +```shell-session +# open a shell session in one of your allocation's tasks +$ nomad alloc exec -i -t -task mytask a1827f93 /bin/bash +a1827f93$ +``` + +[heredoc]: http://tldp.org/LDP/abs/html/here-docs.html +[disable_remote_exec_flag]: /docs/configuration/client#disable_remote_exec diff --git a/content/nomad/v0.11.x/content/docs/commands/alloc/fs.mdx b/content/nomad/v0.11.x/content/docs/commands/alloc/fs.mdx new file mode 100644 index 0000000000..25a404924d --- /dev/null +++ b/content/nomad/v0.11.x/content/docs/commands/alloc/fs.mdx @@ -0,0 +1,110 @@ +--- +layout: docs +page_title: 'Commands: alloc fs' +sidebar_title: fs +description: | + Introspect an allocation directory on a Nomad client +--- + +# Command: alloc fs + +**Alias: `nomad fs`** + +The `alloc fs` command allows a user to navigate an allocation directory on a Nomad +client. The following functionalities are available - `cat`, `tail`, `ls` and +`stat`. + +- `cat`: If the target path is a file, Nomad will `cat` the file. + +- `tail`: If the target path is a file and `-tail` flag is specified, Nomad will + `tail` the file. + +- `ls`: If the target path is a directory, Nomad displays the name of a file and + directories and their associated information. + +- `stat`: If the `-stat` flag is used, Nomad will display information about a + file. + +## Usage + +```plaintext +nomad alloc fs [options] +``` + +This command accepts a single allocation ID (unless the `-job` flag is +specified, in which case an allocation is chosen from the given job) and a path. +The path is relative to the root of the allocation directory. The path is +optional and it defaults to `/` of the allocation directory. + +## General Options + +@include 'general_options.mdx' + +## Fs Options + +- `-H`: Machine friendly output. + +- `-verbose`: Display verbose output. + +- `-job`: Use a random allocation from the specified job, preferring a running + allocation. + +- `-stat`: Show stat information instead of displaying the file, or listing the + directory. + +- `-f`: Causes the output to not stop when the end of the file is reached, but + rather to wait for additional output. + +- `-tail`: Show the files contents with offsets relative to the end of the file. + If no offset is given, -n is defaulted to 10. + +- `-n`: Sets the tail location in best-efforted number of lines relative to the + end of the file. + +- `-c`: Sets the tail location in number of bytes relative to the end of the file. + +## Examples + +```shell-session +$ nomad alloc fs eb17e557 +Mode Size Modified Time Name +drwxrwxr-x 4096 28 Jan 16 05:39 UTC alloc/ +drwxrwxr-x 4096 28 Jan 16 05:39 UTC redis/ +-rw-rw-r-- 0 28 Jan 16 05:39 UTC redis_exit_status + + +$ nomad alloc fs eb17e557 redis/local +Mode Size Modified Time Name +-rw-rw-rw- 0 28 Jan 16 05:39 UTC redis.stderr +-rw-rw-rw- 17 28 Jan 16 05:39 UTC redis.stdout + + +$ nomad alloc fs -stat eb17e557 redis/local/redis.stdout +Mode Size Modified Time Name +-rw-rw-rw- 17 28 Jan 16 05:39 UTC redis.stdout + + +$ nomad alloc fs eb17e557 redis/local/redis.stdout +foobar +baz + +$ nomad alloc fs -tail -f -n 3 eb17e557 redis/local/redis.stdout +foobar +baz +bam + +``` + +## Using Job ID instead of Allocation ID + +Setting the `-job` flag causes a random allocation of the specified job to be +selected. Nomad will prefer to select a running allocation ID for the job, but +if no running allocations for the job are found, Nomad will use a dead +allocation. + +```plaintext +nomad alloc fs -job +``` + +This can be useful for debugging a job that has multiple allocations, and it is +not required to observe a specific allocation. diff --git a/content/nomad/v0.11.x/content/docs/commands/alloc/index.mdx b/content/nomad/v0.11.x/content/docs/commands/alloc/index.mdx new file mode 100644 index 0000000000..d6f2c4e73e --- /dev/null +++ b/content/nomad/v0.11.x/content/docs/commands/alloc/index.mdx @@ -0,0 +1,34 @@ +--- +layout: docs +page_title: 'Commands: alloc' +sidebar_title: alloc +description: | + The alloc command is used to interact with allocations. +--- + +# Command: alloc + +The `alloc` command is used to interact with allocations. + +## Usage + +Usage: `nomad alloc [options]` + +Run `nomad alloc -h` for help on that subcommand. The following +subcommands are available: + +- [`alloc exec`][exec] - Run a command in a running allocation +- [`alloc fs`][fs] - Inspect the contents of an allocation directory +- [`alloc logs`][logs] - Streams the logs of a task +- [`alloc restart`][restart] - Restart a running allocation or task +- [`alloc signal`][signal] - Signal a running allocation +- [`alloc status`][status] - Display allocation status information and metadata +- [`alloc stop`][stop] - Stop and reschedule a running allocation + +[exec]: /docs/commands/alloc/exec 'Run a command in a running allocation' +[fs]: /docs/commands/alloc/fs 'Inspect the contents of an allocation directory' +[logs]: /docs/commands/alloc/logs 'Streams the logs of a task' +[restart]: /docs/commands/alloc/restart 'Restart a running allocation or task' +[signal]: /docs/commands/alloc/signal 'Signal a running allocation' +[status]: /docs/commands/alloc/status 'Display allocation status information and metadata' +[stop]: /docs/commands/alloc/stop 'Stop and reschedule a running allocation' diff --git a/content/nomad/v0.11.x/content/docs/commands/alloc/logs.mdx b/content/nomad/v0.11.x/content/docs/commands/alloc/logs.mdx new file mode 100644 index 0000000000..62ffbb746f --- /dev/null +++ b/content/nomad/v0.11.x/content/docs/commands/alloc/logs.mdx @@ -0,0 +1,91 @@ +--- +layout: docs +page_title: 'Commands: alloc logs' +sidebar_title: logs +description: | + Stream the logs of a task. +--- + +# Command: alloc logs + +**Alias: `nomad logs`** + +The `alloc logs` command displays the log of a given task. + +## Usage + +```plaintext +nomad alloc logs [options] +``` + +This command streams the logs of the given task in the allocation. If the +allocation is only running a single task, the task name can be omitted. +Optionally, the `-job` option may be used in which case a random allocation from +the given job will be chosen. + +## General Options + +@include 'general_options.mdx' + +## Logs Options + +- `-stderr`: Display stderr logs. + +- `-verbose`: Display verbose output. + +- `-job`: Use a random allocation from the specified job, preferring a running + allocation. + +- `-f`: Causes the output to not stop when the end of the logs are reached, but + rather to wait for additional output. + +- `-tail`: Show the logs contents with offsets relative to the end of the logs. + If no offset is given, -n is defaulted to 10. + +- `-n`: Sets the tail location in best-efforted number of lines relative to the + end of the logs. + +- `-c`: Sets the tail location in number of bytes relative to the end of the + logs. + +## Examples + +```shell-session +$ nomad alloc logs eb17e557 redis +foobar +baz +bam + +$ nomad alloc logs -stderr eb17e557 redis +[ERR]: foo +[ERR]: bar + +$ nomad alloc logs -job example +[ERR]: foo +[ERR]: bar + +$ nomad alloc logs -tail -n 2 eb17e557 redis +foobar +baz + +$ nomad alloc logs -tail -f -n 3 eb17e557 redis +foobar +baz +bam + +``` + +## Using Job ID instead of Allocation ID + +Setting the `-job` flag causes a random allocation of the specified job to be +selected. Nomad will prefer to select a running allocation ID for the job, but +if no running allocations for the job are found, Nomad will use a dead +allocation. + +```plaintext +nomad alloc logs -job +``` + +Choosing a specific allocation is useful for debugging issues with a specific +instance of a service. For other operations using the `-job` flag may be more +convenient than looking up an allocation ID to use. diff --git a/content/nomad/v0.11.x/content/docs/commands/alloc/restart.mdx b/content/nomad/v0.11.x/content/docs/commands/alloc/restart.mdx new file mode 100644 index 0000000000..3f35090410 --- /dev/null +++ b/content/nomad/v0.11.x/content/docs/commands/alloc/restart.mdx @@ -0,0 +1,41 @@ +--- +layout: docs +page_title: 'Commands: alloc restart' +sidebar_title: restart +description: | + Restart a running allocation or task +--- + +# Command: alloc restart + +The `alloc restart` command allows a user to perform an in place restart of an +an entire allocation or individual task. + +## Usage + +```plaintext +nomad alloc restart [options] +``` + +This command accepts a single allocation ID and a task name. The task name must +be part of the allocation and the task must be currently running. The task name +is optional and if omitted every task in the allocation will be restarted. + +## General Options + +@include 'general_options.mdx' + +## Restart Options + +- `-verbose`: Display verbose output. + +## Examples + +```shell-session +$ nomad alloc restart eb17e557 + +$ nomad alloc restart eb17e557 foo +Could not find task named: foo, found: +* test + +``` diff --git a/content/nomad/v0.11.x/content/docs/commands/alloc/signal.mdx b/content/nomad/v0.11.x/content/docs/commands/alloc/signal.mdx new file mode 100644 index 0000000000..061c14b955 --- /dev/null +++ b/content/nomad/v0.11.x/content/docs/commands/alloc/signal.mdx @@ -0,0 +1,42 @@ +--- +layout: docs +page_title: 'Commands: alloc signal' +sidebar_title: signal +description: | + Signal a running allocation or task +--- + +# Command: alloc signal + +The `alloc signal` command allows a user to perform an in place signal of an +an entire allocation or individual task. + +## Usage + +```plaintext +nomad alloc signal [options] +``` + +This command accepts a single allocation ID and a task name. The task name must +be part of the allocation and the task must be currently running. The task name +is optional and if omitted every task in the allocation will be signaled. + +## General Options + +@include 'general_options.mdx' + +## Signal Options + +- `-s`: Signal to send to the tasks. Valid options depend on the driver. +- `-verbose`: Display verbose output. + +## Examples + +```shell-session +$ nomad alloc signal eb17e557 + +$ nomad alloc signal eb17e557 foo +Could not find task named: foo, found: +* test + +``` diff --git a/content/nomad/v0.11.x/content/docs/commands/alloc/status.mdx b/content/nomad/v0.11.x/content/docs/commands/alloc/status.mdx new file mode 100644 index 0000000000..27a103b800 --- /dev/null +++ b/content/nomad/v0.11.x/content/docs/commands/alloc/status.mdx @@ -0,0 +1,184 @@ +--- +layout: docs +page_title: 'Commands: alloc status' +sidebar_title: status +description: | + Display status and metadata about existing allocations and their tasks. +--- + +# Command: alloc status + +The `alloc status` command displays status information and metadata +about an existing allocation and its tasks. It can be useful while +debugging to reveal the underlying reasons for scheduling decisions or +failures, as well as the current state of its tasks. As of Nomad +0.7.1, alloc status also shows allocation modification time in +addition to create time. As of Nomad 0.8, alloc status shows +information about reschedule attempts. As of Nomad 0.11, alloc status +shows volume claims when a job claims volumes. + +## Usage + +```plaintext +nomad alloc status [options] +``` + +An allocation ID or prefix must be provided. If there is an exact match, the +full details of the allocation will be displayed. Otherwise, a list of matching +allocations and information will be displayed. + +## General Options + +@include 'general_options.mdx' + +## Alloc Status Options + +- `-short`: Display short output. Shows only the most recent task event. +- `-verbose`: Show full information. +- `-json` : Output the allocation in its JSON format. +- `-t` : Format and display the allocation using a Go template. + +## Examples + +Short status of an alloc: + +```shell-session +$ nomad alloc status --short 0af996ed +ID = 0af996ed +Eval ID = be9bde98 +Name = example.cache[0] +Node ID = 43c0b14e +Job ID = example +Job Version = 0 +Client Status = running +Client Description = +Desired Status = run +Desired Description = +Created At = 07/25/17 16:12:48 UTC +Deployment ID = 0c83a3b1 +Deployment Health = healthy + +Tasks +Name State Last Event Time +redis running Started 07/25/17 16:12:48 UTC +web running Started 07/25/17 16:12:49 UTC +``` + +Full status of an alloc, which shows one of the tasks dying and then being restarted: + +```shell-session +$ nomad alloc status 0af996ed +ID = 0af996ed +Eval ID = be9bde98 +Name = example.cache[0] +Node ID = 43c0b14e +Job ID = example +Job Version = 0 +Client Status = running +Client Description = +Desired Status = run +Desired Description = +Created = 5m ago +Modified = 5m ago +Deployment ID = 0c83a3b1 +Deployment Health = healthy +Replacement Alloc ID = 0bc894ca +Reschedule Attempts = 1/3 + +Task "redis" is "running" +Task Resources +CPU Memory Disk Addresses +1/500 MHz 6.3 MiB/256 MiB 300 MiB db: 127.0.0.1:27908 + +CSI Volumes: +ID Plugin Provider Schedulable Mount Options +vol-4150af42 ebs0 aws.ebs true + +Task Events: +Started At = 07/25/17 16:12:48 UTC +Finished At = N/A +Total Restarts = 0 +Last Restart = N/A + +Recent Events: +Time Type Description +07/25/17 16:12:48 UTC Started Task started by client +07/25/17 16:12:48 UTC Task Setup Building Task Directory +07/25/17 16:12:48 UTC Received Task received by client + +Task "web" is "running" +Task Resources +CPU Memory Disk Addresses +1/500 MHz 6.3 MiB/256 MiB 300 MiB db: 127.0.0.1:30572 + +Task Events: +Started At = 07/25/17 16:12:49 UTC +Finished At = N/A +Total Restarts = 0 +Last Restart = N/A + +Recent Events: +07/25/17 16:12:49 UTC Started Task started by client +07/25/17 16:12:48 UTC Task Setup Building Task Directory +07/25/17 16:12:48 UTC Received Task received by client +``` + +Verbose status can also be accessed: + +```shell-session +$ nomad alloc status -verbose 0af996ed +ID = 0af996ed-aff4-8ddb-a566-e55ebf8969c9 +Eval ID = be9bde98-0490-1beb-ced0-012d10ddf22e +Name = example.cache[0] +Node ID = 43c0b14e-7f96-e432-a7da-06605257ce0c +Job ID = example +Job Version = 0 +Client Status = running +Client Description = +Desired Status = run +Desired Description = +Created = 07/25/17 16:12:48 UTC +Modified = 07/25/17 16:12:48 UTC +Deployment ID = 0c83a3b1-8a7b-136b-0e11-8383dc6c9276 +Deployment Health = healthy +Reschedule Eligibility = 2m from now +Evaluated Nodes = 1 +Filtered Nodes = 0 +Exhausted Nodes = 0 +Allocation Time = 38.474µs +Failures = 0 + +Task "redis" is "running" +Task Resources +CPU Memory Disk Addresses +1/500 MHz 6.3 MiB/256 MiB 300 MiB db: 127.0.0.1:27908 + +Task Events: +Started At = 07/25/17 16:12:48 UTC +Finished At = N/A +Total Restarts = 0 +Last Restart = N/A + +Recent Events: +Time Type Description +07/25/17 16:12:48 UTC Started Task started by client +07/25/17 16:12:48 UTC Task Setup Building Task Directory +07/25/17 16:12:48 UTC Received Task received by client + +Task "web" is "running" +Task Resources +CPU Memory Disk Addresses +1/500 MHz 6.3 MiB/256 MiB 300 MiB db: 127.0.0.1:30572 + +Task Events: +Started At = 07/25/17 16:12:49 UTC +Finished At = N/A +Total Restarts = 0 +Last Restart = N/A + +Recent Events: +Time Type Description +07/25/17 16:12:49 UTC Started Task started by client +07/25/17 16:12:48 UTC Task Setup Building Task Directory +07/25/17 16:12:48 UTC Received Task received by client +``` diff --git a/content/nomad/v0.11.x/content/docs/commands/alloc/stop.mdx b/content/nomad/v0.11.x/content/docs/commands/alloc/stop.mdx new file mode 100644 index 0000000000..646eb53285 --- /dev/null +++ b/content/nomad/v0.11.x/content/docs/commands/alloc/stop.mdx @@ -0,0 +1,57 @@ +--- +layout: docs +page_title: 'Commands: alloc stop' +sidebar_title: stop +description: | + Stop and reschedule a running allocation +--- + +# Command: alloc stop + +The `alloc stop` command allows a user to perform an in-place restart of an +entire allocation or individual task. + +## Usage + +```plaintext +nomad alloc stop [options] +``` + +The `alloc stop` command requires a single argument, specifying the alloc ID or +prefix to stop. If there is an exact match based on the provided alloc ID or +prefix, then the alloc will be stopped and rescheduled. Otherwise, a list of +matching allocs and information will be displayed. + +Stop will issue a request to stop and reschedule the allocation. An interactive +monitoring session will display log lines as the allocation completes shutting +down. It is safe to exit the monitor early with ctrl-c. + +## General Options + +@include 'general_options.mdx' + +## Stop Options + +- `-detach`: Return immediately instead of entering monitor mode. After the + stop command is submitted, a new evaluation ID is printed to the + screen, which can be used to examine the rescheduling evaluation using the + [eval status] command. + +- `-verbose`: Display verbose output. + +## Examples + +```shell-session +$ nomad alloc stop c1488bb5 +==> Monitoring evaluation "26172081" + Evaluation triggered by job "example" + Allocation "4dcb1c98" created: node "b4dc52b9", group "cache" + Evaluation within deployment: "c0c594d0" + Evaluation status changed: "pending" -> "complete" +==> Evaluation "26172081" finished with status "complete" + +$ nomad alloc stop -detach eb17e557 +8a91f0f3-9d6b-ac83-479a-5aa186ab7795 +``` + +[eval status]: /docs/commands/eval-status diff --git a/content/nomad/v0.11.x/content/docs/commands/deployment/fail.mdx b/content/nomad/v0.11.x/content/docs/commands/deployment/fail.mdx new file mode 100644 index 0000000000..b1eaa2c9d2 --- /dev/null +++ b/content/nomad/v0.11.x/content/docs/commands/deployment/fail.mdx @@ -0,0 +1,63 @@ +--- +layout: docs +page_title: 'Commands: deployment fail' +sidebar_title: fail +description: | + The deployment fail command is used to manually fail a deployment. +--- + +# Command: deployment fail + +The `deployment fail` command is used to mark a deployment as failed. Failing a +deployment will stop the placement of new allocations as part of rolling +deployment and if the job is configured to auto revert, the job will attempt to +roll back to a stable version. + +## Usage + +```plaintext +nomad deployment fail [options] +``` + +The `deployment fail` command requires a single argument, a deployment ID or +prefix. + +## General Options + +@include 'general_options.mdx' + +## Fail Options + +- `-detach`: Return immediately instead of monitoring. A new evaluation ID + will be output, which can be used to examine the evaluation using the + [eval status] command. + +- `-verbose`: Show full information. + +## Examples + +Manually mark an ongoing deployment as failed: + +```shell-session +$ nomad deployment fail 8990cfbc +Deployment "8990cfbc-28c0-cb28-ca31-856cf691b987" failed + +==> Monitoring evaluation "a2d97ad5" + Evaluation triggered by job "example" + Evaluation within deployment: "8990cfbc" + Evaluation status changed: "pending" -> "complete" +==> Evaluation "a2d97ad5" finished with status "complete" + +$ nomad deployment status 8990cfbc +ID = 8990cfbc +Job ID = example +Job Version = 2 +Status = failed +Description = Deployment marked as failed + +Deployed +Task Group Desired Placed Healthy Unhealthy +cache 3 2 1 0 +``` + +[eval status]: /docs/commands/eval-status diff --git a/content/nomad/v0.11.x/content/docs/commands/deployment/index.mdx b/content/nomad/v0.11.x/content/docs/commands/deployment/index.mdx new file mode 100644 index 0000000000..d5eb0bb24c --- /dev/null +++ b/content/nomad/v0.11.x/content/docs/commands/deployment/index.mdx @@ -0,0 +1,32 @@ +--- +layout: docs +page_title: 'Commands: deployment' +sidebar_title: deployment +description: | + The deployment command is used to interact with deployments. +--- + +# Command: deployment + +The `deployment` command is used to interact with deployments. + +## Usage + +Usage: `nomad deployment [options]` + +Run `nomad deployment -h` for help on that subcommand. The following +subcommands are available: + +- [`deployment fail`][fail] - Manually fail a deployment +- [`deployment list`][list] - List all deployments +- [`deployment pause`][pause] - Pause a deployment +- [`deployment promote`][promote] - Promote canaries in a deployment +- [`deployment resume`][resume] - Resume a paused deployment +- [`deployment status`][status] - Display the status of a deployment + +[fail]: /docs/commands/deployment/fail 'Manually fail a deployment' +[list]: /docs/commands/deployment/list 'List all deployments' +[pause]: /docs/commands/deployment/pause 'Pause a deployment' +[promote]: /docs/commands/deployment/promote 'Promote canaries in a deployment' +[resume]: /docs/commands/deployment/resume 'Resume a paused deployment' +[status]: /docs/commands/deployment/status 'Display the status of a deployment' diff --git a/content/nomad/v0.11.x/content/docs/commands/deployment/list.mdx b/content/nomad/v0.11.x/content/docs/commands/deployment/list.mdx new file mode 100644 index 0000000000..9930700c5b --- /dev/null +++ b/content/nomad/v0.11.x/content/docs/commands/deployment/list.mdx @@ -0,0 +1,41 @@ +--- +layout: docs +page_title: 'Commands: deployment list' +sidebar_title: list +description: | + The deployment list command is used to list deployments. +--- + +# Command: deployment list + +The `deployment list` command is used list all deployments. + +## Usage + +```plaintext +nomad deployment list [options] +``` + +The `deployment list` command requires no arguments. + +## General Options + +@include 'general_options.mdx' + +## List Options + +- `-json` : Output the deployments in their JSON format. +- `-t` : Format and display the deployments using a Go template. +- `-verbose`: Show full information. + +## Examples + +List all tracked deployments: + +```shell-session +$ nomad deployment list +ID Job ID Job Version Status Description +8990cfbc example 2 failed Deployment marked as failed +62eb607c example 1 successful Deployment completed successfully +5f271fe2 example 0 successful Deployment completed successfully +``` diff --git a/content/nomad/v0.11.x/content/docs/commands/deployment/pause.mdx b/content/nomad/v0.11.x/content/docs/commands/deployment/pause.mdx new file mode 100644 index 0000000000..569d233b87 --- /dev/null +++ b/content/nomad/v0.11.x/content/docs/commands/deployment/pause.mdx @@ -0,0 +1,40 @@ +--- +layout: docs +page_title: 'Commands: deployment pause' +sidebar_title: pause +description: > + The deployment pause command is used to pause a deployment and disallow new + placements. +--- + +# Command: deployment pause + +The `deployment pause` command is used to pause a deployment. Pausing a +deployment will pause the placement of new allocations as part of rolling +deployment. + +## Usage + +```plaintext +nomad deployment pause [options] +``` + +The `deployment pause` command requires a single argument, a deployment ID or +prefix. + +## General Options + +@include 'general_options.mdx' + +## Pause Options + +- `-verbose`: Show full information. + +## Examples + +Manually pause a deployment: + +```shell-session +$ nomad deployment pause 2f14ba55 +Deployment "2f14ba55-acfb-cb31-821c-facf1b9b0830" paused +``` diff --git a/content/nomad/v0.11.x/content/docs/commands/deployment/promote.mdx b/content/nomad/v0.11.x/content/docs/commands/deployment/promote.mdx new file mode 100644 index 0000000000..1ee5ed8e9e --- /dev/null +++ b/content/nomad/v0.11.x/content/docs/commands/deployment/promote.mdx @@ -0,0 +1,224 @@ +--- +layout: docs +page_title: 'Commands: deployment promote' +sidebar_title: promote +description: | + The deployment promote command is used to promote canaries in a deployment. +--- + +# Command: deployment promote + +The `deployment promote` command is used to promote task groups in a deployment. +Promotion should occur when the deployment has placed canaries for a task group +and those canaries have been deemed healthy. When a task group is promoted, the +rolling upgrade of the remaining allocations is unblocked. If the canaries are +found to be unhealthy, the deployment may either be failed using the "nomad +deployment fail" command, the job can be failed forward by submitting a new +version or failed backwards by reverting to an older version using the +[`job revert`] command. + +## Usage + +```plaintext +nomad deployment promote [options] +``` + +The `deployment promote` command requires a single argument, a deployment ID or +prefix. When run without specifying any groups to promote, the promote command +promotes all task groups. The group flag can be specified multiple times to +select particular groups to promote. + +## General Options + +@include 'general_options.mdx' + +## Promote Options + +- `-group`: Group may be specified many times and is used to promote that + particular group. If no specific groups are specified, all groups are + promoted. + +- `-detach`: Return immediately instead of monitoring. A new evaluation ID + will be output, which can be used to examine the evaluation using the + [eval status] command + +- `-verbose`: Show full information. + +## Examples + +Promote canaries in all groups: + +```shell-session +# Have two task groups waiting for promotion. +$ nomad status example +ID = example +Name = example +Submit Date = 07/25/17 18:35:05 UTC +Type = service +Priority = 50 +Datacenters = dc1 +Status = running +Periodic = false +Parameterized = false + +Summary +Task Group Queued Starting Running Failed Complete Lost +cache 0 0 3 0 0 0 +web 0 0 3 0 0 0 + +Latest Deployment +ID = 9fa81f27 +Status = running +Description = Deployment is running but requires manual promotion + +Deployed +Task Group Promoted Desired Canaries Placed Healthy Unhealthy +web false 2 1 1 0 0 +cache false 2 1 1 0 0 + +Allocations +ID Node ID Task Group Version Desired Status Created At +091377e5 a8dcce2d web 1 run running 07/25/17 18:35:05 UTC +d2b13584 a8dcce2d cache 1 run running 07/25/17 18:35:05 UTC +4bb185b7 a8dcce2d web 0 run running 07/25/17 18:31:34 UTC +9b6811ee a8dcce2d cache 0 run running 07/25/17 18:31:34 UTC +e0a2441b a8dcce2d cache 0 run running 07/25/17 18:31:34 UTC +f2409f7d a8dcce2d web 0 run running 07/25/17 18:31:34 UTC + +# Promote all groups within the deployment +$ nomad deployment promote 9fa81f27 +==> Monitoring evaluation "6c6e64ae" + Evaluation triggered by job "example" + Evaluation within deployment: "9fa81f27" + Allocation "8fa21654" created: node "a8dcce2d", group "web" + Allocation "9f6727a6" created: node "a8dcce2d", group "cache" + Evaluation status changed: "pending" -> "complete" +==> Evaluation "6c6e64ae" finished with status "complete" + +# Inspect the status and see both groups have been promoted. +$ nomad status example +ID = example +Name = example +Submit Date = 07/25/17 18:35:05 UTC +Type = service +Priority = 50 +Datacenters = dc1 +Status = running +Periodic = false +Parameterized = false + +Summary +Task Group Queued Starting Running Failed Complete Lost +cache 0 0 2 0 2 0 +web 0 0 2 0 2 0 + +Latest Deployment +ID = 9fa81f27 +Status = successful +Description = Deployment completed successfully + +Deployed +Task Group Promoted Desired Canaries Placed Healthy Unhealthy +web true 2 1 2 2 0 +cache true 2 1 2 2 0 + +Allocations +ID Node ID Task Group Version Desired Status Created At +8fa21654 a8dcce2d web 1 run running 07/25/17 18:35:21 UTC +9f6727a6 a8dcce2d cache 1 run running 07/25/17 18:35:21 UTC +091377e5 a8dcce2d web 1 run running 07/25/17 18:35:05 UTC +d2b13584 a8dcce2d cache 1 run running 07/25/17 18:35:05 UTC +4bb185b7 a8dcce2d web 0 stop complete 07/25/17 18:31:34 UTC +9b6811ee a8dcce2d cache 0 stop complete 07/25/17 18:31:34 UTC +e0a2441b a8dcce2d cache 0 stop complete 07/25/17 18:31:34 UTC +f2409f7d a8dcce2d web 0 stop complete 07/25/17 18:31:34 UTC +``` + +Promote canaries in a particular group: + +```shell-session +# Have two task groups waiting for promotion. +$ nomad status example +ID = example +Name = example +Submit Date = 07/25/17 18:37:14 UTC +Type = service +Priority = 50 +Datacenters = dc1 +Status = running +Periodic = false +Parameterized = false + +Summary +Task Group Queued Starting Running Failed Complete Lost +cache 0 0 3 0 0 0 +web 0 0 3 0 0 0 + +Latest Deployment +ID = a6b87a6c +Status = running +Description = Deployment is running but requires manual promotion + +Deployed +Task Group Promoted Desired Canaries Placed Healthy Unhealthy +cache false 2 1 1 1 0 +web false 2 1 1 1 0 + +Allocations +ID Node ID Task Group Version Desired Status Created At +3071ab8f 6240eed6 web 1 run running 07/25/17 18:37:14 UTC +eeeed13b 6240eed6 cache 1 run running 07/25/17 18:37:14 UTC +0ee7800c 6240eed6 cache 0 run running 07/25/17 18:37:08 UTC +a714a926 6240eed6 cache 0 run running 07/25/17 18:37:08 UTC +cee52788 6240eed6 web 0 run running 07/25/17 18:37:08 UTC +ee8f972e 6240eed6 web 0 run running 07/25/17 18:37:08 UTC + +# Promote only the cache canaries +$ nomad deployment promote -group cache a6b87a6c +==> Monitoring evaluation "37383564" + Evaluation triggered by job "example" + Evaluation within deployment: "a6b87a6c" + Allocation "bbddf5c3" created: node "6240eed6", group "cache" + Evaluation status changed: "pending" -> "complete" +==> Evaluation "37383564" finished with status "complete" + +# Inspect the status and see that only the cache canaries are promoted +$ nomad status example +ID = example +Name = example +Submit Date = 07/25/17 18:37:14 UTC +Type = service +Priority = 50 +Datacenters = dc1 +Status = running +Periodic = false +Parameterized = false + +Summary +Task Group Queued Starting Running Failed Complete Lost +cache 0 0 2 0 2 0 +web 0 0 3 0 0 0 + +Latest Deployment +ID = a6b87a6c +Status = running +Description = Deployment is running but requires manual promotion + +Deployed +Task Group Promoted Desired Canaries Placed Healthy Unhealthy +web false 2 1 1 1 0 +cache true 2 1 2 2 0 + +Allocations +ID Node ID Task Group Version Desired Status Created At +bbddf5c3 6240eed6 cache 1 run running 07/25/17 18:37:40 UTC +eeeed13b 6240eed6 cache 1 run running 07/25/17 18:37:14 UTC +3071ab8f 6240eed6 web 1 run running 07/25/17 18:37:14 UTC +a714a926 6240eed6 cache 0 stop complete 07/25/17 18:37:08 UTC +cee52788 6240eed6 web 0 run running 07/25/17 18:37:08 UTC +ee8f972e 6240eed6 web 0 run running 07/25/17 18:37:08 UTC +0ee7800c 6240eed6 cache 0 stop complete 07/25/17 18:37:08 UTC +``` + +[`job revert`]: /docs/commands/job/revert +[eval status]: /docs/commands/eval-status diff --git a/content/nomad/v0.11.x/content/docs/commands/deployment/resume.mdx b/content/nomad/v0.11.x/content/docs/commands/deployment/resume.mdx new file mode 100644 index 0000000000..582137151d --- /dev/null +++ b/content/nomad/v0.11.x/content/docs/commands/deployment/resume.mdx @@ -0,0 +1,54 @@ +--- +layout: docs +page_title: 'Commands: deployment resume' +sidebar_title: resume +description: | + The deployment resume command is used to resume a paused deployment. +--- + +# Command: deployment resume + +The `deployment resume` command is used used to unpause a paused deployment. +Resuming a deployment will resume the placement of new allocations as part of +rolling deployment. + +## Usage + +```plaintext +nomad deployment resume [options] +``` + +The `deployment resume` command requires a single argument, a deployment ID or +prefix. + +## General Options + +@include 'general_options.mdx' + +## Resume Options + +- `-detach`: Return immediately instead of monitoring. A new evaluation ID + will be output, which can be used to examine the evaluation using the + [eval status] command + +- `-verbose`: Show full information. + +## Examples + +Manually resume a deployment: + +```shell-session +$ nomad deployment resume c848972e +Deployment "c848972e-dcd3-7354-e0d2-39d86642cdb1" resumed + +==> Monitoring evaluation "5e266d42" + Evaluation triggered by job "example" + Evaluation within deployment: "c848972e" + Allocation "00208424" created: node "6240eed6", group "web" + Allocation "68c72edf" created: node "6240eed6", group "cache" + Allocation "00208424" status changed: "pending" -> "running" + Evaluation status changed: "pending" -> "complete" +==> Evaluation "5e266d42" finished with status "complete" +``` + +[eval status]: /docs/commands/eval-status diff --git a/content/nomad/v0.11.x/content/docs/commands/deployment/status.mdx b/content/nomad/v0.11.x/content/docs/commands/deployment/status.mdx new file mode 100644 index 0000000000..c854211c18 --- /dev/null +++ b/content/nomad/v0.11.x/content/docs/commands/deployment/status.mdx @@ -0,0 +1,66 @@ +--- +layout: docs +page_title: 'Commands: deployment status' +sidebar_title: status +description: | + The deployment status command is used to display the status of a deployment. +--- + +# Command: deployment status + +The `deployment status` command is used to display the status of a deployment. +The status will display the number of desired changes as well as the currently +applied changes. + +## Usage + +```plaintext +nomad deployment status [options] +``` + +The `deployment status` command requires a single argument, a deployment ID or +prefix. + +## General Options + +@include 'general_options.mdx' + +## Status Options + +- `-json` : Output the deployment in its JSON format. +- `-t` : Format and display the deployment using a Go template. +- `-verbose`: Show full information. + +## Examples + +Inspect the status of a complete deployment: + +```shell-session +$ nomad deployment status 06ca68a2 +ID = 06ca68a2 +Job ID = example +Job Version = 0 +Status = successful +Description = Deployment completed successfully + +Deployed +Task Group Desired Placed Healthy Unhealthy +cache 2 2 2 0 +web 2 2 2 0 +``` + +Inspect the status of a deployment that is waiting for canary promotion: + +```shell-session +$ nomad deployment status 0b +ID = 0b23b149 +Job ID = example +Job Version = 1 +Status = running +Description = Deployment is running but requires manual promotion + +Deployed +Task Group Promoted Desired Canaries Placed Healthy Unhealthy +cache false 2 1 1 0 0 +web N/A 2 0 2 2 0 +``` diff --git a/content/nomad/v0.11.x/content/docs/commands/eval-status.mdx b/content/nomad/v0.11.x/content/docs/commands/eval-status.mdx new file mode 100644 index 0000000000..eb7aaeaf76 --- /dev/null +++ b/content/nomad/v0.11.x/content/docs/commands/eval-status.mdx @@ -0,0 +1,84 @@ +--- +layout: docs +page_title: 'Commands: eval status' +sidebar_title: eval status +description: > + The eval status command is used to see the status and potential failed + allocations of an evaluation. +--- + +# Command: eval status + +The `eval status` command is used to display information about an existing +evaluation. In the case an evaluation could not place all the requested +allocations, this command can be used to determine the failure reasons. + +Optionally, it can also be invoked in a monitor mode to track an outstanding +evaluation. In this mode, logs will be output describing state changes to the +evaluation or its associated allocations. The monitor will exit when the +evaluation reaches a terminal state. + +## Usage + +```plaintext +nomad eval status [options] +``` + +An evaluation ID or prefix must be provided. If there is an exact match, the +the status will be shown. Otherwise, a list of matching evaluations and +information will be displayed. + +If the `-monitor` flag is passed, an interactive monitoring session will be +started in the terminal. It is safe to exit the monitor at any time using +ctrl+c. The command will exit when the given evaluation reaches a terminal +state (completed or failed). Exit code 0 is returned on successful +evaluation, and if there are no scheduling problems. If there are +job placement issues encountered (unsatisfiable constraints, +resource exhaustion, etc), then the exit code will be 2. Any other +errors, including client connection issues or internal errors, are +indicated by exit code 1. + +## General Options + +@include 'general_options.mdx' + +## Eval Status Options + +- `-monitor`: Monitor an outstanding evaluation +- `-verbose`: Show full information. +- `-json` : Output the evaluation in its JSON format. +- `-t` : Format and display evaluation using a Go template. + +## Examples + +Show the status of an evaluation that has placement failures + +```shell-session +$ nomad eval status 2ae0e6a5 +ID = 2ae0e6a5 +Status = complete +Status Description = complete +Type = service +TriggeredBy = job-register +Job ID = example +Priority = 50 +Placement Failures = true + +==> Failed Placements +Task Group "cache" (failed to place 1 allocation): + * Class "foo" filtered 1 nodes + * Constraint "${attr.kernel.name} = windows" filtered 1 nodes + + +Evaluation "67493a64" waiting for additional capacity to place remainder +``` + +Monitor an existing evaluation + +```shell-session +$ nomad eval status -monitor 8262bc83 +==> Monitoring evaluation "8262bc83" + Allocation "bd6bd0de" created: node "6f299da5", group "group1" + Evaluation status changed: "pending" -> "complete" +==> Evaluation "8262bc83" finished with status "complete" +``` diff --git a/content/nomad/v0.11.x/content/docs/commands/index.mdx b/content/nomad/v0.11.x/content/docs/commands/index.mdx new file mode 100644 index 0000000000..657686fa0f --- /dev/null +++ b/content/nomad/v0.11.x/content/docs/commands/index.mdx @@ -0,0 +1,64 @@ +--- +layout: docs +page_title: Commands (CLI) +sidebar_title: Commands (CLI) +description: > + Nomad can be controlled via a command-line interface. This page documents all + the commands Nomad accepts. +--- + +# Nomad Commands (CLI) + +Nomad is controlled via a very easy to use command-line interface (CLI). +Nomad is only a single command-line application: `nomad`, which +takes a subcommand such as "agent" or "status". The complete list of +subcommands is in the navigation to the left. + +The Nomad CLI is a well-behaved command line application. In erroneous cases, +a non-zero exit status will be returned. It also responds to `-h` and `--help` +as you would most likely expect. + +To view a list of the available commands at any time, just run Nomad +with no arguments. To get help for any specific subcommand, run the subcommand +with the `-h` argument. + +Each command has been conveniently documented on this website. Links to each +command can be found on the left. + +## Autocomplete + +Nomad's CLI supports command autocomplete. Autocomplete can be installed or +uninstalled by running the following on bash or zsh shells: + +```shell-session +$ nomad -autocomplete-install +$ nomad -autocomplete-uninstall +``` + +## Command Contexts + +Nomad's CLI commands have implied contexts in their naming convention. Because +the CLI is most commonly used to manipulate or query jobs, you can assume that +any given command is working in that context unless the command name implies +otherwise. + +For example, the `nomad job run` command is used to run a new job, the `nomad status` command queries information about existing jobs, etc. Conversely, +commands with a prefix in their name likely operate in a different context. +Examples include the `nomad agent-info` or `nomad node drain` commands, +which operate in the agent or node contexts respectively. + +### Remote Usage + +The Nomad CLI may be used to interact with a remote Nomad cluster, even when the +local machine does not have a running Nomad agent. To do so, set the +`NOMAD_ADDR` environment variable or use the `-address=` flag when running +commands. + +```shell-session +$ nomad_ADDR=https://remote-address:4646 nomad status +$ nomad status -address=https://remote-address:4646 +``` + +The provided address must be reachable from your local machine. There are a +variety of ways to accomplish this (VPN, SSH Tunnel, etc). If the port is +exposed to the public internet it is highly recommended to configure TLS. diff --git a/content/nomad/v0.11.x/content/docs/commands/job/deployments.mdx b/content/nomad/v0.11.x/content/docs/commands/job/deployments.mdx new file mode 100644 index 0000000000..d130f56966 --- /dev/null +++ b/content/nomad/v0.11.x/content/docs/commands/job/deployments.mdx @@ -0,0 +1,49 @@ +--- +layout: docs +page_title: 'Commands: job deployments' +sidebar_title: deployments +description: | + The deployments command is used to list deployments for a job. +--- + +# Command: job deployments + +The `job deployments` command is used to display the deployments for a +particular job. + +## Usage + +```plaintext +nomad job deployments [options] +``` + +The `job deployments` command requires a single argument, the job ID or an ID +prefix of a job to display the list of deployments for. + +## General Options + +@include 'general_options.mdx' + +## Deployment Options + +- `-latest`: Display the latest deployment only. + +- `-json` : Output the deployment in its JSON format. + +- `-t` : Format and display the deployment using a Go template. + +- `-verbose`: Show full information. + +- `-all`: Display all deployments matching the job ID, even those from an + older instance of the job. + +## Examples + +List the deployment for a particular job: + +```shell-session +$ nomad job deployments example +ID Job ID Job Version Status Description +0b23b149 example 1 running Deployment is running but requires manual promotion +06ca68a2 example 0 successful Deployment completed successfully +``` diff --git a/content/nomad/v0.11.x/content/docs/commands/job/dispatch.mdx b/content/nomad/v0.11.x/content/docs/commands/job/dispatch.mdx new file mode 100644 index 0000000000..926f9eb444 --- /dev/null +++ b/content/nomad/v0.11.x/content/docs/commands/job/dispatch.mdx @@ -0,0 +1,111 @@ +--- +layout: docs +page_title: 'Commands: job dispatch' +sidebar_title: dispatch +description: | + The dispatch command is used to create an instance of a parameterized job. +--- + +# Command: job dispatch + +The `job dispatch` command is used to create new instances of a [parameterized +job]. The parameterized job captures a job's configuration and runtime +requirements in a generic way and `dispatch` is used to provide the input for +the job to run against. A parameterized job is similar to a function definition, +and dispatch is used to invoke the function. + +Each time a job is dispatched, a unique job ID is generated. This allows a +caller to track the status of the job, much like a future or promise in some +programming languages. + +## Usage + +```plaintext +nomad job dispatch [options] [input source] +``` + +Dispatch creates an instance of a parameterized job. A data payload to the +dispatched instance can be provided via stdin by using "-" for the input source +or by specifying a path to a file. Metadata can be supplied by using the meta +flag one or more times. + +The payload has a **size limit of 16KiB**. + +Upon successful creation, the dispatched job ID will be printed and the +triggered evaluation will be monitored. This can be disabled by supplying the +detach flag. + +On successful job submission and scheduling, exit code 0 will be returned. If +there are job placement issues encountered (unsatisfiable constraints, resource +exhaustion, etc), then the exit code will be 2. Any other errors, including +client connection issues or internal errors, are indicated by exit code 1. + +## General Options + +@include 'general_options.mdx' + +## Dispatch Options + +- `-meta`: Meta takes a key/value pair separated by "=". The metadata key will + be merged into the job's metadata. The job may define a default value for the + key which is overridden when dispatching. The flag can be provided more than + once to inject multiple metadata key/value pairs. Arbitrary keys are not + allowed. The parameterized job must allow the key to be merged. + +- `-detach`: Return immediately instead of monitoring. A new evaluation ID + will be output, which can be used to examine the evaluation using the + [eval status] command + +- `-verbose`: Show full information. + +## Examples + +Dispatch against a parameterized job with the ID "video-encode" and +passing in a configuration payload via stdin: + +```shell-session +$ cat << EOF | nomad job dispatch video-encode - +{ + "s3-input": "https://video-bucket.s3-us-west-1.amazonaws.com/cb31dabb1", + "s3-output": "https://video-bucket.s3-us-west-1.amazonaws.com/a149adbe3", + "input-codec": "mp4", + "output-codec": "webm", + "quality": "1080p" +} +EOF +Dispatched Job ID = video-encode/dispatch-1485379325-cb38d00d +Evaluation ID = 31199841 + +==> Monitoring evaluation "31199841" + Evaluation triggered by job "example/dispatch-1485379325-cb38d00d" + Allocation "8254b85f" created: node "82ff9c50", group "cache" + Evaluation status changed: "pending" -> "complete" +==> Evaluation "31199841" finished with status "complete" +``` + +Dispatch against a parameterized job with the ID "video-encode" and +passing in a configuration payload via a file: + +```shell-session +$ nomad job dispatch video-encode video-config.json +Dispatched Job ID = video-encode/dispatch-1485379325-cb38d00d +Evaluation ID = 31199841 + +==> Monitoring evaluation "31199841" + Evaluation triggered by job "example/dispatch-1485379325-cb38d00d" + Allocation "8254b85f" created: node "82ff9c50", group "cache" + Evaluation status changed: "pending" -> "complete" +==> Evaluation "31199841" finished with status "complete" +``` + +Dispatch against a parameterized job with the ID "video-encode" using the detach +flag: + +```shell-session +$ nomad job dispatch -detach video-encode video-config.json +Dispatched Job ID = example/dispatch-1485380684-c37b3dba +Evaluation ID = d9034c4e +``` + +[eval status]: /docs/commands/eval-status +[parameterized job]: /docs/job-specification/parameterized 'Nomad parameterized Job Specification' diff --git a/content/nomad/v0.11.x/content/docs/commands/job/eval.mdx b/content/nomad/v0.11.x/content/docs/commands/job/eval.mdx new file mode 100644 index 0000000000..a4b77976aa --- /dev/null +++ b/content/nomad/v0.11.x/content/docs/commands/job/eval.mdx @@ -0,0 +1,73 @@ +--- +layout: docs +page_title: 'Commands: job eval' +sidebar_title: eval +description: | + The job eval command is used to force an evaluation of a job +--- + +# Command: job eval + +The `job eval` command is used to force an evaluation of a job, given the job +ID. + +## Usage + +```plaintext +nomad job eval [options] +``` + +The `job eval` command requires a single argument, specifying the job ID to +evaluate. If there is an exact match based on the provided job ID, then the job +will be evaluated, forcing a scheduler run. + +## General Options + +@include 'general_options.mdx' + +## Eval Options + +- `-force-reschedule`: `force-reschedule` is used to force placement of failed + allocations. If this is set, failed allocations that are past their reschedule + limit, and those that are scheduled to be replaced at a future time are placed + immediately. This option only places failed allocations if the task group has + rescheduling enabled. + +- `-detach`: Return immediately instead of monitoring. A new evaluation ID + will be output, which can be used to examine the evaluation using the + [eval status] command. + +- `-verbose`: Show full information. + +## Examples + +Evaluate the job with ID "job1": + +```shell-session +$ nomad job eval job1 +==> Monitoring evaluation "0f3bc0f3" + Evaluation triggered by job "test" + Evaluation within deployment: "51baf5c8" + Evaluation status changed: "pending" -> "complete" +==> Evaluation "0f3bc0f3" finished with status "complete" +``` + +Evaluate the job with ID "job1" and return immediately: + +```shell-session +$ nomad job eval -detach job1 +Created eval ID: "4947e728" +``` + +Evaluate the job with ID "job1", and reschedule any eligible failed allocations: + +```shell-session +$ nomad job eval -force-reschedule job1 +==> Monitoring evaluation "0f3bc0f3" + Evaluation triggered by job "test" + Evaluation within deployment: "51baf5c8" + Evaluation status changed: "pending" -> "complete" +==> Evaluation "0f3bc0f3" finished with status "complete" +``` + +[eval status]: /docs/commands/eval-status diff --git a/content/nomad/v0.11.x/content/docs/commands/job/history.mdx b/content/nomad/v0.11.x/content/docs/commands/job/history.mdx new file mode 100644 index 0000000000..e6ba1c89e5 --- /dev/null +++ b/content/nomad/v0.11.x/content/docs/commands/job/history.mdx @@ -0,0 +1,80 @@ +--- +layout: docs +page_title: 'Commands: job history' +sidebar_title: history +description: | + The history command is used to display all tracked versions of a job. +--- + +# Command: job history + +The `job history` command is used to display the known versions of a particular +job. The command can display the diff between job versions and can be useful for +understanding the changes that occurred to the job as well as deciding job +versions to revert to. + +## Usage + +```plaintext +nomad job history [options] +``` + +The `job history` command requires a single argument, the job ID or an ID prefix +of a job to display the history for. + +## General Options + +@include 'general_options.mdx' + +## History Options + +- `-p`: Display the differences between each job and its predecessor. +- `-full`: Display the full job definition for each version. +- `-version`: Display only the history for the given version. +- `-json` : Output the job versions in its JSON format. +- `-t` : Format and display the job versions using a Go template. + +## Examples + +Display the history showing differences between versions: + +```shell-session +$ nomad job history -p e +Version = 2 +Stable = false +Submit Date = 07/25/17 20:35:43 UTC +Diff = ++/- Job: "example" ++/- Task Group: "cache" + +/- Task: "redis" + +/- Resources { + CPU: "500" + DiskMB: "0" + +/- MemoryMB: "256" => "512" + } + +Version = 1 +Stable = false +Submit Date = 07/25/17 20:35:31 UTC +Diff = ++/- Job: "example" ++/- Task Group: "cache" + +/- Count: "1" => "3" + Task: "redis" + +Version = 0 +Stable = false +Submit Date = 07/25/17 20:35:28 UTC +``` + +Display the memory ask across submitted job versions: + +```shell-session +$ nomad job history -t "{{range .}}\ +v{{.Version}}: {{with index .TaskGroups 0}}{{with index .Tasks 0}}{{.Resources.MemoryMB}}{{end}}{{end}}\ + +{{end}}" example +v2: 512 +v1: 256 +v0: 256 +``` diff --git a/content/nomad/v0.11.x/content/docs/commands/job/index.mdx b/content/nomad/v0.11.x/content/docs/commands/job/index.mdx new file mode 100644 index 0000000000..8026c0fb74 --- /dev/null +++ b/content/nomad/v0.11.x/content/docs/commands/job/index.mdx @@ -0,0 +1,34 @@ +--- +layout: docs +page_title: 'Commands: job' +sidebar_title: job +description: | + The job command is used to interact with jobs. +--- + +# Command: job + +The `job` command is used to interact with jobs. + +## Usage + +Usage: `nomad job [options]` + +Run `nomad job -h` for help on that subcommand. The following +subcommands are available: + +- [`job deployments`][deployments] - List deployments for a job +- [`job dispatch`][dispatch] - Dispatch an instance of a parameterized job +- [`job eval`][eval] - Force an evaluation for a job +- [`job history`][history] - Display all tracked versions of a job +- [`job promote`][promote] - Promote a job's canaries +- [`job revert`][revert] - Revert to a prior version of the job +- [`job status`][status] - Display status information about a job + +[deployments]: /docs/commands/job/deployments 'List deployments for a job' +[dispatch]: /docs/commands/job/dispatch 'Dispatch an instance of a parameterized job' +[eval]: /docs/commands/job/eval 'Force an evaluation for a job' +[history]: /docs/commands/job/history 'Display all tracked versions of a job' +[promote]: /docs/commands/job/promote "Promote a job's canaries" +[revert]: /docs/commands/job/revert 'Revert to a prior version of the job' +[status]: /docs/commands/job/status 'Display status information about a job' diff --git a/content/nomad/v0.11.x/content/docs/commands/job/init.mdx b/content/nomad/v0.11.x/content/docs/commands/job/init.mdx new file mode 100644 index 0000000000..217f29cb45 --- /dev/null +++ b/content/nomad/v0.11.x/content/docs/commands/job/init.mdx @@ -0,0 +1,44 @@ +--- +layout: docs +page_title: 'Commands: job init' +sidebar_title: init +description: | + The job init command is used to generate a skeleton jobspec template. +--- + +# Command: job init + +**Alias: `nomad init`** + +The `job init` command creates an example [job specification][jobspec] in the +current directory that demonstrates some common configurations for tasks, task +groups, runtime constraints, and resource allocation. + +## Usage + +```plaintext +nomad job init [options] [filename] +``` + +You may optionally supply a filename for the example job to be written to. The +default filename for the generated file is "example.nomad". + +Please refer to the [jobspec] and [drivers] pages to learn how to customize the +template. + +## Init Options + +- `-short`: If set, a minimal jobspec without comments is emitted. +- `-connect`: If set, the jobspec includes Consul Connect integration. + +## Examples + +Generate an example job file: + +```shell-session +$ nomad job init +Example job file written to example.nomad +``` + +[jobspec]: /docs/job-specification 'Nomad Job Specification' +[drivers]: /docs/drivers 'Nomad Task Drivers documentation' diff --git a/content/nomad/v0.11.x/content/docs/commands/job/inspect.mdx b/content/nomad/v0.11.x/content/docs/commands/job/inspect.mdx new file mode 100644 index 0000000000..9b0e151c4e --- /dev/null +++ b/content/nomad/v0.11.x/content/docs/commands/job/inspect.mdx @@ -0,0 +1,156 @@ +--- +layout: docs +page_title: 'Commands: job inspect' +sidebar_title: inspect +description: | + The job inspect command is used to inspect a submitted job. +--- + +# Command: job inspect + +**Alias: `nomad inspect`** + +The `job inspect` command is used to inspect the content of a submitted job. + +## Usage + +```plaintext +nomad job inspect [options] +``` + +The `job inspect` command requires a single argument, a submitted job's name, and +will retrieve the JSON version of the job. This JSON is valid to be submitted to +the [Job HTTP API]. This command is useful to inspect what version of a job +Nomad is running. + +## General Options + +@include 'general_options.mdx' + +## Inspect Options + +- `-version`: Display only the job at the given job version. +- `-json` : Output the job in its JSON format. +- `-t` : Format and display the job using a Go template. + +## Examples + +Inspect a submitted job: + +```shell-session +$ nomad job inspect redis +{ + "Job": { + "Region": "global", + "ID": "redis + "Name": "redis + "Type": "service", + "Priority": 50, + "AllAtOnce": false, + "Datacenters": [ + "dc1" + ], + "Constraints": [ + { + "LTarget": "${attr.kernel.name}", + "RTarget": "linux", + "Operand": "=" + } + ], + "TaskGroups": [ + { + "Name": "cache", + "Count": 1, + "Constraints": null, + "Tasks": [ + { + "Name": "redis", + "Driver": "docker", + "User": "", + "Config": { + "image": "redis:latest", + "port_map": [ + { + "db": 6379 + } + ] + }, + "Constraints": null, + "Env": null, + "Services": [ + { + "Id": "", + "Name": "cache-redis", + "Tags": [ + "global", + "cache" + ], + "PortLabel": "db", + "Checks": [ + { + "Id": "", + "Name": "alive", + "Type": "tcp", + "Command": "", + "Args": null, + "Path": "", + "Protocol": "", + "Interval": 10000000000, + "Timeout": 2000000000 + } + ] + } + ], + "Resources": { + "CPU": 500, + "MemoryMB": 256, + "DiskMB": 300, + "Networks": [ + { + "Public": false, + "CIDR": "", + "ReservedPorts": null, + "DynamicPorts": [ + { + "Label": "db", + "Value": 0 + } + ], + "IP": "", + "MBits": 10 + } + ] + }, + "Meta": null, + "KillTimeout": 5000000000, + "LogConfig": { + "MaxFiles": 10, + "MaxFileSizeMB": 10 + }, + "Artifacts": null + } + ], + "RestartPolicy": { + "Interval": 300000000000, + "Attempts": 10, + "Delay": 25000000000, + "Mode": "delay" + }, + "Meta": null + } + ], + "Update": { + "Stagger": 10000000000, + "MaxParallel": 1 + }, + "Periodic": null, + "Meta": null, + "Status": "running", + "StatusDescription": "", + "CreateIndex": 5, + "ModifyIndex": 7 + } +} +``` + +[job http api]: /api-docs/jobs diff --git a/content/nomad/v0.11.x/content/docs/commands/job/periodic-force.mdx b/content/nomad/v0.11.x/content/docs/commands/job/periodic-force.mdx new file mode 100644 index 0000000000..86b1fce684 --- /dev/null +++ b/content/nomad/v0.11.x/content/docs/commands/job/periodic-force.mdx @@ -0,0 +1,66 @@ +--- +layout: docs +page_title: 'Commands: job periodic force' +sidebar_title: periodic force +description: > + The job periodic force command is used to force the evaluation of a periodic + job. +--- + +# Command: job periodic force + +The `job periodic force` command is used to [force the evaluation] of a +[periodic job]. + +## Usage + +```plaintext +nomad job periodic force [options] +``` + +The `job periodic force` command requires a single argument, specifying the ID +of the job. This job must be a periodic job. This is used to immediately run a +periodic job, even if it violates the job's `prohibit_overlap` setting. + +By default, on successful job submission the command will enter an interactive +monitor and display log information detailing the scheduling decisions and +placement information for the forced evaluation. The monitor will exit after +scheduling has finished or failed. + +## General Options + +@include 'general_options.mdx' + +## Run Options + +- `-detach`: Return immediately instead of monitoring. A new evaluation ID + will be output, which can be used to examine the evaluation using the + [eval status] command. + +- `-verbose`: Show full information. + +## Examples + +Force the evaluation of the job `example`, monitoring placement: + +```shell-session +$ nomad job periodic force example +==> Monitoring evaluation "54b2d6d9" + Evaluation triggered by job "example/periodic-1555094493" + Allocation "637aee17" created: node "a35ab8fc", group "cache" + Allocation "637aee17" status changed: "pending" -> "running" (Tasks are running) + Evaluation status changed: "pending" -> "complete" +==> Evaluation "54b2d6d9" finished with status "complete" +``` + +Force the evaluation of the job `example` and return immediately: + +```shell-session +$ nomad job periodic force -detach example +Force periodic successful +Evaluation ID: 0865fbf3-30de-5f53-0811-821e73e63178 +``` + +[eval status]: /docs/commands/eval-status +[force the evaluation]: /api-docs/jobs#force-new-periodic-instance +[periodic job]: /docs/job-specification/periodic diff --git a/content/nomad/v0.11.x/content/docs/commands/job/plan.mdx b/content/nomad/v0.11.x/content/docs/commands/job/plan.mdx new file mode 100644 index 0000000000..46fdf0209f --- /dev/null +++ b/content/nomad/v0.11.x/content/docs/commands/job/plan.mdx @@ -0,0 +1,205 @@ +--- +layout: docs +page_title: 'Commands: job plan' +sidebar_title: plan +description: | + The job plan command is used to dry-run a job update to determine its effects. +--- + +# Command: job plan + +**Alias: `nomad plan`** + +The `job plan` command can be used to invoke the scheduler in a dry-run mode +with new jobs or when updating existing jobs to determine what would happen if +the job is submitted. Job files must conform to the [job specification] format. + +## Usage + +```plaintext +nomad job plan [options] +``` + +The `job plan` command requires a single argument, specifying the path to a file +containing an HCL [job specification]. This file will be read and the resulting +parsed job will be validated. If the supplied path is "-", the job file is read +from STDIN. Otherwise it is read from the file at the supplied path or +downloaded and read from URL specified. Nomad downloads the job file using +[`go-getter`] and supports `go-getter` syntax. + +Plan invokes a dry-run of the scheduler to determine the effects of submitting +either a new or updated version of a job. The plan will not result in any +changes to the cluster but gives insight into whether the job could be run +successfully and how it would affect existing allocations. + +A job modify index is returned with the plan. This value can be used when +submitting the job using [`nomad job run -check-index`], which will check that +the job was not modified between the plan and run command before invoking the +scheduler. This ensures the job has not been modified since the plan. + +A structured diff between the local and remote job is displayed to +give insight into what the scheduler will attempt to do and why. + +If the job has specified the region, the `-region` flag and `NOMAD_REGION` +environment variable are overridden and the job's region is used. + +Plan will return one of the following exit codes: + +- 0: No allocations created or destroyed. +- 1: Allocations created or destroyed. +- 255: Error determining plan results. + +## General Options + +@include 'general_options.mdx' + +## Plan Options + +- `-diff`: Determines whether the diff between the remote job and planned job is + shown. Defaults to true. + +- `-policy-override`: Sets the flag to force override any soft mandatory + Sentinel policies. + +- `-verbose`: Increase diff verbosity. + +## Examples + +Plan a new job that has not been previously submitted: + +```shell-session +$ nomad job plan job1.nomad +nomad job plan example.nomad ++ Job: "example" ++ Task Group: "cache" (1 create) + + Task: "redis" (forces create) + +Scheduler dry-run: +- All tasks successfully allocated. + +Job Modify Index: 0 +To submit the job with version verification run: + +nomad job run -check-index 0 example.nomad + +When running the job with the check-index flag, the job will only be run if the +job modify index given matches the server-side version. If the index has +changed, another user has modified the job and the plan's results are +potentially invalid. +``` + +Increase the count of an existing without sufficient cluster capacity: + +```shell-session +$ nomad job plan example.nomad ++/- Job: "example" ++/- Task Group: "cache" (7 create, 1 in-place update) + +/- Count: "1" => "8" (forces create) + Task: "redis" + +Scheduler dry-run: +- WARNING: Failed to place all allocations. + Task Group "cache" (failed to place 3 allocations): + * Resources exhausted on 1 nodes + * Dimension "cpu" exhausted on 1 nodes + +Job Modify Index: 15 +To submit the job with version verification run: + +nomad job run -check-index 15 example.nomad + +When running the job with the check-index flag, the job will only be run if the +job modify index given matches the server-side version. If the index has +changed, another user has modified the job and the plan's results are +potentially invalid. +``` + +Update an existing job such that it would cause a rolling update: + +```shell-session +$ nomad job plan example.nomad ++/- Job: "example" ++/- Task Group: "cache" (3 create/destroy update) + +/- Task: "redis" (forces create/destroy update) + +/- Config { + +/- image: "redis:2.8" => "redis:3.2" + port_map[0][db]: "6379" + } + +Scheduler dry-run: +- All tasks successfully allocated. +- Rolling update, next evaluation will be in 10s. + +Job Modify Index: 7 +To submit the job with version verification run: + +nomad job run -check-index 7 example.nomad + +When running the job with the check-index flag, the job will only be run if the +job modify index given matches the server-side version. If the index has +changed, another user has modified the job and the plan's results are +potentially invalid. +``` + +Add a task to the task group using verbose mode: + +```shell-session +$ nomad job plan -verbose example.nomad ++/- Job: "example" ++/- Task Group: "cache" (3 create/destroy update) + + Task: "my-website" (forces create/destroy update) + + Driver: "docker" + + KillTimeout: "5000000000" + + Config { + + image: "node:6.2" + + port_map[0][web]: "80" + } + + Resources { + + CPU: "500" + + DiskMB: "300" + + MemoryMB: "256" + + Network { + + MBits: "10" + + Dynamic Port { + + Label: "web" + } + } + } + + LogConfig { + + MaxFileSizeMB: "10" + + MaxFiles: "10" + } + + Service { + + Name: "website" + + PortLabel: "web" + + Check { + Command: "" + + Interval: "10000000000" + + Name: "alive" + Path: "" + Protocol: "" + + Timeout: "2000000000" + + Type: "tcp" + } + } + Task: "redis" + +Scheduler dry-run: +- All tasks successfully allocated. +- Rolling update, next evaluation will be in 10s. + +Job Modify Index: 7 +To submit the job with version verification run: + +nomad job run -check-index 7 example.nomad + +When running the job with the check-index flag, the job will only be run if the +job modify index given matches the server-side version. If the index has +changed, another user has modified the job and the plan's results are +potentially invalid. +``` + +[job specification]: /docs/job-specification +[hcl job specification]: /docs/job-specification +[`go-getter`]: https://github.com/hashicorp/go-getter +[`nomad job run -check-index`]: /docs/commands/job/run#check-index diff --git a/content/nomad/v0.11.x/content/docs/commands/job/promote.mdx b/content/nomad/v0.11.x/content/docs/commands/job/promote.mdx new file mode 100644 index 0000000000..dbdde983a6 --- /dev/null +++ b/content/nomad/v0.11.x/content/docs/commands/job/promote.mdx @@ -0,0 +1,224 @@ +--- +layout: docs +page_title: 'Commands: job promote' +sidebar_title: promote +description: | + The promote command is used to promote a job's canaries. +--- + +# Command: job promote + +The `job promote` command is used to promote task groups in the most recent +deployment for the given job. Promotion should occur when the deployment has +placed canaries for a task group and those canaries have been deemed healthy. +When a task group is promoted, the rolling upgrade of the remaining allocations +is unblocked. If the canaries are found to be unhealthy, the deployment may +either be failed using the "nomad deployment fail" command, the job can be +failed forward by submitting a new version or failed backwards by reverting to +an older version using the [job revert] command. + +## Usage + +```plaintext +nomad job promote [options] +``` + +The `job promote` command requires a single argument, a job ID or +prefix. When run without specifying any groups to promote, the promote command +promotes all task groups. The group flag can be specified multiple times to +select particular groups to promote. + +## General Options + +@include 'general_options.mdx' + +## Promote Options + +- `-group`: Group may be specified many times and is used to promote that + particular group. If no specific groups are specified, all groups are + promoted. + +- `-detach`: Return immediately instead of monitoring. A new evaluation ID + will be output, which can be used to examine the evaluation using the + [eval status] command. + +- `-verbose`: Show full information. + +## Examples + +Promote canaries in all groups: + +```shell-session +# Have two task groups waiting for promotion. +$ nomad status example +ID = example +Name = example +Submit Date = 07/25/17 18:35:05 UTC +Type = service +Priority = 50 +Datacenters = dc1 +Status = running +Periodic = false +Parameterized = false + +Summary +Task Group Queued Starting Running Failed Complete Lost +cache 0 0 3 0 0 0 +web 0 0 3 0 0 0 + +Latest Deployment +ID = 9fa81f27 +Status = running +Description = Deployment is running but requires manual promotion + +Deployed +Task Group Promoted Desired Canaries Placed Healthy Unhealthy +web false 2 1 1 0 0 +cache false 2 1 1 0 0 + +Allocations +ID Node ID Task Group Version Desired Status Created At +091377e5 a8dcce2d web 1 run running 07/25/17 18:35:05 UTC +d2b13584 a8dcce2d cache 1 run running 07/25/17 18:35:05 UTC +4bb185b7 a8dcce2d web 0 run running 07/25/17 18:31:34 UTC +9b6811ee a8dcce2d cache 0 run running 07/25/17 18:31:34 UTC +e0a2441b a8dcce2d cache 0 run running 07/25/17 18:31:34 UTC +f2409f7d a8dcce2d web 0 run running 07/25/17 18:31:34 UTC + +# Promote all groups +$ nomad job promote example +==> Monitoring evaluation "6c6e64ae" + Evaluation triggered by job "example" + Evaluation within deployment: "9fa81f27" + Allocation "8fa21654" created: node "a8dcce2d", group "web" + Allocation "9f6727a6" created: node "a8dcce2d", group "cache" + Evaluation status changed: "pending" -> "complete" +==> Evaluation "6c6e64ae" finished with status "complete" + +# Inspect the status and see both groups have been promoted. +$ nomad status example +ID = example +Name = example +Submit Date = 07/25/17 18:35:05 UTC +Type = service +Priority = 50 +Datacenters = dc1 +Status = running +Periodic = false +Parameterized = false + +Summary +Task Group Queued Starting Running Failed Complete Lost +cache 0 0 2 0 2 0 +web 0 0 2 0 2 0 + +Latest Deployment +ID = 9fa81f27 +Status = successful +Description = Deployment completed successfully + +Deployed +Task Group Promoted Desired Canaries Placed Healthy Unhealthy +web true 2 1 2 2 0 +cache true 2 1 2 2 0 + +Allocations +ID Node ID Task Group Version Desired Status Created At +8fa21654 a8dcce2d web 1 run running 07/25/17 18:35:21 UTC +9f6727a6 a8dcce2d cache 1 run running 07/25/17 18:35:21 UTC +091377e5 a8dcce2d web 1 run running 07/25/17 18:35:05 UTC +d2b13584 a8dcce2d cache 1 run running 07/25/17 18:35:05 UTC +4bb185b7 a8dcce2d web 0 stop complete 07/25/17 18:31:34 UTC +9b6811ee a8dcce2d cache 0 stop complete 07/25/17 18:31:34 UTC +e0a2441b a8dcce2d cache 0 stop complete 07/25/17 18:31:34 UTC +f2409f7d a8dcce2d web 0 stop complete 07/25/17 18:31:34 UTC +``` + +Promote canaries in a particular group: + +```shell-session +# Have two task groups waiting for promotion. +$ nomad status example +ID = example +Name = example +Submit Date = 07/25/17 18:37:14 UTC +Type = service +Priority = 50 +Datacenters = dc1 +Status = running +Periodic = false +Parameterized = false + +Summary +Task Group Queued Starting Running Failed Complete Lost +cache 0 0 3 0 0 0 +web 0 0 3 0 0 0 + +Latest Deployment +ID = a6b87a6c +Status = running +Description = Deployment is running but requires manual promotion + +Deployed +Task Group Promoted Desired Canaries Placed Healthy Unhealthy +cache false 2 1 1 1 0 +web false 2 1 1 1 0 + +Allocations +ID Node ID Task Group Version Desired Status Created At +3071ab8f 6240eed6 web 1 run running 07/25/17 18:37:14 UTC +eeeed13b 6240eed6 cache 1 run running 07/25/17 18:37:14 UTC +0ee7800c 6240eed6 cache 0 run running 07/25/17 18:37:08 UTC +a714a926 6240eed6 cache 0 run running 07/25/17 18:37:08 UTC +cee52788 6240eed6 web 0 run running 07/25/17 18:37:08 UTC +ee8f972e 6240eed6 web 0 run running 07/25/17 18:37:08 UTC + +# Promote only the cache canaries +$ nomad job promote -group=cache example +==> Monitoring evaluation "37383564" + Evaluation triggered by job "example" + Evaluation within deployment: "a6b87a6c" + Allocation "bbddf5c3" created: node "6240eed6", group "cache" + Evaluation status changed: "pending" -> "complete" +==> Evaluation "37383564" finished with status "complete" + +# Inspect the status and see that only the cache canaries are promoted +$ nomad status example +ID = example +Name = example +Submit Date = 07/25/17 18:37:14 UTC +Type = service +Priority = 50 +Datacenters = dc1 +Status = running +Periodic = false +Parameterized = false + +Summary +Task Group Queued Starting Running Failed Complete Lost +cache 0 0 2 0 2 0 +web 0 0 3 0 0 0 + +Latest Deployment +ID = a6b87a6c +Status = running +Description = Deployment is running but requires manual promotion + +Deployed +Task Group Promoted Desired Canaries Placed Healthy Unhealthy +web false 2 1 1 1 0 +cache true 2 1 2 2 0 + +Allocations +ID Node ID Task Group Version Desired Status Created At +bbddf5c3 6240eed6 cache 1 run running 07/25/17 18:37:40 UTC +eeeed13b 6240eed6 cache 1 run running 07/25/17 18:37:14 UTC +3071ab8f 6240eed6 web 1 run running 07/25/17 18:37:14 UTC +a714a926 6240eed6 cache 0 stop complete 07/25/17 18:37:08 UTC +cee52788 6240eed6 web 0 run running 07/25/17 18:37:08 UTC +ee8f972e 6240eed6 web 0 run running 07/25/17 18:37:08 UTC +0ee7800c 6240eed6 cache 0 stop complete 07/25/17 18:37:08 UTC +``` + +[job revert]: /docs/commands/job/revert +[eval status]: /docs/commands/eval-status diff --git a/content/nomad/v0.11.x/content/docs/commands/job/revert.mdx b/content/nomad/v0.11.x/content/docs/commands/job/revert.mdx new file mode 100644 index 0000000000..aafe64a0ca --- /dev/null +++ b/content/nomad/v0.11.x/content/docs/commands/job/revert.mdx @@ -0,0 +1,122 @@ +--- +layout: docs +page_title: 'Commands: job revert' +sidebar_title: revert +description: | + The revert command is used to revert to a prior version of the job. +--- + +# Command: job revert + +The `job revert` command is used to revert a job to a prior version of the +job. The available versions to revert to can be found using [`job history`] +command. + +The revert command will use a Consul token with the following preference: +first the `-consul-token` flag, then the `$CONSUL_HTTP_TOKEN` environment variable. +Because the consul token used to [run] the targeted job version was not +persisted, it must be provided to revert if the targeted job version includes +Consul Connect enabled services and the Nomad servers were configured to require +[consul service identity] authentication. + +The revert command will use a Vault token with the following preference: +first the `-vault-token` flag, then the `$VAULT_TOKEN` environment variable. +Because the vault token used to [run] the targeted job version was not +persisted, it must be provided to revert if the targeted job version includes +Vault policies and the Nomad servers were configured to require [vault policy] +authentication. + +## Usage + +```plaintext +nomad job revert [options] +``` + +The `job revert` command requires two inputs, the job ID and the version of that +job to revert to. + +## General Options + +@include 'general_options.mdx' + +## Revert Options + +- `-detach`: Return immediately instead of monitoring. A new evaluation ID + will be output, which can be used to examine the evaluation using the + [eval status] command. + +- `-consul-token`: If set, the passed Consul token is sent along with the revert + request to the Nomad servers. This overrides the token found in the + \$CONSUL_HTTP_TOKEN environment variable. + +- `-vault-token`: If set, the passed Vault token is sent along with the revert + request to the Nomad servers. This overrides the token found in the + \$VAULT_TOKEN environment variable. + +- `-verbose`: Show full information. + +## Examples + +Revert to an older version of a job: + +```shell-session +$ nomad job history -p example +Version = 1 +Stable = false +Submit Date = 07/25/17 21:27:30 UTC +Diff = ++/- Job: "example" ++/- Task Group: "cache" + +/- Task: "redis" + +/- Config { + +/- image: "redis:3.2" => "redis:4.0" + port_map[0][db]: "6379" + } + +Version = 0 +Stable = false +Submit Date = 07/25/17 21:27:18 UTC + +$ nomad job revert example 0 +==> Monitoring evaluation "faff5c30" + Evaluation triggered by job "example" + Evaluation within deployment: "e17c8592" + Allocation "4ed0ca3b" modified: node "e8a2243d", group "cache" + Evaluation status changed: "pending" -> "complete" +==> Evaluation "faff5c30" finished with status "complete" + +$ nomad job history -p example +Version = 2 +Stable = true +Submit Date = 07/25/17 21:27:43 UTC +Diff = ++/- Job: "example" ++/- Task Group: "cache" + +/- Task: "redis" + +/- Config { + +/- image: "redis:4.0" => "redis:3.2" + port_map[0][db]: "6379" + } + +Version = 1 +Stable = false +Submit Date = 07/25/17 21:27:30 UTC +Diff = ++/- Job: "example" ++/- Task Group: "cache" + +/- Task: "redis" + +/- Config { + +/- image: "redis:3.2" => "redis:4.0" + port_map[0][db]: "6379" + } + +Version = 0 +Stable = false +Submit Date = 07/25/17 21:27:18 UTC +``` + +[`job history`]: /docs/commands/job/history +[eval status]: /docs/commands/eval-status +[consul service identity]: /docs/configuration/consul#allow_unauthenticated +[vault policy]: /docs/configuration/vault#allow_unauthenticated +[run]: /docs/commands/job/run diff --git a/content/nomad/v0.11.x/content/docs/commands/job/run.mdx b/content/nomad/v0.11.x/content/docs/commands/job/run.mdx new file mode 100644 index 0000000000..4e039c3b5d --- /dev/null +++ b/content/nomad/v0.11.x/content/docs/commands/job/run.mdx @@ -0,0 +1,141 @@ +--- +layout: docs +page_title: 'Commands: job run' +sidebar_title: run +description: | + The job run command is used to run a new job. +--- + +# Command: job run + +**Alias: `nomad run`** + +The `job run` command is used to submit new jobs to Nomad or to update existing +jobs. Job files must conform to the [job specification] format. + +## Usage + +```plaintext +nomad job run [options] +``` + +The `job run` command requires a single argument, specifying the path to a file +containing a valid [job specification]. This file will be read and the job will +be submitted to Nomad for scheduling. If the supplied path is "-", the job file +is read from STDIN. Otherwise it is read from the file at the supplied path or +downloaded and read from URL specified. Nomad downloads the job file using +[`go-getter`] and supports `go-getter` syntax. + +By default, on successful job submission the run command will enter an +interactive monitor and display log information detailing the scheduling +decisions and placement information for the provided job. The monitor will +exit after scheduling has finished or failed. + +On successful job submission and scheduling, exit code 0 will be returned. If +there are job placement issues encountered (unsatisfiable constraints, resource +exhaustion, etc), then the exit code will be 2. Any other errors, including +client connection issues or internal errors, are indicated by exit code 1. + +If the job has specified the region, the -region flag and `\$NOMAD_REGION` +environment variable are overridden and the job's region is used. + +The run command will set the `consul_token` of the job based on the following +precedence, going from highest to lowest: the `-consul-token` flag, the +`$CONSUL_HTTP_TOKEN` environment variable and finally the value in the job file. + +The run command will set the `vault_token` of the job based on the following +precedence, going from highest to lowest: the `-vault-token` flag, the +`$VAULT_TOKEN` environment variable and finally the value in the job file. + +## General Options + +@include 'general_options.mdx' + +## Run Options + +- `-check-index`: If set, the job is only registered or + updated if the passed job modify index matches the server side version. + If a check-index value of zero is passed, the job is only registered if it does + not yet exist. If a non-zero value is passed, it ensures that the job is being + updated from a known state. The use of this flag is most common in conjunction + with [`job plan` command]. + +- `-detach`: Return immediately instead of monitoring. A new evaluation ID + will be output, which can be used to examine the evaluation using the + [eval status] command. + +- `-output`: Output the JSON that would be submitted to the HTTP API without + submitting the job. + +- `-policy-override`: Sets the flag to force override any soft mandatory + Sentinel policies. + +- `-consul-token`: If set, the passed Consul token is stored in the job before + sending to the Nomad servers. This allows passing the Consul token without + storing it in the job file. This overrides the token found in the \$CONSUL_HTTP_TOKEN + environment variable and that found in the job. + +- `-vault-token`: If set, the passed Vault token is stored in the job before + sending to the Nomad servers. This allows passing the Vault token without + storing it in the job file. This overrides the token found in the \$VAULT_TOKEN + environment variable and that found in the job. + +- `-verbose`: Show full information. + +## Examples + +Schedule the job contained in the file `job1.nomad`, monitoring placement: + +```shell-session +$ nomad job run job1.nomad +==> Monitoring evaluation "52dee78a" + Evaluation triggered by job "example" + Evaluation within deployment: "62eb607c" + Allocation "5e0b39f0" created: node "3e84d3d2", group "group1" + Allocation "5e0b39f0" status changed: "pending" -> "running" + Evaluation status changed: "pending" -> "complete" +==> Evaluation "52dee78a" finished with status "complete" +``` + + Update the job using `check-index`: + +```shell-session +$ nomad job run -check-index 5 example.nomad +Enforcing job modify index 5: job exists with conflicting job modify index: 6 +Job not updated + +$ nomad job run -check-index 6 example.nomad +==> Monitoring evaluation "5ef16dff" + Evaluation triggered by job "example" + Evaluation within deployment: "62eb607c" + Allocation "6ec7d16f" modified: node "6e1f9bf6", group "cache" + Evaluation status changed: "pending" -> "complete" +==> Evaluation "5ef16dff" finished with status "complete" +``` + +Schedule the job contained in `job1.nomad` and return immediately: + +```shell-session +$ nomad job run -detach job1.nomad +4947e728 +``` + +Schedule a job which cannot be successfully placed. This results in a scheduling +failure and the specifics of the placement are printed: + +```shell-session +$ nomad job run failing.nomad +==> Monitoring evaluation "2ae0e6a5" + Evaluation triggered by job "example" + Evaluation status changed: "pending" -> "complete" +==> Evaluation "2ae0e6a5" finished with status "complete" but failed to place all allocations: + Task Group "cache" (failed to place 1 allocation): + * Class "foo" filtered 1 nodes + * Constraint "${attr.kernel.name} = linux" filtered 1 nodes + Evaluation "67493a64" waiting for additional capacity to place remainder +``` + +[`go-getter`]: https://github.com/hashicorp/go-getter +[`job plan` command]: /docs/commands/job/plan +[eval status]: /docs/commands/eval-status +[job specification]: /docs/job-specification diff --git a/content/nomad/v0.11.x/content/docs/commands/job/status.mdx b/content/nomad/v0.11.x/content/docs/commands/job/status.mdx new file mode 100644 index 0000000000..490d2a4e6d --- /dev/null +++ b/content/nomad/v0.11.x/content/docs/commands/job/status.mdx @@ -0,0 +1,252 @@ +--- +layout: docs +page_title: 'Commands: job status' +sidebar_title: status +description: | + Display information and status of jobs. +--- + +# Command: job status + +The `job status` command displays status information for a job. + +## Usage + +```plaintext +nomad job status [options] [job] +``` + +This command accepts an optional job ID or prefix as the sole argument. If there +is an exact match based on the provided job ID or prefix, then information about +the specific job is queried and displayed. Otherwise, a list of matching jobs +and information will be displayed. + +If the ID is omitted, the command lists out all of the existing jobs and a few +of the most useful status fields for each. As of Nomad 0.7.1, alloc status also +shows allocation modification time in addition to create time. When the +`-verbose` flag is not set, allocation creation and modify times are shown in a +shortened relative time format like `5m ago`. + +## General Options + +@include 'general_options.mdx' + +## Status Options + +- `-all-allocs`: Display all allocations matching the job ID, even those from an + older instance of the job. + +- `-evals`: Display the evaluations associated with the job. + +- `-short`: Display short output. Used only when a single node is being queried. + Drops verbose node allocation data from the output. + +- `-verbose`: Show full information. Allocation create and modify times are + shown in `yyyy/mm/dd hh:mm:ss` format. + +## Examples + +List of all jobs: + +```shell-session +$ nomad job status +ID Type Priority Status Submit Date +job1 service 80 running 07/25/17 15:47:11 UTC +job2 batch 40 complete 07/24/17 19:22:11 UTC +job3 service 50 dead (stopped) 07/22/17 16:34:48 UTC +``` + +Short view of a specific job: + +```shell-session +$ nomad job status -short job1 +ID = job1 +Name = Test Job +Submit Date = 07/25/17 15:47:11 UTC +Type = service +Priority = 3 +Datacenters = dc1,dc2,dc3 +Status = pending +Periodic = false +Parameterized = false +``` + +Full status information of a job: + +```shell-session +$ nomad job status example +ID = example +Name = example +Submit Date = 07/25/17 15:53:04 UTC +Type = service +Priority = 50 +Datacenters = dc1 +Status = running +Periodic = false +Parameterized = false + +Summary +Task Group Queued Starting Running Failed Complete Lost +cache 0 0 1 0 0 0 + +Latest Deployment +ID = 6294be0c +Status = successful +Description = Deployment completed successfully + +Deployed +Task Group Desired Placed Healthy Unhealthy +cache 1 1 1 0 + +Allocations +ID Node ID Task Group Version Desired Status Created Modified +478ce836 5ed166e8 cache 0 run running 5m ago 5m ago +``` + +Full status information of a periodic job: + +```shell-session +$ nomad job status example +ID = example +Name = example +Submit Date = 07/25/17 15:59:52 UTC +Type = batch +Priority = 50 +Datacenters = dc1 +Status = running +Periodic = true +Parameterized = false +Next Periodic Launch = 07/25/17 16:00:30 UTC (5s from now) + +Children Job Summary +Pending Running Dead +0 3 0 + +Previously Launched Jobs +ID Status +example/periodic-1500998400 running +example/periodic-1500998410 running +example/periodic-1500998420 running +``` + +Full status information of a parameterized job: + +```shell-session +$ nomad job status example +ID = example +Name = example +Submit Date = 07/25/17 15:59:52 UTC +Type = batch +Priority = 50 +Datacenters = dc1 +Status = running +Periodic = false +Parameterized = true + +Parameterized Job +Payload = required +Required Metadata = foo +Optional Metadata = bar + +Parameterized Job Summary +Pending Running Dead +0 2 0 + +Dispatched Jobs +ID Status +example/dispatch-1485411496-58f24d2d running +example/dispatch-1485411499-fa2ee40e running +``` + +Full status information of a job with placement failures: + +```shell-session +$ nomad job status example +ID = example +Name = example +Submit Date = 07/25/17 15:55:27 UTC +Type = service +Priority = 50 +Datacenters = dc1 +Status = running +Periodic = false +Parameterized = false + +Summary +Task Group Queued Starting Running Failed Complete Lost +cache 1 0 4 0 0 0 + +Placement Failure +Task Group "cache": + * Resources exhausted on 1 nodes + * Dimension "cpu" exhausted on 1 nodes + +Latest Deployment +ID = bb4b2fb1 +Status = running +Description = Deployment is running + +Deployed +Task Group Desired Placed Healthy Unhealthy +cache 5 4 4 0 + +Allocations +ID Node ID Task Group Version Desired Status Created Modified +048c1e9e 3f38ecb4 cache 0 run running 5m ago 5m ago +250f9dec 3f38ecb4 cache 0 run running 5m ago 5m ago +2eb772a1 3f38ecb4 cache 0 run running 5m ago 5m ago +a17b7d3d 3f38ecb4 cache 0 run running 5m ago 5m ago +``` + +Full status information showing evaluations with a placement failure. The in +progress evaluation denotes that Nomad is blocked waiting for resources to +become available so that it can place the remaining allocations. + +```shell-session +$ nomad job status -evals example +ID = example +Name = example +Submit Date = 07/25/17 15:55:27 UTC +Type = service +Priority = 50 +Datacenters = dc1 +Status = running +Periodic = false +Parameterized = false + +Summary +Task Group Queued Starting Running Failed Complete Lost +cache 1 0 4 0 0 0 + +Evaluations +ID Priority Triggered By Status Placement Failures +e44a39e8 50 deployment-watcher canceled false +97018573 50 deployment-watcher complete true +d5a7300c 50 deployment-watcher canceled false +f05a4495 50 deployment-watcher complete true +e3f3bdb4 50 deployment-watcher canceled false +b5f08700 50 deployment-watcher complete true +73bb867a 50 job-register blocked N/A - In Progress +85052989 50 job-register complete true + +Placement Failure +Task Group "cache": + * Resources exhausted on 1 nodes + * Dimension "cpu exhausted" exhausted on 1 nodes + +Latest Deployment +ID = bb4b2fb1 +Status = running +Description = Deployment is running + +Deployed +Task Group Desired Placed Healthy Unhealthy +cache 5 4 4 0 + +Allocations +ID Node ID Task Group Version Desired Status Created Modified +048c1e9e 3f38ecb4 cache 0 run running 07/25/17 15:55:27 UTC 07/25/17 15:55:27 UTC +250f9dec 3f38ecb4 cache 0 run running 07/25/17 15:55:27 UTC 07/25/17 15:55:27 UTC +2eb772a1 3f38ecb4 cache 0 run running 07/25/17 15:55:27 UTC 07/25/17 15:55:27 UTC +a17b7d3d 3f38ecb4 cache 0 run running 07/25/17 15:55:27 UTC 07/25/17 15:55:27 UTC +``` diff --git a/content/nomad/v0.11.x/content/docs/commands/job/stop.mdx b/content/nomad/v0.11.x/content/docs/commands/job/stop.mdx new file mode 100644 index 0000000000..f31abde940 --- /dev/null +++ b/content/nomad/v0.11.x/content/docs/commands/job/stop.mdx @@ -0,0 +1,67 @@ +--- +layout: docs +page_title: 'Commands: job stop' +sidebar_title: stop +description: | + The job stop command is used to stop a running job. +--- + +# Command: job stop + +**Alias: `nomad stop`** + +The `job stop` command is used to stop a running job and signals the scheduler +to cancel all of the running allocations. + +## Usage + +```plaintext +nomad job stop [options] +``` + +The `job stop` command requires a single argument, specifying the job ID or +prefix to cancel. If there is an exact match based on the provided job ID or +prefix, then the job will be cancelled. Otherwise, a list of matching jobs and +information will be displayed. + +Stop will issue a request to deregister the matched job and then invoke an +interactive monitor that exits automatically once the scheduler has processed +the request. It is safe to exit the monitor early using ctrl+c. + +## General Options + +@include 'general_options.mdx' + +## Stop Options + +- `-detach`: Return immediately instead of entering monitor mode. After the + deregister command is submitted, a new evaluation ID is printed to the screen, + which can be used to examine the evaluation using the [eval status] command. + +- `-verbose`: Show full information. + +- `-yes`: Automatic yes to prompts. + +- `-purge`: Purge is used to stop the job and purge it from the system. If not + set, the job will still be queryable and will be purged by the garbage + collector. + +## Examples + +Stop the job with ID "job1": + +```shell-session +$ nomad job stop job1 +==> Monitoring evaluation "43bfe672" + Evaluation status changed: "pending" -> "complete" +==> Evaluation "43bfe672" finished with status "complete" +``` + +Stop the job with ID "job1" and return immediately: + +```shell-session +$ nomad job stop -detach job1 +507d26cb +``` + +[eval status]: /docs/commands/eval-status diff --git a/content/nomad/v0.11.x/content/docs/commands/job/validate.mdx b/content/nomad/v0.11.x/content/docs/commands/job/validate.mdx new file mode 100644 index 0000000000..0339a00d14 --- /dev/null +++ b/content/nomad/v0.11.x/content/docs/commands/job/validate.mdx @@ -0,0 +1,62 @@ +--- +layout: docs +page_title: 'Commands: job validate' +sidebar_title: validate +description: > + The job validate command is used to check a job specification for syntax + errors and validation problems. +--- + +# Command: job validate + +**Alias: `nomad validate`** + +The `job validate` command is used to check an HCL [job specification] for any +syntax errors or validation problems. + +## Usage + +```plaintext +nomad job validate +``` + +The `job validate` command requires a single argument, specifying the path to a +file containing an HCL [job specification]. This file will be read and the job +checked for any problems. If the supplied path is "-", the job file is read from +STDIN. Otherwise it is read from the file at the supplied path or downloaded and +read from URL specified. Nomad downloads the job file using [`go-getter`] and +supports `go-getter` syntax. + +On successful validation, exit code 0 will be returned, otherwise an exit code +of 1 indicates an error. + +## Examples + +Validate a job with invalid syntax: + +```shell-session +$ nomad job validate example.nomad +Job validation errors: +1 error(s) occurred: + +* group "cache" -> task "redis" -> config: 1 error(s) occurred: + +* field "image" is required +``` + +Validate a job that has a configuration that causes warnings: + +```shell-session +$ nomad job validate example.nomad +Job Warnings: +1 warning(s): + +* Group "cache" has warnings: 1 error(s) occurred: + +* Update max parallel count is greater than task group count (6 > 3). A destructive change would result in the simultaneous replacement of all allocations. + +Job validation successful +``` + +[`go-getter`]: https://github.com/hashicorp/go-getter +[job specification]: /docs/job-specification diff --git a/content/nomad/v0.11.x/content/docs/commands/monitor.mdx b/content/nomad/v0.11.x/content/docs/commands/monitor.mdx new file mode 100644 index 0000000000..3fa44e165d --- /dev/null +++ b/content/nomad/v0.11.x/content/docs/commands/monitor.mdx @@ -0,0 +1,56 @@ +--- +layout: docs +page_title: 'Commands: monitor' +sidebar_title: monitor +description: | + Stream the logs of a running nomad agent. +--- + +# Command: monitor + +The `nomad monitor` command streams log messages for a given agent. + +## Usage + +```plaintext +nomad monitor [options] +``` + +The nomad monitor command can be used to stream the logs of a +running Nomad agent. Monitor will follow logs and exit when +interrupted or until the remote agent quits. + +The power of the monitor command is that it allows you to log +the agent at a relatively high log level (such as "warn"), +but still access debug logs and watch the debug logs if necessary. +The monitor command also allows you to specify a single client node id to follow. + +## General Options + +@include 'general_options.mdx' + +## Monitor Options + +- `-log-level`: The log level to use for log streaming. Defaults to `info`. + Possible values include `trace`, `debug`, `info`, `warn`, `error` + +- `-node-id`: Specifies the client node-id to stream logs from. If no + node-id is given the nomad server from the -address flag will be used. + +- `-server-id`: Specifies the nomad server id to stream logs from. Accepts + server names from `nomad server members` and also a special `leader` option + which will target the current leader. + +- `-json`: Stream logs in json format + +## Examples + +```shell-session +$ nomad monitor -log-level=DEBUG -node-id=a57b2adb-1a30-2dda-8df0-25abb0881952 +2019-11-04T12:22:08.528-0500 [DEBUG] http: request complete: method=GET path=/v1/agent/health?type=server duration=1.445739ms +2019-11-04T12:22:09.892-0500 [DEBUG] nomad: memberlist: Stream connection from=127.0.0.1:53628 + +$ nomad monitor -log-level=DEBUG -json=true +{"@level":"debug","@message":"request complete"...} + +``` diff --git a/content/nomad/v0.11.x/content/docs/commands/namespace/apply.mdx b/content/nomad/v0.11.x/content/docs/commands/namespace/apply.mdx new file mode 100644 index 0000000000..e3417386b8 --- /dev/null +++ b/content/nomad/v0.11.x/content/docs/commands/namespace/apply.mdx @@ -0,0 +1,42 @@ +--- +layout: docs +page_title: 'Commands: namespace apply' +sidebar_title: apply +description: | + The namespace apply command is used create or update a namespace. +--- + +# Command: namespace apply + +The `namespace apply` command is used create or update a namespace. + +~> Namespace commands are new in Nomad 0.7 and are only available with Nomad +Enterprise. + +## Usage + +```plaintext +nomad namespace apply [options] +``` + +The `namespace apply` command requires the name of the namespace to be created +or updated. + +## General Options + +@include 'general_options.mdx' + +## Apply Options + +- `-quota` : An optional quota to apply to the namespace. + +- `-description` : An optional human readable description for the namespace. + +## Examples + +Create a namespace with a quota + +```shell-session +$ nomad namespace apply -description "Prod API servers" -quota prod api-prod +Successfully applied namespace "api-prod"! +``` diff --git a/content/nomad/v0.11.x/content/docs/commands/namespace/delete.mdx b/content/nomad/v0.11.x/content/docs/commands/namespace/delete.mdx new file mode 100644 index 0000000000..d25485ecdf --- /dev/null +++ b/content/nomad/v0.11.x/content/docs/commands/namespace/delete.mdx @@ -0,0 +1,35 @@ +--- +layout: docs +page_title: 'Commands: namespace delete' +sidebar_title: delete +description: | + The namespace delete command is used to delete a namespace. +--- + +# Command: namespace delete + +The `namespace delete` command is used delete a namespace. + +~> Namespace commands are new in Nomad 0.7 and are only available with Nomad +Enterprise. + +## Usage + +```plaintext +nomad namespace delete [options] +``` + +The `namespace delete` command requires the name of the namespace to be deleted. + +## General Options + +@include 'general_options.mdx' + +## Examples + +Delete a namespace + +```shell-session +$ nomad namespace delete api-prod +Successfully deleted namespace "api-prod"! +``` diff --git a/content/nomad/v0.11.x/content/docs/commands/namespace/index.mdx b/content/nomad/v0.11.x/content/docs/commands/namespace/index.mdx new file mode 100644 index 0000000000..7a8b064ae6 --- /dev/null +++ b/content/nomad/v0.11.x/content/docs/commands/namespace/index.mdx @@ -0,0 +1,33 @@ +--- +layout: docs +page_title: 'Commands: namespace' +sidebar_title: namespace +description: | + The namespace command is used to interact with namespaces. +--- + +# Command: namespace + +The `namespace` command is used to interact with namespaces. + +~> Namespace commands are new in Nomad 0.7 and are only available with Nomad +Enterprise. + +## Usage + +Usage: `nomad namespace [options]` + +Run `nomad namespace -h` for help on that subcommand. The following +subcommands are available: + +- [`namespace apply`][apply] - Create or update a namespace +- [`namespace delete`][delete] - Delete a namespace +- [`namespace inspect`][inspect] - Inspect a namespace +- [`namespace list`][list] - List available namespaces +- [`namespace status`][status] - Display a namespace's status + +[apply]: /docs/commands/namespace/apply 'Create or update a namespace' +[delete]: /docs/commands/namespace/delete 'Delete a namespace' +[inspect]: /docs/commands/namespace/inspect 'Inspect a namespace' +[list]: /docs/commands/namespace/list 'List available namespaces' +[status]: /docs/commands/namespace/status "Display a namespace's status" diff --git a/content/nomad/v0.11.x/content/docs/commands/namespace/inspect.mdx b/content/nomad/v0.11.x/content/docs/commands/namespace/inspect.mdx new file mode 100644 index 0000000000..b859061848 --- /dev/null +++ b/content/nomad/v0.11.x/content/docs/commands/namespace/inspect.mdx @@ -0,0 +1,45 @@ +--- +layout: docs +page_title: 'Commands: namespace inspect' +sidebar_title: inspect +description: > + The namespace inspect command is used to view raw information about a + particular namespace. +--- + +# Command: namespace inspect + +The `namespace inspect` command is used to view raw information about a particular +namespace. + +~> Namespace commands are new in Nomad 0.7 and are only available with Nomad +Enterprise. + +## Usage + +```plaintext +nomad namespace inspect [options] +``` + +## General Options + +@include 'general_options.mdx' + +## Inspect Options + +- `-t` : Format and display the namespace using a Go template. + +## Examples + +Inspect a namespace: + +```shell-session +$ nomad namespace inspect default +{ + "CreateIndex": 5, + "Description": "Default shared namespace", + "ModifyIndex": 38, + "Name": "default", + "Quota": "shared-default-quota" +} +``` diff --git a/content/nomad/v0.11.x/content/docs/commands/namespace/list.mdx b/content/nomad/v0.11.x/content/docs/commands/namespace/list.mdx new file mode 100644 index 0000000000..dc03ce9a0b --- /dev/null +++ b/content/nomad/v0.11.x/content/docs/commands/namespace/list.mdx @@ -0,0 +1,46 @@ +--- +layout: docs +page_title: 'Commands: namespace list' +sidebar_title: list +description: | + The namespace list command is used to list namespaces. +--- + +# Command: namespace list + +The `namespace list` command is used list available namespaces. + +~> Namespace commands are new in Nomad 0.7 and are only available with Nomad +Enterprise. + +## Usage + +```plaintext +nomad namespace list [options] +``` + +The `namespace list` command requires no arguments. + +## General Options + +@include 'general_options.mdx' + +## List Options + +- `-json` : Output the namespaces in their JSON format. + +- `-t` : Format and display the namespaces using a Go template. + +## Examples + +List all namespaces: + +```shell-session +$ nomad namespace list +Name Description +default Default shared namespace +api-prod Production instances of backend API servers +api-qa QA instances of backend API servers +web-prod Production instances of webservers +web-qa QA instances of webservers +``` diff --git a/content/nomad/v0.11.x/content/docs/commands/namespace/status.mdx b/content/nomad/v0.11.x/content/docs/commands/namespace/status.mdx new file mode 100644 index 0000000000..3ee41fbefe --- /dev/null +++ b/content/nomad/v0.11.x/content/docs/commands/namespace/status.mdx @@ -0,0 +1,41 @@ +--- +layout: docs +page_title: 'Commands: namespace status' +sidebar_title: status +description: > + The namespace status command is used to view the status of a particular + namespace. +--- + +# Command: namespace status + +The `namespace status` command is used to view the status of a particular +namespace. + +~> Namespace commands are new in Nomad 0.7 and are only available with Nomad +Enterprise. + +## Usage + +```plaintext +nomad namespace status [options] +``` + +## General Options + +@include 'general_options.mdx' + +## Examples + +View the status of a namespace: + +```shell-session +$ nomad namespace status default +Name = default +Description = Default shared namespace +Quota = shared-default-quota + +Quota Limits +Region CPU Usage Memory Usage +global 500 / 2500 256 / 2000 +``` diff --git a/content/nomad/v0.11.x/content/docs/commands/node/config.mdx b/content/nomad/v0.11.x/content/docs/commands/node/config.mdx new file mode 100644 index 0000000000..7477749b67 --- /dev/null +++ b/content/nomad/v0.11.x/content/docs/commands/node/config.mdx @@ -0,0 +1,59 @@ +--- +layout: docs +page_title: 'Commands: node config' +sidebar_title: config +description: | + The node config command is used to view or modify client configuration. +--- + +# Command: node config + +The `node config` command is used to view or modify client configuration +details. This command only works on client nodes, and can be used to update +the running client configurations it supports. + +## Usage + +```plaintext +nomad node config [options] +``` + +The arguments behave differently depending on the flags given. See each flag's +description below for specific usage information and requirements. + +## General Options + +@include 'general_options.mdx' + +## Node Config Options + +- `-servers`: List the client's known servers. Client nodes do not participate + in the gossip pool, and instead register with these servers periodically over + the network. The initial value of this list may come from configuration files + using the [`servers`] configuration option in the client block. + +- `-update-servers`: Updates the client's server list using the provided + arguments. Multiple server addresses may be passed using multiple arguments. + When updating the servers list, you must specify ALL of the server nodes you + wish to configure. The set is updated atomically. It is an error to specify + this flag without any server addresses. If you do _not_ specify a port for each + server address, the default port `4647` will be used. + +## Examples + +Query the currently known servers: + +```shell-session +$ nomad node config -servers +server1:4647 +server2:4647 +``` + +Update the list of servers: + +```shell-session +$ nomad node config -update-servers server1:4647 server2:4647 server3:4647 server4 + +``` + +[`servers`]: /docs/configuration/client#servers diff --git a/content/nomad/v0.11.x/content/docs/commands/node/drain.mdx b/content/nomad/v0.11.x/content/docs/commands/node/drain.mdx new file mode 100644 index 0000000000..1b8e3cf434 --- /dev/null +++ b/content/nomad/v0.11.x/content/docs/commands/node/drain.mdx @@ -0,0 +1,137 @@ +--- +layout: docs +page_title: 'Commands: node drain' +sidebar_title: drain +description: | + The node drain command is used to configure a node's drain strategy. +--- + +# Command: node drain + +The `node drain` command is used to toggle drain mode on a given node. Drain +mode prevents any new tasks from being allocated to the node, and begins +migrating all existing allocations away. Allocations will be migrated according +to their [`migrate`][migrate] stanza until the drain's deadline is reached. + +By default the `node drain` command blocks until a node is done draining and +all allocations have terminated. Canceling the `node drain` command _will not_ +cancel the drain. Drains may be canceled by using the `-disable` parameter +below. + +When draining more than one node at a time, it is recommended you first disable +[scheduling eligibility][eligibility] on all nodes that will be drained. For +example if you are decommissioning an entire class of nodes, first run `node eligibility -disable` on all of their node IDs, and then run `node drain -enable`. This will ensure allocations drained from the first node are not +placed on another node about to be drained. + +The [node status] command compliments this nicely by providing the current drain +status of a given node. + +See the [Workload Migration guide] for detailed examples of node draining. + +## Usage + +```plaintext +nomad node drain [options] +``` + +A `-self` flag can be used to drain the local node. If this is not supplied, a +node ID or prefix must be provided. If there is an exact match, the drain mode +will be adjusted for that node. Otherwise, a list of matching nodes and +information will be displayed. + +It is also required to pass one of `-enable` or `-disable`, depending on which +operation is desired. + +## General Options + +@include 'general_options.mdx' + +## Drain Options + +- `-enable`: Enable node drain mode. + +- `-disable`: Disable node drain mode. + +- `-deadline`: Set the deadline by which all allocations must be moved off the + node. Remaining allocations after the deadline are force removed from the + node. Defaults to 1 hour. + +- `-detach`: Return immediately instead of entering monitor mode. + +- `-monitor`: Enter monitor mode directly without modifying the drain status. + +- `-force`: Force remove allocations off the node immediately. + +- `-no-deadline`: No deadline allows the allocations to drain off the node + without being force stopped after a certain deadline. + +- `-ignore-system`: Ignore system allows the drain to complete without stopping + system job allocations. By default system jobs are stopped last. + +- `-keep-ineligible`: Keep ineligible will maintain the node's scheduling + ineligibility even if the drain is being disabled. This is useful when an + existing drain is being cancelled but additional scheduling on the node is not + desired. + +- `-self`: Drain the local node. + +- `-yes`: Automatic yes to prompts. + +## Examples + +Enable drain mode on node with ID prefix "4d2ba53b": + +```shell-session +$ nomad node drain -enable f4e8a9e5 +Are you sure you want to enable drain mode for node "f4e8a9e5-30d8-3536-1e6f-cda5c869c35e"? [y/N] y +2018-03-30T23:13:16Z: Ctrl-C to stop monitoring: will not cancel the node drain +2018-03-30T23:13:16Z: Node "f4e8a9e5-30d8-3536-1e6f-cda5c869c35e" drain strategy set +2018-03-30T23:13:17Z: Alloc "1877230b-64d3-a7dd-9c31-dc5ad3c93e9a" marked for migration +2018-03-30T23:13:17Z: Alloc "1877230b-64d3-a7dd-9c31-dc5ad3c93e9a" draining +2018-03-30T23:13:17Z: Alloc "1877230b-64d3-a7dd-9c31-dc5ad3c93e9a" status running -> complete +2018-03-30T23:13:29Z: Alloc "3fce5308-818c-369e-0bb7-f61f0a1be9ed" marked for migration +2018-03-30T23:13:29Z: Alloc "3fce5308-818c-369e-0bb7-f61f0a1be9ed" draining +2018-03-30T23:13:30Z: Alloc "3fce5308-818c-369e-0bb7-f61f0a1be9ed" status running -> complete +2018-03-30T23:13:41Z: Alloc "9a98c5aa-a719-2f34-ecfc-0e6268b5d537" marked for migration +2018-03-30T23:13:41Z: Alloc "9a98c5aa-a719-2f34-ecfc-0e6268b5d537" draining +2018-03-30T23:13:41Z: Node "f4e8a9e5-30d8-3536-1e6f-cda5c869c35e" has marked all allocations for migration +2018-03-30T23:13:42Z: Alloc "9a98c5aa-a719-2f34-ecfc-0e6268b5d537" status running -> complete +2018-03-30T23:13:42Z: All allocations on node "f4e8a9e5-30d8-3536-1e6f-cda5c869c35e" have stopped. +``` + +Enable drain mode on the local node: + +```shell-session +$ nomad node drain -enable -self +... +``` + +Enable drain mode but do not stop system jobs: + +```shell-session +$ nomad node drain -enable -ignore-system 4d2ba53b +... +``` + +Disable drain mode but keep the node ineligible for scheduling. Useful for +inspecting the current state of a misbehaving node without Nomad trying to +start or migrate allocations: + +```shell-session +$ nomad node drain -disable -keep-ineligible 4d2ba53b +... +``` + +Enable drain mode and detach from monitoring, then reattach later: + +```shell-session +$ nomad node drain -enable -detach -self +... +$ nomad node drain -self -monitor +... +``` + +[eligibility]: /docs/commands/node/eligibility +[migrate]: /docs/job-specification/migrate +[node status]: /docs/commands/node/status +[workload migration guide]: https://learn.hashicorp.com/nomad/operating-nomad/node-draining diff --git a/content/nomad/v0.11.x/content/docs/commands/node/eligibility.mdx b/content/nomad/v0.11.x/content/docs/commands/node/eligibility.mdx new file mode 100644 index 0000000000..d82dbbbd6e --- /dev/null +++ b/content/nomad/v0.11.x/content/docs/commands/node/eligibility.mdx @@ -0,0 +1,70 @@ +--- +layout: docs +page_title: 'Commands: node eligibility' +sidebar_title: eligibility +description: > + The node eligibility command is used to configure a node's scheduling + eligibility. +--- + +# Command: node eligibility + +The `node eligibility` command is used to toggle scheduling eligibility for a +given node. By default nodes are eligible for scheduling meaning they can +receive placements and run new allocations. Nodes that have their scheduling +elegibility disabled are ineligibile for new placements. + +The [`node drain`][drain] command automatically disables eligibility. Disabling +a drain restore eligibility by default. + +Disable scheduling eligibility is useful when draining a set of nodes: first +disable eligibility on each node that will be drained. Then drain each node. +If you just drain each node allocations may get rescheduled multiple times as +they get placed on nodes about to be drained! + +Disabling scheduling eligibility may also be useful when investigating poorly +behaved nodes. It allows operators to investigate the current state of a node +without the risk of additional work being assigned to it. + +## Usage + +```plaintext +nomad node eligibility [options] +``` + +A `-self` flag can be used to toggle eligibility of the local node. If this is +not supplied, a node ID or prefix must be provided. If there is an exact match, +the eligibility will be adjusted for that node. Otherwise, a list of matching +nodes and information will be displayed. + +It is also required to pass one of `-enable` or `-disable`, depending on which +operation is desired. + +## General Options + +@include 'general_options.mdx' + +## Eligibility Options + +- `-enable`: Enable scheduling eligibility. +- `-disable`: Disable scheduling eligibility. +- `-self`: Set eligibility for the local node. +- `-yes`: Automatic yes to prompts. + +## Examples + +Enable scheduling eligibility on node with ID prefix "574545c5": + +```shell-session +$ nomad node eligibility -enable 574545c5 +Node "574545c5-c2d7-e352-d505-5e2cb9fe169f" scheduling eligibility set: eligible for scheduling +``` + +Disable scheduling eligibility on the local node: + +```shell-session +$ nomad node eligibility -disable -self +Node "574545c5-c2d7-e352-d505-5e2cb9fe169f" scheduling eligibility set: ineligible for scheduling +``` + +[drain]: /docs/commands/node/drain diff --git a/content/nomad/v0.11.x/content/docs/commands/node/index.mdx b/content/nomad/v0.11.x/content/docs/commands/node/index.mdx new file mode 100644 index 0000000000..d5563d4cb8 --- /dev/null +++ b/content/nomad/v0.11.x/content/docs/commands/node/index.mdx @@ -0,0 +1,32 @@ +--- +layout: docs +page_title: 'Commands: node' +sidebar_title: node +description: | + The node command is used to interact with nodes. +--- + +# Command: node + +The `node` command is used to interact with nodes. + +## Usage + +Usage: `nomad node [options]` + +Run `nomad node -h` for help on that subcommand. The following +subcommands are available: + +- [`node config`][config] - View or modify client configuration details + +- [`node drain`][drain] - Set drain mode on a given node + +- [`node eligibility`][eligibility] - Toggle scheduling eligibility on a given + node + +- [`node status`][status] - Display status information about nodes + +[config]: /docs/commands/node/config 'View or modify client configuration details' +[drain]: /docs/commands/node/drain 'Set drain mode on a given node' +[eligibility]: /docs/commands/node/eligibility 'Toggle scheduling eligibility on a given node' +[status]: /docs/commands/node/status 'Display status information about nodes' diff --git a/content/nomad/v0.11.x/content/docs/commands/node/status.mdx b/content/nomad/v0.11.x/content/docs/commands/node/status.mdx new file mode 100644 index 0000000000..15d38804b8 --- /dev/null +++ b/content/nomad/v0.11.x/content/docs/commands/node/status.mdx @@ -0,0 +1,364 @@ +--- +layout: docs +page_title: 'Commands: node status' +sidebar_title: status +description: | + The node status command is used to display information about nodes. +--- + +# Command: node status + +The `node status` command is used to display information about client nodes. A +node must first be registered with the servers before it will be visible in this +output. + +## Usage + +```plaintext +nomad node status [options] [node] +``` + +If no node ID is passed, then the command will enter "list mode" and dump a +high-level list of all known nodes. This list output contains less information +but is a good way to get a bird's-eye view of things. + +If there is an exact match based on the provided node ID or prefix, then that +particular node will be queried, and detailed information will be displayed, +including resource usage statistics. Otherwise, a list of matching nodes and +information will be displayed. If running the command on a Nomad Client, the +`-self` flag is useful to quickly access the status of the local node. + +## General Options + +@include 'general_options.mdx' + +## Status Options + +- `-self`: Query the status of the local node. + +- `-stats`: Display detailed resource usage statistics. + +- `-allocs`: When a specific node is not being queried, shows the number of + running allocations per node. + +- `-short`: Display short output. Used only when querying a single node. + +- `-verbose`: Show full information. + +- `-json` : Output the node in its JSON format. + +- `-t` : Format and display node using a Go template. + +## Examples + +List view: + +```shell-session +$ nomad node status +ID DC Name Class Drain Eligibility Status +a72dfba2 dc1 node1 false eligible ready +1f3f03ea dc1 node2 false eligible ready +``` + +List view, with running allocations: + +```shell-session +$ nomad node status -allocs +ID DC Name Class Drain Eligibility Status Running Allocs +4d2ba53b dc1 node1 false eligible ready 1 +34dfba32 dc1 node2 false eligible ready 3 +``` + +Single-node view in short mode: + +```shell-session +$ nomad node status -short 1f3f03ea +ID = c754da1f +Name = nomad +Class = +DC = dc1 +Drain = false +Status = ready +Uptime = 17h2m25s + +Allocations +ID Eval ID Job ID Task Group Desired Status Client Status +0b8b9e37 8bf94335 example cache run running +``` + +Full output for a single node: + +```shell-session +$ nomad node status 1f3f03ea +ID = c754da1f +Name = nomad-server01 +Class = +DC = dc1 +Drain = false +Status = ready +Uptime = 17h42m50s + +Drivers +Driver Detected Healthy +docker false false +exec true true +java true true +qemu true true +raw_exec true true +rkt true true + +Node Events +Time Subsystem Message +2018-03-29T17:24:42Z Driver: docker Driver docker is not detected +2018-03-29T17:23:42Z Cluster Node registered + +Allocated Resources +CPU Memory Disk +500/2600 MHz 256 MiB/2.0 GiB 300 MiB/32 GiB + +Allocation Resource Utilization +CPU Memory +430/2600 MHz 199 MiB/2.0 GiB + +Host Resource Utilization +CPU Memory Disk +513/3000 MHz 551 MiB/2.4 GiB 4.2 GiB/52 GiB + +Allocations +ID Eval ID Job ID Task Group Desired Status Client Status +7bff7214 b3a6b9d2 example cache run running +``` + +Using `-self` when on a Nomad Client: + +```shell-session +$ nomad node status -self +ID = c754da1f +Name = nomad-client01 +Class = +DC = dc1 +Drain = false +Status = ready +Uptime = 17h7m41s + +Drivers +Driver Detected Healthy +docker false false +exec true true +java true true +qemu true true +raw_exec true true +rkt true true + +Node Events +Time Subsystem Message +2018-03-29T17:24:42Z Driver: docker Driver docker is not detected +2018-03-29T17:23:42Z Cluster Node registered + +Allocated Resources +CPU Memory Disk +2500/2600 MHz 1.3 GiB/2.0 GiB 1.5 GiB/32 GiB + +Allocation Resource Utilization +CPU Memory +2200/2600 MHz 1.7 GiB/2.0 GiB + +Host Resource Utilization +CPU Memory Disk +2430/3000 MHz 1.8 GiB/2.4 GiB 6.5 GiB/40 GiB + +Allocations +ID Eval ID Job ID Task Group Desired Status Client Status +0b8b9e37 8bf94335 example cache run running +b206088c 8bf94335 example cache run running +b82f58b6 8bf94335 example cache run running +ed3665f5 8bf94335 example cache run running +24cfd201 8bf94335 example cache run running +``` + +You will note that in the above examples, the **Allocations** output contains +columns labeled **Desired Status** and **Client status**. + +Desired Status represents the goal of the scheduler on the allocation with +the following valid statuses: + +- _run_: The allocation should run +- _stop_: The allocation should stop + +Client Status represents the emergent state of the allocation and include +the following: + +- _pending_: The allocation is pending and will be running + +- _running_: The allocation is currently running + +- _complete_: The allocation was running and completed successfully + +- _failed_: The allocation was running and completed with a non-zero exit code + +- _lost_: The node that was running the allocation has failed or has been + partitioned + +Using `-stats` to see detailed to resource usage information on the node: + +```shell-session +$ nomad node status -stats c754da1f +ID = c754da1f +Name = nomad-client01 +Class = +DC = dc1 +Drain = false +Status = ready +Uptime = 17h7m41s + +Drivers +Driver Detected Healthy +docker false false +exec true true +java true true +qemu true true +raw_exec true true +rkt true true + +Node Events +Time Subsystem Message +2018-03-29T17:24:42Z Driver: docker Driver docker is not detected +2018-03-29T17:23:42Z Cluster Node registered + +Allocated Resources +CPU Memory Disk +2500/2600 MHz 1.3 GiB/2.0 GiB 1.5 GiB/32 GiB + +Allocation Resource Utilization +CPU Memory +2200/2600 MHz 1.7 GiB/2.0 GiB + +Host Resource Utilization +CPU Memory Disk +2430/3000 MHz 1.8 GiB/2.4 GiB 3.9 GiB/40 GiB + +CPU Stats +CPU = cpu0 +User = 96.94% +System = 1.02% +Idle = 2.04% + +CPU = cpu1 +User = 97.92% +System = 2.08% +Idle = 0.00% + +Memory Stats +Total = 2.4 GiB +Available = 612 MiB +Used = 1.8 GiB +Free = 312 MiB + +Disk Stats +Device = /dev/mapper/ubuntu--14--vg-root +MountPoint = / +Size = 38 GiB +Used = 3.9 GiB +Available = 32 GiB +Used Percent = 10.31% +Inodes Percent = 3.85% + +Device = /dev/sda1 +MountPoint = /boot +Size = 235 MiB +Used = 45 MiB +Available = 178 MiB +Used Percent = 19.17% +Inodes Percent = 0.48% + +Allocations +ID Eval ID Job ID Task Group Desired Status Client Status +0b8b9e37 8bf94335 example cache run running +b206088c 8bf94335 example cache run running +b82f58b6 8bf94335 example cache run running +ed3665f5 8bf94335 example cache run running +24cfd201 8bf94335 example cache run running +``` + +To view verbose information about the node: + +```shell-session +$ nomad node status -verbose c754da1f +ID = c754da1f-6337-b86d-47dc-2ef4c71aca14 +Name = nomad +Class = +DC = dc1 +Drain = false +Status = ready +Uptime = 17h7m41s + +Host Volumes +Name ReadOnly Source + +CSI Volumes +ID Name Plugin ID Schedulable Access Mode Mount Options +402f2c83 vol plug true single-node-writer + +Drivers +Driver Detected Healthy Message Time +docker false false Driver docker is not detected 2018-03-29T17:24:42Z +exec true true 2018-03-29T17:23:42Z +java true true 2018-03-29T17:23:41Z +qemu true true 2018-03-29T17:23:41Z +raw_exec true true 2018-03-29T17:23:42Z +rkt true true 2018-03-29T17:23:42Z + +Node Events +Time Subsystem Message Details +2018-03-29T17:24:42Z Driver: docker Driver docker is not detected driver: docker, +2018-03-29T17:23:42Z Cluster Node registered + +Allocated Resources +CPU Memory Disk +2500/2600 MHz 1.3 GiB/2.0 GiB 1.5 GiB/32 GiB + +Allocation Resource Utilization +CPU Memory +2200/2600 MHz 1.7 GiB/2.0 GiB + +Host Resource Utilization +CPU Memory Disk +230/3000 MHz 121 MiB/2.4 GiB 6.5 GiB/40 GiB + +Allocations +ID Eval ID Job ID Task Group Desired Status Client Status +3d743cff-8d57-18c3-2260-a41d3f6c5204 2fb686da-b2b0-f8c2-5d57-2be5600435bd example cache run complete + +Attributes +arch = amd64 +cpu.frequency = 1300.000000 +cpu.modelname = Intel(R) Core(TM) M-5Y71 CPU @ 1.20GHz +cpu.numcores = 2 +cpu.totalcompute = 2600.000000 +driver.docker = 1 +driver.docker.version = 1.10.3 +driver.exec = 1 +driver.java = 1 +driver.java.runtime = OpenJDK Runtime Environment (IcedTea 2.6.4) (7u95-2.6.4-0ubuntu0.14.04.2) +driver.java.version = 1.7.0_95 +driver.java.vm = OpenJDK 64-Bit Server VM (build 24.95-b01, mixed mode) +driver.qemu = 1 +driver.qemu.version = 2.0.0 +driver.raw_exec = 1 +driver.rkt = 1 +driver.rkt.appc.version = 0.7.4 +driver.rkt.version = 1.2.0 +hostname = nomad +kernel.name = linux +kernel.version = 3.19.0-25-generic +memory.totalbytes = 2094473216 +nomad.revision = '270da7a60ccbf39eeeadc4064a59ca06bf9ac6fc+CHANGES' +nomad.version = 0.3.2dev +os.name = ubuntu +os.version = 14.04 +unique.cgroup.mountpoint = /sys/fs/cgroup +unique.network.ip-address = 127.0.0.1 +unique.storage.bytesfree = 36044333056 +unique.storage.bytestotal = 41092214784 +unique.storage.volume = /dev/mapper/ubuntu--14--vg-root +``` diff --git a/content/nomad/v0.11.x/content/docs/commands/operator/autopilot-get-config.mdx b/content/nomad/v0.11.x/content/docs/commands/operator/autopilot-get-config.mdx new file mode 100644 index 0000000000..5464e951f3 --- /dev/null +++ b/content/nomad/v0.11.x/content/docs/commands/operator/autopilot-get-config.mdx @@ -0,0 +1,66 @@ +--- +layout: docs +page_title: 'Commands: operator autopilot get-config' +sidebar_title: autopilot get-config +description: | + Display the current Autopilot configuration. +--- + +# Command: operator autopilot get-config + +The Autopilot operator command is used to view the current Autopilot +configuration. See the [Autopilot Guide] for more information about Autopilot. + +## Usage + +```plaintext +nomad operator autopilot get-config [options] +``` + +## General Options + +@include 'general_options.mdx' + +The output looks like this: + +```shell-session +$ nomad operator autopilot get-config +CleanupDeadServers = true +LastContactThreshold = 200ms +MaxTrailingLogs = 250 +ServerStabilizationTime = 10s +RedundancyZoneTag = "" +DisableUpgradeMigration = false +UpgradeMigrationTag = "" +``` + +- `CleanupDeadServers` - Specifies automatic removal of dead + server nodes periodically and whenever a new server is added to the cluster. + +- `LastContactThreshold` - Specifies the maximum amount of + time a server can go without contact from the leader before being considered + unhealthy. Must be a duration value such as `10s`. + +- `MaxTrailingLogs` - specifies the maximum number of log entries + that a server can trail the leader by before being considered unhealthy. + +- `ServerStabilizationTime` - Specifies the minimum amount of + time a server must be stable in the 'healthy' state before being added to the + cluster. Only takes effect if all servers are running Raft protocol version 3 + or higher. Must be a duration value such as `30s`. + +- `RedundancyZoneTag` - Controls the node-meta key to use when + Autopilot is separating servers into zones for redundancy. Only one server in + each zone can be a voting member at one time. If left blank, this feature will + be disabled. + +- `DisableUpgradeMigration` - Disables Autopilot's upgrade + migration strategy in Nomad Enterprise of waiting until enough + newer-versioned servers have been added to the cluster before promoting any of + them to voters. + +- `UpgradeVersionTag` - Controls the node-meta key to use for + version info when performing upgrade migrations. If left blank, the Nomad + version will be used. + +[autopilot guide]: https://learn.hashicorp.com/nomad/operating-nomad/autopilot diff --git a/content/nomad/v0.11.x/content/docs/commands/operator/autopilot-set-config.mdx b/content/nomad/v0.11.x/content/docs/commands/operator/autopilot-set-config.mdx new file mode 100644 index 0000000000..d9f87562a2 --- /dev/null +++ b/content/nomad/v0.11.x/content/docs/commands/operator/autopilot-set-config.mdx @@ -0,0 +1,63 @@ +--- +layout: docs +page_title: 'Commands: operator autopilot set-config' +sidebar_title: autopilot set-config +description: | + Modify the current Autopilot configuration. +--- + +# Command: operator autopilot set-config + +The Autopilot operator command is used to set the current Autopilot +configuration. See the [Autopilot Guide] for more information about Autopilot. + +## Usage + +```plaintext +nomad operator autopilot set-config [options] +``` + +## General Options + +@include 'general_options.mdx' + +## Set Config Options + +- `-cleanup-dead-servers` - Specifies whether to enable automatic removal of + dead servers upon the successful joining of new servers to the cluster. Must be + one of `[true|false]`. + +- `-last-contact-threshold` - Controls the maximum amount of time a server can + go without contact from the leader before being considered unhealthy. Must be a + duration value such as `200ms`. + +- `-max-trailing-logs` - Controls the maximum number of log entries that a + server can trail the leader by before being considered unhealthy. + +- `-server-stabilization-time` - Controls the minimum amount of time a server + must be stable in the 'healthy' state before being added to the cluster. Only + takes effect if all servers are running Raft protocol version 3 or higher. Must + be a duration value such as `10s`. + +- `-disable-upgrade-migration` - (Enterprise-only) Controls whether Nomad will + avoid promoting new servers until it can perform a migration. Must be one of + `[true|false]`. + +- `-redundancy-zone-tag`- (Enterprise-only) Controls the [`redundancy_zone`] + used for separating servers into different redundancy zones. + +- `-upgrade-version-tag` - (Enterprise-only) Controls the [`upgrade_version`] to + use for version info when performing upgrade migrations. If left blank, the + Nomad version will be used. + +The output looks like this: + +```plaintext +Configuration updated! +``` + +The return code will indicate success or failure. + +[`redundancy_zone`]: /docs/configuration/server#redundancy_zone +[`upgrade_version`]: /docs/configuration/server#upgrade_version +[autopilot guide]: https://learn.hashicorp.com/nomad/operating-nomad/autopilot diff --git a/content/nomad/v0.11.x/content/docs/commands/operator/index.mdx b/content/nomad/v0.11.x/content/docs/commands/operator/index.mdx new file mode 100644 index 0000000000..b64b2d6113 --- /dev/null +++ b/content/nomad/v0.11.x/content/docs/commands/operator/index.mdx @@ -0,0 +1,51 @@ +--- +layout: docs +page_title: 'Commands: operator' +sidebar_title: operator +description: | + The operator command provides cluster-level tools for Nomad operators. +--- + +# Command: operator + +The `operator` command provides cluster-level tools for Nomad operators, such +as interacting with the Raft subsystem. This was added in Nomad 0.5.5. + +~> Use this command with extreme caution, as improper use could lead to a Nomad +outage and even loss of data. + +See the [Outage Recovery guide] guide for some examples of how this command is +used. For an API to perform these operations programmatically, please see the +documentation for the [Operator] endpoint. + +## Usage + +Usage: `nomad operator [options]` + +Run `nomad operator ` with no arguments for help on that subcommand. +The following subcommands are available: + +- [`operator autopilot get-config`][get-config] - Display the current Autopilot + configuration + +- [`operator autopilot set-config`][set-config] - Modify the current Autopilot + configuration + +- [`operator keygen`][keygen] - Generates a new encryption key + +- [`operator keyring`][keyring] - Manages gossip layer encryption keys + +- [`operator raft list-peers`][list] - Display the current Raft peer + configuration + +- [`operator raft remove-peer`][remove] - Remove a Nomad server from the Raft + configuration + +[get-config]: /docs/commands/operator/autopilot-get-config 'Autopilot Get Config command' +[keygen]: /docs/commands/operator/keygen 'Generates a new encryption key' +[keyring]: /docs/commands/operator/keyring 'Manages gossip layer encryption keys' +[list]: /docs/commands/operator/raft-list-peers 'Raft List Peers command' +[operator]: /api-docs/operator 'Operator API documentation' +[outage recovery guide]: https://learn.hashicorp.com/nomad/operating-nomad/outage +[remove]: /docs/commands/operator/raft-remove-peer 'Raft Remove Peer command' +[set-config]: /docs/commands/operator/autopilot-set-config 'Autopilot Set Config command' diff --git a/content/nomad/v0.11.x/content/docs/commands/operator/keygen.mdx b/content/nomad/v0.11.x/content/docs/commands/operator/keygen.mdx new file mode 100644 index 0000000000..3baf7bfd6a --- /dev/null +++ b/content/nomad/v0.11.x/content/docs/commands/operator/keygen.mdx @@ -0,0 +1,28 @@ +--- +layout: docs +page_title: 'Commands: operator keygen' +sidebar_title: keygen +description: > + The `operator keygen` command generates an encryption key that can be used for + Nomad server's gossip traffic encryption. The keygen command uses a + cryptographically strong pseudo-random number generator to generate the key. +--- + +# Command: operator keygen + +The `operator keygen` command generates an encryption key that can be used for +Nomad server's gossip traffic encryption. The keygen command uses a +cryptographically strong pseudo-random number generator to generate the key. + +## Usage + +```plaintext +nomad operator keygen +``` + +## Example + +```shell-session +$ nomad operator keygen +YgZOXLMhC7TtZqeghMT8+w== +``` diff --git a/content/nomad/v0.11.x/content/docs/commands/operator/keyring.mdx b/content/nomad/v0.11.x/content/docs/commands/operator/keyring.mdx new file mode 100644 index 0000000000..1ac7282ba5 --- /dev/null +++ b/content/nomad/v0.11.x/content/docs/commands/operator/keyring.mdx @@ -0,0 +1,60 @@ +--- +layout: docs +page_title: 'Commands: operator keyring' +sidebar_title: keyring +--- + +# Command: operator keyring + +The `operator keyring` command is used to examine and modify the encryption keys +used in Nomad server. It is capable of distributing new encryption keys to the +cluster, retiring old encryption keys, and changing the keys used by the cluster +to encrypt messages. + +Nomad allows multiple encryption keys to be in use simultaneously. This is +intended to provide a transition state while the cluster converges. It is the +responsibility of the operator to ensure that only the required encryption keys +are installed on the cluster. You can review the installed keys using the +`-list` argument, and remove unneeded keys with `-remove`. + +All operations performed by this command can only be run against server nodes +and will effect the entire cluster. + +All variations of the `keyring` command return 0 if all nodes reply and there +are no errors. If any node fails to reply or reports failure, the exit code +will be 1. + +## Usage + +```plaintext +nomad operator keyring [options] +``` + +Only one actionable argument may be specified per run, including `-list`, +`-install`, `-remove`, and `-use`. + +The list of available flags are: + +- `-list` - List all keys currently in use within the cluster. + +- `-install` - Install a new encryption key. This will broadcast the new key to + all members in the cluster. + +- `-use` - Change the primary encryption key, which is used to encrypt messages. + The key must already be installed before this operation can succeed. + +- `-remove` - Remove the given key from the cluster. This operation may only be + performed on keys which are not currently the primary key. + +## Output + +The output of the `nomad operator keyring -list` command consolidates +information from all the Nomad servers from all datacenters and regions to +provide a simple and easy to understand view of the cluster. + +```shell-session +$ nomad operator keyring -list +==> Gathering installed encryption keys... +Key +PGm64/neoebUBqYR/lZTbA== +``` diff --git a/content/nomad/v0.11.x/content/docs/commands/operator/raft-list-peers.mdx b/content/nomad/v0.11.x/content/docs/commands/operator/raft-list-peers.mdx new file mode 100644 index 0000000000..2a41bd2e4e --- /dev/null +++ b/content/nomad/v0.11.x/content/docs/commands/operator/raft-list-peers.mdx @@ -0,0 +1,62 @@ +--- +layout: docs +page_title: 'Commands: operator raft list-peers' +sidebar_title: raft list-peers +description: | + Display the current Raft peer configuration. +--- + +# Command: operator raft list-peers + +The Raft list-peers command is used to display the current Raft peer +configuration. + +See the [Outage Recovery] guide for some examples of how this command is used. +For an API to perform these operations programmatically, please see the +documentation for the [Operator] endpoint. + +## Usage + +```plaintext +nomad operator raft list-peers [options] +``` + +## General Options + +@include 'general_options.mdx' + +## List Peers Options + +- `-stale`: The stale argument defaults to "false" which means the leader + provides the result. If the cluster is in an outage state without a leader, you + may need to set `-stale` to "true" to get the configuration from a non-leader + server. + +## Examples + +An example output with three servers is as follows: + +```shell-session +$ nomad operator raft list-peers +Node ID Address State Voter +nomad-server01.global 10.10.11.5:4647 10.10.11.5:4647 follower true +nomad-server02.global 10.10.11.6:4647 10.10.11.6:4647 leader true +nomad-server03.global 10.10.11.7:4647 10.10.11.7:4647 follower true +``` + +- `Node` is the node name of the server, as known to Nomad, or "(unknown)" if + the node is stale and not known. + +- `ID` is the ID of the server. This is the same as the `Address` but may be + upgraded to a GUID in a future version of Nomad. + +- `Address` is the IP:port for the server. + +- `State` is either "follower" or "leader" depending on the server's role in the + Raft configuration. + +- `Voter` is "true" or "false", indicating if the server has a vote in the Raft + configuration. Future versions of Nomad may add support for non-voting servers. + +[operator]: /api-docs/operator +[outage recovery]: https://learn.hashicorp.com/nomad/operating-nomad/outage diff --git a/content/nomad/v0.11.x/content/docs/commands/operator/raft-remove-peer.mdx b/content/nomad/v0.11.x/content/docs/commands/operator/raft-remove-peer.mdx new file mode 100644 index 0000000000..9f0081e9c5 --- /dev/null +++ b/content/nomad/v0.11.x/content/docs/commands/operator/raft-remove-peer.mdx @@ -0,0 +1,45 @@ +--- +layout: docs +page_title: 'Commands: operator raft remove-peer' +sidebar_title: raft remove-peer +description: | + Remove a Nomad server from the Raft configuration. +--- + +# Command: operator raft remove-peer + +Remove the Nomad server with given address from the Raft configuration. + +There are rare cases where a peer may be left behind in the Raft quorum even +though the server is no longer present and known to the cluster. This command +can be used to remove the failed server so that it is no longer affects the Raft +quorum. If the server still shows in the output of the [`nomad server members`] +command, it is preferable to clean up by running [`nomad server force-leave`] +instead of this command. + +See the [Outage Recovery] guide for some examples of how this command is used. +For an API to perform these operations programmatically, please see the +documentation for the [Operator] endpoint. + +## Usage + +```plaintext +nomad operator raft remove-peer [options] +``` + +## General Options + +@include 'general_options.mdx' + +## Remove Peer Options + +- `-peer-address`: Remove a Nomad server with given address from the Raft + configuration. The format is "IP:port" + +- `-peer-id`: Remove a Nomad server with the given ID from the Raft + configuration. The format is "id" + +[`nomad server force-leave`]: /docs/commands/server/force-leave 'Nomad server force-leave command' +[`nomad server members`]: /docs/commands/server/members 'Nomad server members command' +[operator]: /api-docs/operator 'Nomad Operator API' +[outage recovery]: https://learn.hashicorp.com/nomad/operating-nomad/outage diff --git a/content/nomad/v0.11.x/content/docs/commands/plugin/index.mdx b/content/nomad/v0.11.x/content/docs/commands/plugin/index.mdx new file mode 100644 index 0000000000..fee7edf365 --- /dev/null +++ b/content/nomad/v0.11.x/content/docs/commands/plugin/index.mdx @@ -0,0 +1,22 @@ +--- +layout: docs +page_title: 'Commands: plugin' +sidebar_title: plugin +description: | + The plugin command is used to interact with plugins. +--- + +# Command: plugin + +The `plugin` command is used to interact with plugins. + +## Usage + +Usage: `nomad plugin [options]` + +Run `nomad plugin -h` for help on that subcommand. The following +subcommands are available: + +- [`plugin status`][status] - Display status information about a plugin + +[status]: /docs/commands/plugin/status 'Display status information about a plugin' diff --git a/content/nomad/v0.11.x/content/docs/commands/plugin/status.mdx b/content/nomad/v0.11.x/content/docs/commands/plugin/status.mdx new file mode 100644 index 0000000000..df1cdf012c --- /dev/null +++ b/content/nomad/v0.11.x/content/docs/commands/plugin/status.mdx @@ -0,0 +1,85 @@ +--- +layout: docs +page_title: 'Commands: plugin status' +sidebar_title: status +description: | + Display information and status of plugins. +--- + +# Command: plugin status + +The `plugin status` command displays status information for [Container +Storage Interface (CSI)][csi] plugins. + +## Usage + +```plaintext +nomad plugin status [options] [plugin] +``` + +This command accepts an optional plugin ID or prefix as the sole argument. If there +is an exact match based on the provided plugin ID or prefix, then information about +the specific plugin is queried and displayed. Otherwise, a list of matching plugins +and information will be displayed. + +If the ID is omitted, the command lists out all of the existing plugins and a few +of the most useful status fields for each. + +## General Options + +@include 'general_options.mdx' + +## Status Options + +- `-type`: Display only plugins of a particular type. Currently only + the `csi` type is supported, so this option can be omitted when + querying the status of CSI plugins. + +- `-short`: Display short output. Used only when a single plugin is being queried. + Drops verbose plugin allocation data from the output. + +- `-verbose`: Show full information. Allocation create and modify times are + shown in `yyyy/mm/dd hh:mm:ss` format. + +## Examples + +List of all plugins: + +```shell-session +$ nomad plugin [-type csi] status +ID Provider Controllers Healthy / Expected Nodes Healthy / Expected +ebs-prod aws.ebs 1 / 1 1 / 1 +``` + +Short view of a specific plugin: + +```shell-session +$ nomad plugin [-type csi] status ebs-prod +ID = ebs-prod +Provider = aws.ebs +Version = 1.0.1 +Controllers Healthy = 1 +Controllers Expected = 1 +Nodes Healthy = 1 +Nodes Expected = 1 +``` + +Full status information of a plugin: + +```shell-session +$ nomad plugin [-type csi] status ebs-prod +ID = ebs-prod +Provider = aws.ebs +Version = 1.0.1 +Controllers Healthy = 1 +Controllers Expected = 1 +Nodes Healthy = 1 +Nodes Expected = 1 + +Allocations +ID Node ID Task Group Version Desired Status Created Modified +0de05689 95303afc csi 0 run running 1m57s ago 1m19s ago +b206088c 8bf94335 csi 0 run running 1m56s ago 1m19s ago +``` + +[csi]: https://github.com/container-storage-interface/spec diff --git a/content/nomad/v0.11.x/content/docs/commands/quota/apply.mdx b/content/nomad/v0.11.x/content/docs/commands/quota/apply.mdx new file mode 100644 index 0000000000..0f113d3f80 --- /dev/null +++ b/content/nomad/v0.11.x/content/docs/commands/quota/apply.mdx @@ -0,0 +1,40 @@ +--- +layout: docs +page_title: 'Commands: quota apply' +sidebar_title: apply +description: | + The quota apply command is used to create or update quota specifications. +--- + +# Command: quota apply + +The `quota apply` command is used to create or update quota specifications. + +~> Quota commands are new in Nomad 0.7 and are only available with Nomad +Enterprise. + +## Usage + +```plaintext +nomad quota apply [options] +``` + +The `quota apply` command requires the path to the specification file. The +specification can be read from stdin by setting the path to "-". + +## General Options + +@include 'general_options.mdx' + +## Apply Options + +- `-json`: Parse the input as a JSON quota specification. + +## Examples + +Create a new quota specification: + +```shell-session +$ nomad quota apply my-quota.hcl +Successfully applied quota specification "my-quota"! +``` diff --git a/content/nomad/v0.11.x/content/docs/commands/quota/delete.mdx b/content/nomad/v0.11.x/content/docs/commands/quota/delete.mdx new file mode 100644 index 0000000000..ccfbf5679e --- /dev/null +++ b/content/nomad/v0.11.x/content/docs/commands/quota/delete.mdx @@ -0,0 +1,35 @@ +--- +layout: docs +page_title: 'Commands: quota delete' +sidebar_title: delete +description: | + The quota delete command is used to delete an existing quota specification. +--- + +# Command: quota delete + +The `quota delete` command is used to delete an existing quota specification. + +~> Quota commands are new in Nomad 0.7 and are only available with Nomad +Enterprise. + +## Usage + +```plaintext +nomad quota delete +``` + +The `quota delete` command requires the quota specification name as an argument. + +## General Options + +@include 'general_options.mdx' + +## Examples + +Delete a quota specification: + +```shell-session +$ nomad quota delete my-quota +Successfully deleted quota "my-quota"! +``` diff --git a/content/nomad/v0.11.x/content/docs/commands/quota/index.mdx b/content/nomad/v0.11.x/content/docs/commands/quota/index.mdx new file mode 100644 index 0000000000..264e91c1b1 --- /dev/null +++ b/content/nomad/v0.11.x/content/docs/commands/quota/index.mdx @@ -0,0 +1,35 @@ +--- +layout: docs +page_title: 'Commands: quota' +sidebar_title: quota +description: | + The quota command is used to interact with quota specifications. +--- + +# Command: quota + +The `quota` command is used to interact with quota specifications. + +~> Quota commands are new in Nomad 0.7 and are only available with Nomad +Enterprise. + +## Usage + +Usage: `nomad quota [options]` + +Run `nomad quota -h` for help on that subcommand. The following +subcommands are available: + +- [`quota apply`][quotaapply] - Create or update a quota specification +- [`quota delete`][quotadelete] - Delete a quota specification +- [`quota init`][quotainit] - Create an example quota specification file +- [`quota inspect`][quotainspect] - Inspect a quota specification +- [`quota list`][quotalist] - List quota specifications +- [`quota status`][quotastatus] - Display a quota's status and current usage + +[quotaapply]: /docs/commands/quota/apply +[quotadelete]: /docs/commands/quota/delete +[quotainit]: /docs/commands/quota/init +[quotainspect]: /docs/commands/quota/inspect +[quotalist]: /docs/commands/quota/list +[quotastatus]: /docs/commands/quota/status diff --git a/content/nomad/v0.11.x/content/docs/commands/quota/init.mdx b/content/nomad/v0.11.x/content/docs/commands/quota/init.mdx new file mode 100644 index 0000000000..c0d1168e39 --- /dev/null +++ b/content/nomad/v0.11.x/content/docs/commands/quota/init.mdx @@ -0,0 +1,34 @@ +--- +layout: docs +page_title: 'Commands: quota init' +sidebar_title: init +description: | + Generate an example quota specification. +--- + +# Command: quota init + +The `quota init` command is used to create an example quota specification file +that can be used as a starting point to customize further. + +~> Quota commands are new in Nomad 0.7 and are only available with Nomad +Enterprise. + +## Usage + +```plaintext +nomad quota init +``` + +## Init Options + +- `-json`: Create an example JSON quota specification. + +## Examples + +Create an example quota specification: + +```shell-session +$ nomad quota init +Example quota specification written to spec.hcl +``` diff --git a/content/nomad/v0.11.x/content/docs/commands/quota/inspect.mdx b/content/nomad/v0.11.x/content/docs/commands/quota/inspect.mdx new file mode 100644 index 0000000000..d34c100efd --- /dev/null +++ b/content/nomad/v0.11.x/content/docs/commands/quota/inspect.mdx @@ -0,0 +1,88 @@ +--- +layout: docs +page_title: 'Commands: quota inspect' +sidebar_title: inspect +description: > + The quota inspect command is used to view raw information about a particular + quota specification. +--- + +# Command: quota inspect + +The `quota inspect` command is used to view raw information about a particular +quota. + +~> Quota commands are new in Nomad 0.7 and are only available with Nomad +Enterprise. + +## Usage + +```plaintext +nomad quota inspect [options] +``` + +## General Options + +@include 'general_options.mdx' + +## Inspect Options + +- `-t` : Format and display the quota using a Go template. + +## Examples + +Inspect a quota specification: + +```shell-session +$ nomad quota inspect default-quota +{ + "Spec": { + "CreateIndex": 8, + "Description": "Limit the shared default namespace", + "Limits": [ + { + "Hash": "NLOoV2WBU8ieJIrYXXx8NRb5C2xU61pVVWRDLEIMxlU=", + "Region": "global", + "RegionLimit": { + "CPU": 2500, + "DiskMB": 0, + "MemoryMB": 2000, + "Networks": null + } + } + ], + "ModifyIndex": 56, + "Name": "default-quota" + }, + "UsageLookupErrors": {}, + "Usages": { + "global": { + "CreateIndex": 8, + "ModifyIndex": 56, + "Name": "default-quota", + "Used": { + "NLOoV2WBU8ieJIrYXXx8NRb5C2xU61pVVWRDLEIMxlU=": { + "Hash": "NLOoV2WBU8ieJIrYXXx8NRb5C2xU61pVVWRDLEIMxlU=", + "Region": "global", + "RegionLimit": { + "CPU": 500, + "DiskMB": 0, + "MemoryMB": 256, + "Networks": [ + { + "CIDR": "", + "Device": "", + "DynamicPorts": null, + "IP": "", + "MBits": 0, + "Mode": "", + "ReservedPorts": null + } + ] + } + } + } + } + } +} +``` diff --git a/content/nomad/v0.11.x/content/docs/commands/quota/list.mdx b/content/nomad/v0.11.x/content/docs/commands/quota/list.mdx new file mode 100644 index 0000000000..4ec7223d1f --- /dev/null +++ b/content/nomad/v0.11.x/content/docs/commands/quota/list.mdx @@ -0,0 +1,40 @@ +--- +layout: docs +page_title: 'Commands: quota list' +sidebar_title: list +description: | + The quota list command is used to list available quota specifications. +--- + +# Command: quota list + +The `quota list` command is used to list available quota specifications. + +~> Quota commands are new in Nomad 0.7 and are only available with Nomad +Enterprise. + +## Usage + +```plaintext +nomad quota list +``` + +## General Options + +@include 'general_options.mdx' + +## List Options + +- `-json`: Output the quota specifications in a JSON format. + +- `-t`: Format and display the quotas specifications using a Go template. + +## Examples + +List all quota specifications: + +```shell-session +$ nomad quota list +Name Description +default Limit the shared default namespace +``` diff --git a/content/nomad/v0.11.x/content/docs/commands/quota/status.mdx b/content/nomad/v0.11.x/content/docs/commands/quota/status.mdx new file mode 100644 index 0000000000..0ca2350018 --- /dev/null +++ b/content/nomad/v0.11.x/content/docs/commands/quota/status.mdx @@ -0,0 +1,41 @@ +--- +layout: docs +page_title: 'Commands: quota status' +sidebar_title: status +description: > + The quota status command is used to view the status of a particular quota + specification. +--- + +# Command: quota status + +The `quota status` command is used to view the status of a particular quota +specification. + +~> Quota commands are new in Nomad 0.7 and are only available with Nomad +Enterprise. + +## Usage + +```plaintext +nomad quota status [options] +``` + +## General Options + +@include 'general_options.mdx' + +## Examples + +View the status of a quota specification: + +```shell-session +$ nomad quota status default-quota +Name = default-quota +Description = Limit the shared default namespace +Limits = 1 + +Quota Limits +Region CPU Usage Memory Usage Network Usage +global 500 / 2500 256 / 2000 30 / 50 +``` diff --git a/content/nomad/v0.11.x/content/docs/commands/sentinel/apply.mdx b/content/nomad/v0.11.x/content/docs/commands/sentinel/apply.mdx new file mode 100644 index 0000000000..8f78ad97b0 --- /dev/null +++ b/content/nomad/v0.11.x/content/docs/commands/sentinel/apply.mdx @@ -0,0 +1,49 @@ +--- +layout: docs +page_title: 'Commands: sentinel apply' +sidebar_title: apply +description: > + The sentinel apply command is used to write a new, or update an existing, + Sentinel policy. +--- + +# Command: sentinel apply + +The `sentinel apply` command is used to write a new, or update an existing, +Sentinel policy. + +~> Sentinel commands are new in Nomad 0.7 and are only available with Nomad +Enterprise. + +## Usage + +```plaintext +nomad sentinel apply [options] +``` + +The `sentinel apply` command requires two arguments, the policy name and the +policy file. The policy file can be read from stdin by specifying "-" as the +file name. + +## General Options + +@include 'general_options.mdx' + +## Apply Options + +- `-description` : Sets a human readable description for the policy + +- `-scope` : (default: submit-job) Sets the scope of the policy and when it + should be enforced. + +- `-level` : (default: advisory) Sets the enforcement level of the policy. Must + be one of advisory, soft-mandatory, hard-mandatory. + +## Examples + +Write a policy: + +```shell-session +$ nomad sentinel write -description "My test policy" foo test.sentinel +Successfully wrote "foo" Sentinel policy! +``` diff --git a/content/nomad/v0.11.x/content/docs/commands/sentinel/delete.mdx b/content/nomad/v0.11.x/content/docs/commands/sentinel/delete.mdx new file mode 100644 index 0000000000..a57a18c59a --- /dev/null +++ b/content/nomad/v0.11.x/content/docs/commands/sentinel/delete.mdx @@ -0,0 +1,35 @@ +--- +layout: docs +page_title: 'Commands: sentinel delete' +sidebar_title: delete +description: | + The sentinel delete command is used to delete a Sentinel policy. +--- + +# Command: sentinel delete + +The `sentinel delete` command is used to delete a Sentinel policy. + +~> Sentinel commands are new in Nomad 0.7 and are only available with Nomad +Enterprise. + +## Usage + +```plaintext +nomad sentinel delete [options] +``` + +The `sentinel delete` command requires a single argument, the policy name. + +## General Options + +@include 'general_options.mdx' + +## Examples + +Delete a policy: + +```shell-session +$ nomad sentinel delete foo +Successfully deleted "foo" Sentinel policy! +``` diff --git a/content/nomad/v0.11.x/content/docs/commands/sentinel/index.mdx b/content/nomad/v0.11.x/content/docs/commands/sentinel/index.mdx new file mode 100644 index 0000000000..d1db0d0fea --- /dev/null +++ b/content/nomad/v0.11.x/content/docs/commands/sentinel/index.mdx @@ -0,0 +1,31 @@ +--- +layout: docs +page_title: 'Commands: sentinel' +sidebar_title: sentinel +description: | + The sentinel command is used to interact with Sentinel policies. +--- + +# Command: sentinel + +The `sentinel` command is used to interact with Sentinel policies. + +~> Sentinel commands are new in Nomad 0.7 and are only available with Nomad +Enterprise. + +## Usage + +Usage: `nomad sentinel [options]` + +Run `nomad sentinel -h` for help on that subcommand. The following +subcommands are available: + +- [`sentinel apply`][apply] - Create a new or update existing Sentinel policies +- [`sentinel delete`][delete] - Delete an existing Sentinel policies +- [`sentinel list`][list] - Display all Sentinel policies +- [`sentinel read`][read] - Inspects an existing Sentinel policies + +[delete]: /docs/commands/sentinel/delete +[list]: /docs/commands/sentinel/list +[read]: /docs/commands/sentinel/read +[apply]: /docs/commands/sentinel/apply diff --git a/content/nomad/v0.11.x/content/docs/commands/sentinel/list.mdx b/content/nomad/v0.11.x/content/docs/commands/sentinel/list.mdx new file mode 100644 index 0000000000..b80836ea00 --- /dev/null +++ b/content/nomad/v0.11.x/content/docs/commands/sentinel/list.mdx @@ -0,0 +1,37 @@ +--- +layout: docs +page_title: 'Commands: sentinel list' +sidebar_title: list +description: | + The sentinel list command is used to list all installed Sentinel policies. +--- + +# Command: sentinel list + +The `sentinel list` command is used to display all the installed Sentinel +policies. + +~> Sentinel commands are new in Nomad 0.7 and are only available with Nomad +Enterprise. + +## Usage + +```plaintext +nomad sentinel list [options] +``` + +The `sentinel list` command requires no arguments. + +## General Options + +@include 'general_options.mdx' + +## Examples + +List all policies: + +```shell-session +$ nomad sentinel list +Name Scope Enforcement Level Description +foo submit-job advisory my test policy +``` diff --git a/content/nomad/v0.11.x/content/docs/commands/sentinel/read.mdx b/content/nomad/v0.11.x/content/docs/commands/sentinel/read.mdx new file mode 100644 index 0000000000..e49c1a38af --- /dev/null +++ b/content/nomad/v0.11.x/content/docs/commands/sentinel/read.mdx @@ -0,0 +1,46 @@ +--- +layout: docs +page_title: 'Commands: sentinel read' +sidebar_title: read +description: | + The sentinel read command is used to inspect a Sentinel policies. +--- + +# Command: sentinel read + +The `sentinel read` command is used to inspect a Sentinel policy. + +~> Sentinel commands are new in Nomad 0.7 and are only available with Nomad +Enterprise. + +## Usage + +```plaintext +nomad sentinel read [options] +``` + +The `sentinel read` command requires a single argument, the policy name. + +## General Options + +@include 'general_options.mdx' + +## Read Options + +- `-raw` : Output the raw policy only. + +## Examples + +Read all policies: + +```shell-session +$ nomad sentinel read foo +Name = foo +Scope = submit-job +Enforcement Level = advisory +Description = my test policy +Policy: + +main = rule { true } + +``` diff --git a/content/nomad/v0.11.x/content/docs/commands/server/force-leave.mdx b/content/nomad/v0.11.x/content/docs/commands/server/force-leave.mdx new file mode 100644 index 0000000000..61a437888b --- /dev/null +++ b/content/nomad/v0.11.x/content/docs/commands/server/force-leave.mdx @@ -0,0 +1,37 @@ +--- +layout: docs +page_title: 'Commands: server force-leave' +sidebar_title: force-leave +description: > + The server force-leave command is used to force a server into the "left" + state. +--- + +# Command: server force-leave + +The `server force-leave` command forces a server to enter the "left" state. +This can be used to eject server nodes which have failed and will not rejoin +the cluster. Note that if the server is actually still alive, it will +eventually rejoin the cluster again. + +## Usage + +```plaintext +nomad server force-leave [options] +``` + +This command expects only one argument - the node which should be forced +to enter the "left" state. + +## General Options + +@include 'general_options.mdx' + +## Examples + +Force-leave the server "node1": + +```shell-session +$ nomad server force-leave node1 + +``` diff --git a/content/nomad/v0.11.x/content/docs/commands/server/index.mdx b/content/nomad/v0.11.x/content/docs/commands/server/index.mdx new file mode 100644 index 0000000000..a56135439b --- /dev/null +++ b/content/nomad/v0.11.x/content/docs/commands/server/index.mdx @@ -0,0 +1,28 @@ +--- +layout: docs +page_title: 'Commands: server' +sidebar_title: server +description: | + The server command is used to interact with Nomad servers. +--- + +# Command: server + +Command: `nomad server` + +The `server` command is used to interact with servers. + +## Usage + +Usage: `nomad server [options]` + +Run `nomad server -h` for help on that subcommand. The following +subcommands are available: + +- [`server force-leave`][force-leave] - Force a server into the 'left' state +- [`server join`][join] - Join server nodes together +- [`server members`][members] - Display a list of known servers and their status + +[force-leave]: /docs/commands/server/force-leave "Force a server into the 'left' state" +[join]: /docs/commands/server/join 'Join server nodes together' +[members]: /docs/commands/server/members 'Display a list of known servers and their status' diff --git a/content/nomad/v0.11.x/content/docs/commands/server/join.mdx b/content/nomad/v0.11.x/content/docs/commands/server/join.mdx new file mode 100644 index 0000000000..3f4eba7a53 --- /dev/null +++ b/content/nomad/v0.11.x/content/docs/commands/server/join.mdx @@ -0,0 +1,39 @@ +--- +layout: docs +page_title: 'Commands: server join' +sidebar_title: join +description: > + The server join command is used to join the local server to one or more Nomad + servers. +--- + +# Command: server join + +The `server join` command joins the local server to one or more Nomad servers. +Joining is only required for server nodes, and only needs to succeed against +one or more of the provided addresses. Once joined, the gossip layer will +handle discovery of the other server nodes in the cluster. + +## Usage + +```plaintext +nomad server join [options] [...] +``` + +One or more server addresses are required. If multiple server addresses are +specified, then an attempt will be made to join each one. If one or more nodes +are joined successfully, the exit code will be 0. Otherwise, the exit code will +be 1. + +## General Options + +@include 'general_options.mdx' + +## Examples + +Join the local server to a remote server: + +```shell-session +$ nomad server join 10.0.0.8:4648 +Joined 1 servers successfully +``` diff --git a/content/nomad/v0.11.x/content/docs/commands/server/members.mdx b/content/nomad/v0.11.x/content/docs/commands/server/members.mdx new file mode 100644 index 0000000000..f45bcc513a --- /dev/null +++ b/content/nomad/v0.11.x/content/docs/commands/server/members.mdx @@ -0,0 +1,50 @@ +--- +layout: docs +page_title: 'Commands: server members' +sidebar_title: members +description: > + The server members command is used to display a list of the known server + members and their status. +--- + +# Command: server members + +The `server members` command displays a list of the known servers in the cluster +and their current status. Member information is provided by the gossip protocol, +which is only run on server nodes. + +## Usage + +```plaintext +nomad server members [options] +``` + +## General Options + +@include 'general_options.mdx' + +## Server Members Options + +- `-detailed`: Dump the basic member information as well as the raw set of tags + for each member. This mode reveals additional information not displayed in the + standard output format. + +## Examples + +Default view: + +```shell-session +$ nomad server members +Name Addr Port Status Proto Build DC Region +node1.global 10.0.0.8 4648 alive 2 0.1.0dev dc1 global +node2.global 10.0.0.9 4648 alive 2 0.1.0dev dc1 global +``` + +Detailed view: + +```shell-session +$ nomad server members -detailed +Name Addr Port Tags +node1 10.0.0.8 4648 bootstrap=1,build=0.1.0dev,vsn=1,vsn_max=1,dc=dc1,port=4647,region=global,role=nomad,vsn_min=1 +node2 10.0.0.9 4648 bootstrap=0,build=0.1.0dev,vsn=1,vsn_max=1,dc=dc1,port=4647,region=global,role=nomad,vsn_min=1 +``` diff --git a/content/nomad/v0.11.x/content/docs/commands/status.mdx b/content/nomad/v0.11.x/content/docs/commands/status.mdx new file mode 100644 index 0000000000..a57901788d --- /dev/null +++ b/content/nomad/v0.11.x/content/docs/commands/status.mdx @@ -0,0 +1,143 @@ +--- +layout: docs +page_title: 'Commands: status' +sidebar_title: status +description: | + Display the status output for any Nomad resource. +--- + +# Command: status + +The `status` command displays the status output for any Nomad resource. + +## Usage + +```plaintext +nomad status [options] +``` + +The status command accepts any Nomad identifier or identifier prefix as its sole +argument. The command detects the type of the identifier and routes to the +appropriate status command to display more detailed output. + +If the ID is omitted, the command lists out all of the existing jobs. This is +for backwards compatibility and should not be relied on. + +## General Options + +@include 'general_options.mdx' + +## Examples + +Display the status of a job: + +```shell-session +$ nomad status example +ID = example +Name = example +Submit Date = 08/28/17 23:01:39 UTC +Type = service +Priority = 50 +Datacenters = dc1 +Status = running +Periodic = false +Parameterized = false + +Summary +Task Group Queued Starting Running Failed Complete Lost +cache 0 0 1 0 0 0 + +Latest Deployment +ID = f5506391 +Status = running +Description = Deployment is running + +Deployed +Task Group Desired Placed Healthy Unhealthy +cache 1 1 0 0 + +Allocations +ID Node ID Task Group Version Desired Status Created At +e1d14a39 f9dabe93 cache 0 run running 08/28/17 23:01:39 UTC +``` + +Display the status of an allocation: + +```shell-session +$ nomad status e1d14a39 +ID = e1d14a39 +Eval ID = cc882755 +Name = example.cache[0] +Node ID = f9dabe93 +Job ID = example +Job Version = 0 +Client Status = running +Client Description = +Desired Status = run +Desired Description = +Created At = 08/28/17 23:01:39 UTC +Deployment ID = f5506391 +Deployment Health = healthy + +Task "redis" is "running" +Task Resources +CPU Memory Disk Addresses +4/500 MHz 6.3 MiB/256 MiB 300 MiB db: 127.0.0.1:21752 + +Task Events: +Started At = 08/28/17 23:01:39 UTC +Finished At = N/A +Total Restarts = 0 +Last Restart = N/A + +Recent Events: +Time Type Description +08/28/17 23:01:39 UTC Started Task started by client +08/28/17 23:01:39 UTC Task Setup Building Task Directory +08/28/17 23:01:39 UTC Received Task received by client +``` + +Display the status of a deployment: + +```shell-session +$ nomad status f5506391 +ID = f5506391 +Job ID = example +Job Version = 0 +Status = successful +Description = Deployment completed successfully + +Deployed +Task Group Desired Placed Healthy Unhealthy +cache 1 1 1 0 +``` + +Display the status of a node: + +```shell-session +$ nomad status f9dabe93 +ID = f9dabe93 +Name = nomad-server01 +Class = +DC = dc1 +Drain = false +Status = ready +Drivers = docker,exec,java,qemu,raw_exec,rkt +Uptime = 4h17m24s + +Allocated Resources +CPU Memory Disk +500/8709 MHz 256 MiB/2.0 GiB 300 MiB/24 GiB + +Allocation Resource Utilization +CPU Memory +3/8709 MHz 6.3 MiB/2.0 GiB + +Host Resource Utilization +CPU Memory Disk +116/8709 MHz 335 MiB/2.0 GiB 12 GiB/38 GiB + +Allocations +ID Node ID Task Group Version Desired Status Created At +e1d14a39 f9dabe93 cache 0 run running 08/28/17 23:01:39 UTC +``` diff --git a/content/nomad/v0.11.x/content/docs/commands/system/gc.mdx b/content/nomad/v0.11.x/content/docs/commands/system/gc.mdx new file mode 100644 index 0000000000..aff2c38074 --- /dev/null +++ b/content/nomad/v0.11.x/content/docs/commands/system/gc.mdx @@ -0,0 +1,31 @@ +--- +layout: docs +page_title: 'Commands: system gc' +sidebar_title: gc +description: | + Run the system garbage collection process. +--- + +# Command: system gc + +Initializes a garbage collection of jobs, evaluations, allocations, and nodes. +This is an asynchronous operation. + +## Usage + +```plaintext +nomad system gc [options] +``` + +## General Options + +@include 'general_options.mdx' + +## Examples + +Running the system gc command does not output unless an error occurs: + +```shell-session +$ nomad system gc + +``` diff --git a/content/nomad/v0.11.x/content/docs/commands/system/index.mdx b/content/nomad/v0.11.x/content/docs/commands/system/index.mdx new file mode 100644 index 0000000000..afdf094f9b --- /dev/null +++ b/content/nomad/v0.11.x/content/docs/commands/system/index.mdx @@ -0,0 +1,25 @@ +--- +layout: docs +page_title: 'Commands: system' +sidebar_title: system +description: | + The system command is used to interact with the system API. +--- + +# Command: system + +The `system` command is used to interact with the system API. These calls are +used for system maintenance and should not be necessary for most users. + +## Usage + +Usage: `nomad system [options]` + +Run `nomad system -h` for help on that subcommand. The following +subcommands are available: + +- [`system gc`][gc] - Run the system garbage collection process +- [`system reconcile summaries`][reconcile-summaries] - Reconciles the summaries of all registered jobs + +[gc]: /docs/commands/system/gc 'Run the system garbage collection process' +[reconcile-summaries]: /docs/commands/system/reconcile-summaries 'Reconciles the summaries of all registered jobs' diff --git a/content/nomad/v0.11.x/content/docs/commands/system/reconcile-summaries.mdx b/content/nomad/v0.11.x/content/docs/commands/system/reconcile-summaries.mdx new file mode 100644 index 0000000000..f8d1cc0719 --- /dev/null +++ b/content/nomad/v0.11.x/content/docs/commands/system/reconcile-summaries.mdx @@ -0,0 +1,31 @@ +--- +layout: docs +page_title: 'Commands: system reconcile summaries' +sidebar_title: reconcile summaries +description: | + Reconciles the summaries of all registered jobs. +--- + +# Command: system reconcile summaries + +Reconciles the summaries of all registered jobs. + +## Usage + +```plaintext +nomad system reconcile summaries [options] +``` + +## General Options + +@include 'general_options.mdx' + +## Examples + +Running the system reconcile summaries command does not output unless an error +occurs: + +```shell-session +$ nomad system reconcile summaries + +``` diff --git a/content/nomad/v0.11.x/content/docs/commands/ui.mdx b/content/nomad/v0.11.x/content/docs/commands/ui.mdx new file mode 100644 index 0000000000..903032844d --- /dev/null +++ b/content/nomad/v0.11.x/content/docs/commands/ui.mdx @@ -0,0 +1,50 @@ +--- +layout: docs +page_title: 'Commands: ui' +sidebar_title: ui +description: | + The ui command is used to open the Nomad Web UI. +--- + +# Command: ui + +The `ui` command is used to open the Nomad Web UI. + +## Usage + +```plaintext +nomad ui [options] +``` + +The `ui` command can be called with no arguments, in which case the UI homepage +will be opened in the default browser. + +An identifier may be provided, in which case the UI will be opened to view the +details for that object. Supported identifiers are jobs, allocations and nodes. + +## General Options + +@include 'general_options.mdx' + +## Examples + +Open the UI homepage: + +```shell-session +$ nomad ui +Opening URL "http://127.0.0.1:4646" +``` + +Open the UI directly to look at a job: + +```shell-session +$ nomad ui redis-job +http://127.0.0.1:4646/ui/jobs/redis-job +``` + +Open the UI directly to look at an allocation: + +```shell-session +$ nomad ui d4005969 +Opening URL "http://127.0.0.1:4646/ui/allocations/d4005969-b16f-10eb-4fe1-a5374986083d" +``` diff --git a/content/nomad/v0.11.x/content/docs/commands/version.mdx b/content/nomad/v0.11.x/content/docs/commands/version.mdx new file mode 100644 index 0000000000..b3d756dd9e --- /dev/null +++ b/content/nomad/v0.11.x/content/docs/commands/version.mdx @@ -0,0 +1,31 @@ +--- +layout: docs +page_title: 'Commands: version' +sidebar_title: version +description: | + Display the version and build data of Nomad +--- + +# Command: version + +The `version` command displays build information about the running binary, +including the release version and the exact revision. + +## Usage + +```plaintext +nomad version +``` + +## Output + +This command prints both the version number as well as the exact commit SHA used +during the build. The SHA may also have the string `+CHANGES` appended to the +end, indicating that local, uncommitted changes were detected at build time. + +## Examples + +```shell-session +$ nomad version +Nomad v0.0.0-615-gcf3c6aa-dev (cf3c6aa8a75a689987b689d75ae2ba73458465cb+CHANGES) +``` diff --git a/content/nomad/v0.11.x/content/docs/commands/volume/deregister.mdx b/content/nomad/v0.11.x/content/docs/commands/volume/deregister.mdx new file mode 100644 index 0000000000..06dc19c3ad --- /dev/null +++ b/content/nomad/v0.11.x/content/docs/commands/volume/deregister.mdx @@ -0,0 +1,31 @@ +--- +layout: docs +page_title: 'Commands: volume deregister' +sidebar_title: deregister +description: | + Deregister volumes with CSI plugins. +--- + +# Command: volume deregister + +The `volume deregister` command deregisters external storage volumes with +Nomad's [Container Storage Interface (CSI)][csi] support. The volume +must exist on the remote storage provider before it can be deregistered +and used by a task. + +## Usage + +```plaintext +nomad volume deregister [options] [volume] +``` + +The `volume deregister` command requires a single argument, specifying +the ID of volume to be deregistered. Deregistration will fail if the +volume is still in use by an allocation or in the process of being +unpublished. + +## General Options + +@include 'general_options.mdx' + +[csi]: https://github.com/container-storage-interface/spec diff --git a/content/nomad/v0.11.x/content/docs/commands/volume/index.mdx b/content/nomad/v0.11.x/content/docs/commands/volume/index.mdx new file mode 100644 index 0000000000..0f3db82667 --- /dev/null +++ b/content/nomad/v0.11.x/content/docs/commands/volume/index.mdx @@ -0,0 +1,26 @@ +--- +layout: docs +page_title: 'Commands: volume' +sidebar_title: volume +description: | + The volume command is used to interact with volumes. +--- + +# Command: volume + +The `volume` command is used to interact with volumes. + +## Usage + +Usage: `nomad volume [options]` + +Run `nomad volume -h` for help on that subcommand. The following +subcommands are available: + +- [`volume register`][register] - Register a volume. +- [`volume deregister`][deregister] - Deregister a volume. +- [`volume status`][status] - Display status information about a volume + +[register]: /docs/commands/volume/register 'Register a volume' +[deregister]: /docs/commands/volume/deregister 'Deregister a volume' +[status]: /docs/commands/volume/status 'Display status information about a volume' diff --git a/content/nomad/v0.11.x/content/docs/commands/volume/register.mdx b/content/nomad/v0.11.x/content/docs/commands/volume/register.mdx new file mode 100644 index 0000000000..815f80135c --- /dev/null +++ b/content/nomad/v0.11.x/content/docs/commands/volume/register.mdx @@ -0,0 +1,115 @@ +--- +layout: docs +page_title: 'Commands: volume register' +sidebar_title: register +description: | + Register volumes with CSI plugins. +--- + +# Command: volume register + +The `volume register` command registers external storage volumes with +Nomad's [Container Storage Interface (CSI)][csi] support. The volume +must exist on the remote storage provider before it can be registered +and used by a task. + +## Usage + +```plaintext +nomad volume register [options] [file] +``` + +The `volume register` command requires a single argument, specifying +the path to a file containing a valid [volume +specification][volume_specification]. This file will be read and the +job will be submitted to Nomad for scheduling. If the supplied path is +"-", the job file is read from STDIN. Otherwise it is read from the +file at the supplied path. + +## General Options + +@include 'general_options.mdx' + +## Volume Specification + +The file may be provided as either HCL or JSON. An example HCL configuration: + +```hcl +id = "ebs_prod_db1" +name = "database" +type = "csi" +external_id = "vol-23452345" +plugin_id = "ebs-prod" +access_mode = "single-node-writer" +attachment_mode = "file-system" +mount_options { + fs_type = "ext4" + mount_flags = ["ro"] +} +secrets { + example_secret = "xyzzy" +} +parameters { + skuname = "Premium_LRS" +} +context { + endpoint = "http://192.168.1.101:9425" +} +``` + +## Volume Specification Parameters + +- `id` `(string: )` - The unique ID of the volume. This will + be how [`volume`][csi_volume] stanzas in a jobspec refer to the volume. + +- `name` `(string: )` - The display name of the volume. + +- `type` `(string: )` - The type of volume. Currently only + `"csi"` is supported. + +- `external_id` `(string: )` - The ID of the physical volume + from the storage provider. For example, the volume ID of an AWS EBS + volume or Digital Ocean volume. + +- `plugin_id` `(string: )` - The ID of the [CSI + plugin][csi_plugin] that manages this volume. + +- `access_mode` `(string: )` - Defines whether a volume + should be available concurrently. Can be one of + `"single-node-reader-only"`, `"single-node-writer"`, + `"multi-node-reader-only"`, `"multi-node-single-writer"`, or + `"multi-node-multi-writer"`. Most CSI plugins support only + single-node modes. Consult the documentation of the storage provider + and CSI plugin. + +- `attachment_mode` `(string: )` - The storage API that will + be used by the volume. Most storage providers will support + `"file-system"`, to mount pre-formatted file system volumes. Some + storage providers will support `"block-device"`, which will require + the job be configured with appropriate mount options. + +- `mount_options` ([mount_options][]:nil) - Options for + mounting `block-device`volumes without a pre-formatted file system. + - `fs_type`: file system type (ex. `"ext4"`) + - `mount_flags`: the flags passed to `mount` (ex. `"ro,noatime"`) + +- `secrets` (map:nil) - An optional + key-value map of strings used as credentials for publishing and + unpublishing volumes. + +- `parameters` (map:nil) - An optional + key-value map of strings passed directly to the CSI plugin to + configure the volume. The details of these parameters are specific + to each storage provider, so please see the specific plugin + documentation for more information. + +- `context` (map:nil) - An optional + key-value map of strings passed directly to the CSI plugin to + validate the volume. The details of these parameters are specific to + each storage provider, so please see the specific plugin + documentation for more information. + +[volume_specification]: #volume-specification +[csi]: https://github.com/container-storage-interface/spec +[csi_plugin]: /docs/job-specification/csi_plugin +[csi_volumes]: /docs/job-specification/volume diff --git a/content/nomad/v0.11.x/content/docs/commands/volume/status.mdx b/content/nomad/v0.11.x/content/docs/commands/volume/status.mdx new file mode 100644 index 0000000000..8788167d05 --- /dev/null +++ b/content/nomad/v0.11.x/content/docs/commands/volume/status.mdx @@ -0,0 +1,107 @@ +--- +layout: docs +page_title: 'Commands: volume status' +sidebar_title: status +description: | + Display information and status of volumes. +--- + +# Command: volume status + +The `volume status` command displays status information for [Container +Storage Interface (CSI)][csi] volumes. + +## Usage + +```plaintext +nomad volume status [options] [volume] +``` + +This command accepts an optional volume ID or prefix as the sole argument. If there +is an exact match based on the provided volume ID or prefix, then information about +the specific volume is queried and displayed. Otherwise, a list of matching volumes +and information will be displayed. + +If the ID is omitted, the command lists out all of the existing volumes and a few +of the most useful status fields for each. + +## General Options + +@include 'general_options.mdx' + +## Status Options + +- `-type`: Display only volumes of a particular type. Currently only + the `csi` type is supported, so this option can be omitted when + querying the status of CSI volumes. + +- `-plugin_id`: Display only volumes managed by a particular [CSI + plugin][csi_plugin]. + +- `-short`: Display short output. Used only when a single volume is + being queried. Drops verbose volume allocation data from the + output. + +- `-verbose`: Show full information. Allocation create and modify + times are shown in `yyyy/mm/dd hh:mm:ss` format. + +## Examples + +List of all volumes: + +```shell-session +$ nomad volume [-type csi] status +ID Name Plugin ID Schedulable Access Mode +ebs_prod_db1 database ebs-prod true single-node-writer +``` + +Short view of a specific volume: + +```shell-session +$ nomad volume status [-verbose] [-plugin=ebs-prod] ebs_prod_db1 +ID = ebs_prod_db1 +Name = database +Type = csi +External ID = vol-23452345 +Plugin ID = ebs-prod +Provider = aws.ebs +Version = 1.0.1 +Schedulable = true +Controllers Healthy = 1 +Controllers Expected = 1 +Nodes Healthy = 1 +Nodes Expected = 1 +Access Mode = single-node-writer +Attachment Mode = file-system +Mount Options = fs_type: ext4 flags: ro +Namespace = default +``` + +Full status information of a volume: + +```shell-session +$ nomad volume status [-verbose] [-plugin=ebs-prod] ebs_prod_db1 +ID = ebs_prod_db1 +Name = database +Type = csi +External ID = vol-23452345 +Plugin ID = ebs-prod +Provider = aws.ebs +Version = 1.0.1 +Schedulable = true +Controllers Healthy = 1 +Controllers Expected = 1 +Nodes Healthy = 1 +Nodes Expected = 1 +Access Mode = single-node-writer +Attachment Mode = file-system +Mount Options = fs_type: ext4 flags: ro +Namespace = default + +Allocations +ID Node ID Access Mode Task Group Version Desired [...] +b00fa322 28be17d5 write csi 0 run +``` + +[csi]: https://github.com/container-storage-interface/spec +[csi_plugin]: /docs/job-specification/csi_plugin diff --git a/content/nomad/v0.11.x/content/docs/configuration/acl.mdx b/content/nomad/v0.11.x/content/docs/configuration/acl.mdx new file mode 100644 index 0000000000..bb4e193416 --- /dev/null +++ b/content/nomad/v0.11.x/content/docs/configuration/acl.mdx @@ -0,0 +1,47 @@ +--- +layout: docs +page_title: acl Stanza - Agent Configuration +sidebar_title: acl +description: >- + The "acl" stanza configures the Nomad agent to enable ACLs and tune various + parameters. +--- + +# `acl` Stanza + + + +The `acl` stanza configures the Nomad agent to enable ACLs and tunes various +ACL parameters. Learn more about configuring Nomad's ACL system in the [Secure +Nomad with Access Control guide][secure-guide]. + +```hcl +acl { + enabled = true + token_ttl = "30s" + policy_ttl = "60s" +} +``` + +## `acl` Parameters + +- `enabled` `(bool: false)` - Specifies if ACL enforcement is enabled. All other + client configuration options depend on this value. + +- `token_ttl` `(string: "30s")` - Specifies the maximum time-to-live (TTL) for + cached ACL tokens. This does not affect servers, since they do not cache tokens. + Setting this value lower reduces how stale a token can be, but increases + the request load against servers. If a client cannot reach a server, for example + because of an outage, the TTL will be ignored and the cached value used. + +- `policy_ttl` `(string: "30s")` - Specifies the maximum time-to-live (TTL) for + cached ACL policies. This does not affect servers, since they do not cache policies. + Setting this value lower reduces how stale a policy can be, but increases + the request load against servers. If a client cannot reach a server, for example + because of an outage, the TTL will be ignored and the cached value used. + +- `replication_token` `(string: "")` - Specifies the Secret ID of the ACL token + to use for replicating policies and tokens. This is used by servers in non-authoritative + region to mirror the policies and tokens into the local region. + +[secure-guide]: https://learn.hashicorp.com/nomad/acls/fundamentals diff --git a/content/nomad/v0.11.x/content/docs/configuration/audit.mdx b/content/nomad/v0.11.x/content/docs/configuration/audit.mdx new file mode 100644 index 0000000000..3423d3aeec --- /dev/null +++ b/content/nomad/v0.11.x/content/docs/configuration/audit.mdx @@ -0,0 +1,297 @@ +--- +layout: docs +page_title: audit Stanza - Agent Configuration +sidebar_title: audit +description: >- + The "audit" stanza configures the Nomad agent to configure Audit Logging + behavior. This is an Enterprise-only feature. +--- + +# `audit` Stanza + + + +The `audit` stanza configures the Nomad agent to configure Audit logging behavior. +Audit logging is an Enterprise-only feature. + +```hcl +audit { + enabled = true +} +``` + +When enabled, each HTTP request made to a nomad agent (client or server) will +generate two audit log entries. These two entries correspond to a stage, +`OperationReceived` and `OperationComplete`. Audit logging will generate a +`OperationReceived` event before the request is processed. An `OperationComplete` +event will be sent after the request has been processed, but before the response +body is returned to the end user. + +By default, with a minimally configured audit stanza (`audit { enabled = true }`) +The following default sink will be added with no filters. + +```hcl +audit { + enable = true + sink "audit" { + type = "file" + delivery_guarantee = "enforced" + format = "json" + path = "/[data_dir]/audit/audit.log" + } +} +``` + +The sink will create an `audit.log` file located within the defined `data_dir` +directory inside an `audit` directory. `delivery_guarantee` will be set to +`"enforced"` meaning that all requests must successfully be written to the sink +in order for HTTP requests to successfully complete. + + +## `audit` Parameters + +- `enabled` `(bool: false)` - Specifies if audit logging should be enabled. + When enabled, audit logging will occur for every request, unless it is + filtered by a `filter`. + +- `sink` ([sink](#sink-stanza): default) - Configures a sink + for audit logs to be sent to. + +- `filter` (array<[filter](#filter-stanza)>: []) - Configures a filter + to exclude matching events from being sent to audit logging sinks. + +### `sink` Stanza + +The `sink` stanza is used to make audit logging sinks for events to be +sent to. Currently only a single sink is supported. + +The key of the stanza corresponds to the name of the sink which is used +for logging purposes + +```hcl +audit { + enabled = true + + sink "audit" { + type = "file" + delivery_guarantee = "enforced" + format = "json" + path = "/var/lib/nomad/audit/audit.log" + rotate_bytes = 100 + rotate_duration = "24h" + rotate_max_files = 10 + } +} +``` + +#### `sink` Parameters + +- `type` `(string: "file", required)` - Specifies the type of sink to create. + Currently only `"file"` type is supported. + +- `delivery_guarantee` `(string: "enforced", required)` - Specifies the + delivery guarantee that will be made for each audit log entry. Available + options are `"enforced"` and `"best-effort"`. `"enforced"` will + hault request execution if the audit log event fails to be written to it's sink. + `"best-effort"` will not hault request execution, meaning a request could + potentially be un-audited. + +- `format` `(string: "json", required)` - Specifies the output format to be + sent to a sink. Currently only `"json"` format is supported. + +- `path` `(string: "[data_dir]/audit/audit.log")` - Specifies the path and file + name to use for the audit log. By default Nomad will use it's configured + [`data_dir`](/docs/configuration#data_dir) for a combined path of + `/data_dir/audit/audit.log`. If `rotate_bytes` or `rotate_duration` are set + file rotation will occur. In this case the filename will be post-fixed with + a timestamp `"filename-{timestamp}.log"` + +- `rotate_bytes` `(int: 0)` - Specifies the number of bytes that should be + written to an audit log before it needs to be rotated. Unless specified, + there is no limit to the number of bytes that can be written to a log file. + +- `rotate_duration` `(duration: "24h")` - Specifies the maximum duration a + audit log should be written to before it needs to be rotated. Must be a + duration value such as 30s. + +- `rotate_max_files` `(int: 0)` - Specifies the maximum number of older audit log + file archives to keep. If 0 no files are ever deleted. + +### `filter` Stanza + +The `filter` stanza is used to create filters to filter __out__ matching events +from being written to the audit log. By default, all events will be sent to an +audit log for all stages (OperationReceived and OperationComplete). Filters +are useful for operators who want to limit the performance impact of audit +logging as well as reducing the amount of events generated. + +`endpoints`, `stages`, and `operations` support [globbed pattern](https://github.com/ryanuber/go-glob/blob/master/README.md#example) matching. + +Query parameters are ignored when evaluating filters. + +```hcl +audit { + enabled = true + + # Filter out all requests and all stages for /v1/metrics + filter "default" { + type = "HTTPEvent" + endpoints = ["/v1/metrics"] + stages = ["*"] + operations = ["*"] + } + + # Filter out requests where endpoint matches globbed pattern + filter "globbed example" { + type = "HTTPEvent" + endpoints = ["/v1/evaluation/*/allocations"] + stages = ["*"] + operations = ["*"] + } + + # Filter out OperationReceived GET requests for all endpoints + filter "OperationReceived GETs" { + type = "HTTPEvent" + endpoints = ["*"] + stages = ["OperationReceived"] + operations = ["GET"] + } +} +``` + +#### `filter` Parameters + +- `type` `(string: "HTTPEvent", required)` - Specifies the type of filter to + create. Currently only HTTPEvent is supported. + +- `endpoints` `(array: [])` - Specifies the list of endpoints to apply + the filter to. + +- `stages` `(array: [])` - Specifies the list of stages + (`"OperationReceived"`, `"OperationComplete"`, `"*"`) to apply the filter to + for a matching endpoint. + +- `operations` `(array: [])` - Specifies the list of operations to + apply the filter to for a matching endpoint. For HTTPEvent types this + corresponds to an HTTP verb (GET, PUT, POST, DELETE...). + +## Audit Log Format + +Below are two audit log entries for a request made to `/v1/job/web/summary`. +The first entry is for the `OperationReceived` stage. The second entry is for +the `OperationComplete` stage and includes the contents of the `OperationReceived` +stage plus a `response` key. + +```json +{ + "created_at": "2020-03-24T13:09:35.703869927-04:00", + "event_type": "audit", + "payload": { + "id": "8b826146-b264-af15-6526-29cb905145aa", + "stage": "OperationReceived", + "type": "audit", + "timestamp": "2020-03-24T13:09:35.703865005-04:00", + "version": 1, + "auth": { + "accessor_id": "a162f017-bcf7-900c-e22a-a2a8cbbcef53", + "name": "Bootstrap Token", + "global": true, + "create_time": "2020-03-24T17:08:35.086591881Z" + }, + "request": { + "id": "02f0ac35-c7e8-0871-5a58-ee9dbc0a70ea", + "operation": "GET", + "endpoint": "/v1/job/web/summary", + "namespace": { + "id": "default" + }, + "request_meta": { + "remote_address": "127.0.0.1:33648", + "user_agent": "Go-http-client/1.1" + }, + "node_meta": { + "ip": "127.0.0.1:4646" + } + } + } +} +{ + "created_at": "2020-03-24T13:09:35.704224536-04:00", + "event_type": "audit", + "payload": { + "id": "8b826146-b264-af15-6526-29cb905145aa", + "stage": "OperationComplete", + "type": "audit", + "timestamp": "2020-03-24T13:09:35.703865005-04:00", + "version": 1, + "auth": { + "accessor_id": "a162f017-bcf7-900c-e22a-a2a8cbbcef53", + "name": "Bootstrap Token", + "global": true, + "create_time": "2020-03-24T17:08:35.086591881Z" + }, + "request": { + "id": "02f0ac35-c7e8-0871-5a58-ee9dbc0a70ea", + "operation": "GET", + "endpoint": "/v1/job/web/summary", + "namespace": { + "id": "default" + }, + "request_meta": { + "remote_address": "127.0.0.1:33648", + "user_agent": "Go-http-client/1.1" + }, + "node_meta": { + "ip": "127.0.0.1:4646" + } + }, + "response": { + "status_code": 200 + } + } +} + +``` + +If the request returns an error the audit log will reflect the error message. + +```json +{ + "created_at": "2020-03-24T13:18:36.121978648-04:00", + "event_type": "audit", + "payload": { + "id": "21c6f97a-fbfb-1090-1e34-34d1ece57cc2", + "stage": "OperationComplete", + "type": "audit", + "timestamp": "2020-03-24T13:18:36.121428628-04:00", + "version": 1, + "auth": { + "accessor_id": "anonymous", + "name": "Anonymous Token", + "policies": [ + "anonymous" + ], + "create_time": "0001-01-01T00:00:00Z" + }, + "request": { + "id": "c696cc9e-962e-18b3-4097-e0a09070f89e", + "operation": "GET", + "endpoint": "/v1/jobs?prefix=web", + "namespace": { + "id": "default" + }, + "request_meta": { + "remote_address": "127.0.0.1:33874", + "user_agent": "Go-http-client/1.1" + }, + "node_meta": { + "ip": "127.0.0.1:4646" + } + }, + "response": { + "status_code": 403, + "error": "Permission denied" + } + } +} +``` diff --git a/content/nomad/v0.11.x/content/docs/configuration/autopilot.mdx b/content/nomad/v0.11.x/content/docs/configuration/autopilot.mdx new file mode 100644 index 0000000000..f986c69d70 --- /dev/null +++ b/content/nomad/v0.11.x/content/docs/configuration/autopilot.mdx @@ -0,0 +1,58 @@ +--- +layout: docs +page_title: autopilot Stanza - Agent Configuration +sidebar_title: autopilot +description: >- + The "autopilot" stanza configures the Nomad agent to configure Autopilot + behavior. +--- + +# `autopilot` Stanza + + + +The `autopilot` stanza configures the Nomad agent to configure Autopilot behavior. +For more information about Autopilot, see the [Autopilot Guide](https://learn.hashicorp.com/nomad/operating-nomad/autopilot). + +```hcl +autopilot { + cleanup_dead_servers = true + last_contact_threshold = "200ms" + max_trailing_logs = 250 + server_stabilization_time = "10s" + enable_redundancy_zones = false + disable_upgrade_migration = false + enable_custom_upgrades = false +} +``` + +## `autopilot` Parameters + +- `cleanup_dead_servers` `(bool: true)` - Specifies automatic removal of dead + server nodes periodically and whenever a new server is added to the cluster. + +- `last_contact_threshold` `(string: "200ms")` - Specifies the maximum amount of + time a server can go without contact from the leader before being considered + unhealthy. Must be a duration value such as `10s`. + +- `max_trailing_logs` `(int: 250)` specifies the maximum number of log entries + that a server can trail the leader by before being considered unhealthy. + +- `server_stabilization_time` `(string: "10s")` - Specifies the minimum amount of + time a server must be stable in the 'healthy' state before being added to the + cluster. Only takes effect if all servers are running Raft protocol version 3 + or higher. Must be a duration value such as `30s`. + +- `enable_redundancy_zones` `(bool: false)` - (Enterprise-only) Controls whether + Autopilot separates servers into zones for redundancy, in conjunction with the + [redundancy_zone](/docs/configuration/server#redundancy_zone) parameter. + Only one server in each zone can be a voting member at one time. + +- `disable_upgrade_migration` `(bool: false)` - (Enterprise-only) Disables Autopilot's + upgrade migration strategy in Nomad Enterprise of waiting until enough + newer-versioned servers have been added to the cluster before promoting any of + them to voters. + +- `enable_custom_upgrades` `(bool: false)` - (Enterprise-only) Specifies whether to + enable using custom upgrade versions when performing migrations, in conjunction with + the [upgrade_version](/docs/configuration/server#upgrade_version) parameter. diff --git a/content/nomad/v0.11.x/content/docs/configuration/client.mdx b/content/nomad/v0.11.x/content/docs/configuration/client.mdx new file mode 100644 index 0000000000..aa81a1aef6 --- /dev/null +++ b/content/nomad/v0.11.x/content/docs/configuration/client.mdx @@ -0,0 +1,436 @@ +--- +layout: docs +page_title: client Stanza - Agent Configuration +sidebar_title: client +description: |- + The "client" stanza configures the Nomad agent to accept jobs as assigned by + the Nomad server, join the cluster, and specify driver-specific configuration. +--- + +# `client` Stanza + + + +The `client` stanza configures the Nomad agent to accept jobs as assigned by +the Nomad server, join the cluster, and specify driver-specific configuration. + +```hcl +client { + enabled = true + servers = ["1.2.3.4:4647", "5.6.7.8:4647"] +} +``` + +## Plugin Options + +Nomad 0.9 now supports pluggable drivers. Operators should use the new +[plugin][plugin-stanza] syntax to modify driver configuration. To find the +plugin options supported by each individual Nomad driver, please see the +[drivers documentation](/docs/drivers). The pre-0.9 `client.options` +stanza will be supported in 0.9 for backward compatibility (except for the `lxc` +driver) but will be removed in a future release. + +## `client` Parameters + +- `alloc_dir` `(string: "[data_dir]/alloc")` - Specifies the directory to use + for allocation data. By default, this is the top-level + [data_dir](/docs/configuration#data_dir) suffixed with + "alloc", like `"/opt/nomad/alloc"`. This must be an absolute path. + +- `chroot_env` ([ChrootEnv](#chroot_env-parameters): nil) - + Specifies a key-value mapping that defines the chroot environment for jobs + using the Exec and Java drivers. + +- `enabled` `(bool: false)` - Specifies if client mode is enabled. All other + client configuration options depend on this value. + +- `max_kill_timeout` `(string: "30s")` - Specifies the maximum amount of time a + job is allowed to wait to exit. Individual jobs may customize their own kill + timeout, but it may not exceed this value. + +- `disable_remote_exec` `(bool: false)` - Specifies if the client should disable + remote task execution to tasks running on this client. + +- `meta` `(map[string]string: nil)` - Specifies a key-value map that annotates + with user-defined metadata. + +- `network_interface` `(string: varied)` - Specifies the name of the interface + to force network fingerprinting on. When run in dev mode, this defaults to the + loopback interface. When not in dev mode, the interface attached to the + default route is used. The scheduler chooses from these fingerprinted IP + addresses when allocating ports for tasks. + + If no non-local IP addresses are found, Nomad could fingerprint link-local IPv6 + addresses depending on the client's + [`"fingerprint.network.disallow_link_local"`](#fingerprint-network-disallow_link_local) + configuration value. + +- `network_speed` `(int: 0)` - Specifies an override for the network link speed. + This value, if set, overrides any detected or defaulted link speed. Most + clients can determine their speed automatically, and thus in most cases this + should be left unset. + +- `cpu_total_compute` `(int: 0)` - Specifies an override for the total CPU + compute. This value should be set to `# Cores * Core MHz`. For example, a + quad-core running at 2 GHz would have a total compute of 8000 (4 \* 2000). Most + clients can determine their total CPU compute automatically, and thus in most + cases this should be left unset. + +- `memory_total_mb` `(int:0)` - Specifies an override for the total memory. If set, + this value overrides any detected memory. + +- `node_class` `(string: "")` - Specifies an arbitrary string used to logically + group client nodes by user-defined class. This can be used during job + placement as a filter. + +- `options` ([Options](#options-parameters): nil) - Specifies a + key-value mapping of internal configuration for clients, such as for driver + configuration. + +- `reserved` ([Reserved](#reserved-parameters): nil) - Specifies + that Nomad should reserve a portion of the node's resources from receiving + tasks. This can be used to target a certain capacity usage for the node. For + example, 20% of the node's CPU could be reserved to target a CPU utilization + of 80%. + +- `servers` `(array: [])` - Specifies an array of addresses to the Nomad + servers this client should join. This list is used to register the client with + the server nodes and advertise the available resources so that the agent can + receive work. This may be specified as an IP address or DNS, with or without + the port. If the port is omitted, the default port of `4647` is used. + +- `server_join` ([server_join][server-join]: nil) - Specifies + how the Nomad client will connect to Nomad servers. The `start_join` field + is not supported on the client. The retry_join fields may directly specify + the server address or use go-discover syntax for auto-discovery. See the + documentation for more detail. + +- `state_dir` `(string: "[data_dir]/client")` - Specifies the directory to use + to store client state. By default, this is - the top-level + [data_dir](/docs/configuration#data_dir) suffixed with + "client", like `"/opt/nomad/client"`. This must be an absolute path. + +- `gc_interval` `(string: "1m")` - Specifies the interval at which Nomad + attempts to garbage collect terminal allocation directories. + +- `gc_disk_usage_threshold` `(float: 80)` - Specifies the disk usage percent which + Nomad tries to maintain by garbage collecting terminal allocations. + +- `gc_inode_usage_threshold` `(float: 70)` - Specifies the inode usage percent + which Nomad tries to maintain by garbage collecting terminal allocations. + +- `gc_max_allocs` `(int: 50)` - Specifies the maximum number of allocations + which a client will track before triggering a garbage collection of terminal + allocations. This will _not_ limit the number of allocations a node can run at + a time, however after `gc_max_allocs` every new allocation will cause terminal + allocations to be GC'd. + +- `gc_parallel_destroys` `(int: 2)` - Specifies the maximum number of + parallel destroys allowed by the garbage collector. This value should be + relatively low to avoid high resource usage during garbage collections. + +- `no_host_uuid` `(bool: true)` - By default a random node UUID will be + generated, but setting this to `false` will use the system's UUID. Before + Nomad 0.6 the default was to use the system UUID. + +- `cni_path` `(string: "/opt/cni/bin")` - Sets the search path that is used for + CNI plugin discovery. Multiple paths can be searched using colon delimited + paths + +- `bridge_network name` `(string: "nomad")` - Sets the name of the bridge to be + created by nomad for allocations running with bridge networking mode on the + client. + +- `bridge_network_subnet` `(string: "172.26.66.0/23")` - Specifies the subnet + which the client will use to allocate IP addresses from. + +- `template` ([Template](#template-parameters): nil) - Specifies + controls on the behavior of task + [`template`](/docs/job-specification/template) stanzas. + +- `host_volume` ([host_volume](#host_volume-stanza): nil) - Exposes + paths from the host as volumes that can be mounted into jobs. + +### `chroot_env` Parameters + +Drivers based on [isolated fork/exec](/docs/drivers/exec) implement file +system isolation using chroot on Linux. The `chroot_env` map allows the chroot +environment to be configured using source paths on the host operating system. +The mapping format is: + +```text +source_path -> dest_path +``` + +The following example specifies a chroot which contains just enough to run the +`ls` utility: + +```hcl +client { + chroot_env { + "/bin/ls" = "/bin/ls" + "/etc/ld.so.cache" = "/etc/ld.so.cache" + "/etc/ld.so.conf" = "/etc/ld.so.conf" + "/etc/ld.so.conf.d" = "/etc/ld.so.conf.d" + "/etc/passwd" = "/etc/passwd" + "/lib" = "/lib" + "/lib64" = "/lib64" + } +} +``` + +When `chroot_env` is unspecified, the `exec` driver will use a default chroot +environment with the most commonly used parts of the operating system. Please +see the [Nomad `exec` driver documentation](/docs/drivers/exec#chroot) for +the full list. + +### `options` Parameters + +~> Note: client configuration options for drivers will soon be deprecated. See +the [plugin stanza][plugin-stanza] documentation for more information. + +The following is not an exhaustive list of options for only the Nomad +client. To find the options supported by each individual Nomad driver, please +see the [drivers documentation](/docs/drivers). + +- `"driver.whitelist"` `(string: "")` - Specifies a comma-separated list of + whitelisted drivers . If specified, drivers not in the whitelist will be + disabled. If the whitelist is empty, all drivers are fingerprinted and enabled + where applicable. + + ```hcl + client { + options = { + "driver.whitelist" = "docker,qemu" + } + } + ``` + +- `"driver.blacklist"` `(string: "")` - Specifies a comma-separated list of + blacklisted drivers . If specified, drivers in the blacklist will be + disabled. + + ```hcl + client { + options = { + "driver.blacklist" = "docker,qemu" + } + } + ``` + +- `"env.blacklist"` `(string: see below)` - Specifies a comma-separated list of + environment variable keys not to pass to these tasks. Nomad passes the host + environment variables to `exec`, `raw_exec` and `java` tasks. If specified, + the defaults are overridden. If a value is provided, **all** defaults are + overridden (they are not merged). + + ```hcl + client { + options = { + "env.blacklist" = "MY_CUSTOM_ENVVAR" + } + } + ``` + + The default list is: + + ```text + CONSUL_TOKEN + CONSUL_HTTP_TOKEN + VAULT_TOKEN + AWS_ACCESS_KEY_ID + AWS_SECRET_ACCESS_KEY + AWS_SESSION_TOKEN + GOOGLE_APPLICATION_CREDENTIALS + ``` + +- `"user.blacklist"` `(string: see below)` - Specifies a comma-separated + blacklist of usernames for which a task is not allowed to run. This only + applies if the driver is included in `"user.checked_drivers"`. If a value is + provided, **all** defaults are overridden (they are not merged). + + ```hcl + client { + options = { + "user.blacklist" = "root,ubuntu" + } + } + ``` + + The default list is: + + ```text + root + Administrator + ``` + +- `"user.checked_drivers"` `(string: see below)` - Specifies a comma-separated + list of drivers for which to enforce the `"user.blacklist"`. For drivers using + containers, this enforcement is usually unnecessary. If a value is provided, + **all** defaults are overridden (they are not merged). + + ```hcl + client { + options = { + "user.checked_drivers" = "exec,raw_exec" + } + } + ``` + + The default list is: + + ```text + exec + qemu + java + ``` + +- `"fingerprint.whitelist"` `(string: "")` - Specifies a comma-separated list of + whitelisted fingerprinters. If specified, any fingerprinters not in the + whitelist will be disabled. If the whitelist is empty, all fingerprinters are + used. + + ```hcl + client { + options = { + "fingerprint.whitelist" = "network" + } + } + ``` + +- `"fingerprint.blacklist"` `(string: "")` - Specifies a comma-separated list of + blacklisted fingerprinters. If specified, any fingerprinters in the blacklist + will be disabled. + + ```hcl + client { + options = { + "fingerprint.blacklist" = "network" + } + } + ``` + +- `"fingerprint.network.disallow_link_local"` `(string: "false")` - Specifies + whether the network fingerprinter should ignore link-local addresses in the + case that no globally routable address is found. The fingerprinter will always + prefer globally routable addresses. + + ```hcl + client { + options = { + "fingerprint.network.disallow_link_local" = "true" + } + } + ``` + +### `reserved` Parameters + +- `cpu` `(int: 0)` - Specifies the amount of CPU to reserve, in MHz. + +- `memory` `(int: 0)` - Specifies the amount of memory to reserve, in MB. + +- `disk` `(int: 0)` - Specifies the amount of disk to reserve, in MB. + +- `reserved_ports` `(string: "")` - Specifies a comma-separated list of ports to + reserve on all fingerprinted network devices. Ranges can be specified by using + a hyphen separated the two inclusive ends. + +### `template` Parameters + +- `function_blacklist` `([]string: ["plugin"])` - Specifies a list of template + rendering functions that should be disallowed in job specs. By default the + `plugin` function is disallowed as it allows running arbitrary commands on + the host as root (unless Nomad is configured to run as a non-root user). + +- `disable_file_sandbox` `(bool: false)` - Allows templates access to arbitrary + files on the client host via the `file` function. By default templates can + access files only within the task directory. + +### `host_volume` Stanza + +The `host_volume` stanza is used to make volumes available to jobs. + +The key of the stanza corresponds to the name of the volume for use in the +`source` parameter of a `"host"` type [`volume`](/docs/job-specification/volume) +and ACLs. + +```hcl +client { + host_volume "ca-certificates" { + path = "/etc/ssl/certs" + read_only = true + } +} +``` + +#### `host_volume` Parameters + +- `path` `(string: "", required)` - Specifies the path on the host that should + be used as the source when this volume is mounted into a task. The path must + exist on client startup. + +- `read_only` `(bool: false)` - Specifies whether the volume should only ever be + allowed to be mounted `read_only`, or if it should be writeable. + +## `client` Examples + +### Common Setup + +This example shows the most basic configuration for a Nomad client joined to a +cluster. + +```hcl +client { + enabled = true + server_join { + retry_join = [ "1.1.1.1", "2.2.2.2" ] + retry_max = 3 + retry_interval = "15s" + } +} +``` + +### Reserved Resources + +This example shows a sample configuration for reserving resources to the client. +This is useful if you want to allocate only a portion of the client's resources +to jobs. + +```hcl +client { + enabled = true + + reserved { + cpu = 500 + memory = 512 + disk = 1024 + reserved_ports = "22,80,8500-8600" + } +} +``` + +### Custom Metadata, Network Speed, and Node Class + +This example shows a client configuration which customizes the metadata, network +speed, and node class. The scheduler can use this information while processing +[constraints][metadata_constraint]. The metadata is completely user configurable; +the values below are for illustrative purposes only. + +```hcl +client { + enabled = true + network_speed = 500 + node_class = "prod" + + meta { + "owner" = "ops" + "cached_binaries" = "redis,apache,nginx,jq,cypress,nodejs" + "rack" = "rack-12-1" + } +} +``` + +[plugin-options]: #plugin-options +[plugin-stanza]: /docs/configuration/plugin +[server-join]: /docs/configuration/server_join 'Server Join' +[metadata_constraint]: /docs/job-specification/constraint#user-specified-metadata 'Nomad User-Specified Metadata Constraint Example' diff --git a/content/nomad/v0.11.x/content/docs/configuration/consul.mdx b/content/nomad/v0.11.x/content/docs/configuration/consul.mdx new file mode 100644 index 0000000000..29a7825807 --- /dev/null +++ b/content/nomad/v0.11.x/content/docs/configuration/consul.mdx @@ -0,0 +1,174 @@ +--- +layout: docs +page_title: consul Stanza - Agent Configuration +sidebar_title: consul +description: |- + The "consul" stanza configures the Nomad agent's communication with + Consul for service discovery and key-value integration. When + configured, tasks can register themselves with Consul, and the Nomad cluster + can automatically bootstrap itself. +--- + +# `consul` Stanza + + + +The `consul` stanza configures the Nomad agent's communication with +[Consul][consul] for service discovery and key-value integration. When +configured, tasks can register themselves with Consul, and the Nomad cluster can +[automatically bootstrap][bootstrap] itself. + +```hcl +consul { + address = "127.0.0.1:8500" + auth = "admin:password" + token = "abcd1234" +} +``` + +A default `consul` stanza is automatically merged with all Nomad agent +configurations. These sane defaults automatically enable Consul integration if +Consul is detected on the system. This allows for seamless bootstrapping of the +cluster with zero configuration. To put it another way: if you have a Consul +agent running on the same host as the Nomad agent with the default +configuration, Nomad will automatically connect and configure with Consul. + +~> An important requirement is that each Nomad agent talks to a unique Consul +agent. Nomad agents should be configured to talk to Consul agents and not +Consul servers. If you are observing flapping services, you may have +multiple Nomad agents talking to the same Consul agent. As such avoid +configuring Nomad to talk to Consul via DNS such as consul.service.consul + +## `consul` Parameters + +- `address` `(string: "127.0.0.1:8500")` - Specifies the address to the local + Consul agent, given in the format `host:port`. Supports Unix sockets with the + format: `unix:///tmp/consul/consul.sock`. Will default to the + `CONSUL_HTTP_ADDR` environment variable if set. + +- `allow_unauthenticated` `(bool: true)` - Specifies if users submitting jobs to + the Nomad server should be required to provide their own Consul token, proving + they have access to the service identity policies required by the Consul Connect + enabled services listed in the job. This option should be + disabled in an untrusted environment. + +- `auth` `(string: "")` - Specifies the HTTP Basic Authentication information to + use for access to the Consul Agent, given in the format `username:password`. + +- `auto_advertise` `(bool: true)` - Specifies if Nomad should advertise its + services in Consul. The services are named according to `server_service_name` + and `client_service_name`. Nomad servers and clients advertise their + respective services, each tagged appropriately with either `http` or `rpc` + tag. Nomad servers also advertise a `serf` tagged service. + +- `ca_file` `(string: "")` - Specifies an optional path to the CA certificate + used for Consul communication. This defaults to the system bundle if + unspecified. Will default to the `CONSUL_CACERT` environment variable if set. + +- `cert_file` `(string: "")` - Specifies the path to the certificate used for + Consul communication. If this is set then you need to also set `key_file`. + +- `checks_use_advertise` `(bool: false)` - Specifies if Consul health checks + should bind to the advertise address. By default, this is the bind address. + +- `client_auto_join` `(bool: true)` - Specifies if the Nomad clients should + automatically discover servers in the same region by searching for the Consul + service name defined in the `server_service_name` option. The search occurs if + the client is not registered with any servers or it is unable to heartbeat to + the leader of the region, in which case it may be partitioned and searches for + other servers. + +- `client_service_name` `(string: "nomad-client")` - Specifies the name of the + service in Consul for the Nomad clients. + +- `client_http_check_name` `(string: "Nomad Client HTTP Check")` - Specifies the + HTTP health check name in Consul for the Nomad clients. + +- `key_file` `(string: "")` - Specifies the path to the private key used for + Consul communication. If this is set then you need to also set `cert_file`. + +- `server_service_name` `(string: "nomad")` - Specifies the name of the service + in Consul for the Nomad servers. + +- `server_http_check_name` `(string: "Nomad Server HTTP Check")` - Specifies the + HTTP health check name in Consul for the Nomad servers. + +- `server_serf_check_name` `(string: "Nomad Server Serf Check")` - Specifies + the Serf health check name in Consul for the Nomad servers. + +- `server_rpc_check_name` `(string: "Nomad Server RPC Check")` - Specifies + the RPC health check name in Consul for the Nomad servers. + +- `server_auto_join` `(bool: true)` - Specifies if the Nomad servers should + automatically discover and join other Nomad servers by searching for the + Consul service name defined in the `server_service_name` option. This search + only happens if the server does not have a leader. + +- `ssl` `(bool: false)` - Specifies if the transport scheme should use HTTPS to + communicate with the Consul agent. Will default to the `CONSUL_HTTP_SSL` + environment variable if set. + +- `tags` `(array: [])` - Specifies optional Consul tags to be + registered with the Nomad server and agent services. + +- `token` `(string: "")` - Specifies the token used to provide a per-request ACL + token. This option overrides the Consul Agent's default token. If the token is + not set here or on the Consul agent, it will default to Consul's anonymous policy, + which may or may not allow writes. + +- `verify_ssl` `(bool: true)`- Specifies if SSL peer verification should be used + when communicating to the Consul API client over HTTPS. Will default to the + `CONSUL_HTTP_SSL_VERIFY` environment variable if set. + +If the local Consul agent is configured and accessible by the Nomad agents, the +Nomad cluster will [automatically bootstrap][bootstrap] provided +`server_auto_join`, `client_auto_join`, and `auto_advertise` are all enabled +(which is the default). + +## `consul` Examples + +### Default + +This example shows the default Consul integration: + +```hcl +consul { + address = "127.0.0.1:8500" + server_service_name = "nomad" + client_service_name = "nomad-client" + auto_advertise = true + server_auto_join = true + client_auto_join = true +} +``` + +### Custom Address and Port + +This example shows pointing the Nomad agent at a different Consul address. Note +that you should **never** point directly at a Consul server; always point to a +local client. In this example, the Consul server is bound and listening on the +node's private IP address instead of localhost, so we use that: + +```hcl +consul { + address = "10.0.2.4:8500" +} +``` + +### Custom SSL + +This example shows configuring custom SSL certificates to communicate with +the Consul agent. The Consul agent should be configured to accept certificates +similarly, but that is not discussed here: + +```hcl +consul { + ssl = true + ca_file = "/var/ssl/bundle/ca.bundle" + cert_file = "/etc/ssl/consul.crt" + key_file = "/etc/ssl/consul.key" +} +``` + +[consul]: https://www.consul.io/ 'Consul by HashiCorp' +[bootstrap]: https://learn.hashicorp.com/nomad/operating-nomad/clustering 'Automatic Bootstrapping' diff --git a/content/nomad/v0.11.x/content/docs/configuration/index.mdx b/content/nomad/v0.11.x/content/docs/configuration/index.mdx new file mode 100644 index 0000000000..91d0e2b211 --- /dev/null +++ b/content/nomad/v0.11.x/content/docs/configuration/index.mdx @@ -0,0 +1,344 @@ +--- +layout: docs +page_title: Agent Configuration +sidebar_title: Configuration +description: Learn about the configuration options available for the Nomad agent. +--- + +# Nomad Configuration + +Nomad agents have a variety of parameters that can be specified via +configuration files or command-line flags. Configuration files are written in +[HCL][hcl]. Nomad can read and combine parameters from multiple configuration +files or directories to configure the Nomad agent. + +## Load Order and Merging + +The Nomad agent supports multiple configuration files, which can be provided +using the `-config` CLI flag. The flag can accept either a file or folder. In +the case of a folder, any `.hcl` and `.json` files in the folder will be loaded +and merged in lexicographical order. Directories are not loaded recursively. + +For example: + +```shell-session +$ nomad agent -config=server.conf -config=/etc/nomad -config=extra.json +``` + +This will load configuration from `server.conf`, from `.hcl` and `.json` files +under `/etc/nomad`, and finally from `extra.json`. + +As each file is processed, its contents are merged into the existing +configuration. When merging, any non-empty values from the latest config file +will append or replace parameters in the current configuration. An empty value +means `""` for strings, `0` for integer or float values, and `false` for +booleans. Since empty values are ignored you cannot disable a parameter like +`server` mode once you've enabled it. + +Here is an example Nomad agent configuration that runs in both client and server +mode. + +```hcl +data_dir = "/var/lib/nomad" + +bind_addr = "0.0.0.0" # the default + +advertise { + # Defaults to the first private IP address. + http = "1.2.3.4" + rpc = "1.2.3.4" + serf = "1.2.3.4:5648" # non-default ports may be specified +} + +server { + enabled = true + bootstrap_expect = 3 +} + +client { + enabled = true + network_speed = 10 +} + +plugin "raw_exec" { + config { + enabled = true + } +} + +consul { + address = "1.2.3.4:8500" +} + +``` + +~> Note that it is strongly recommended **not** to operate a node as both +`client` and `server`, although this is supported to simplify development and +testing. + +## General Parameters + +- `acl` `(`[`ACL`]`: nil)` - Specifies configuration which is specific to ACLs. + +- `addresses` `(Addresses: see below)` - Specifies the bind address for + individual network services. Any values configured in this stanza take + precedence over the default [bind_addr](#bind_addr). + The values support [go-sockaddr/template format][go-sockaddr/template]. + + - `http` - The address the HTTP server is bound to. This is the most common + bind address to change. + + - `rpc` - The address to bind the internal RPC interfaces to. Should be + exposed only to other cluster members if possible. + + - `serf` - The address used to bind the gossip layer to. Both a TCP and UDP + listener will be exposed on this address. Should be exposed only to other + cluster members if possible. + +- `advertise` `(Advertise: see below)` - Specifies the advertise address for + individual network services. This can be used to advertise a different address + to the peers of a server or a client node to support more complex network + configurations such as NAT. This configuration is optional, and defaults to + the bind address of the specific network service if it is not provided. Any + values configured in this stanza take precedence over the default + [bind_addr](#bind_addr). + + If the bind address is `0.0.0.0` then the address + private IP found is advertised. You may advertise an alternate port as well. + The values support [go-sockaddr/template format][go-sockaddr/template]. + + - `http` - The address to advertise for the HTTP interface. This should be + reachable by all the nodes from which end users are going to use the Nomad + CLI tools. + + - `rpc` - The address advertised to Nomad client nodes. This allows + advertising a different RPC address than is used by Nomad Servers such that + the clients can connect to the Nomad servers if they are behind a NAT. + + - `serf` - The address advertised for the gossip layer. This address must be + reachable from all server nodes. It is not required that clients can reach + this address. Nomad servers will communicate to each other over RPC using + the advertised Serf IP and advertised RPC Port. + +- `audit` `(`[`Audit`]`: nil)` - Enterprise-only. Specifies audit logging + configuration. + +- `bind_addr` `(string: "0.0.0.0")` - Specifies which address the Nomad + agent should bind to for network services, including the HTTP interface as + well as the internal gossip protocol and RPC mechanism. This should be + specified in IP format, and can be used to easily bind all network services to + the same address. It is also possible to bind the individual services to + different addresses using the [addresses](#addresses) configuration option. + Dev mode (`-dev`) defaults to localhost. + The value supports [go-sockaddr/template format][go-sockaddr/template]. + +- `client` `(`[`Client`]`: nil)` - Specifies configuration which is specific + to the Nomad client. + +- `consul` `(`[`Consul`]`: nil)` - Specifies configuration for + connecting to Consul. + +- `datacenter` `(string: "dc1")` - Specifies the data center of the local agent. + All members of a datacenter should share a local LAN connection. + +- `data_dir` `(string: required)` - Specifies a local directory used to store + agent state. Client nodes use this directory by default to store temporary + allocation data as well as cluster information. Server nodes use this + directory to store cluster state, including the replicated log and snapshot + data. This must be specified as an absolute path. + + ~> **WARNING**: This directory **must not** be set to a directory that is + [included in the chroot](/docs/drivers/exec#chroot) if you use the + [`exec`](/docs/drivers/exec) driver. + +- `disable_anonymous_signature` `(bool: false)` - Specifies if Nomad should + provide an anonymous signature for de-duplication with the update check. + +- `disable_update_check` `(bool: false)` - Specifies if Nomad should not check + for updates and security bulletins. + +- `enable_debug` `(bool: false)` - Specifies if the debugging HTTP endpoints + should be enabled. These endpoints can be used with profiling tools to dump + diagnostic information about Nomad's internals. + +- `enable_syslog` `(bool: false)` - Specifies if the agent should log to syslog. + This option only works on Unix based systems. + +- `http_api_response_headers` `(map: nil)` - Specifies + user-defined headers to add to the HTTP API responses. + +- `leave_on_interrupt` `(bool: false)` - Specifies if the agent should + gracefully leave when receiving the interrupt signal. By default, the agent + will exit forcefully on any signal. This value should only be set to true on + server agents if it is expected that a terminated server instance will never + join the cluster again. + +- `leave_on_terminate` `(bool: false)` - Specifies if the agent should + gracefully leave when receiving the terminate signal. By default, the agent + will exit forcefully on any signal. This value should only be set to true on + server agents if it is expected that a terminated server instance will never + join the cluster again. + +- `limits` - Available in Nomad 0.10.3 and later, this is a nested object that + configures limits that are enforced by the agent. The following parameters + are available: + + - `https_handshake_timeout` `(string: "5s")` - Configures the limit for how + long the HTTPS server in both client and server agents will wait for a + client to complete a TLS handshake. This should be kept conservative as it + limits how many connections an unauthenticated attacker can open if + [`tls.http = true`][tls] is being used (strongly recommended in + production). Default value is `5s`. `0` disables HTTP handshake timeouts. + + - `http_max_conns_per_client` `(int: 100)` - Configures a limit of how many + concurrent TCP connections a single client IP address is allowed to open to + the agent's HTTP server. This affects the HTTP servers in both client and + server agents. Default value is `100`. `0` disables HTTP connection limits. + + - `rpc_handshake_timeout` `(string: "5s")` - Configures the limit for how + long servers will wait after a client TCP connection is established before + they complete the connection handshake. When TLS is used, the same timeout + applies to the TLS handshake separately from the initial protocol + negotiation. All Nomad clients should perform this immediately on + establishing a new connection. This should be kept conservative as it + limits how many connections an unauthenticated attacker can open if + TLS is being using to authenticate clients (strongly recommended in + production). When `tls.rpc` is true on servers, this limits how long the + connection and associated goroutines will be held open before the client + successfully authenticates. Default value is `5s`. `0` disables RPC handshake + timeouts. + + - `rpc_max_conns_per_client` `(int: 100)` - Configures a limit of how + many concurrent TCP connections a single source IP address is allowed + to open to a single server. Client agents do not accept RPC TCP connections + directly and therefore are not affected. It affects both clients connections + and other server connections. Nomad clients multiplex many RPC calls over a + single TCP connection, except for streaming endpoints such as [log + streaming][log-api] which require their own connection when routed through + servers. A server needs at least 2 TCP connections (1 Raft, 1 RPC) per peer + server locally and in any federated region. Servers also need a TCP connection + per routed streaming endpoint concurrently in use. Only operators use streaming + endpoints; as of 0.10.3 Nomad client code does not. A reasonably low limit + significantly reduces the ability of an unauthenticated attacker to consume + unbounded resources by holding open many connections. You may need to + increase this if WAN federated servers connect via proxies or NAT gateways + or similar causing many legitimate connections from a single source IP. + Default value is `100` which is designed to support the majority of users. + `0` disables RPC connection limits. `26` is the minimum as `20` connections + are always reserved for non-streaming connections (Raft and RPC) to ensure + streaming RPCs do not prevent normal server operation. This minimum may be + lowered in the future when streaming RPCs no longer require their own TCP + connection. + +- `log_level` `(string: "INFO")` - Specifies the verbosity of logs the Nomad + agent will output. Valid log levels include `WARN`, `INFO`, or `DEBUG` in + increasing order of verbosity. + +- `log_json` `(bool: false)` - Output logs in a JSON format. + +- `log_file` `(string: "")` - Specifies the path for logging. If the path + does not includes a filename, the filename defaults to "nomad-{timestamp}.log". + This setting can be combined with `log_rotate_bytes` and `log_rotate_duration` + for a fine-grained log rotation control. + +- `log_rotate_bytes` `(int: 0)` - Specifies the number of bytes that should be + written to a log before it needs to be rotated. Unless specified, there is no + limit to the number of bytes that can be written to a log file. + +- `log_rotate_duration` `(duration: "24h")` - Specifies the maximum duration a + log should be written to before it needs to be rotated. Must be a duration + value such as 30s. + +- `log_rotate_max_files` `(int: 0)` - Specifies the maximum number of older log + file archives to keep. If 0 no files are ever deleted. + +- `name` `(string: [hostname])` - Specifies the name of the local node. This + value is used to identify individual agents. When specified on a server, the + name must be unique within the region. + +- `plugin_dir` `(string: "[data_dir]/plugins")` - Specifies the directory to + use for looking up plugins. By default, this is the top-level + [data_dir](#data_dir) suffixed with "plugins", like `"/opt/nomad/plugins"`. + This must be an absolute path. + +- `plugin` `(`[`Plugin`]`: nil)` - Specifies configuration for a + specific plugin. The plugin stanza may be repeated, once for each plugin being + configured. The key of the stanza is the plugin's executable name relative to + the [plugin_dir](#plugin_dir). + +- `ports` `(Port: see below)` - Specifies the network ports used for different + services required by the Nomad agent. + + - `http` - The port used to run the HTTP server. + + - `rpc` - The port used for internal RPC communication between + agents and servers, and for inter-server traffic for the consensus algorithm + (raft). + + - `serf` - The port used for the gossip protocol for cluster + membership. Both TCP and UDP should be routable between the server nodes on + this port. + + The default values are: + + ```hcl + ports { + http = 4646 + rpc = 4647 + serf = 4648 + } + ``` + +- `region` `(string: "global")` - Specifies the region the Nomad agent is a + member of. A region typically maps to a geographic region, for example `us`, + with potentially multiple zones, which map to [datacenters](#datacenter) such + as `us-west` and `us-east`. + +- `sentinel` `(`[`Sentinel`]`: nil)` - Specifies configuration for Sentinel + policies. + +- `server` `(`[`Server`]`: nil)` - Specifies configuration which is specific + to the Nomad server. + +- `syslog_facility` `(string: "LOCAL0")` - Specifies the syslog facility to + write to. This has no effect unless `enable_syslog` is true. + +- `tls` `(`[`TLS`][tls]`: nil)` - Specifies configuration for TLS. + +- `vault` `(`[`Vault`]`: nil)` - Specifies configuration for + connecting to Vault. + +## Examples + +### Custom Region and Datacenter + +This example shows configuring a custom region and data center for the Nomad +agent: + +```hcl +region = "europe" +datacenter = "ams" +``` + +### Enable CORS + +This example shows how to enable CORS on the HTTP API endpoints: + +```hcl +http_api_response_headers { + "Access-Control-Allow-Origin" = "*" +} +``` + +[`acl`]: /docs/configuration/acl 'Nomad Agent ACL Configuration' +[`audit`]: /docs/configuration/audit 'Nomad Agent Audit Logging Configuration' +[`client`]: /docs/configuration/client 'Nomad Agent client Configuration' +[`consul`]: /docs/configuration/consul 'Nomad Agent consul Configuration' +[`plugin`]: /docs/configuration/plugin 'Nomad Agent Plugin Configuration' +[`sentinel`]: /docs/configuration/sentinel 'Nomad Agent sentinel Configuration' +[`server`]: /docs/configuration/server 'Nomad Agent server Configuration' +[tls]: /docs/configuration/tls 'Nomad Agent tls Configuration' +[`vault`]: /docs/configuration/vault 'Nomad Agent vault Configuration' +[go-sockaddr/template]: https://godoc.org/github.com/hashicorp/go-sockaddr/template +[log-api]: /api-docs/client#stream-logs +[hcl]: https://github.com/hashicorp/hcl 'HashiCorp Configuration Language' diff --git a/content/nomad/v0.11.x/content/docs/configuration/plugin.mdx b/content/nomad/v0.11.x/content/docs/configuration/plugin.mdx new file mode 100644 index 0000000000..6f583cd7dc --- /dev/null +++ b/content/nomad/v0.11.x/content/docs/configuration/plugin.mdx @@ -0,0 +1,37 @@ +--- +layout: docs +page_title: plugin Stanza - Agent Configuration +sidebar_title: plugin +description: The "plugin" stanza is used to configure a Nomad plugin. +--- + +# `plugin` Stanza + + + +The `plugin` stanza is used to configure plugins. + +```hcl +plugin "example-plugin" { + args = ["-my-flag"] + config { + foo = "bar" + bam { + baz = 1 + } + } +} +``` + +The name of the plugin is the plugin's executable name relative to to the +[plugin_dir](/docs/configuration#plugin_dir). If the plugin has a +suffix, such as `.exe`, this should be omitted. + +## `plugin` Parameters + +- `args` `(array: [])` - Specifies a set of arguments to pass to the + plugin binary when it is executed. + +- `config` `(hcl/json: nil)` - Specifies configuration values for the plugin + either as HCL or JSON. The accepted values are plugin specific. Please refer + to the individual plugin's documentation. diff --git a/content/nomad/v0.11.x/content/docs/configuration/sentinel.mdx b/content/nomad/v0.11.x/content/docs/configuration/sentinel.mdx new file mode 100644 index 0000000000..f83d147321 --- /dev/null +++ b/content/nomad/v0.11.x/content/docs/configuration/sentinel.mdx @@ -0,0 +1,35 @@ +--- +layout: docs +page_title: sentinel Stanza - Agent Configuration +sidebar_title: sentinel +description: >- + The "sentinel" stanza configures the Nomad agent for Sentinel policies and + tune various parameters. +--- + +# `sentinel` Stanza + + + +The `sentinel` stanza configures the Sentinel policy engine and tunes various parameters. + +```hcl +sentinel { + import "custom-plugin" { + path = "/usr/bin/sentinel-custom-plugin" + args = ["-verbose", "foo"] + } +} +``` + +## `sentinel` Parameters + +- `import` ([Import](#import-parameters): nil) - + Specifies a plugin that should be made available for importing by Sentinel policies. + The name of the import matches the name that can be imported. + +### `import` Parameters + +- `path` `(string: "")` - Specifies the path to the import plugin. Must be executable by Nomad. + +- `args` `(array: [])` - Specifies arguments to pass to the plugin when starting it. diff --git a/content/nomad/v0.11.x/content/docs/configuration/server.mdx b/content/nomad/v0.11.x/content/docs/configuration/server.mdx new file mode 100644 index 0000000000..aafc402751 --- /dev/null +++ b/content/nomad/v0.11.x/content/docs/configuration/server.mdx @@ -0,0 +1,278 @@ +--- +layout: docs +page_title: server Stanza - Agent Configuration +sidebar_title: server +description: |- + The "server" stanza configures the Nomad agent to operate in server mode to + participate in scheduling decisions, register with service discovery, handle + join failures, and more. +--- + +# `server` Stanza + + + +The `server` stanza configures the Nomad agent to operate in server mode to +participate in scheduling decisions, register with service discovery, handle +join failures, and more. + +```hcl +server { + enabled = true + bootstrap_expect = 3 + server_join { + retry_join = [ "1.1.1.1", "2.2.2.2" ] + retry_max = 3 + retry_interval = "15s" + } +} +``` + +## `server` Parameters + +- `authoritative_region` `(string: "")` - Specifies the authoritative region, which + provides a single source of truth for global configurations such as ACL Policies and + global ACL tokens. Non-authoritative regions will replicate from the authoritative + to act as a mirror. By default, the local region is assumed to be authoritative. + +- `bootstrap_expect` `(int: required)` - Specifies the number of server nodes to + wait for before bootstrapping. It is most common to use the odd-numbered + integers `3` or `5` for this value, depending on the cluster size. A value of + `1` does not provide any fault tolerance and is not recommended for production + use cases. + +- `data_dir` `(string: "[data_dir]/server")` - Specifies the directory to use - + for server-specific data, including the replicated log. By default, this is - + the top-level [data_dir](/docs/configuration#data_dir) + suffixed with "server", like `"/opt/nomad/server"`. This must be an absolute + path. + +- `enabled` `(bool: false)` - Specifies if this agent should run in server mode. + All other server options depend on this value being set. + +- `enabled_schedulers` `(array: [all])` - Specifies which sub-schedulers + this server will handle. This can be used to restrict the evaluations that + worker threads will dequeue for processing. + +- `encrypt` `(string: "")` - Specifies the secret key to use for encryption of + Nomad server's gossip network traffic. This key must be 16 bytes that are + base64-encoded. The provided key is automatically persisted to the data + directory and loaded automatically whenever the agent is restarted. This means + that to encrypt Nomad server's gossip protocol, this option only needs to be + provided once on each agent's initial startup sequence. If it is provided + after Nomad has been initialized with an encryption key, then the provided key + is ignored and a warning will be displayed. See the + [encryption documentation][encryption] for more details on this option + and its impact on the cluster. + +- `node_gc_threshold` `(string: "24h")` - Specifies how long a node must be in a + terminal state before it is garbage collected and purged from the system. This + is specified using a label suffix like "30s" or "1h". + +- `job_gc_interval` `(string: "5m")` - Specifies the interval between the job + garbage collections. Only jobs who have been terminal for at least + `job_gc_threshold` will be collected. Lowering the interval will perform more + frequent but smaller collections. Raising the interval will perform collections + less frequently but collect more jobs at a time. Reducing this interval is + useful if there is a large throughput of tasks, leading to a large set of + dead jobs. This is specified using a label suffix like "30s" or "3m". `job_gc_interval` + was introduced in Nomad 0.10.0. + +- `job_gc_threshold` `(string: "4h")` - Specifies the minimum time a job must be + in the terminal state before it is eligible for garbage collection. This is + specified using a label suffix like "30s" or "1h". + +- `eval_gc_threshold` `(string: "1h")` - Specifies the minimum time an + evaluation must be in the terminal state before it is eligible for garbage + collection. This is specified using a label suffix like "30s" or "1h". + +- `deployment_gc_threshold` `(string: "1h")` - Specifies the minimum time a + deployment must be in the terminal state before it is eligible for garbage + collection. This is specified using a label suffix like "30s" or "1h". + +- `csi_volume_claim_gc_threshold` `(string: "1h")` - Specifies the minimum age of + a CSI volume before it is eligible to have its claims garbage collected. + This is specified using a label suffix like "30s" or "1h". + +- `csi_plugin_gc_threshold` `(string: "1h")` - Specifies the minimum age of a + CSI plugin before it is eligible for garbage collection if not in use. + This is specified using a label suffix like "30s" or "1h". + +- `default_scheduler_config` ([scheduler_configuration][update-scheduler-config]: + nil) - Specifies the initial default scheduler config when + bootstrapping cluster. The parameter is ignored once the cluster is bootstrapped or + value is updated through the [API endpoint][update-scheduler-config]. See [the + example section](#configuring-scheduler-config) for more details + `default_scheduler_config` was introduced in Nomad 0.10.4. + +- `heartbeat_grace` `(string: "10s")` - Specifies the additional time given as a + grace period beyond the heartbeat TTL of nodes to account for network and + processing delays as well as clock skew. This is specified using a label + suffix like "30s" or "1h". + +- `min_heartbeat_ttl` `(string: "10s")` - Specifies the minimum time between + node heartbeats. This is used as a floor to prevent excessive updates. This is + specified using a label suffix like "30s" or "1h". Lowering the minimum TTL is + a tradeoff as it lowers failure detection time of nodes at the tradeoff of + false positives and increased load on the leader. + +- `max_heartbeats_per_second` `(float: 50.0)` - Specifies the maximum target + rate of heartbeats being processed per second. This allows the TTL to be + increased to meet the target rate. Increasing the maximum heartbeats per + second is a tradeoff as it lowers failure detection time of nodes at the + tradeoff of false positives and increased load on the leader. + +- `non_voting_server` `(bool: false)` - (Enterprise-only) Specifies whether + this server will act as a non-voting member of the cluster to help provide + read scalability. + +- `num_schedulers` `(int: [num-cores])` - Specifies the number of parallel + scheduler threads to run. This can be as many as one per core, or `0` to + disallow this server from making any scheduling decisions. This defaults to + the number of CPU cores. + +- `protocol_version` `(int: 1)` - Specifies the Nomad protocol version to use + when communicating with other Nomad servers. This value is typically not + required as the agent internally knows the latest version, but may be useful + in some upgrade scenarios. + +- `raft_protocol` `(int: 2)` - Specifies the Raft protocol version to use when + communicating with other Nomad servers. This affects available Autopilot + features and is typically not required as the agent internally knows the + latest version, but may be useful in some upgrade scenarios. + +- `redundancy_zone` `(string: "")` - (Enterprise-only) Specifies the redundancy + zone that this server will be a part of for Autopilot management. For more + information, see the [Autopilot Guide](https://learn.hashicorp.com/nomad/operating-nomad/autopilot). + +- `rejoin_after_leave` `(bool: false)` - Specifies if Nomad will ignore a + previous leave and attempt to rejoin the cluster when starting. By default, + Nomad treats leave as a permanent intent and does not attempt to join the + cluster again when starting. This flag allows the previous state to be used to + rejoin the cluster. + +- `server_join` ([server_join][server-join]: nil) - Specifies + how the Nomad server will connect to other Nomad servers. The `retry_join` + fields may directly specify the server address or use go-discover syntax for + auto-discovery. See the [server_join documentation][server-join] for more detail. + +- `upgrade_version` `(string: "")` - A custom version of the format X.Y.Z to use + in place of the Nomad version when custom upgrades are enabled in Autopilot. + For more information, see the [Autopilot Guide](https://learn.hashicorp.com/nomad/operating-nomad/autopilot). + +### Deprecated Parameters + +- `retry_join` `(array: [])` - Specifies a list of server addresses to + retry joining if the first attempt fails. This is similar to + [`start_join`](#start_join), but only invokes if the initial join attempt + fails. The list of addresses will be tried in the order specified, until one + succeeds. After one succeeds, no further addresses will be contacted. This is + useful for cases where we know the address will become available eventually. + Use `retry_join` with an array as a replacement for `start_join`, **do not use + both options**. See the [server_join][server-join] + section for more information on the format of the string. This field is + deprecated in favor of the [server_join stanza][server-join]. + +- `retry_interval` `(string: "30s")` - Specifies the time to wait between retry + join attempts. This field is deprecated in favor of the [server_join + stanza][server-join]. + +- `retry_max` `(int: 0)` - Specifies the maximum number of join attempts to be + made before exiting with a return code of 1. By default, this is set to 0 + which is interpreted as infinite retries. This field is deprecated in favor of + the [server_join stanza][server-join]. + +- `start_join` `(array: [])` - Specifies a list of server addresses to + join on startup. If Nomad is unable to join with any of the specified + addresses, agent startup will fail. See the [server address + format](/docs/configuration/server_join#server-address-format) + section for more information on the format of the string. This field is + deprecated in favor of the [server_join stanza][server-join]. + +## `server` Examples + +### Common Setup + +This example shows a common Nomad agent `server` configuration stanza. The two +IP addresses could also be DNS, and should point to the other Nomad servers in +the cluster + +```hcl +server { + enabled = true + bootstrap_expect = 3 + + server_join { + retry_join = [ "1.1.1.1", "2.2.2.2" ] + retry_max = 3 + retry_interval = "15s" + } +} +``` + +### Configuring Data Directory + +This example shows configuring a custom data directory for the server data. + +```hcl +server { + data_dir = "/opt/nomad/server" +} +``` + +### Automatic Bootstrapping + +The Nomad servers can automatically bootstrap if Consul is configured. For a +more detailed explanation, please see the +[automatic Nomad bootstrapping documentation](https://learn.hashicorp.com/nomad/operating-nomad/clustering). + +### Restricting Schedulers + +This example shows restricting the schedulers that are enabled as well as the +maximum number of cores to utilize when participating in scheduling decisions: + +```hcl +server { + enabled = true + enabled_schedulers = ["batch", "service"] + num_schedulers = 7 +} +``` + +### Bootstrapping with a Custom Scheduler Config ((#configuring-scheduler-config)) + +While [bootstrapping a cluster], you can use the `default_scheduler_config` stanza +to prime the cluster with a [`SchedulerConfig`][update-scheduler-config]. The +scheduler configuration determines which scheduling algorithm is configured— +spread scheduling or binpacking—and which job types are eligible for preemption. + +~> **Warning:** Once the cluster is bootstrapped, you must configure this using + the [update scheduler configuration][update-scheduler-config] API. This + option is only consulted during bootstrap. + +The structure matches the [Update Scheduler Config][update-scheduler-config] API +endpoint, which you should consult for canonical documentation. However, the +attributes names must be adapted to HCL syntax by using snake case +representations rather than camel case. + +This example shows configuring spread scheduling and enabling preemption for all +job-type schedulers. + +```hcl +server { + default_scheduler_config { + scheduler_algorithm = "spread" + + preemption_config { + batch_scheduler_enabled = true + system_scheduler_enabled = true + service_scheduler_enabled = true + } + } +} +``` + +[encryption]: https://learn.hashicorp.com/nomad/transport-security/gossip-encryption 'Nomad Encryption Overview' +[server-join]: /docs/configuration/server_join 'Server Join' +[update-scheduler-config]: /api-docs/operator#update-scheduler-configuration 'Scheduler Config' +[bootstrapping a cluster]: /docs/faq#bootstrapping \ No newline at end of file diff --git a/content/nomad/v0.11.x/content/docs/configuration/server_join.mdx b/content/nomad/v0.11.x/content/docs/configuration/server_join.mdx new file mode 100644 index 0000000000..62f0a85829 --- /dev/null +++ b/content/nomad/v0.11.x/content/docs/configuration/server_join.mdx @@ -0,0 +1,230 @@ +--- +layout: docs +page_title: server_join Stanza - Agent Configuration +sidebar_title: server_join +description: >- + The "server_join" stanza specifies how the Nomad agent will discover and + connect to Nomad servers. +--- + +# `server_join` Stanza + + + +The `server_join` stanza specifies how the Nomad agent will discover and connect +to Nomad servers. + +```hcl +server_join { + retry_join = [ "1.1.1.1", "2.2.2.2" ] + retry_max = 3 + retry_interval = "15s" +} +``` + +## `server_join` Parameters + +- `retry_join` `(array: [])` - Specifies a list of server addresses to + join. This is similar to [`start_join`](#start_join), but will continue to + be attempted even if the initial join attempt fails, up to + [retry_max](#retry_max). Further, `retry_join` is available to + both Nomad servers and clients, while `start_join` is only defined for Nomad + servers. This is useful for cases where we know the address will become + available eventually. Use `retry_join` with an array as a replacement for + `start_join`, **do not use both options**. + + Address format includes both using IP addresses as well as an interface to the + [go-discover](https://github.com/hashicorp/go-discover) library for doing + automated cluster joining using cloud metadata. See the [Cloud Auto-join](#cloud-auto-join) + section below for more information. + + ``` + server_join { + retry_join = [ "1.1.1.1", "2.2.2.2" ] + } + ``` + + Using the `go-discover` interface, this can be defined both in a client or + server configuration as well as provided as a command-line argument. + + ``` + server_join { + retry_join = [ "provider=aws tag_key=..." ] + } + ``` + + See the [server address format](#server-address-format) for more information about expected + server address formats. + +- `retry_interval` `(string: "30s")` - Specifies the time to wait between retry + join attempts. + +- `retry_max` `(int: 0)` - Specifies the maximum number of join attempts to be + made before exiting with a return code of 1. By default, this is set to 0 + which is interpreted as infinite retries. + +- `start_join` `(array: [])` - Specifies a list of server addresses to + join on startup. If Nomad is unable to join with any of the specified + addresses, agent startup will fail. See the + [server address format](#server-address-format) section for more information + on the format of the string. This field is defined only for Nomad servers and + will result in a configuration parse error if included in a client + configuration. + +## Server Address Format + +This section describes the acceptable syntax and format for describing the +location of a Nomad server. There are many ways to reference a Nomad server, +including directly by IP address and resolving through DNS. + +### Directly via IP Address + +It is possible to address another Nomad server using its IP address. This is +done in the `ip:port` format, such as: + +``` +1.2.3.4:5678 +``` + +If the port option is omitted, it defaults to the Serf port, which is 4648 +unless configured otherwise: + +``` +1.2.3.4 => 1.2.3.4:4648 +``` + +### Via Domains or DNS + +It is possible to address another Nomad server using its DNS address. This is +done in the `address:port` format, such as: + +``` +nomad-01.company.local:5678 +``` + +If the port option is omitted, it defaults to the Serf port, which is 4648 +unless configured otherwise: + +``` +nomad-01.company.local => nomad-01.company.local:4648 +``` + +### Via the go-discover interface + +As of Nomad 0.8.4, `retry_join` accepts a unified interface using the +[go-discover](https://github.com/hashicorp/go-discover) library for doing +automated cluster joining using cloud metadata. See [Cloud +Auto-join][#cloud_auto_join] for more information. + +``` +"provider=aws tag_key=..." => 1.2.3.4:4648 +``` + +## Cloud Auto-join + +The following sections describe the Cloud Auto-join `retry_join` options that are specific +to a subset of supported cloud providers. For information on all providers, see further +documentation in [go-discover](https://github.com/hashicorp/go-discover). + +### Amazon EC2 + +This returns the first private IP address of all servers in the given +region which have the given `tag_key` and `tag_value`. + +```json +{ + "retry_join": ["provider=aws tag_key=... tag_value=..."] +} +``` + +- `provider` (required) - the name of the provider ("aws" in this case). +- `tag_key` (required) - the key of the tag to auto-join on. +- `tag_value` (required) - the value of the tag to auto-join on. +- `region` (optional) - the AWS region to authenticate in. +- `addr_type` (optional) - the type of address to discover: `private_v4`, `public_v4`, `public_v6`. Default is `private_v4`. (>= 1.0) +- `access_key_id` (optional) - the AWS access key for authentication (see below for more information about authenticating). +- `secret_access_key` (optional) - the AWS secret access key for authentication (see below for more information about authenticating). + +#### Authentication & Precedence + +- Static credentials `access_key_id=... secret_access_key=...` +- Environment variables (`AWS_ACCESS_KEY_ID` and `AWS_SECRET_ACCESS_KEY`) +- Shared credentials file (`~/.aws/credentials` or the path specified by `AWS_SHARED_CREDENTIALS_FILE`) +- ECS task role metadata (container-specific). +- EC2 instance role metadata. + + The only required IAM permission is `ec2:DescribeInstances`, and it is + recommended that you make a dedicated key used only for auto-joining. If the + region is omitted it will be discovered through the local instance's [EC2 + metadata + endpoint](http://docs.aws.amazon.com/AWSEC2/latest/UserGuide/instance-identity-documents.html). + +### Microsoft Azure + +This returns the first private IP address of all servers in the given region +which have the given `tag_key` and `tag_value` in the tenant and subscription, or in +the given `resource_group` of a `vm_scale_set` for Virtual Machine Scale Sets. + +```json +{ + "retry_join": [ + "provider=azure tag_name=... tag_value=... tenant_id=... client_id=... subscription_id=... secret_access_key=..." + ] +} +``` + +- `provider` (required) - the name of the provider ("azure" in this case). +- `tenant_id` (required) - the tenant to join machines in. +- `client_id` (required) - the client to authenticate with. +- `secret_access_key` (required) - the secret client key. + +Use these configuration parameters when using tags: + +- `tag_name` - the name of the tag to auto-join on. +- `tag_value` - the value of the tag to auto-join on. + +Use these configuration parameters when using Virtual Machine Scale Sets (Consul 1.0.3 and later): + +- `resource_group` - the name of the resource group to filter on. +- `vm_scale_set` - the name of the virtual machine scale set to filter on. + + When using tags the only permission needed is the `ListAll` method for `NetworkInterfaces`. When using + Virtual Machine Scale Sets the only role action needed is `Microsoft.Compute/virtualMachineScaleSets/*/read`. + +### Google Compute Engine + +This returns the first private IP address of all servers in the given +project which have the given `tag_value`. + +```` + +```json +{ +"retry_join": ["provider=gce project_name=... tag_value=..."] +} +```` + +- `provider` (required) - the name of the provider ("gce" in this case). +- `tag_value` (required) - the value of the tag to auto-join on. +- `project_name` (optional) - the name of the project to auto-join on. Discovered if not set. +- `zone_pattern` (optional) - the list of zones can be restricted through an RE2 compatible regular expression. If omitted, servers in all zones are returned. +- `credentials_file` (optional) - the credentials file for authentication. See below for more information. + +#### Authentication & Precedence + +- Use credentials from `credentials_file`, if provided. +- Use JSON file from `GOOGLE_APPLICATION_CREDENTIALS` environment variable. +- Use JSON file in a location known to the gcloud command-line tool. +- On Windows, this is `%APPDATA%/gcloud/application_default_credentials.json`. +- On other systems, `$HOME/.config/gcloud/application_default_credentials.json`. +- On Google Compute Engine, use credentials from the metadata + server. In this final case any provided scopes are ignored. + +Discovery requires a [GCE Service +Account](https://cloud.google.com/compute/docs/access/service-accounts). +Credentials are searched using the following paths, in order of precedence. diff --git a/content/nomad/v0.11.x/content/docs/configuration/telemetry.mdx b/content/nomad/v0.11.x/content/docs/configuration/telemetry.mdx new file mode 100644 index 0000000000..0db8756c55 --- /dev/null +++ b/content/nomad/v0.11.x/content/docs/configuration/telemetry.mdx @@ -0,0 +1,201 @@ +--- +layout: docs +page_title: telemetry Stanza - Agent Configuration +sidebar_title: telemetry +description: |- + The "telemetry" stanza configures Nomad's publication of metrics and telemetry + to third-party systems. +--- + +# `telemetry` Stanza + + + +The `telemetry` stanza configures Nomad's publication of metrics and telemetry +to third-party systems. + +```hcl +telemetry { + publish_allocation_metrics = true + publish_node_metrics = true +} +``` + +This section of the documentation only covers the configuration options for +`telemetry` stanza. To understand the architecture and metrics themselves, +please see the [Telemetry guide](/docs/telemetry). + +## `telemetry` Parameters + +Due to the number of configurable parameters to the `telemetry` stanza, +parameters on this page are grouped by the telemetry provider. + +### Common + +The following options are available on all telemetry configurations. + +- `disable_hostname` `(bool: false)` - Specifies if gauge values should be + prefixed with the local hostname. + +- `collection_interval` `(duration: 1s)` - Specifies the time interval at which + the Nomad agent collects telemetry data. + +- `use_node_name` `(bool: false)` - Specifies if gauge values should be + prefixed with the name of the node, instead of the hostname. If set it will + override [disable_hostname](#disable_hostname) value. + +- `publish_allocation_metrics` `(bool: false)` - Specifies if Nomad should + publish runtime metrics of allocations. + +- `publish_node_metrics` `(bool: false)` - Specifies if Nomad should publish + runtime metrics of nodes. + +- `backwards_compatible_metrics` `(bool: false)` - Specifies if Nomad should + publish metrics that are backwards compatible with versions below 0.7, as + post version 0.7, Nomad emits tagged metrics. All new metrics will + only be added to tagged metrics. Note that this option is used to transition + monitoring to tagged metrics and will eventually be deprecated. + +- `disable_tagged_metrics` `(bool: false)` - Specifies if Nomad should not emit + tagged metrics and only emit metrics compatible with versions below Nomad + 0.7. Note that this option is used to transition monitoring to tagged + metrics and will eventually be deprecated. + +- `filter_default` `(bool: true)` - This controls whether to allow metrics that + have not been specified by the filter. Defaults to true, which will allow all + metrics when no filters are provided. When set to false with no filters, no + metrics will be sent. + +- `prefix_filter` `(list: [])` - This is a list of filter rules to apply for + allowing/blocking metrics by prefix. A leading "+" will enable any + metrics with the given prefix, and a leading "-" will block them. If + there is overlap between two rules, the more specific rule will take + precedence. Blocking will take priority if the same prefix is listed multiple + times. + +```python +['-nomad.raft', '+nomad.raft.apply', '-nomad.memberlist'] +``` + +- `disable_dispatched_job_summary_metrics` `(bool: false)` - Specifies if Nomad + should ignore jobs dispatched from a parameterized job when publishing job + summary statistics. Since each job has a small memory overhead for tracking + summary statistics, it is sometimes desired to trade these statistics for + more memory when dispatching high volumes of jobs. + +### `statsite` + +These `telemetry` parameters apply to +[statsite](https://github.com/armon/statsite). + +- `statsite_address` `(string: "")` - Specifies the address of a statsite server + to forward metrics data to. + +```hcl +telemetry { + statsite_address = "statsite.company.local:8125" +} +``` + +### `statsd` + +These `telemetry` parameters apply to +[statsd](https://github.com/etsy/statsd). + +- `statsd_address` `(string: "")` - Specifies the address of a statsd server to + forward metrics to. + +```hcl +telemetry { + statsd_address = "statsd.company.local:8125" +} +``` + +### `datadog` + +These `telemetry` parameters apply to +[DataDog statsd](https://github.com/DataDog/dd-agent). + +- `datadog_address` `(string: "")` - Specifies the address of a DataDog statsd + server to forward metrics to. + +- `datadog_tags` `(list: [])` - Specifies a list of global tags that will be + added to all telemetry packets sent to DogStatsD. It is a list of strings, + where each string looks like "my_tag_name:my_tag_value". + +```hcl +telemetry { + datadog_address = "dogstatsd.company.local:8125" + datadog_tags = ["my_tag_name:my_tag_value"] +} +``` + +### `prometheus` + +These `telemetry` parameters apply to [Prometheus](https://prometheus.io). + +- `prometheus_metrics` `(bool: false)` - Specifies whether the agent should + make Prometheus formatted metrics available at `/v1/metrics?format=prometheus`. + +### `circonus` + +These `telemetry` parameters apply to +[Circonus](http://circonus.com/). + +- `circonus_api_token` `(string: "")` - Specifies a valid Circonus API Token + used to create/manage check. If provided, metric management is enabled. + +- `circonus_api_app` `(string: "nomad")` - Specifies a valid app name associated + with the API token. + +- `circonus_api_url` `(string: "https://api.circonus.com/v2")` - Specifies the + base URL to use for contacting the Circonus API. + +- `circonus_submission_interval` `(string: "10s")` - Specifies the interval at + which metrics are submitted to Circonus. + +- `circonus_submission_url` `(string: "")` - Specifies the + `check.config.submission_url` field, of a Check API object, from a previously + created HTTPTRAP check. + +- `circonus_check_id` `(string: "")` - Specifies the Check ID (**not check + bundle**) from a previously created HTTPTRAP check. The numeric portion of the + `check._cid` field in the Check API object. + +- `circonus_check_force_metric_activation` `(bool: false)` - Specifies if force + activation of metrics which already exist and are not currently active. If + check management is enabled, the default behavior is to add new metrics as + they are encountered. If the metric already exists in the check, it will + not be activated. This setting overrides that behavior. + +- `circonus_check_instance_id` `(string: ":")` - Serves + to uniquely identify the metrics coming from this _instance_. It can be used + to maintain metric continuity with transient or ephemeral instances as they + move around within an infrastructure. By default, this is set to + hostname:application name (e.g. "host123:nomad"). + +- `circonus_check_search_tag` `(string: :)` - Specifies a + special tag which, when coupled with the instance id, helps to narrow down the + search results when neither a Submission URL or Check ID is provided. By + default, this is set to service:app (e.g. "service:nomad"). + +- `circonus_check_display_name` `(string: "")` - Specifies a name to give a + check when it is created. This name is displayed in the Circonus UI Checks + list. + +- `circonus_check_tags` `(string: "")` - Comma separated list of additional + tags to add to a check when it is created. + +- `circonus_broker_id` `(string: "")` - Specifies the ID of a specific Circonus + Broker to use when creating a new check. The numeric portion of `broker._cid` + field in a Broker API object. If metric management is enabled and neither a + Submission URL nor Check ID is provided, an attempt will be made to search for + an existing check using Instance ID and Search Tag. If one is not found, a new + HTTPTRAP check will be created. By default, this is a random + Enterprise Broker is selected, or, the default Circonus Public Broker. + +- `circonus_broker_select_tag` `(string: "")` - Specifies a special tag which + will be used to select a Circonus Broker when a Broker ID is not provided. The + best use of this is to as a hint for which broker should be used based on + _where_ this particular instance is running (e.g. a specific geographic location or + datacenter, dc:sfo). diff --git a/content/nomad/v0.11.x/content/docs/configuration/tls.mdx b/content/nomad/v0.11.x/content/docs/configuration/tls.mdx new file mode 100644 index 0000000000..f04a0f0578 --- /dev/null +++ b/content/nomad/v0.11.x/content/docs/configuration/tls.mdx @@ -0,0 +1,103 @@ +--- +layout: docs +page_title: tls Stanza - Agent Configuration +sidebar_title: tls +description: |- + The "tls" stanza configures Nomad's TLS communication via HTTP and RPC to + enforce secure cluster communication between servers, clients, and between. +--- + +# `tls` Stanza + + + +The `tls` stanza configures Nomad's TLS communication via HTTP and RPC to +enforce secure cluster communication between servers, clients, and between. + +```hcl +tls { + http = true + rpc = true +} +``` + +~> Incorrect configuration of the TLS configuration can result in failure to +start the Nomad agent. + +This section of the documentation only covers the configuration options for +`tls` stanza. To understand how to setup the certificates themselves, please see +the [Encryption Overview Guide](https://learn.hashicorp.com/nomad/transport-security/gossip-encryption). + +## `tls` Parameters + +- `ca_file` `(string: "")` - Specifies the path to the CA certificate to use for + Nomad's TLS communication. + +- `cert_file` `(string: "")` - Specifies the path to the certificate file used + for Nomad's TLS communication. + +- `key_file` `(string: "")` - Specifies the path to the key file to use for + Nomad's TLS communication. + +- `http` `(bool: false)` - Specifies if TLS should be enabled on the HTTP + endpoints on the Nomad agent, including the API. + +- `rpc` `(bool: false)` - Specifies if TLS should be enabled on the RPC + endpoints and [Raft][raft] traffic between the Nomad servers. Enabling this on + a Nomad client makes the client use TLS for making RPC requests to the Nomad + servers. + +- `rpc_upgrade_mode` `(bool: false)` - This option should be used only when the + cluster is being upgraded to TLS, and removed after the migration is + complete. This allows the agent to accept both TLS and plaintext traffic. + +- `tls_cipher_suites` `string: "")` - Specifies the TLS cipher suites that will + be used by the agent as a comma-separated string. Known insecure ciphers are + disabled (3DES and RC4). By default, an agent is configured to use + TLS_ECDHE_ECDSA_WITH_AES_256_GCM_SHA384, + TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384, + TLS_ECDHE_ECDSA_WITH_CHACHA20_POLY1305, + TLS_ECDHE_RSA_WITH_CHACHA20_POLY1305, + TLS_ECDHE_ECDSA_WITH_AES_128_GCM_SHA256, + TLS_ECDHE_RSA_WITH_AES_128_GCM_SHA256, + TLS_ECDHE_ECDSA_WITH_AES_256_GCM_SHA384, + TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384, + TLS_ECDHE_ECDSA_WITH_AES_128_GCM_SHA256 and + TLS_ECDHE_RSA_WITH_AES_128_GCM_SHA256. + +- `tls_min_version` `(string: "tls12")`- Specifies the minimum supported version + of TLS. Accepted values are "tls10", "tls11", "tls12". + +- `tls_prefer_server_cipher_suites` `(bool: false)` - Specifies whether + TLS connections should prefer the server's ciphersuites over the client's. + +- `verify_https_client` `(bool: false)` - Specifies agents should require + client certificates for all incoming HTTPS requests. The client certificates + must be signed by the same CA as Nomad. + +- `verify_server_hostname` `(bool: false)` - Specifies if outgoing TLS + connections should verify the server's hostname. + +## `tls` Examples + +The following examples only show the `tls` stanzas. Remember that the +`tls` stanza is only valid in the placements listed above. + +### Enabling TLS + +This example shows enabling TLS configuration. This enables TLS communication +between all servers and clients using the default system CA bundle and +certificates. + +```hcl +tls { + http = true + rpc = true + + ca_file = "/etc/certs/ca.crt" + cert_file = "/etc/certs/nomad.crt" + key_file = "/etc/certs/nomad.key" +} +``` + +[raft]: https://github.com/hashicorp/serf 'Serf by HashiCorp' diff --git a/content/nomad/v0.11.x/content/docs/configuration/vault.mdx b/content/nomad/v0.11.x/content/docs/configuration/vault.mdx new file mode 100644 index 0000000000..11ccd6832a --- /dev/null +++ b/content/nomad/v0.11.x/content/docs/configuration/vault.mdx @@ -0,0 +1,148 @@ +--- +layout: docs +page_title: vault Stanza - Agent Configuration +sidebar_title: vault +description: |- + The "vault" stanza configures Nomad's integration with HashiCorp's Vault. + When configured, Nomad can create and distribute Vault tokens to tasks + automatically. +--- + +# `vault` Stanza + + + +The `vault` stanza configures Nomad's integration with [HashiCorp's +Vault][vault]. When configured, Nomad can create and distribute Vault tokens to +tasks automatically. For more information on the architecture and setup, please +see the [Nomad and Vault integration documentation][nomad-vault]. + +```hcl +vault { + enabled = true + address = "https://vault.company.internal:8200" +} +``` + +## `vault` Parameters + +- `address` - `(string: "https://vault.service.consul:8200")` - Specifies the + address to the Vault server. This must include the protocol, host/ip, and port + given in the format `protocol://host:port`. If your Vault installation is + behind a load balancer, this should be the address of the load balancer. + +- `allow_unauthenticated` `(bool: true)` - Specifies if users submitting jobs to + the Nomad server should be required to provide their own Vault token, proving + they have access to the policies listed in the job. This option should be + disabled in an untrusted environment. + +- `enabled` `(bool: false)` - Specifies if the Vault integration should be + activated. + +- `create_from_role` `(string: "")` - Specifies the role to create tokens from. + The token given to Nomad does not have to be created from this role but must + have "`update`" capability on "`auth/token/create/`" path in + Vault. If this value is unset and the token is created from a role, the value + is defaulted to the role the token is from. This is largely for backwards + compatibility. It is recommended to set the `create_from_role` field if Nomad + is deriving child tokens from a role. + +- `task_token_ttl` `(string: "")` - Specifies the TTL of created tokens when + using a root token. This is specified using a label suffix like "30s" or "1h". + +- `ca_file` `(string: "")` - Specifies an optional path to the CA + certificate used for Vault communication. If unspecified, this will fallback + to the default system CA bundle, which varies by OS and version. + +- `ca_path` `(string: "")` - Specifies an optional path to a folder + containing CA certificates to be used for Vault communication. If unspecified, + this will fallback to the default system CA bundle, which varies by OS and + version. + +- `cert_file` `(string: "")` - Specifies the path to the certificate used + for Vault communication. If this is set then you need to also set + `tls_key_file`. + +- `key_file` `(string: "")` - Specifies the path to the private key used for + Vault communication. If this is set then you need to also set `cert_file`. + +- `namespace` `(string: "")` - Specifies the [Vault namespace](https://www.vaultproject.io/docs/enterprise/namespaces) + used by the Vault integration. If non-empty, this namespace will be used on + all Vault API calls. + +- `tls_server_name` `(string: "")` - Specifies an optional string used to set + the SNI host when connecting to Vault via TLS. + +- `tls_skip_verify` `(bool: false)` - Specifies if SSL peer validation should be + enforced. + + !> It is **strongly discouraged** to disable SSL verification. Instead, you + should install a custom CA bundle and validate against it. Disabling SSL + verification can allow an attacker to easily compromise your cluster. + +- `token` `(string: "")` - Specifies the parent Vault token to use to derive child tokens for jobs + requesting tokens. + Visit the [Vault Integration Guide](/docs/vault-integration) + to see how to generate an appropriate token in Vault. + + !> It is **strongly discouraged** to place the token as a configuration + parameter like this, since the token could be checked into source control + accidentally. Users should set the `VAULT_TOKEN` environment variable when + starting the agent instead. + +## `vault` Examples + +The following examples only show the `vault` stanzas. Remember that the +`vault` stanza is only valid in the placements listed above. + +### Nomad Server + +This example shows an example Vault configuration for a Nomad server: + +```hcl +vault { + enabled = true + ca_path = "/etc/certs/ca" + cert_file = "/var/certs/vault.crt" + key_file = "/var/certs/vault.key" + + # Address to communicate with Vault. The below is the default address if + # unspecified. + address = "https://vault.service.consul:8200" + + # Embedding the token in the configuration is discouraged. Instead users + # should set the VAULT_TOKEN environment variable when starting the Nomad + # agent + token = "debecfdc-9ed7-ea22-c6ee-948f22cdd474" + + # Setting the create_from_role option causes Nomad to create tokens for tasks + # via the provided role. This allows the role to manage what policies are + # allowed and disallowed for use by tasks. + create_from_role = "nomad-cluster" +} +``` + +### Nomad Client + +This example shows an example Vault configuration for a Nomad client: + +```hcl +vault { + enabled = true + address = "https://vault.service.consul:8200" + ca_path = "/etc/certs/ca" + cert_file = "/var/certs/vault.crt" + key_file = "/var/certs/vault.key" +} +``` + +The key difference is that the token is not necessary on the client. + +## `vault` Configuration Reloads + +The Vault configuration can be reloaded on servers. This can be useful if a new +token needs to be given to the servers without having to restart them. A reload +can be accomplished by sending the process a `SIGHUP` signal. + +[vault]: https://www.vaultproject.io/ 'Vault by HashiCorp' +[nomad-vault]: /docs/vault-integration 'Nomad Vault Integration' diff --git a/content/nomad/v0.11.x/content/docs/devices/community.mdx b/content/nomad/v0.11.x/content/docs/devices/community.mdx new file mode 100644 index 0000000000..f6e894e1f8 --- /dev/null +++ b/content/nomad/v0.11.x/content/docs/devices/community.mdx @@ -0,0 +1,21 @@ +--- +layout: docs +page_title: 'Device Plugins: Community Supported' +sidebar_title: Community +description: A list of community supported Device Plugins. +--- + +# Community Supported + +If you have authored a device plugin that you believe will be useful to the +broader Nomad community and you are committed to maintaining the plugin, please +file a PR to add your plugin to this page. + +## Authoring Device Plugins + +Nomad has a plugin system for defining device drivers. External device plugins +will have the same user experience as built in drivers. For details on +authoring a device plugin, please refer to the [plugin authoring +guide][plugin_guide]. + +[plugin_guide]: /docs/internals/plugins diff --git a/content/nomad/v0.11.x/content/docs/devices/index.mdx b/content/nomad/v0.11.x/content/docs/devices/index.mdx new file mode 100644 index 0000000000..bb403ea6d3 --- /dev/null +++ b/content/nomad/v0.11.x/content/docs/devices/index.mdx @@ -0,0 +1,24 @@ +--- +layout: docs +page_title: Device Plugins +sidebar_title: Device Plugins +description: Device Plugins are used to expose devices to tasks in Nomad. +--- + +# Device Plugins + +Device plugins are used to detect and make devices available to tasks in Nomad. +Devices are physical hardware that exists on a node such as a GPU or an FPGA. By +having extensible device plugins, Nomad has the flexibility to support a broad +set of devices and allows the community to build additional device plugins as +needed. + +The list of supported device plugins is provided on the left of this page. +Each device plugin documents its configuration and installation requirements, +the attributes it fingerprints, and the environment variables it exposes to +tasks. + +For details on authoring a device plugin, please refer to the [plugin authoring +guide][plugin_guide]. + +[plugin_guide]: /docs/internals/plugins diff --git a/content/nomad/v0.11.x/content/docs/devices/nvidia.mdx b/content/nomad/v0.11.x/content/docs/devices/nvidia.mdx new file mode 100644 index 0000000000..222c9ae535 --- /dev/null +++ b/content/nomad/v0.11.x/content/docs/devices/nvidia.mdx @@ -0,0 +1,330 @@ +--- +layout: docs +page_title: 'Device Plugins: Nvidia' +sidebar_title: Nvidia +description: The Nvidia Device Plugin detects and makes Nvidia devices available to tasks. +--- + +# Nvidia GPU Device Plugin + +Name: `nvidia-gpu` + +The Nvidia device plugin is used to expose Nvidia GPUs to Nomad. The Nvidia +plugin is built into Nomad and does not need to be downloaded separately. + +## Fingerprinted Attributes + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + +
AttributeUnit
+ memory + MiB
+ power + W (Watt)
+ bar1 + MiB
+ driver_version + string
+ cores_clock + MHz
+ memory_clock + MHz
+ pci_bandwidth + MB/s
+ display_state + string
+ persistence_mode + string
+ +## Runtime Environment + +The `nvidia-gpu` device plugin exposes the following environment variables: + +- `NVIDIA_VISIBLE_DEVICES` - List of Nvidia GPU IDs available to the task. + +### Additional Task Configurations + +Additional environment variables can be set by the task to influence the runtime +environment. See [Nvidia's +documentation](https://github.com/NVIDIA/nvidia-container-runtime#environment-variables-oci-spec). + +## Installation Requirements + +In order to use the `nvidia-gpu` the following prerequisites must be met: + +1. GNU/Linux x86_64 with kernel version > 3.10 +2. NVIDIA GPU with Architecture > Fermi (2.1) +3. NVIDIA drivers >= 340.29 with binary `nvidia-smi` + +### Docker Driver Requirements + +In order to use the Nvidia driver plugin with the Docker driver, please follow +the installation instructions for +[`nvidia-docker`](). + +## Plugin Configuration + +```hcl +plugin "nvidia-gpu" { + ignored_gpu_ids = ["GPU-fef8089b", "GPU-ac81e44d"] + fingerprint_period = "1m" +} +``` + +The `nvidia-gpu` device plugin supports the following configuration in the agent +config: + +- `ignored_gpu_ids` `(array: [])` - Specifies the set of GPU UUIDs that + should be ignored when fingerprinting. + +- `fingerprint_period` `(string: "1m")` - The period in which to fingerprint for + device changes. + +## Restrictions + +The Nvidia integration only works with drivers who natively integrate with +Nvidia's [container runtime +library](https://github.com/NVIDIA/libnvidia-container). + +Nomad has tested support with the [`docker` driver][docker-driver] and plans to +bring support to the built-in [`exec`][exec-driver] and [`java`][java-driver] +drivers. Support for [`lxc`][lxc-driver] should be possible by installing the +[Nvidia hook](https://github.com/lxc/lxc/blob/master/hooks/nvidia) but is not +tested or documented by Nomad. + +## Examples + +Inspect a node with a GPU: + +```shell-session +$ nomad node status 4d46e59f +ID = 4d46e59f +Name = nomad +Class = +DC = dc1 +Drain = false +Eligibility = eligible +Status = ready +Uptime = 19m43s +Driver Status = docker,mock_driver,raw_exec + +Node Events +Time Subsystem Message +2019-01-23T18:25:18Z Cluster Node registered + +Allocated Resources +CPU Memory Disk +0/15576 MHz 0 B/55 GiB 0 B/28 GiB + +Allocation Resource Utilization +CPU Memory +0/15576 MHz 0 B/55 GiB + +Host Resource Utilization +CPU Memory Disk +2674/15576 MHz 1.5 GiB/55 GiB 3.0 GiB/31 GiB + +Device Resource Utilization +nvidia/gpu/Tesla K80[GPU-e1f6f4f1-1ea5-7b9d-5f03-338a9dc32416] 0 / 11441 MiB + +Allocations +No allocations placed +``` + +Display detailed statistics on a node with a GPU: + +```shell-session +$ nomad node status -stats 4d46e59f +ID = 4d46e59f +Name = nomad +Class = +DC = dc1 +Drain = false +Eligibility = eligible +Status = ready +Uptime = 19m59s +Driver Status = docker,mock_driver,raw_exec + +Node Events +Time Subsystem Message +2019-01-23T18:25:18Z Cluster Node registered + +Allocated Resources +CPU Memory Disk +0/15576 MHz 0 B/55 GiB 0 B/28 GiB + +Allocation Resource Utilization +CPU Memory +0/15576 MHz 0 B/55 GiB + +Host Resource Utilization +CPU Memory Disk +2673/15576 MHz 1.5 GiB/55 GiB 3.0 GiB/31 GiB + +Device Resource Utilization +nvidia/gpu/Tesla K80[GPU-e1f6f4f1-1ea5-7b9d-5f03-338a9dc32416] 0 / 11441 MiB + +// ...TRUNCATED... + +Device Stats +Device = nvidia/gpu/Tesla K80[GPU-e1f6f4f1-1ea5-7b9d-5f03-338a9dc32416] +BAR1 buffer state = 2 / 16384 MiB +Decoder utilization = 0 % +ECC L1 errors = 0 +ECC L2 errors = 0 +ECC memory errors = 0 +Encoder utilization = 0 % +GPU utilization = 0 % +Memory state = 0 / 11441 MiB +Memory utilization = 0 % +Power usage = 37 / 149 W +Temperature = 34 C + +Allocations +No allocations placed +``` + +Run the following example job to see that that the GPU was mounted in the +container: + +```hcl +job "gpu-test" { + datacenters = ["dc1"] + type = "batch" + + group "smi" { + task "smi" { + driver = "docker" + + config { + image = "nvidia/cuda:9.0-base" + command = "nvidia-smi" + } + + resources { + device "nvidia/gpu" { + count = 1 + + # Add an affinity for a particular model + affinity { + attribute = "${device.model}" + value = "Tesla K80" + weight = 50 + } + } + } + } + } +} +``` + +```shell-session +$ nomad run example.nomad +==> Monitoring evaluation "21bd7584" + Evaluation triggered by job "gpu-test" + Allocation "d250baed" created: node "4d46e59f", group "smi" + Evaluation status changed: "pending" -> "complete" +==> Evaluation "21bd7584" finished with status "complete" + +$ nomad alloc status d250baed +ID = d250baed +Eval ID = 21bd7584 +Name = gpu-test.smi[0] +Node ID = 4d46e59f +Job ID = example +Job Version = 0 +Client Status = complete +Client Description = All tasks have completed +Desired Status = run +Desired Description = +Created = 7s ago +Modified = 2s ago + +Task "smi" is "dead" +Task Resources +CPU Memory Disk Addresses +0/100 MHz 0 B/300 MiB 300 MiB + +Device Stats +nvidia/gpu/Tesla K80[GPU-e1f6f4f1-1ea5-7b9d-5f03-338a9dc32416] 0 / 11441 MiB + +Task Events: +Started At = 2019-01-23T18:25:32Z +Finished At = 2019-01-23T18:25:34Z +Total Restarts = 0 +Last Restart = N/A + +Recent Events: +Time Type Description +2019-01-23T18:25:34Z Terminated Exit Code: 0 +2019-01-23T18:25:32Z Started Task started by client +2019-01-23T18:25:29Z Task Setup Building Task Directory +2019-01-23T18:25:29Z Received Task received by client + +$ nomad alloc logs d250baed +Wed Jan 23 18:25:32 2019 ++-----------------------------------------------------------------------------+ +| NVIDIA-SMI 410.48 Driver Version: 410.48 | +|-------------------------------+----------------------+----------------------+ +| GPU Name Persistence-M| Bus-Id Disp.A | Volatile Uncorr. ECC | +| Fan Temp Perf Pwr:Usage/Cap| Memory-Usage | GPU-Util Compute M. | +|===============================+======================+======================| +| 0 Tesla K80 On | 00004477:00:00.0 Off | 0 | +| N/A 33C P8 37W / 149W | 0MiB / 11441MiB | 0% Default | ++-------------------------------+----------------------+----------------------+ + ++-----------------------------------------------------------------------------+ +| Processes: GPU Memory | +| GPU PID Type Process name Usage | +|=============================================================================| +| No running processes found | ++-----------------------------------------------------------------------------+ +``` + +[docker-driver]: /docs/drivers/docker 'Nomad docker Driver' +[exec-driver]: /docs/drivers/exec 'Nomad exec Driver' +[java-driver]: /docs/drivers/java 'Nomad java Driver' +[lxc-driver]: /docs/drivers/external/lxc 'Nomad lxc Driver' diff --git a/content/nomad/v0.11.x/content/docs/drivers/docker.mdx b/content/nomad/v0.11.x/content/docs/drivers/docker.mdx new file mode 100644 index 0000000000..f261ba45c1 --- /dev/null +++ b/content/nomad/v0.11.x/content/docs/drivers/docker.mdx @@ -0,0 +1,987 @@ +--- +layout: docs +page_title: 'Drivers: Docker' +sidebar_title: Docker +description: The Docker task driver is used to run Docker based tasks. +--- + +# Docker Driver + +Name: `docker` + +The `docker` driver provides a first-class Docker workflow on Nomad. The Docker +driver handles downloading containers, mapping ports, and starting, watching, +and cleaning up after containers. + +## Task Configuration + +```hcl +task "webservice" { + driver = "docker" + + config { + image = "redis:3.2" + labels { + group = "webservice-cache" + } + } +} +``` + +The `docker` driver supports the following configuration in the job spec. Only +`image` is required. + +- `image` - The Docker image to run. The image may include a tag or custom URL + and should include `https://` if required. By default it will be fetched from + Docker Hub. If the tag is omitted or equal to `latest` the driver will always + try to pull the image. If the image to be pulled exists in a registry that + requires authentication credentials must be provided to Nomad. Please see the + [Authentication section](#authentication). + + ```hcl + config { + image = "https://hub.docker.internal/redis:3.2" + } + ``` + +- `args` - (Optional) A list of arguments to the optional `command`. If no + `command` is specified, the arguments are passed directly to the container. + References to environment variables or any [interpretable Nomad + variables](/docs/runtime/interpolation) will be interpreted before + launching the task. For example: + + ```hcl + config { + args = [ + "-bind", "${NOMAD_PORT_http}", + "${nomad.datacenter}", + "${MY_ENV}", + "${meta.foo}", + ] + } + ``` + +- `auth` - (Optional) Provide authentication for a private registry (see below). + +- `auth_soft_fail` `(bool: false)` - Don't fail the task on an auth failure. + Attempt to continue without auth. + +- `command` - (Optional) The command to run when starting the container. + + ```hcl + config { + command = "my-command" + } + ``` + +- `dns_search_domains` - (Optional) A list of DNS search domains for the container + to use. + +- `dns_options` - (Optional) A list of DNS options for the container to use. + +- `dns_servers` - (Optional) A list of DNS servers for the container to use + (e.g. ["8.8.8.8", "8.8.4.4"]). Requires Docker v1.10 or greater. + +- `entrypoint` - (Optional) A string list overriding the image's entrypoint. + +- `extra_hosts` - (Optional) A list of hosts, given as host:IP, to be added to + `/etc/hosts`. + +- `force_pull` - (Optional) `true` or `false` (default). Always pull most recent image + instead of using existing local image. Should be set to `true` if repository tags + are mutable. If image's tag is `latest` or omitted, the image will always be pulled + regardless of this setting. + +- `hostname` - (Optional) The hostname to assign to the container. When + launching more than one of a task (using `count`) with this option set, every + container the task starts will have the same hostname. + +- `interactive` - (Optional) `true` or `false` (default). Keep STDIN open on + the container. + +- `sysctl` - (Optional) A key-value map of sysctl configurations to set to the + containers on start. + + ```hcl + config { + sysctl { + net.core.somaxconn = "16384" + } + } + ``` + +- `ulimit` - (Optional) A key-value map of ulimit configurations to set to the + containers on start. + + ```hcl + config { + ulimit { + nproc = "4242" + nofile = "2048:4096" + } + } + ``` + +- `privileged` - (Optional) `true` or `false` (default). Privileged mode gives + the container access to devices on the host. Note that this also requires the + nomad agent and docker daemon to be configured to allow privileged + containers. + +- `ipc_mode` - (Optional) The IPC mode to be used for the container. The default + is `none` for a private IPC namespace. Other values are `host` for sharing + the host IPC namespace or the name or id of an existing container. Note that + it is not possible to refer to Docker containers started by Nomad since their + names are not known in advance. Note that setting this option also requires the + Nomad agent to be configured to allow privileged containers. + +- `ipv4_address` - (Optional) The IPv4 address to be used for the container when + using user defined networks. Requires Docker 1.13 or greater. + +- `ipv6_address` - (Optional) The IPv6 address to be used for the container when + using user defined networks. Requires Docker 1.13 or greater. + +- `labels` - (Optional) A key-value map of labels to set to the containers on + start. + + ```hcl + config { + labels { + foo = "bar" + zip = "zap" + } + } + ``` + +- `load` - (Optional) Load an image from a `tar` archive file instead of from a + remote repository. Equivalent to the `docker load -i ` command. + + ```hcl + artifact { + source = "http://path.to/redis.tar" + } + config { + load = "redis.tar" + image = "redis" + } + ``` + +- `logging` - (Optional) A key-value map of Docker logging options. + Defaults to `json-file` with log rotation (`max-file=2` and `max-size=2m`). + + ```hcl + config { + logging { + type = "fluentd" + config { + fluentd-address = "localhost:24224" + tag = "your_tag" + } + } + } + ``` + +- `mac_address` - (Optional) The MAC address for the container to use (e.g. + "02:68:b3:29:da:98"). + +- `memory_hard_limit` - (Optional) The maximum allowable amount of memory used + (megabytes) by the container. If set, the [`memory`](/docs/job-specification/resources#memory) + parameter of the task resource configuration becomes a soft limit passed to the + docker driver as [`--memory_reservation`](https://docs.docker.com/config/containers/resource_constraints/#limit-a-containers-access-to-memory), + and `memory_hard_limit` is passed as the [`--memory`](https://docs.docker.com/config/containers/resource_constraints/#limit-a-containers-access-to-memory) + hard limit. When the host is under memory pressure, the behavior of soft limit + activation is governed by the [Kernel](https://www.kernel.org/doc/Documentation/cgroup-v1/memory.txt). + +- `network_aliases` - (Optional) A list of network-scoped aliases, provide a way for a + container to be discovered by an alternate name by any other container within + the scope of a particular network. Network-scoped alias is supported only for + containers in user defined networks + + ```hcl + config { + network_mode = "user-network" + network_aliases = [ + "${NOMAD_TASK_NAME}", + "${NOMAD_TASK_NAME}-${NOMAD_ALLOC_INDEX}" + ] + } + ``` + +- `network_mode` - (Optional) The network mode to be used for the container. In + order to support userspace networking plugins in Docker 1.9 this accepts any + value. The default is `bridge` for all operating systems but Windows, which + defaults to `nat`. Other networking modes may not work without additional + configuration on the host (which is outside the scope of Nomad). Valid values + pre-docker 1.9 are `default`, `bridge`, `host`, `none`, or `container:name`. + +- `pid_mode` - (Optional) `host` or not set (default). Set to `host` to share + the PID namespace with the host. Note that this also requires the Nomad agent + to be configured to allow privileged containers. + See below for more details. + +- `port_map` - (Optional) A key-value map of port labels (see below). + +- `security_opt` - (Optional) A list of string flags to pass directly to + [`--security-opt`](https://docs.docker.com/engine/reference/run/#security-configuration). + For example: + + ```hcl + config { + security_opt = [ + "credentialspec=file://gmsaUser.json", + ] + } + ``` + +- `shm_size` - (Optional) The size (bytes) of /dev/shm for the container. + +- `storage_opt` - (Optional) A key-value map of storage options set to the containers on start. + This overrides the [host dockerd configuration](https://docs.docker.com/engine/reference/commandline/dockerd/#options-per-storage-driver). + For example: + + ```hcl + config { + storage_opt = { + size = "40G" + } + } + ``` + +- `SSL` - (Optional) If this is set to true, Nomad uses SSL to talk to the + repository. The default value is `true`. **Deprecated as of 0.5.3** + +- `tty` - (Optional) `true` or `false` (default). Allocate a pseudo-TTY for the + container. + +- `uts_mode` - (Optional) `host` or not set (default). Set to `host` to share + the UTS namespace with the host. Note that this also requires the Nomad agent + to be configured to allow privileged containers. + +- `userns_mode` - (Optional) `host` or not set (default). Set to `host` to use + the host's user namespace when user namespace remapping is enabled on the + docker daemon. + +- `volumes` - (Optional) A list of `host_path:container_path` strings to bind + host paths to container paths. Mounting host paths outside of the allocation + directory can be disabled on clients by setting the `docker.volumes.enabled` + option set to false. This will limit volumes to directories that exist inside + the allocation directory. We recommend using [`mounts`](#mounts) if you wish + to have more control over volume definitions. + + ```hcl + config { + volumes = [ + # Use absolute paths to mount arbitrary paths on the host + "/path/on/host:/path/in/container", + + # Use relative paths to rebind paths already in the allocation dir + "relative/to/task:/also/in/container" + ] + } + ``` + +- `volume_driver` - (Optional) The name of the volume driver used to mount + volumes. Must be used along with `volumes`. If `volume_driver` is omitted, + then relative paths will be mounted from inside the allocation dir. If a + `"local"` or other driver is used, then they may be named volumes instead. + If `docker.volumes.enabled` is false then volume drivers and paths outside the + allocation directory are disallowed. + + ```hcl + config { + volumes = [ + # Use named volume created outside nomad. + "name-of-the-volume:/path/in/container" + ] + # Name of the Docker Volume Driver used by the container + volume_driver = "pxd" + } + ``` + +- `work_dir` - (Optional) The working directory inside the container. + +- `mounts` - (Optional) A list of + [mounts](https://docs.docker.com/engine/reference/commandline/service_create/#add-bind-mounts-or-volumes) + to be mounted into the container. Volume, bind, and tmpfs type mounts are supported. + + ```hcl + config { + mounts = [ + # sample volume mount + { + type = "volume" + target = "/path/in/container" + source = "name-of-volume" + readonly = false + volume_options { + no_copy = false + labels { + foo = "bar" + } + driver_config { + name = "pxd" + options = { + foo = "bar" + } + } + } + }, + # sample bind mount + { + type = "bind" + target = "/path/in/container" + source = "/path/in/host" + readonly = false + bind_options { + propagation = "rshared" + } + }, + # sample tmpfs mount + { + type = "tmpfs" + target = "/path/in/container" + readonly = false + tmpfs_options { + size = 100000 # size in bytes + } + } + ] + } + ``` + +- `devices` - (Optional) A list of + [devices](https://docs.docker.com/engine/reference/commandline/run/#add-host-device-to-container-device) + to be exposed the container. `host_path` is the only required field. By default, the container will be able to + `read`, `write` and `mknod` these devices. Use the optional `cgroup_permissions` field to restrict permissions. + + ```hcl + config { + devices = [ + { + host_path = "/dev/sda1" + container_path = "/dev/xvdc" + cgroup_permissions = "r" + }, + { + host_path = "/dev/sda2" + container_path = "/dev/xvdd" + } + ] + } + ``` + +- `cap_add` - (Optional) A list of Linux capabilities as strings to pass directly to + [`--cap-add`](https://docs.docker.com/engine/reference/run/#runtime-privilege-and-linux-capabilities). + Effective capabilities (computed from `cap_add` and `cap_drop`) have to match the configured whitelist. + The whitelist can be customized using the [`allow_caps`](#plugin_caps) plugin option key in the client node's configuration. + For example: + + ```hcl + config { + cap_add = [ + "SYS_TIME", + ] + } + ``` + +- `cap_drop` - (Optional) A list of Linux capabilities as strings to pass directly to + [`--cap-drop`](https://docs.docker.com/engine/reference/run/#runtime-privilege-and-linux-capabilities). + Effective capabilities (computed from `cap_add` and `cap_drop`) have to match the configured whitelist. + The whitelist can be customized using the [`allow_caps`](#plugin_caps) plugin option key in the client node's configuration. + For example: + + ```hcl + config { + cap_drop = [ + "MKNOD", + ] + } + ``` + +- `cpu_hard_limit` - (Optional) `true` or `false` (default). Use hard CPU + limiting instead of soft limiting. By default this is `false` which means + soft limiting is used and containers are able to burst above their CPU limit + when there is idle capacity. + +- `cpu_cfs_period` - (Optional) An integer value that specifies the duration in microseconds of the period + during which the CPU usage quota is measured. The default is 100000 (0.1 second) and the maximum allowed + value is 1000000 (1 second). See [here](https://access.redhat.com/documentation/en-us/red_hat_enterprise_linux/6/html/resource_management_guide/sec-cpu#sect-cfs) + for more details. + +- `advertise_ipv6_address` - (Optional) `true` or `false` (default). Use the container's + IPv6 address (GlobalIPv6Address in Docker) when registering services and checks. + See [IPv6 Docker containers](/docs/job-specification/service#IPv6 Docker containers) for details. + +- `readonly_rootfs` - (Optional) `true` or `false` (default). Mount + the container's filesystem as read only. + +- `runtime` - (Optional) A string representing a configured runtime to pass to docker. + This is equivalent to the `--runtime` argument in the docker CLI + For example, to use gVisor: + + ```hcl + config { + # gVisor runtime is runsc + runtime = "runsc" + } + ``` + +- `pids_limit` - (Optional) An integer value that specifies the pid limit for + the container. Defaults to unlimited. + +Additionally, the docker driver supports customization of the container's user through the task's [`user` option](/docs/job-specification/task#user). + +### Container Name + +Nomad creates a container after pulling an image. Containers are named +`{taskName}-{allocId}`. This is necessary in order to place more than one +container from the same task on a host (e.g. with count > 1). This also means +that each container's name is unique across the cluster. + +This is not configurable. + +### Authentication + +If you want to pull from a private repo (for example on dockerhub or quay.io), +you will need to specify credentials in your job via: + +- the `auth` option in the task config. + +- by storing explicit repository credentials or by specifying Docker + `credHelpers` in a file and setting the auth [config](#plugin_auth_file) + value on the client in the plugin options. + +- by specifying an auth [helper](#plugin_auth_helper) on the client in the + plugin options. + +The `auth` object supports the following keys: + +- `username` - (Optional) The account username. + +- `password` - (Optional) The account password. + +- `email` - (Optional) The account email. + +- `server_address` - (Optional) The server domain/IP without the protocol. + Docker Hub is used by default. + +Example task-config: + +```hcl +task "example" { + driver = "docker" + + config { + image = "secret/service" + + auth { + username = "dockerhub_user" + password = "dockerhub_password" + } + } +} +``` + +Example docker-config, using two helper scripts in \$PATH, +"docker-credential-ecr" and "docker-credential-vault": + +```json +{ + "auths": { + "internal.repo": { + "auth": "`echo -n ':' | base64 -w0`" + } + }, + "credHelpers": { + ".dkr.ecr..amazonaws.com": "ecr-login" + }, + "credsStore": "secretservice" +} +``` + +Example agent configuration, using a helper script "docker-credential-ecr" in +\$PATH + +```hcl +client { + enabled = true +} + +plugin "docker" { + config { + auth { + # Nomad will prepend "docker-credential-" to the helper value and call + # that script name. + helper = "ecr" + } + } +} +``` + +!> **Be Careful!** At this time these credentials are stored in Nomad in plain +text. Secrets management will be added in a later release. + +## Networking + +Docker supports a variety of networking configurations, including using host +interfaces, SDNs, etc. Nomad uses `bridged` networking by default, like Docker. + +You can specify other networking options, including custom networking plugins +in Docker 1.9. **You may need to perform additional configuration on the host +in order to make these work.** This additional configuration is outside the +scope of Nomad. + +### Allocating Ports + +You can allocate ports to your task using the port syntax described on the +[networking page](/docs/job-specification/network). Here is a recap: + +```hcl +task "example" { + driver = "docker" + + resources { + network { + port "http" {} + port "https" {} + } + } +} +``` + +### Forwarding and Exposing Ports + +A Docker container typically specifies which port a service will listen on by +specifying the `EXPOSE` directive in the `Dockerfile`. + +Because dynamic ports will not match the ports exposed in your Dockerfile, +Nomad will automatically expose all of the ports it allocates to your +container. + +These ports will be identified via environment variables. For example: + +```hcl +port "http" {} +``` + +If Nomad allocates port `23332` to your task for `http`, `23332` will be +automatically exposed and forwarded to your container, and the driver will set +an environment variable `NOMAD_PORT_http` with the value `23332` that you can +read inside your container. + +This provides an easy way to use the `host` networking option for better +performance. + +### Using the Port Map + +If you prefer to use the traditional port-mapping method, you can specify the +`port_map` option in your job specification. It looks like this: + +```hcl +task "example" { + driver = "docker" + + config { + image = "redis" + + port_map { + redis = 6379 + } + } + + resources { + network { + mbits = 20 + port "redis" {} + } + } +} +``` + +If Nomad allocates port `23332` to your task, the Docker driver will +automatically setup the port mapping from `23332` on the host to `6379` in your +container, so it will just work! + +Note that by default this only works with `bridged` networking mode. It may +also work with custom networking plugins which implement the same API for +expose and port forwarding. + +### Advertising Container IPs + +_New in Nomad 0.6._ + +When using network plugins like `weave` that assign containers a routable IP +address, that address will automatically be used in any `service` +advertisements for the task. You may override what address is advertised by +using the `address_mode` parameter on a `service`. See +[service](/docs/job-specification/service) for details. + +### Networking Protocols + +The Docker driver configures ports on both the `tcp` and `udp` protocols. + +This is not configurable. + +### Other Networking Modes + +Some networking modes like `container` or `none` will require coordination +outside of Nomad. First-class support for these options may be improved later +through Nomad plugins or dynamic job configuration. + +## Client Requirements + +Nomad requires Docker to be installed and running on the host alongside the +Nomad agent. Nomad was developed against Docker `1.8.2` and `1.9`. + +By default Nomad communicates with the Docker daemon using the daemon's Unix +socket. Nomad will need to be able to read/write to this socket. If you do not +run Nomad as root, make sure you add the Nomad user to the Docker group so +Nomad can communicate with the Docker daemon. + +For example, on Ubuntu you can use the `usermod` command to add the `vagrant` +user to the `docker` group so you can run Nomad without root: + +```shell-session +$ sudo usermod -G docker -a vagrant +``` + +For the best performance and security features you should use recent versions +of the Linux Kernel and Docker daemon. + +If you would like to change any of the options related to the `docker` driver on +a Nomad client, you can modify them with the [plugin stanza][plugin-stanza] syntax. Below is an example of a configuration (many of the values are the default). See the next section for more information on the options. + +```hcl +plugin "docker" { + config { + endpoint = "unix:///var/run/docker.sock" + + auth { + config = "/etc/docker-auth.json" + helper = "docker-credential-aws" + } + + tls { + cert = "/etc/nomad/nomad.pub" + key = "/etc/nomad/nomad.pem" + ca = "/etc/nomad/nomad.cert" + } + + gc { + image = true + image_delay = "3m" + container = true + + dangling_containers { + enabled = true + dry_run = false + period = "5m" + creation_grace = "5m" + } + } + + volumes { + enabled = true + selinuxlabel = "z" + } + + allow_privileged = false + allow_caps = ["CHOWN", "NET_RAW"] + + # allow_caps can also be set to "ALL" + # allow_caps = ["ALL"] + } +} +``` + +## Plugin Options + +- `endpoint` - If using a non-standard socket, HTTP or another location, or if + TLS is being used, docker.endpoint must be set. If unset, Nomad will attempt + to instantiate a Docker client using the DOCKER_HOST environment variable and + then fall back to the default listen address for the given operating system. + Defaults to unix:///var/run/docker.sock on Unix platforms and + npipe:////./pipe/docker_engine for Windows. + +- `allow_privileged` - Defaults to `false`. Changing this to true will allow + containers to use privileged mode, which gives the containers full access to + the host's devices. Note that you must set a similar setting on the Docker + daemon for this to work. + +- `pull_activity_timeout` - Defaults to `2m`. If Nomad receives no communication + from the Docker engine during an image pull within this timeframe, Nomad will + timeout the request that initiated the pull command. (Minimum of `1m`) + +- `allow_caps` - A list of allowed Linux capabilities. + Defaults to + "CHOWN,DAC_OVERRIDE,FSETID,FOWNER,MKNOD,NET_RAW,SETGID,SETUID,SETFCAP,SETPCAP, + NET_BIND_SERVICE,SYS_CHROOT,KILL,AUDIT_WRITE", which is the list of + capabilities allowed by docker by default, as defined here. Allows the + operator to control which capabilities can be obtained by tasks using cap_add + and cap_drop options. Supports the value "ALL" as a shortcut for whitelisting + all capabilities. + +- `allow_runtimes` - defaults to `["runc", "nvidia"]` - A list of the allowed + docker runtimes a task may use. + +- `auth` stanza: + + - `config` - Allows an operator to specify a + JSON file which is in the dockercfg format containing authentication + information for a private registry, from either (in order) `auths`, + `credHelpers` or `credsStore`. + - `helper` - Allows an operator to specify a + [credsStore](https://docs.docker.com/engine/reference/commandline/login/#credential-helper-protocol) + -like script on \$PATH to lookup authentication information from external + sources. The script's name must begin with `docker-credential-` and this + option should include only the basename of the script, not the path. + +- `tls` stanza: + + - `cert` - Path to the server's certificate file (`.pem`). Specify this + along with `key` and `ca` to use a TLS client to connect to the docker + daemon. `endpoint` must also be specified or this setting will be ignored. + - `key` - Path to the client's private key (`.pem`). Specify this along with + `cert` and `ca` to use a TLS client to connect to the docker daemon. + `endpoint` must also be specified or this setting will be ignored. + - `ca` - Path to the server's CA file (`.pem`). Specify this along with + `cert` and `key` to use a TLS client to connect to the docker daemon. + `endpoint` must also be specified or this setting will be ignored. + +- `disable_log_collection` - Defaults to `false`. Setting this to true will + disable Nomad logs collection of Docker tasks. If you don't rely on nomad log + capabilities and exclusively use host based log aggregation, you may consider + this option to disable nomad log collection overhead. + +- `gc` stanza: + + - `image` - Defaults to `true`. Changing this to `false` will prevent Nomad + from removing images from stopped tasks. + - `image_delay` - A time duration, as [defined + here](https://golang.org/pkg/time/#ParseDuration), that defaults to `3m`. + The delay controls how long Nomad will wait between an image being unused + and deleting it. If a tasks is received that uses the same image within + the delay, the image will be reused. + - `container` - Defaults to `true`. This option can be used to disable Nomad + from removing a container when the task exits. Under a name conflict, + Nomad may still remove the dead container. + - `dangling_containers` stanza for controlling dangling container detection + and cleanup: + + - `enabled` - Defaults to `true`. Enables dangling container handling. + - `dry_run` - Defaults to `false`. Only log dangling containers without + cleaning them up. + - `period` - Defaults to `"5m"`. A time duration that controls interval + between Nomad scans for dangling containers. + - `creation_grace` - Defaults to `"5m"`. Grace period after a container is + created during which the GC ignores it. Only used to prevent the GC from + removing newly created containers before they are registered with the + GC. Should not need adjusting higher but may be adjusted lower to GC + more aggressively. + +- `volumes` stanza: + + - `enabled` - Defaults to `true`. Allows tasks to bind host paths + (`volumes`) inside their container and use volume drivers + (`volume_driver`). Binding relative paths is always allowed and will be + resolved relative to the allocation's directory. + - `selinuxlabel` - Allows the operator to set a SELinux label to the + allocation and task local bind-mounts to containers. If used with + `docker.volumes.enabled` set to false, the labels will still be applied to + the standard binds in the container. + +- `infra_image` - This is the Docker image to use when creating the parent + container necessary when sharing network namespaces between tasks. Defaults + to "gcr.io/google_containers/pause-amd64:3.0". + +## Client Configuration + +~> Note: client configuration options will soon be deprecated. Please use +[plugin options][plugin-options] instead. See the [plugin stanza][plugin-stanza] +documentation for more information. + +The `docker` driver has the following [client configuration +options](/docs/configuration/client#options): + +- `docker.endpoint` - If using a non-standard socket, HTTP or another location, + or if TLS is being used, `docker.endpoint` must be set. If unset, Nomad will + attempt to instantiate a Docker client using the `DOCKER_HOST` environment + variable and then fall back to the default listen address for the given + operating system. Defaults to `unix:///var/run/docker.sock` on Unix platforms + and `npipe:////./pipe/docker_engine` for Windows. + +- `docker.auth.config` - Allows an operator to specify a + JSON file which is in the dockercfg format containing authentication + information for a private registry, from either (in order) `auths`, + `credHelpers` or `credsStore`. + +- `docker.auth.helper` - Allows an operator to specify a + [credsStore](https://docs.docker.com/engine/reference/commandline/login/#credential-helper-protocol) + -like script on \$PATH to lookup authentication information from external + sources. The script's name must begin with `docker-credential-` and this + option should include only the basename of the script, not the path. + +- `docker.tls.cert` - Path to the server's certificate file (`.pem`). Specify + this along with `docker.tls.key` and `docker.tls.ca` to use a TLS client to + connect to the docker daemon. `docker.endpoint` must also be specified or this + setting will be ignored. + +- `docker.tls.key` - Path to the client's private key (`.pem`). Specify this + along with `docker.tls.cert` and `docker.tls.ca` to use a TLS client to + connect to the docker daemon. `docker.endpoint` must also be specified or this + setting will be ignored. + +- `docker.tls.ca` - Path to the server's CA file (`.pem`). Specify this along + with `docker.tls.cert` and `docker.tls.key` to use a TLS client to connect to + the docker daemon. `docker.endpoint` must also be specified or this setting + will be ignored. + +- `docker.cleanup.image` Defaults to `true`. Changing this to `false` will + prevent Nomad from removing images from stopped tasks. + +- `docker.cleanup.image.delay` A time duration, as [defined + here](https://golang.org/pkg/time/#ParseDuration), that defaults to `3m`. The + delay controls how long Nomad will wait between an image being unused and + deleting it. If a tasks is received that uses the same image within the delay, + the image will be reused. + +- `docker.volumes.enabled`: Defaults to `true`. Allows tasks to bind host paths + (`volumes`) inside their container and use volume drivers (`volume_driver`). + Binding relative paths is always allowed and will be resolved relative to the + allocation's directory. + +- `docker.volumes.selinuxlabel`: Allows the operator to set a SELinux label to + the allocation and task local bind-mounts to containers. If used with + `docker.volumes.enabled` set to false, the labels will still be applied to the + standard binds in the container. + +- `docker.privileged.enabled` Defaults to `false`. Changing this to `true` will + allow containers to use `privileged` mode, which gives the containers full + access to the host's devices. Note that you must set a similar setting on the + Docker daemon for this to work. + +- `docker.caps.whitelist`: A list of allowed Linux capabilities. Defaults to + `"CHOWN,DAC_OVERRIDE,FSETID,FOWNER,MKNOD,NET_RAW,SETGID,SETUID,SETFCAP, SETPCAP,NET_BIND_SERVICE,SYS_CHROOT,KILL,AUDIT_WRITE"`, which is the list of + capabilities allowed by docker by default, as [defined + here](https://docs.docker.com/engine/reference/run/#runtime-privilege-and-linux-capabilities). + Allows the operator to control which capabilities can be obtained by tasks + using `cap_add` and `cap_drop` options. Supports the value `"ALL"` as a + shortcut for whitelisting all capabilities. + +- `docker.cleanup.container`: Defaults to `true`. This option can be used to + disable Nomad from removing a container when the task exits. Under a name + conflict, Nomad may still remove the dead container. + +- `docker.nvidia_runtime`: Defaults to `nvidia`. This option allows operators to select the runtime that should be used in order to expose Nvidia GPUs to the container. + +Note: When testing or using the `-dev` flag you can use `DOCKER_HOST`, +`DOCKER_TLS_VERIFY`, and `DOCKER_CERT_PATH` to customize Nomad's behavior. If +`docker.endpoint` is set Nomad will **only** read client configuration from the +config file. + +An example is given below: + +```hcl +client { + options { + "docker.cleanup.image" = "false" + } +} +``` + +## Client Attributes + +The `docker` driver will set the following client attributes: + +- `driver.docker` - This will be set to "1", indicating the driver is + available. +- `driver.docker.bridge_ip` - The IP of the Docker bridge network if one + exists. +- `driver.docker.version` - This will be set to version of the docker server. + +Here is an example of using these properties in a job file: + +```hcl +job "docs" { + # Require docker version higher than 1.2. + constraint { + attribute = "${driver.docker.version}" + operator = ">" + version = "1.2" + } +} +``` + +## Resource Isolation + +### CPU + +Nomad limits containers' CPU based on CPU shares. CPU shares allow containers +to burst past their CPU limits. CPU limits will only be imposed when there is +contention for resources. When the host is under load your process may be +throttled to stabilize QoS depending on how many shares it has. You can see how +many CPU shares are available to your process by reading `NOMAD_CPU_LIMIT`. +1000 shares are approximately equal to 1 GHz. + +Please keep the implications of CPU shares in mind when you load test workloads +on Nomad. + +### Memory + +Nomad limits containers' memory usage based on total virtual memory. This means +that containers scheduled by Nomad cannot use swap. This is to ensure that a +swappy process does not degrade performance for other workloads on the same +host. + +Since memory is not an elastic resource, you will need to make sure your +container does not exceed the amount of memory allocated to it, or it will be +terminated or crash when it tries to malloc. A process can inspect its memory +limit by reading `NOMAD_MEMORY_LIMIT`, but will need to track its own memory +usage. Memory limit is expressed in megabytes so 1024 = 1 GB. + +### IO + +Nomad's Docker integration does not currently provide QoS around network or +filesystem IO. These will be added in a later release. + +### Security + +Docker provides resource isolation by way of +[cgroups and namespaces](https://docs.docker.com/introduction/understanding-docker/#the-underlying-technology). +Containers essentially have a virtual file system all to themselves. If you +need a higher degree of isolation between processes for security or other +reasons, it is recommended to use full virtualization like +[QEMU](/docs/drivers/qemu). + +## Caveats + +### Dangling Containers + +Nomad 0.10.2 introduces a detector and a reaper for dangling Docker containers, +containers that Nomad starts yet does not manage or track. Though rare, they +lead to unexpectedly running services, potentially with stale versions. + +When Docker daemon becomes unavailable as Nomad starts a task, it is possible +for Docker to successfully start the container but return a 500 error code from +the API call. In such cases, Nomad retries and eventually aims to kill such +containers. However, if the Docker Engine remains unhealthy, subsequent retries +and stop attempts may still fail, and the started container becomes a dangling +container that Nomad no longer manages. + +The newly added reaper periodically scans for such containers. It only targets +containers with a `com.hashicorp.nomad.allocation_id` label, or match Nomad's +conventions for naming and bind-mounts (i.e. `/alloc`, `/secrets`, `local`). +Containers that don't match Nomad container patterns are left untouched. + +Operators can run the reaper in a dry-run mode, where it only logs dangling +container ids without killing them, or disable it by setting the +`gc.dangling_containers` config stanza. + +### Docker for Windows + +Docker for Windows only supports running Windows containers. Because Docker for +Windows is relatively new and rapidly evolving you may want to consult the +[list of relevant issues on GitHub][winissues]. + +[winissues]: https://github.com/hashicorp/nomad/issues?q=is%3Aopen+is%3Aissue+label%3Adriver%2Fdocker+label%3Aplatform-windows +[plugin-options]: #plugin-options +[plugin-stanza]: /docs/configuration/plugin diff --git a/content/nomad/v0.11.x/content/docs/drivers/exec.mdx b/content/nomad/v0.11.x/content/docs/drivers/exec.mdx new file mode 100644 index 0000000000..8aaa71979b --- /dev/null +++ b/content/nomad/v0.11.x/content/docs/drivers/exec.mdx @@ -0,0 +1,140 @@ +--- +layout: docs +page_title: 'Drivers: Exec' +sidebar_title: Isolated Fork/Exec +description: The Exec task driver is used to run binaries using OS isolation primitives. +--- + +# Isolated Fork/Exec Driver + +Name: `exec` + +The `exec` driver is used to simply execute a particular command for a task. +However, unlike [`raw_exec`](/docs/drivers/raw_exec) it uses the underlying isolation +primitives of the operating system to limit the task's access to resources. While +simple, since the `exec` driver can invoke any command, it can be used to call +scripts or other wrappers which provide higher level features. + +## Task Configuration + +```hcl +task "webservice" { + driver = "exec" + + config { + command = "my-binary" + args = ["-flag", "1"] + } +} +``` + +The `exec` driver supports the following configuration in the job spec: + +- `command` - The command to execute. Must be provided. If executing a binary + that exists on the host, the path must be absolute and within the task's + [chroot](#chroot). If executing a binary that is downloaded from + an [`artifact`](/docs/job-specification/artifact), the path can be + relative from the allocations's root directory. + +- `args` - (Optional) A list of arguments to the `command`. References + to environment variables or any [interpretable Nomad + variables](/docs/runtime/interpolation) will be interpreted before + launching the task. + +## Examples + +To run a binary present on the Node: + +```hcl +task "example" { + driver = "exec" + + config { + # When running a binary that exists on the host, the path must be absolute. + command = "/bin/sleep" + args = ["1"] + } +} +``` + +To execute a binary downloaded from an +[`artifact`](/docs/job-specification/artifact): + +```hcl +task "example" { + driver = "exec" + + config { + command = "name-of-my-binary" + } + + artifact { + source = "https://internal.file.server/name-of-my-binary" + options { + checksum = "sha256:abd123445ds4555555555" + } + } +} +``` + +## Client Requirements + +The `exec` driver can only be run when on Linux and running Nomad as root. +`exec` is limited to this configuration because currently isolation of resources +is only guaranteed on Linux. Further, the host must have cgroups mounted properly +in order for the driver to work. + +If you are receiving the error: + +``` +* Constraint "missing drivers" filtered <> nodes +``` + +and using the exec driver, check to ensure that you are running Nomad as root. +This also applies for running Nomad in -dev mode. + +## Plugin Options + +- `no_pivot_root` - Defaults to `false`. When `true`, the driver uses `chroot` + for file system isolation without `pivot_root`. This is useful for systems + where the root is on a ramdisk. + +## Client Attributes + +The `exec` driver will set the following client attributes: + +- `driver.exec` - This will be set to "1", indicating the driver is available. + +## Resource Isolation + +The resource isolation provided varies by the operating system of +the client and the configuration. + +On Linux, Nomad will use cgroups, and a chroot to isolate the +resources of a process and as such the Nomad agent must be run as root. + +### Chroot + +The chroot is populated with data in the following directories from the host +machine: + +``` +[ + "/bin", + "/etc", + "/lib", + "/lib32", + "/lib64", + "/run/resolvconf", + "/sbin", + "/usr", +] +``` + +The task's chroot is populated by linking or copying the data from the host into +the chroot. Note that this can take considerable disk space. Since Nomad v0.5.3, +the client manages garbage collection locally which mitigates any issue this may +create. + +This list is configurable through the agent client +[configuration file](/docs/configuration/client#chroot_env). diff --git a/content/nomad/v0.11.x/content/docs/drivers/external/firecracker-task-driver.mdx b/content/nomad/v0.11.x/content/docs/drivers/external/firecracker-task-driver.mdx new file mode 100644 index 0000000000..00f1a0f7df --- /dev/null +++ b/content/nomad/v0.11.x/content/docs/drivers/external/firecracker-task-driver.mdx @@ -0,0 +1,131 @@ +--- +layout: docs +page_title: 'Drivers: firecracker-task-driver' +sidebar_title: Firecracker driver +description: >- + The Firecracker task driver is used to run + firecracker(https://firecracker-microvm.github.io/) microvms. +--- + +# Firecracker task Driver + +Name: `firecracker-task-driver` + +The Firecracker task driver provides an interface for creating Linux microvms. +For more detailed instructions on how to set up and use this driver, please +refer to the [documentation][firecracker-task-guide]. + +## Task Configuration + +```hcl +task "test01" { + driver = "firecracker-task-driver" + + config { + KernelImage = "/home/build/hello-vmlinux.bin" + Firecracker = "/home/build/firecracker" + Vcpus = 1 + Mem = 128 + Network = "default" + } +} +``` + +The firecracker task driver supports the following parameters: + +- `KernelImage` - (Optional) Path to the kernel image to be used on the microvm. + Defaults to 'vmlinux' on nomad's allocation directory. + +- `BootDisk` - (Optional) Path to the ext4 rootfs to boot from. + Defaults to 'rootfs.ext4' on nomad's allocation directory. + +- `BootOptions` - (Optional) Kernel command line options to boot the microvm. + Defaults to "ro console=ttyS0 reboot=k panic=1 pci=off" + +- `Network` - (Optional) Network name of your container network configuration + file. + +- `Vcpus` - (Optional) Number of CPUs to assign to microvm. + +- `Cputype` - (Optional) CPU template to use, templates available are C3 or T2. + +- `Mem` - (Optional) Amount of memory in Megabytes to assign to microvm. + Defaults to 512 + +- `Firecracker` Location of the firecracker binary, the option could be omitted + if the environment variable FIRECRACKER_BIN is setup. Defaults to + '/usr/bin/firecracker' + +- `DisableHt` - (Optional) Disable CPU Hyperthreading. Defaults to false + +- `Log` - (Optional) path to file where to write firecracker logs. + +## Networking + +Network configuration is setup using CNI plugins, the steps to setup firecracker +task driver with cni are the following: + +- Build [cni plugins][container network plugins] and [tc-redirect-tap][tc-redirect-tap] + and copy them to `/opt/cni`. + +- Create a network configuration to be used by micro-vms on /etc/cni/conf.d/, + for example: default.conflist. + +### Example network configuration + +```json +{ + "name": "default", + "cniVersion": "0.4.0", + "plugins": [ + { + "type": "ptp", + "ipMasq": true, + "ipam": { + "type": "host-local", + "subnet": "192.168.127.0/24", + "resolvConf": "/etc/resolv.conf" + } + }, + { + "type": "firewall" + }, + { + "type": "tc-redirect-tap" + } + ] +} +``` + +In this example the name of this network is default and this name is the +parameter used in Network on the task driver job spec. Also the filename must +match the name of the network, and use the .conflist extension. + +## Client Requirements + +`firecracker-task-driver` requires the following: + +- Linux 4.14+ Firecracker currently supports physical Linux x86_64 and aarch64 + hosts, running kernel version 4.14 or later. However, the aarch64 support is + not feature complete (alpha stage) + +- The [Firecracker binary][firecracker binary] + +- KVM enabled in your Linux kernel, and you have read/write access to /dev/kvm + +- tun kernel module + +- The firecracker-task-driver binary placed in the [plugin_dir][plugin_dir] + directory + +- ip6tables package + +- [Container network plugins][container network plugins] + +- [tc-redirect-tap][tc-redirect-tap] + +[plugin_dir]: /docs/configuration#plugin_dir +[tc-redirect-tap]: https://github.com/firecracker-microvm/firecracker-go-sdk/tree/master/cni +[container network plugins]: https://github.com/containernetworking/plugins +[firecracker binary]: https://github.com/firecracker-microvm/firecracker/releases +[firecracker-task-guide]: https://github.com/cneira/firecracker-task-driver diff --git a/content/nomad/v0.11.x/content/docs/drivers/external/iis.mdx b/content/nomad/v0.11.x/content/docs/drivers/external/iis.mdx new file mode 100644 index 0000000000..2c83a6940b --- /dev/null +++ b/content/nomad/v0.11.x/content/docs/drivers/external/iis.mdx @@ -0,0 +1,106 @@ +--- +layout: docs +page_title: 'Drivers: nomad-driver-iis' +sidebar_title: Windows IIS +description: >- + The IIS driver is used for running + Windows IIS services. +--- + +# Windows IIS Driver + +Name: `win_iis` + +The Nomad IIS driver provides an interface for running Windows IIS website tasks. +A "website" is a combination of an application pool and a site (app, vdir, etc.). +Each allocation will create an application pool and site with the name being the allocation ID (guid). + +For more detailed instructions on how to set up and use this driver, please +refer to the [project README.md][nomad-driver-iis]. + +## Task Configuration + +```hcl +task "iis-test" { + driver = "win_iis" + + config { + path = "C:\\inetpub\\wwwroot" + apppool_identity { + identity="SpecificUser" + username="vagrant" + password="vagrant" + } + } +} +``` + +The IIS task driver supports the following parameters: + +- `path` (string) - (Required) The path of the website directory. + +- `site_config_path` (string) - (Optional) The path should point to a valid IIS Site XML that is generated from an export. + +- `apppool_config_path` (string) - (Optional) The path should point to a valid IIS Application Pool XML that is generated from an export. + +- `apppool_identity` - (Optional) `ApplicationPoolIdentity` (default). The identity which the Application Pool will run under. + - `identity` (string) - (Required) An identity is required to be set for Application Pools. Accepted inputs are `LocalService`, `LocalSystem`, `NetworkService`, `SpecificUser`, and `ApplicationPoolIdentity`. + - `username` (string) - (Optional) If `SpecificUser` was chosen, then provide the username. + - `password` (string) - (Optional) If `SpecificUser` was chosen, then provide the password. + +- `bindings` - (Optional) Bindings set here will be set for IIS. + - `hostname` (string) - (Optional) IIS hostname for a binding. + - `ipaddress` (string) - (Optional). `*` (default). IIS allows a binding to specify the IP Address being sent out. + - `resource_port` (string) - (Optional) Use a label from an established network stanza port. It is recommended to use this approach over using `port`. + - `port` (number) - (Optional) Specify a static port to use for the website. A port must be specified between `resource_port` or `port`. Ports set this way will not be recognized by nomad. + - `type` (string) - (Optional) Specifies which binding type IIS should use such as `http` or `https`. + - `cert_hash` (string) - (Optional) For SSL support, supply the cert hash here of a cert installed on the system. + +For optional parameters default value is assumed `nil`, if the default value is not specified in the documentation. + +## Networking + +Currently, `nomad-driver-iis` only supports host networking. No special configuration is needed as `nomad-driver-iis` +relies on IIS to manage the networking for Windows IIS website tasks. + +## Client Requirements + +`nomad-driver-iis` requires the following: + +- Windows 2016+ + +- Web Server enabled for IIS + +- The Nomad IIS driver binary ([build instructions][nomad-driver-iis]) + +## Plugin Options ((#plugin_options)) + +- `enabled` - The `IIS` driver may be disabled on hosts by setting this option to `false` (defaults to `true`). + +- `stats_interval` - This value defines how frequently you want to send `TaskStats` to nomad client. (defaults to `1 second`). + + +An example of using these plugin options with the new [plugin +syntax][plugin] is shown below: + +```hcl +plugin "win_iis" { + client { + enabled = true + stats_interval = "30s" + } +} +``` + +Please note the plugin name should match whatever name you have specified for the external driver in the [plugin_dir][plugin_dir] directory. + +## Client Attributes + +The `IIS` driver will set the following client attributes: + +- `driver.win_iis.iis_version` - Sets to IIS version of IIS which is running on nomad client. + +[nomad-driver-iis]: https://github.com/Roblox/nomad-driver-iis +[plugin]: /docs/configuration/plugin +[plugin_dir]: /docs/configuration#plugin_dir +[plugin-options]: #plugin_options diff --git a/content/nomad/v0.11.x/content/docs/drivers/external/index.mdx b/content/nomad/v0.11.x/content/docs/drivers/external/index.mdx new file mode 100644 index 0000000000..96c9b10494 --- /dev/null +++ b/content/nomad/v0.11.x/content/docs/drivers/external/index.mdx @@ -0,0 +1,43 @@ +--- +layout: docs +page_title: 'Task Driver Plugins: Community Supported' +sidebar_title: Community +description: A list of community supported Task Driver Plugins. +--- + +# Community Supported + +If you have authored a task driver plugin that you believe will be useful to the +broader Nomad community and you are committed to maintaining the plugin, please +file a PR to add your plugin to this page. + +For details on authoring a task driver plugin, please refer to the [plugin +authoring guide][plugin_guide]. + +## Task Driver Plugins + +Nomad has a plugin system for defining task drivers. External task driver +plugins will have the same user experience as built in drivers. + +Below is a list of community-supported task drivers you can use with Nomad: + +- [LXC][lxc] +- [Rkt][rkt] +- [Podman][podman] +- [Singularity][singularity] +- [Jail task driver][jail-task-driver] +- [Pot][pot] +- [Firecracker][firecracker-task-driver] +- [Systemd-Nspawn][nspawn-driver] +- [Windows IIS][nomad-driver-iis] + +[lxc]: /docs/drivers/external/lxc +[rkt]: /docs/drivers/external/rkt +[plugin_guide]: /docs/internals/plugins +[singularity]: /docs/drivers/external/singularity +[jail-task-driver]: /docs/drivers/external/jail-task-driver +[podman]: /docs/drivers/external/podman +[pot]: /docs/drivers/external/pot +[firecracker-task-driver]: /docs/drivers/external/firecracker-task-driver +[nspawn-driver]: /docs/drivers/external/nspawn +[nomad-driver-iis]: /docs/drivers/external/iis diff --git a/content/nomad/v0.11.x/content/docs/drivers/external/jail-task-driver.mdx b/content/nomad/v0.11.x/content/docs/drivers/external/jail-task-driver.mdx new file mode 100644 index 0000000000..a880fca820 --- /dev/null +++ b/content/nomad/v0.11.x/content/docs/drivers/external/jail-task-driver.mdx @@ -0,0 +1,149 @@ +--- +layout: docs +page_title: 'Drivers: jail-task-driver' +sidebar_title: Jailtask driver +description: >- + The Jail task driver is used to run application containers using FreeBSD + jails. +--- + +# Jail task Driver + +Name: `jail-task-driver` + +The Jail task driver provides an interface for using FreeBSD jails for running application +containers. You can download the external jail-task-driver [here][jail-task-driver]. For more detailed instructions on how to set up and use this driver, please refer to the [guide][jail-task-guide]. + +## Task Configuration + +```hcl +task "http-echo-jail" { + driver = "jail-task-driver" + config { + Path = "/zroot/iocage/jails/myjail/root" + Allow_raw_sockets = true + Allow_chflags = true + Ip4_addr = "em1|192.168.1.102" + Exec_start = "/usr/local/bin/http-echo -listen :9999 -text hello" + Rctl = { + Vmemoryuse = { + Action = "deny" + Amount = "1G" + Per = "process" + } + Openfiles = { + Action = "deny" + Amount = "500" + } + } + } + } +``` + +The Jail task driver supports most of [JAIL(8)][jail(8)] parameters, for a list of the currently supported parameters, please refer to the [Parameter Documentation][parameter-doc]. + +- `Path` - (Optional) The directory which is to be the root of the jail. + Defaults to nomad's allocation directory. + +- `Ip4` - (Optional) Control the availability of IPv4 addresses. Possible values are + **"inherit"** to allow unrestricted access to all system addresses, + **"new"** to restrict addresses via Ip4_addr, and "**disable"** to stop + the jail from using IPv4 entirely. + +~> Note : Setting the Ip4_addr parameter implies a value of **"new"** + +- `Ip4_addr` - (Optional) A list of IPv4 addresses assigned to the jail. If this is set, + the jail is restricted to using only these addresses. Any attempts to use other addresses fail, + and attempts to use wildcard addresses silently use the jailed address instead. For + IPv4 the first address given will be used as the source address when source address selection on + unbound sockets cannot find a better match. It is only possible to start multiple jails with + the same IP address if none of the jails has more than this + single overlapping IP address assigned to itself. + +- `Allow_raw_sockets` - (Optional) The jail root is allowed to create raw sockets. Setting + this parameter allows utilities like ping(8) and traceroute(8) to operate inside the jail. + If this is set, the source IP addresses are enforced to comply with the IP address bound to the jail, + regardless of whether or not the IP_HDRINCL flag has been set on the socket. + Since raw sockets can be used to configure and interact with various network subsystems, extra caution + should be used where privileged access to jails is given out to untrusted parties. + +## Resource Isolation + +Resource isolation on jails is enforced by [RCTL(8)][rctl-doc] all parameters for resource control +are supported. + +- `Rctl` - (Optional) Set resource limits on the jail, for a list of currently supported parameters, please refer to the [Parameter Documentation][parameter-doc]. + + * `Vmemoryuse` - (Optional) Address space limit,in bytes + * `Cputime` - (Optional) CPU time, in seconds + * `Datasize` - (Optional) data size, in bytes + * `Stacksize` - (Optional stack size, in bytes + * `Coredumpsize` - (Optional) core dump size, in bytes + * `Memoryuse` - (Optional) resident set size, in bytes + * `Memorylocked` - (Optional) locked memory, in bytes + * `Maxproc` - (Optional) number of processes + * `Openfiles` - (Optional) file descriptor table size + * `Vmemoryuse` - (Optional) address space limit,in bytes + * `Pseudoterminals` - (Optional) number of PTYs + * `Swapuse` - (Optional) swap space that may be reserved or used, in bytes + * `Nthr` - (Optional) number of threads + * `Msgqqueued` - (Optional) number of queued SysV messages + * `Msgqsize` - (Optional) SysV message queue size, in bytes + * `Nmsgq` - (Optional) number of SysV message queues + * `Nsem` - (Optional) number of SysV semaphores + * `Nsemop` - (Optional) number of SysV semaphores modified in a single semop(2) call + * `Nshm` - (Optional) number of SysV shared memory segments + * `Shmsize` - (Optional) SysV shared memory size, in bytes + * `Wallclock` - (Optional) wallclock time, in seconds + * `Pcpu` - (Optional) %CPU, in percents of a single CPU core + * `Readbps` - (Optional) filesystem reads, in bytes per second + * `Writebps` - (Optional) filesystem writes, in bytes per second + * `Readiops` - (Optional) filesystem reads, in operations per second + * `Writeiops` - (Optional) filesystem writes, in operations per second + +## Networking + +The job spec could specify the `Ip4addr` parameter to add the jail's ip address to an specific interface at jail +startup or the `Vnet` parameter to create a virtual network stack. Please refer to [JAIL(8)][jail(8)] for more details. + +- `vnet jail` - Example taken from Lucas, Michael W. FreeBSD Mastery: Jails (IT Mastery Book 15). + +```hcl + task "test01" { + driver = "jail-task-driver" + config { + Path = "/zroot/iocage/jails/myjail/root" + Host_hostname = "nomad00" + Exec_clean = true + Exec_start = "sh /etc/rc" + Exec_stop = "sh /etc/rc.shutdown" + Mount_devfs = true + Exec_prestart = "logger trying to start " + Exec_poststart = "logger jail has started" + Exec_prestop = "logger shutting down jail " + Exec_poststop = "logger has shut down jail " + Exec_consolelog ="/var/tmp/vnet-example" + Vnet = true + Vnet_nic = "e0b_loghost" + Exec_prestart = "/usr/share/examples/jails/jib addm loghost em1" + Exec_poststop = "/usr/share/examples/jails/jib destroy loghost " + } + } +``` + +## Client Requirements + +`jail-task-driver` requires the following: + +- 64-bit FreeBSD 12.0-RELEASE host +- The FreeBSD's Nomad binary +- The jail-task-driver binary placed in the [plugin_dir][plugin_dir] directory. +- If resource control is going be used then [RACCT][racct-doc] must be enabled + +[jail-task-driver]: https://github.com/cneira/jail-task-driver/releases +[jail-task-guide]: https://github.com/cneira/jail-task-driver#installation +[jail(8)]: https://www.freebsd.org/cgi/man.cgi?jail(8) +[racct-doc]: https://www.freebsd.org/doc/handbook/security-resourcelimits.html +[rctl-doc]: https://www.freebsd.org/doc/handbook/security-resourcelimits.html +[parameter-doc]: https://github.com/cneira/jail-task-driver/blob/master/Parameters.md +[plugin_dir]: /docs/configuration#plugin_dir diff --git a/content/nomad/v0.11.x/content/docs/drivers/external/lxc.mdx b/content/nomad/v0.11.x/content/docs/drivers/external/lxc.mdx new file mode 100644 index 0000000000..e8d7cb4912 --- /dev/null +++ b/content/nomad/v0.11.x/content/docs/drivers/external/lxc.mdx @@ -0,0 +1,189 @@ +--- +layout: docs +page_title: 'Drivers: LXC' +sidebar_title: LXC +description: The LXC task driver is used to run application containers using LXC. +--- + +# LXC Driver + +Name: `lxc` + +The `lxc` driver provides an interface for using LXC for running application +containers. You can download the external LXC driver [here][lxc-driver]. For more detailed instructions on how to set up and use this driver, please refer to the [LXC guide][lxc-guide]. + +~> **Note:** The LXC client set up has changed in Nomad 0.9. You must use the new [plugin syntax][plugin] and install the external LXC driver in the [plugin_dir][plugin_dir] prior to upgrading. See [plugin options][plugin-options] below for an example. Note the job specification remains the same. + +## Task Configuration + +```hcl +task "busybox" { + driver = "lxc" + + config { + log_level = "trace" + verbosity = "verbose" + template = "/usr/share/lxc/templates/lxc-busybox" + template_args = [] + + # these optional values can be set in the template + distro = "" + release = "" + arch = "" + image_variant = "default" + image_server = "images.linuxcontainers.org" + gpg_key_id = "" + gpg_key_server = "" + disable_gpg = false + flush_cache = false + force_cache = false + } +} +``` + +The `lxc` driver supports the following configuration in the job spec: + +- `template` - The LXC template to run. + + ```hcl + config { + template = "/usr/share/lxc/templates/lxc-alpine" + } + ``` + +- `template_args` - A list of argument strings to pass into the template. + +- `log_level` - (Optional) LXC library's logging level. Defaults to `error`. + Must be one of `trace`, `debug`, `info`, `warn`, or `error`. + + ```hcl + config { + log_level = "debug" + } + ``` + +- `verbosity` - (Optional) Enables extra verbosity in the LXC library's + logging. Defaults to `quiet`. Must be one of `quiet` or `verbose`. + + ```hcl + config { + verbosity = "quiet" + } + ``` + +- `volumes` - (Optional) A list of `host_path:container_path` strings to bind-mount host paths to container paths. Mounting host paths outside of the allocation directory can be disabled on clients by setting the [`volumes_enabled`](#volumes_enabled) option set to false. This will limit volumes to directories that exist inside the allocation directory. + + ~> **Note:** Unlike the similar option for the docker driver, this + option must not have an absolute path as the `container_path` + component. This will cause an error when submitting a job. + + Setting this does not affect the standard bind-mounts of `alloc`, + `local`, and `secrets`, which are always created. + + ```hcl + config { + volumes = [ + # Use absolute paths to mount arbitrary paths on the host + "/path/on/host:path/in/container", + + # Use relative paths to rebind paths already in the allocation dir + "relative/to/task:also/in/container" + ] + } + ``` + +- `release` - (Optional) The name/version of the distribution. By default this is set by the template. + +- `arch` - (Optional) The architecture of the container. By default this is set by the template. + +- `image_server` - (Optional) The hostname of the image server. Defaults to `images.linuxcontainers.org`. + +- `image_variant` - (Optional) The variant of the image. Defaults to `default` or as set by the template. + +- `disable_gpg` - (Optional) Disable GPG validation of images. Defaults to `false`, and enabling this flag is not recommended. + +- `flush_cache` - (Optional) Flush the local copy of the image (if present) and force it to be fetched from the image server. Defaults to `false`. + +- `force_cache` - (Optional) Force the use of the local copy even if expired. Defaults to `false`. + +- `gpg_key_server`: GPG key server used for checking image signatures. Default is set by the underlying LXC library. + +- `gpg_key_id`: GPG key ID used for checking image signatures. Default is set by the underlying LXC library. + +## Networking + +Currently the `lxc` driver only supports host networking. See the `none` +networking type in the `lxc.container.conf` [manual][lxc_man] for more +information. + +## Client Requirements + +The `lxc` driver requires the following: + +- 64-bit Linux host +- The `linux_amd64` Nomad binary +- The LXC driver binary placed in the [plugin_dir][plugin_dir] directory. +- `liblxc` to be installed +- `lxc-templates` to be installed + +## Plugin Options + +- `enabled` - The `lxc` driver may be disabled on hosts by setting this option to `false` (defaults to `true`). + +- `volumes_enabled` - Specifies whether host can bind-mount host paths to container paths (defaults to `true`). + +- `lxc_path` - The location in which all containers are stored (commonly defaults to `/var/lib/lxc`). See [`lxc-create`][lxc-create] for more details. + +- `gc` stanza: + - `container` - Defaults to `true`. This option can be used to disable Nomad + from removing a container when the task exits. Under a name conflict, + Nomad may still remove the dead container. + +An example of using these plugin options with the new [plugin +syntax][plugin] is shown below: + +```hcl +plugin "nomad-driver-lxc" { + config { + enabled = true + volumes_enabled = true + lxc_path = "/var/lib/lxc" + gc { + container = false + } + } +} +``` + +Please note the plugin name should match whatever name you have specified for the external driver in the [plugin_dir][plugin_dir] directory. + +## Client Configuration + +-> Only use this section for pre-0.9 releases of Nomad. If you are using Nomad +0.9 or above, please see [plugin options][plugin-options] + +The `lxc` driver has the following [client-level options][client_options]: + +- `lxc.enable` - The `lxc` driver may be disabled on hosts by setting this + option to `false` (defaults to `true`). + +## Client Attributes + +The `lxc` driver will set the following client attributes: + +- `driver.lxc` - Set to `1` if LXC is found and enabled on the host node. +- `driver.lxc.version` - Version of `lxc` e.g.: `1.1.0`. + +## Resource Isolation + +This driver supports CPU and memory isolation via the `lxc` library. Network +isolation is not supported as of now. + +[lxc-create]: https://linuxcontainers.org/lxc/manpages/man1/lxc-create.1.html +[lxc-driver]: https://releases.hashicorp.com/nomad-driver-lxc +[lxc-guide]: https://learn.hashicorp.com/nomad/using-plugins/lxc +[lxc_man]: https://linuxcontainers.org/lxc/manpages/man5/lxc.container.conf.5.html#lbAM +[plugin]: /docs/configuration/plugin +[plugin_dir]: /docs/configuration#plugin_dir +[plugin-options]: #plugin-options +[client_options]: /docs/configuration/client#options diff --git a/content/nomad/v0.11.x/content/docs/drivers/external/nspawn.mdx b/content/nomad/v0.11.x/content/docs/drivers/external/nspawn.mdx new file mode 100644 index 0000000000..78eb7458cf --- /dev/null +++ b/content/nomad/v0.11.x/content/docs/drivers/external/nspawn.mdx @@ -0,0 +1,162 @@ +--- +layout: docs +page_title: 'Drivers: Systemd-Nspawn' +sidebar_title: Systemd-Nspawn +description: The Nspawn task driver is used to run application containers using Systemd-Nspawn. +--- + +# Nspawn Driver + +Name: `nspawn` + +The `nspawn` driver provides an interface for using Systemd-Nspawn for running application +containers. You can download the external Systemd-Nspawn driver [here][nspawn-driver]. For more detailed instructions on how to set up and use this driver, please refer to the [guide][nspawn-guide]. + +## Task Configuration + +```hcl +task "debian" { + driver = "nspawn" + config { + image = "/var/lib/machines/Debian" + resolv_conf = "copy-host" + } +} +``` + +The `nspawn` driver supports the following configuration in the job spec: + +* [`boot`](https://www.freedesktop.org/software/systemd/man/systemd-nspawn.html#-b) - + (Optional) `true` (default) or `false`. Search for an init program and invoke + it as PID 1. Arguments specified in `command` will be used as arguments for + the init program. +* [`ephemeral`](https://www.freedesktop.org/software/systemd/man/systemd-nspawn.html#-x) - + (Optional) `true` or `false` (default). Make an ephemeral copy of the image + before staring the container. +* [`process_two`](https://www.freedesktop.org/software/systemd/man/systemd-nspawn.html#-a) - + (Optional) `true` or `false` (default). Start the command specified with + `command` as PID 2, using a minimal stub init as PID 1. +* [`read_only`](https://www.freedesktop.org/software/systemd/man/systemd-nspawn.html#--read-only) - + (Optional) `true` or `false` (default). Mount the used image as read only. +* [`user_namespacing`](https://www.freedesktop.org/software/systemd/man/systemd-nspawn.html#-U) - + (Optional) `true` (default) or `false`. Enable user namespacing features + inside the container. +* `command` - (Optional) A list of strings to pass as the used command to the + container. + + ```hcl + config { + command = [ "/bin/bash", "-c", "dhclient && nginx && tail -f /var/log/nginx/access.log" ] + } + ``` +* [`console`](https://www.freedesktop.org/software/systemd/man/systemd-nspawn.html#--console=MODE) - + (Optional) Configures how to set up standard input, output and error output + for the container. +* `image` - Path to the image to be used in the container. This can either be a + [directory](https://www.freedesktop.org/software/systemd/man/systemd-nspawn.html#-D) + or the path to a file system + [image](https://www.freedesktop.org/software/systemd/man/systemd-nspawn.html#-i) + or block device. Can be specified as a relative path from the configured Nomad + plugin directory. **This option is mandatory**. +* [`pivot_root`](https://www.freedesktop.org/software/systemd/man/systemd-nspawn.html#--pivot-root=) - + (Optional) Pivot the specified directory to the be containers root directory. +* [`resolv_conf`](https://www.freedesktop.org/software/systemd/man/systemd-nspawn.html#--resolv-conf=) - + (Optional) Configure how `/etc/resolv.conf` is handled inside the container. +* [`user`](https://www.freedesktop.org/software/systemd/man/systemd-nspawn.html#-u) - + (Optional) Change to the specified user in the containers user database. +* [`volatile`](https://www.freedesktop.org/software/systemd/man/systemd-nspawn.html#--volatile) - + (Optional) Boot the container in volatile mode. +* [`working_directory`](https://www.freedesktop.org/software/systemd/man/systemd-nspawn.html#--chdir=) - + (Optional) Set the working directory inside the container. +* [`bind`](https://www.freedesktop.org/software/systemd/man/systemd-nspawn.html#--bind=) - + (Optional) Files or directories to bind mount inside the container. + + ```hcl + config { + bind { + "/var/lib/postgresql" = "/postgres" + } + } + ``` +* [`bind_read_only`](https://www.freedesktop.org/software/systemd/man/systemd-nspawn.html#--bind=) - + (Optional) Files or directories to bind mount read only inside the container. + + ```hcl + config { + bind_read_only { + "/etc/passwd" = "/etc/passwd" + } + } + + ``` +* `environment` - (Optional) Environment variables to pass to the init process + in the container. + + ```hcl + config { + environment = { + FOO = "bar" + } + } + ``` +* `port_map` - (Optional) A key-value map of port labels. Works the same way as + in the [docker + driver][docker_driver]. + **Note:** `systemd-nspawn` will not expose ports to the loopback interface of + your host. + + ```hcl + config { + port_map { + http = 80 + } + } + ``` + + +## Networking + +Currently the `nspawn` driver only supports host networking. + +## Client Requirements + +The `nspawn` driver requires the following: + +* 64-bit Linux host +* The `linux_amd64` Nomad binary +* The Nspawn driver binary placed in the [plugin_dir][plugin_dir] directory. +* `systemd-nspawn` to be installed +* Nomad running with root privileges + +## Plugin Options + +* `enabled` - The `nspawn` driver may be disabled on hosts by setting this option to `false` (defaults to `true`). + +An example of using these plugin options with the new [plugin +syntax][plugin] is shown below: + +```hcl +plugin "nspawn" { + config { + enabled = true + } +} +``` + + +## Client Attributes + +The `nspawn` driver will set the following client attributes: + +* `driver.nspawn` - Set to `true` if Systemd-Nspawn is found and enabled on the + host node and Nomad is running with root privileges. +* `driver.nspawn.version` - Version of `systemd-nspawn` e.g.: `244`. + + +[nspawn-driver]: https://github.com/JanMa/nomad-driver-nspawn/releases +[nspawn-guide]: https://github.com/JanMa/nomad-driver-nspawn +[plugin]: /docs/configuration/plugin +[plugin_dir]: /docs/configuration#plugin_dir +[plugin-options]: #plugin-options +[client_options]: /docs/configuration/client#options +[docker_driver]: /docs/drivers/docker#using-the-port-map diff --git a/content/nomad/v0.11.x/content/docs/drivers/external/podman.mdx b/content/nomad/v0.11.x/content/docs/drivers/external/podman.mdx new file mode 100644 index 0000000000..1965a4ffb6 --- /dev/null +++ b/content/nomad/v0.11.x/content/docs/drivers/external/podman.mdx @@ -0,0 +1,244 @@ +--- +layout: docs +page_title: 'Drivers: podman' +sidebar_title: Podman +description: >- + The Podman task driver uses podman (https://podman.io/) for containerizing + tasks. +--- + +# Podman Task Driver + +Name: `podman` + +Homepage: https://github.com/pascomnet/nomad-driver-podman + +The podman task driver plugin for Nomad uses the [Pod Manager (podman)][podman] +daemonless container runtime for executing Nomad tasks. Podman supports OCI +containers and its command line tool is meant to be [a drop-in replacement for +Docker's][podman-cli]. + +See the project's [homepage][homepage] for details. + +## Client Requirements + +- Linux host with [`podman`][podman] installed. +- [`nomad-driver-podman`][releases] binary in Nomad's [`plugin_dir`][plugin_dir]. + +You need a varlink enabled podman binary and a system socket activation unit, see https://podman.io/blogs/2019/01/16/podman-varlink.html. + +Since the Nomad agent, nomad-driver-podman plugin binary, and podman will +reside on the same host, skip the ssh aspects of the podman varlink +documentation above. + +## Task Configuration + +Due to Podman's similarity to Docker, the example job created by [`nomad init -short`][nomad-init] is easily adapted to use Podman instead: + +```hcl +job "example" { + datacenters = ["dc1"] + + group "cache" { + task "redis" { + driver = "podman" + + config { + image = "docker://redis:3.2" + + port_map { + db = 6379 + } + } + + resources { + cpu = 500 + memory = 256 + + network { + mbits = 10 + port "db" {} + } + } + } + } +} +``` + +- `image` - The image to run. + +```hcl +config { + image = "docker://redis" +} +``` + +- `command` - (Optional) The command to run when starting the container. + +```hcl +config { + command = "some-command" +} +``` + +- `args` - (Optional) A list of arguments to the optional command. If no + _command_ is specified, the arguments are passed directly to the container. + +```hcl +config { + args = [ + "arg1", + "arg2", + ] +} +``` + +- `volumes` - (Optional) A list of `host_path:container_path` strings to bind + host paths to container paths. + +```hcl +config { + volumes = [ + "/some/host/data:/container/data" + ] +} +``` + +- `tmpfs` - (Optional) A list of `/container_path` strings for tmpfs mount + points. See `podman run --tmpfs` options for details. + +```hcl +config { + tmpfs = [ + "/var" + ] +} +``` + +- `hostname` - (Optional) The hostname to assign to the container. When + launching more than one of a task (using count) with this option set, every + container the task starts will have the same hostname. + +- `init` - Run an init inside the container that forwards signals and reaps processes. + +```hcl +config { + init = true +} +``` + +- `init_path` - Path to the container-init binary. + +```hcl +config { + init = true + init_path = "/usr/libexec/podman/catatonit" +} +``` + +- `user` - Run the command as a specific user/uid within the container. See + [task configuration][task]. + +- `memory_reservation` - Memory soft limit (unit = b (bytes), k (kilobytes), m + (megabytes), or g (gigabytes)) + +After setting memory reservation, when the system detects memory contention or +low memory, containers are forced to restrict their consumption to their +reservation. So you should always set the value below --memory, otherwise the +hard limit will take precedence. By default, memory reservation will be the +same as memory limit. + +```hcl +config { + memory_reservation = "100m" +} +``` + +- `memory_swap` - A limit value equal to memory plus swap. The swap limit + should always be larger than the [memory value][memory-value]. + +Unit can be b (bytes), k (kilobytes), m (megabytes), or g (gigabytes). If you +don't specify a unit, b is used. Set LIMIT to -1 to enable unlimited swap. + +```hcl +config { + memory_swap = "180m" +} +``` + +- `memory_swappiness` - Tune a container's memory swappiness behavior. Accepts + an integer between 0 and 100. + +```hcl +config { + memory_swappiness = 60 +} +``` + +## Networking + +Podman supports forwarding and exposing ports like Docker. See [Docker Driver +configuration][docker-ports] for details. + +## Plugin Options + +The podman plugin has options which may be customized in the agent's +configuration file. + +- `volumes` stanza: + + - `enabled` - Defaults to `true`. Allows tasks to bind host paths (volumes) + inside their container. + - `selinuxlabel` - Allows the operator to set a SELinux label to the + allocation and task local bind-mounts to containers. If used with + `volumes.enabled` set to false, the labels will still be applied to the + standard binds in the container. + +```hcl +plugin "nomad-driver-podman" { + config { + volumes { + enabled = true + selinuxlabel = "z" + } + } +} +``` + +- `gc` stanza: + + - `container` - Defaults to `true`. This option can be used to disable + Nomad from removing a container when the task exits. + +```hcl +plugin "nomad-driver-podman" { + config { + gc { + container = false + } + } +} +``` + +- `recover_stopped` - Defaults to `true`. Allows the driver to start and reuse + a previously stopped container after a Nomad client restart. + Consider a simple single node system and a complete reboot. All previously managed containers + will be reused instead of disposed and recreated. + +```hcl +plugin "nomad-driver-podman" { + config { + recover_stopped = false + } +} +``` + +[docker-ports]: /docs/drivers/docker#forwarding-and-exposing-ports +[homepage]: https://github.com/pascomnet/nomad-driver-podman +[memory-value]: /docs/job-specification/resources#memory +[nomad-init]: /docs/commands/job/init +[plugin_dir]: /docs/configuration#plugin_dir +[podman]: https://podman.io/ +[podman-cli]: https://podman.io/whatis.html +[releases]: https://github.com/pascomnet/nomad-driver-podman/releases +[task]: /docs/job-specification/task#user diff --git a/content/nomad/v0.11.x/content/docs/drivers/external/pot.mdx b/content/nomad/v0.11.x/content/docs/drivers/external/pot.mdx new file mode 100644 index 0000000000..d2cff5da53 --- /dev/null +++ b/content/nomad/v0.11.x/content/docs/drivers/external/pot.mdx @@ -0,0 +1,102 @@ +--- +layout: docs +page_title: 'Drivers: pot' +sidebar_title: Pot +description: >- + The Pot task driver is used to run pot (https://github.com/pizzamig/pot) + containers using FreeBSD jails. +--- + +# Pot Task Driver + +Name: `pot` + +The Pot task driver provides an interface for using [pot][pot-github-repo] for dynamically running applications inside a FreeBSD Jail. +You can download the external nomad-pot-driver [here][nomad-pot-driver]. + +## Task Configuration + +```hcl +task "nginx-pot" { + driver = "pot" + + config { + image = "https://pot-registry.zapto.org/registry/" + pot = "FBSD120-nginx" + tag = "1.0" + command = "nginx" + args = [ + "-g 'daemon off;'" + ] + network_mode = "public-bridge" + port_map = { + http = "80" + } + copy = [ + "/root/index.html:/usr/local/www/nginx-dist/index.html", + "/root/nginx.conf:/usr/local/etc/nginx/nginx.conf" + ] + mount = [ + "/tmp/test:/root/test", + ] + mount_read_only = [ + "/tmp/test2:/root/test2" + ] + extra_hosts = [ + "artifactory.yourdomain.com:192.168.0.1", + "mail.yourdomain.com:192.168.0.2" + ] + } +} +``` + +The pot task driver supports the following parameters: + +- `image` - The url for the http registry from where to get the image. + +- `pot` - Name of the image in the registry. + +- `tag` - Version of the image. + +- `command` - Command that is going to be executed once the jail is started. + +- `args` - (Optional) List of options for the command executed on the command argument. + +- `network_mode` - (Optional) Defines the network mode of the pot. Default: **"public-bridge"** + + Possible values are: + + **"public-bridge"** pot creates an internal virtual network with a NAT table where all traffic is going to be sent. + + **"host"** pot bounds the jail directly to a host port. + +- `port_map` - (Optional) Sets the port on which the application is listening inside of the jail. If not set, the application will inherit the port configuration from the image. + +- `copy` - (Optional) Copies a file from the host machine to the pot jail in the given directory. + +- `mount` - (Optional) Mounts a read/write folder from the host machine to the pot jail. + +- `mount_read_only` - (Optional) Mounts a read only directory inside the pot jail. + +- `extra_hosts` - (Optional) A list of hosts, given as host:IP, to be added to /etc/hosts + +## Client Requirements + +`pot Task Driver` requires the following: + +- 64-bit FreeBSD 12.0-RELEASE host . +- The FreeBSD's Nomad binary (available as a package). +- The pot-task-driver binary placed in the [plugin_dir][plugin_dir] directory. +- Installing [pot][pot-github-repo] and following the install [guide][pot-install-guide]. +- Webserver from where to serve the images. (simple file server) +- Following lines need to be included in your rc.conf + +``` +nomad_user="root" +nomad_env="PATH=/usr/local/bin:/usr/local/sbin:/usr/bin:/usr/sbin:/sbin:/bin" +``` + +[nomad-pot-driver]: https://github.com/trivago/nomad-pot-driver +[plugin_dir]: /docs/configuration#plugin_dir +[pot-github-repo]: https://github.com/pizzamig/pot +[pot-install-guide]: https://github.com/pizzamig/pot/blob/master/share/doc/pot/Installation.md diff --git a/content/nomad/v0.11.x/content/docs/drivers/external/rkt.mdx b/content/nomad/v0.11.x/content/docs/drivers/external/rkt.mdx new file mode 100644 index 0000000000..38da283a6a --- /dev/null +++ b/content/nomad/v0.11.x/content/docs/drivers/external/rkt.mdx @@ -0,0 +1,229 @@ +--- +layout: docs +page_title: 'Drivers: Rkt' +sidebar_title: 'Rkt Deprecated ' +description: The rkt task driver is used to run application containers using rkt. +--- + +~> **Deprecation Warning!** +Nomad introduced the rkt driver in version 0.2.0. The rkt project had some +early adoption; in recent times user adoption has trended away from rkt towards +other projects. Project activity has declined and there are unpatched CVEs. +The project has been [archived by the CNCF](https://github.com/rkt/rkt/issues/4004#issuecomment-507358362) + +Nomad 0.11 converted the rkt driver to an external driver. We will not prioritize features +or pull requests that affect the rkt driver. The external driver is available as an [open source +repository](https://github.com/hashicorp/nomad-driver-rkt) for community ownership. + +# Rkt Driver + +Name: `rkt` + +The `rkt` driver provides an interface for using rkt for running +application containers. + +## Task Configuration + +```hcl +task "webservice" { + driver = "rkt" + + config { + image = "redis:3.2" + } +} +``` + +The `rkt` driver supports the following configuration in the job spec: + +- `image` - The image to run. May be specified by name, hash, ACI address + or docker registry. + + ```hcl + config { + image = "https://hub.docker.internal/redis:3.2" + } + ``` + +- `command` - (Optional) A command to execute on the ACI. + + ```hcl + config { + command = "my-command" + } + ``` + +- `args` - (Optional) A list of arguments to the optional `command`. References + to environment variables or any [interpretable Nomad + variables](/docs/runtime/interpolation) will be interpreted before + launching the task. + + ```hcl + config { + args = [ + "-bind", "${NOMAD_PORT_http}", + "${nomad.datacenter}", + "${MY_ENV}", + "${meta.foo}", + ] + } + ``` + +- `trust_prefix` - (Optional) The trust prefix to be passed to rkt. Must be + reachable from the box running the nomad agent. If not specified, the image is + run with `--insecure-options=all`. + +- `insecure_options` - (Optional) List of insecure options for rkt. Consult `rkt --help` + for list of supported values. This list overrides the `--insecure-options=all` default when + no `trust_prefix` is provided in the job config, which can be effectively used to enforce + secure runs, using `insecure_options = ["none"]` option. + + ```hcl + config { + image = "example.com/image:1.0" + insecure_options = ["image", "tls", "ondisk"] + } + ``` + +- `dns_servers` - (Optional) A list of DNS servers to be used in the container. + Alternatively a list containing just `host` or `none`. `host` uses the host's + `resolv.conf` while `none` forces use of the image's name resolution configuration. + +- `dns_search_domains` - (Optional) A list of DNS search domains to be used in + the containers. + +- `net` - (Optional) A list of networks to be used by the containers + +- `port_map` - (Optional) A key/value map of ports used by the container. The + value is the port name specified in the image manifest file. When running + Docker images with rkt the port names will be of the form `${PORT}-tcp`. See + [networking](#networking) below for more details. + + ```hcl + port_map { + # If running a Docker image that exposes port 8080 + app = "8080-tcp" + } + ``` + +* `debug` - (Optional) Enable rkt command debug option. + +* `no_overlay` - (Optional) When enabled, will use `--no-overlay=true` flag for 'rkt run'. + Useful when running jobs on older systems affected by https://github.com/rkt/rkt/issues/1922 + +* `volumes` - (Optional) A list of `host_path:container_path[:readOnly]` strings to bind + host paths to container paths. + Mount is done read-write by default; an optional third parameter `readOnly` can be provided + to make it read-only. + + ```hcl + config { + volumes = ["/path/on/host:/path/in/container", "/readonly/path/on/host:/path/in/container:readOnly"] + } + ``` + +* `group` - (Optional) Specifies the group that will run the task. Sets the + `--group` flag and overrides the group specified by the image. The + [`user`][user] may be specified at the task level. + +## Networking + +The `rkt` can specify `--net` and `--port` for the rkt client. Hence, there are two ways to use host ports by +using `--net=host` or `--port=PORT` with your network. + +Example: + +```hcl +task "redis" { + # Use rkt to run the task. + driver = "rkt" + + config { + # Use docker image with port defined + image = "docker://redis:latest" + port_map { + app = "6379-tcp" + } + } + + service { + port = "app" + } + + resources { + network { + mbits = 10 + port "app" { + static = 12345 + } + } + } +} +``` + +### Allocating Ports + +You can allocate ports to your task using the port syntax described on the +[networking page](/docs/job-specification/network). + +When you use port allocation, the image manifest needs to declare public ports and host has configured network. +For more information, please refer to [rkt Networking](https://coreos.com/rkt/docs/latest/networking/overview). + +## Client Requirements + +The `rkt` driver requires the following: + +- The Nomad client agent to be running as the root user. +- rkt to be installed and in your system's `$PATH`. +- The `trust_prefix` must be accessible by the node running Nomad. This can be an + internal source, private to your cluster, but it must be reachable by the client + over HTTP. + +## Plugin Options + +- `volumes_enabled` - Defaults to `true`. Allows tasks to bind host paths + (`volumes`) inside their container. Binding relative paths is always allowed + and will be resolved relative to the allocation's directory. + +## Client Configuration + +~> Note: client configuration options will soon be deprecated. Please use [plugin options][plugin-options] instead. See the [plugin stanza][plugin-stanza] documentation for more information. + +The `rkt` driver has the following [client configuration +options](/docs/configuration/client#options): + +- `rkt.volumes.enabled` - Defaults to `true`. Allows tasks to bind host paths + (`volumes`) inside their container. Binding relative paths is always allowed + and will be resolved relative to the allocation's directory. + +## Client Attributes + +The `rkt` driver will set the following client attributes: + +- `driver.rkt` - Set to `1` if rkt is found on the host node. Nomad determines + this by executing `rkt version` on the host and parsing the output +- `driver.rkt.version` - Version of `rkt` e.g.: `1.27.0`. Note that the minimum required + version is `1.27.0` +- `driver.rkt.appc.version` - Version of `appc` that `rkt` is using e.g.: `1.1.0` + +Here is an example of using these properties in a job file: + +```hcl +job "docs" { + # Only run this job where the rkt version is higher than 0.8. + constraint { + attribute = "${driver.rkt.version}" + operator = ">" + value = "1.2" + } +} +``` + +## Resource Isolation + +This driver supports CPU and memory isolation by delegating to `rkt`. Network +isolation is not supported as of now. + +[user]: /docs/job-specification/task#user +[plugin-options]: #plugin-options +[plugin-stanza]: /docs/configuration/plugin diff --git a/content/nomad/v0.11.x/content/docs/drivers/external/singularity.mdx b/content/nomad/v0.11.x/content/docs/drivers/external/singularity.mdx new file mode 100644 index 0000000000..e160e4d5e3 --- /dev/null +++ b/content/nomad/v0.11.x/content/docs/drivers/external/singularity.mdx @@ -0,0 +1,187 @@ +--- +layout: docs +page_title: 'Drivers: Singularity' +sidebar_title: Singularity +description: >- + The Singularity task driver is used to run application containers using + Singularity. +--- + +# Singularity Driver + +Name: `Singularity` + +The `Singularity` driver provides an interface for using Singularity for running application +containers. You can download the external Singularity driver [here][singularity-driver]. + +## Task Configuration + +```hcl +task "lolcow" { + driver = "Singularity" + + config { + # this example run an image from sylabs container library with the + # canonical example of lolcow + image = "library://sylabsed/examples/lolcow:latest" + # command can be run, exec or test + command = "run" + } +} +``` + +The `Singularity` driver supports the following configuration in the job spec: + +- `image` - The Singularity image to run. It can be a local path or a supported URI. + + ```hcl + config { + image = "library://sylabsed/examples/lolcow:latest" + } + ``` + +- `verbose` - (Optional) Enables extra verbosity in the Singularity runtime logging. + Defaults to `false`. + + ```hcl + config { + verbose = "false" + } + ``` + +- `debug` - (Optional) Enables extra debug output in the Singularity runtime + logging. Defaults to `false`. + + ```hcl + config { + debug = "false" + } + ``` + +- `command` - Singularity command action; can be `run`, `exec` or `test`. + + ```hcl + config { + command = "run" + } + ``` + +- `args` - (Optional) Singularity command action arguments, when trying to pass arguments to `run`, `exec` or `test`. + Multiple args can be given by a comma separated list. + + ```hcl + config { + args = [ "echo", "hello Cloud" ] + } + ``` + +- [`binds`][bind] - (Optional) A user-bind path specification. This spec has the format `src[:dest[:opts]]`, where src and + dest are outside and inside paths. If dest is not given, it is set equal to src. + Mount options ('opts') may be specified as 'ro' (read-only) or 'rw' (read/write, which + is the default). Multiple bind paths can be given by a comma separated list. + + ```hcl + config { + bind = [ "host/path:/container/path" ] + } + ``` + +- [`overlay`][overlay] - (Optional) Singularity command action flag, to enable an overlayFS image for persistent data + storage or as read-only layer of container. Multiple overlay paths can be given by a comma separated list. + + ```hcl + config { + overlay = [ "host/path/to/overlay" ] + } + ``` + +- [`security`][security] - (Optional) Allows the root user to leverage security modules such as + SELinux, AppArmor, and seccomp within your Singularity container. + You can also change the UID and GID of the user within the container at runtime. + + ```hcl + config { + security = [ "uid:1000 " ] + } + ``` + +- `contain` - (Optional) Use minimal `/dev` and empty other directories (e.g. /tmp and \$HOME) instead of sharing filesystems from your host. + + ```hcl + config { + contain = "false" + } + ``` + +- `workdir` - (Optional) Working directory to be used for `/tmp`, `/var/tmp` and \$HOME (if -c/--contain was also used). + + ```hcl + config { + workdir = "/path/to/folder" + } + ``` + +- `pwd` - (Optional) Initial working directory for payload process inside the container. + + ```hcl + config { + pwd = "/path/to/folder" + } + ``` + +## Networking + +Currently the `Singularity` driver only supports host networking. For more detailed instructions on how to set up networking options, please refer to the `Singularity` user guides [singularity-network] + +## Client Requirements + +The `Singularity` driver requires the following: + +- 64-bit Linux host +- The `linux_amd64` Nomad binary +- The Singularity driver binary placed in the [plugin_dir][plugin_dir] directory. +- [`Singularity`][singularity] v3.1.1+ to be installed + +## Plugin Options ((#plugin_options)) + +- `enabled` - The `Singularity` driver may be disabled on hosts by setting this option to `false` (defaults to `true`). + +- `singularity_cache` - The location in which all containers are stored (commonly defaults to `/var/lib/singularity`). See [`Singularity-cache`][singularity-cache] for more details. + +An example of using these plugin options with the new [plugin +syntax][plugin] is shown below: + +```hcl +plugin "nomad-driver-Singularity" { + config { + enabled = true + singularity_path = "/var/lib/singularity" + } +} +``` + +Please note the plugin name should match whatever name you have specified for the external driver in the [plugin_dir][plugin_dir] directory. + +## Client Attributes + +The `Singularity` driver will set the following client attributes: + +- `driver.singularity` - Set to `1` if Singularity is found and enabled on the host node. +- `driver.singularity.version` - Version of `Singularity` e.g.: `3.1.0`. + +## Resource Isolation + +This driver supports CPU and memory isolation via the `Singularity` cgroups feature. Network +isolation is supported via `--net` and `--network` feature (Singularity v3.1.1+ required). + +[singularity-driver]: https://github.com/sylabs/nomad-driver-singularity +[singularity_man]: https://linuxcontainers.org/Singularity/manpages/man5/Singularity.container.conf.5.html#lbAM +[plugin]: /docs/configuration/plugin +[plugin_dir]: /docs/configuration#plugin_dir +[plugin-options]: #plugin_options +[singularity]: https://github.com/sylabs/singularity +[singularity-cache]: https://www.sylabs.io/guides/3.1/user-guide/appendix.html#c +[bind]: https://www.sylabs.io/guides/3.1/user-guide/bind_paths_and_mounts.html +[security]: https://www.sylabs.io/guides/3.1/user-guide/security_options.html +[overlay]: https://www.sylabs.io/guides/3.1/user-guide/persistent_overlays.html +[singularity-network]: https://www.sylabs.io/guides/3.1/user-guide/networking.html diff --git a/content/nomad/v0.11.x/content/docs/drivers/index.mdx b/content/nomad/v0.11.x/content/docs/drivers/index.mdx new file mode 100644 index 0000000000..ad191d9413 --- /dev/null +++ b/content/nomad/v0.11.x/content/docs/drivers/index.mdx @@ -0,0 +1,42 @@ +--- +layout: docs +page_title: Task Drivers +sidebar_title: Task Drivers +description: Task Drivers are used to integrate with the host OS to run tasks in Nomad. +--- + +# Task Drivers + +Task drivers are used by Nomad clients to execute a task and provide resource +isolation. By having extensible task drivers, Nomad has the flexibility to +support a broad set of workloads across all major operating systems. + +Starting with Nomad 0.9, task drivers are now pluggable. This gives users the +flexibility to introduce their own drivers without having to recompile Nomad. +You can view the [plugin stanza][plugin] documentation for examples on how to +use the `plugin` stanza in Nomad's client configuration. Note that we have +introduced new syntax when specifying driver options in the client configuration +(see [docker][docker_plugin] for an example). Keep in mind that even though all +built-in drivers are now plugins, Nomad remains a single binary and maintains +backwards compatibility except with the `lxc` driver. + +The list of supported task drivers is provided on the left of this page. Each +task driver documents the configuration available in a [job +specification](/docs/job-specification), the environments it can be +used in, and the resource isolation mechanisms available. + +For details on authoring a task driver plugin, please refer to the [plugin +authoring guide][plugin_guide]. + +Task driver resource isolation is intended to provide a degree of separation of +Nomad client CPU / memory / storage between tasks. Resource isolation +effectiveness is dependent upon individual task driver implementations and +underlying client operating systems. Task drivers do include various +security-related controls, but the Nomad client to task interface should not be +considered a security boundary. See the [access control guide][acl_guide] for +more information on how to protect Nomad cluster operations. + +[plugin]: /docs/configuration/plugin +[docker_plugin]: /docs/drivers/docker#client-requirements +[plugin_guide]: /docs/internals/plugins +[acl_guide]: https://learn.hashicorp.com/nomad?track=acls#operations-and-development diff --git a/content/nomad/v0.11.x/content/docs/drivers/java.mdx b/content/nomad/v0.11.x/content/docs/drivers/java.mdx new file mode 100644 index 0000000000..06f9ea5efd --- /dev/null +++ b/content/nomad/v0.11.x/content/docs/drivers/java.mdx @@ -0,0 +1,140 @@ +--- +layout: docs +page_title: 'Drivers: Java' +sidebar_title: Java +description: The Java task driver is used to run Jars using the JVM. +--- + +# Java Driver + +Name: `java` + +The `java` driver is used to execute Java applications packaged into a Java Jar +file. The driver requires the Jar file to be accessible from the Nomad +client via the [`artifact` downloader](/docs/job-specification/artifact). + +## Task Configuration + +```hcl +task "webservice" { + driver = "java" + + config { + jar_path = "local/example.jar" + jvm_options = ["-Xmx2048m", "-Xms256m"] + } +} +``` + +The `java` driver supports the following configuration in the job spec: + +- `class` - (Optional) The name of the class to run. If `jar_path` is specified + and the manifest specifies a main class, this is optional. If shipping classes + rather than a Jar, please specify the class to run and the `class_path`. + +- `class_path` - (Optional) The `class_path` specifies the class path used by + Java to lookup classes and Jars. + +- `jar_path` - (Optional) The path to the downloaded Jar. In most cases this will just be + the name of the Jar. However, if the supplied artifact is an archive that + contains the Jar in a subfolder, the path will need to be the relative path + (`subdir/from_archive/my.jar`). + +- `args` - (Optional) A list of arguments to the Jar's main method. References + to environment variables or any [interpretable Nomad + variables](/docs/runtime/interpolation) will be interpreted before + launching the task. + +- `jvm_options` - (Optional) A list of JVM options to be passed while invoking + java. These options are passed without being validated in any way by Nomad. + +## Examples + +A simple config block to run a Java Jar: + +```hcl +task "web" { + driver = "java" + + config { + jar_path = "local/hello.jar" + jvm_options = ["-Xmx2048m", "-Xms256m"] + } + + # Specifying an artifact is required with the "java" driver. This is the + # mechanism to ship the Jar to be run. + artifact { + source = "https://internal.file.server/hello.jar" + + options { + checksum = "md5:123445555555555" + } + } +} +``` + +A simple config block to run a Java class: + +```hcl +task "web" { + driver = "java" + + config { + class = "Hello" + class_path = "${NOMAD_TASK_DIR}" + jvm_options = ["-Xmx2048m", "-Xms256m"] + } + + # Specifying an artifact is required with the "java" driver. This is the + # mechanism to ship the Jar to be run. + artifact { + source = "https://internal.file.server/Hello.class" + + options { + checksum = "md5:123445555555555" + } + } +} +``` + +## Client Requirements + +The `java` driver requires Java to be installed and in your system's `$PATH`. On +Linux, Nomad must run as root since it will use `chroot` and `cgroups` which +require root privileges. The task must also specify at least one artifact to +download, as this is the only way to retrieve the Jar being run. + +## Client Attributes + +The `java` driver will set the following client attributes: + +- `driver.java` - Set to `1` if Java is found on the host node. Nomad determines + this by executing `java -version` on the host and parsing the output +- `driver.java.version` - Version of Java, ex: `1.6.0_65` +- `driver.java.runtime` - Runtime version, ex: `Java(TM) SE Runtime Environment (build 1.6.0_65-b14-466.1-11M4716)` +- `driver.java.vm` - Virtual Machine information, ex: `Java HotSpot(TM) 64-Bit Server VM (build 20.65-b04-466.1, mixed mode)` + +Here is an example of using these properties in a job file: + +```hcl +job "docs" { + # Only run this job where the JVM is higher than version 1.6.0. + constraint { + attribute = "${driver.java.version}" + operator = ">" + value = "1.6.0" + } +} +``` + +## Resource Isolation + +The resource isolation provided varies by the operating system of +the client and the configuration. + +On Linux, Nomad will attempt to use cgroups, namespaces, and chroot +to isolate the resources of a process. If the Nomad agent is not +running as root, many of these mechanisms cannot be used. + +As a baseline, the Java jars will be run inside a Java Virtual Machine, +providing a minimum amount of isolation. diff --git a/content/nomad/v0.11.x/content/docs/drivers/qemu.mdx b/content/nomad/v0.11.x/content/docs/drivers/qemu.mdx new file mode 100644 index 0000000000..60cd98cf5b --- /dev/null +++ b/content/nomad/v0.11.x/content/docs/drivers/qemu.mdx @@ -0,0 +1,140 @@ +--- +layout: docs +page_title: 'Drivers: Qemu' +sidebar_title: Qemu +description: The Qemu task driver is used to run virtual machines using Qemu/KVM. +--- + +# Qemu Driver + +Name: `qemu` + +The `qemu` driver provides a generic virtual machine runner. Qemu can utilize +the KVM kernel module to utilize hardware virtualization features and provide +great performance. Currently the `qemu` driver can map a set of ports from the +host machine to the guest virtual machine, and provides configuration for +resource allocation. + +The `qemu` driver can execute any regular `qemu` image (e.g. `qcow`, `img`, +`iso`), and is currently invoked with `qemu-system-x86_64`. + +The driver requires the image to be accessible from the Nomad client via the +[`artifact` downloader](/docs/job-specification/artifact). + +## Task Configuration + +```hcl +task "webservice" { + driver = "qemu" + + config { + image_path = "/path/to/my/linux.img" + accelerator = "kvm" + graceful_shutdown = true + args = ["-nodefaults", "-nodefconfig"] + } +} +``` + +The `qemu` driver supports the following configuration in the job spec: + +- `image_path` - The path to the downloaded image. In most cases this will just + be the name of the image. However, if the supplied artifact is an archive that + contains the image in a subfolder, the path will need to be the relative path + (`subdir/from_archive/my.img`). + +- `accelerator` - (Optional) The type of accelerator to use in the invocation. + If the host machine has `qemu` installed with KVM support, users can specify + `kvm` for the `accelerator`. Default is `tcg`. + +- `graceful_shutdown` `(bool: false)` - Using the [qemu + monitor](https://en.wikibooks.org/wiki/QEMU/Monitor), send an ACPI shutdown + signal to virtual machines rather than simply terminating them. This emulates + a physical power button press, and gives instances a chance to shut down + cleanly. If the VM is still running after `kill_timeout`, it will be + forcefully terminated. (Note that + [prior to qemu 2.10.1](https://github.com/qemu/qemu/commit/ad9579aaa16d5b385922d49edac2c96c79bcfb6), + the monitor socket path is limited to 108 characters. Graceful shutdown will + be disabled if qemu is < 2.10.1 and the generated monitor path exceeds this + length. You may encounter this issue if you set long + [data_dir](/docs/configuration#data_dir) + or + [alloc_dir](/docs/configuration/client#alloc_dir) + paths.) This feature is currently not supported on Windows. + +- `port_map` - (Optional) A key-value map of port labels. + + ```hcl + config { + # Forward the host port with the label "db" to the guest VM's port 6539. + port_map { + db = 6539 + } + } + ``` + +- `args` - (Optional) A list of strings that is passed to qemu as command line + options. + +## Examples + +A simple config block to run a `qemu` image: + +``` +task "virtual" { + driver = "qemu" + + config { + image_path = "local/linux.img" + accelerator = "kvm" + args = ["-nodefaults", "-nodefconfig"] + } + + # Specifying an artifact is required with the "qemu" + # driver. This is the # mechanism to ship the image to be run. + artifact { + source = "https://internal.file.server/linux.img" + + options { + checksum = "md5:123445555555555" + } + } +``` + +## Client Requirements + +The `qemu` driver requires Qemu to be installed and in your system's `$PATH`. +The task must also specify at least one artifact to download, as this is the only +way to retrieve the image being run. + +## Client Attributes + +The `qemu` driver will set the following client attributes: + +- `driver.qemu` - Set to `1` if Qemu is found on the host node. Nomad determines + this by executing `qemu-system-x86_64 -version` on the host and parsing the output +- `driver.qemu.version` - Version of `qemu-system-x86_64`, ex: `2.4.0` + +Here is an example of using these properties in a job file: + +```hcl +job "docs" { + # Only run this job where the qemu version is higher than 1.2.3. + constraint { + attribute = "${driver.qemu.version}" + operator = ">" + value = "1.2.3" + } +} +``` + +## Resource Isolation + +Nomad uses Qemu to provide full software virtualization for virtual machine +workloads. Nomad can use Qemu KVM's hardware-assisted virtualization to deliver +better performance. + +Virtualization provides the highest level of isolation for workloads that +require additional security, and resource use is constrained by the Qemu +hypervisor rather than the host kernel. VM network traffic still flows through +the host's interface(s). diff --git a/content/nomad/v0.11.x/content/docs/drivers/raw_exec.mdx b/content/nomad/v0.11.x/content/docs/drivers/raw_exec.mdx new file mode 100644 index 0000000000..581b3b3ecc --- /dev/null +++ b/content/nomad/v0.11.x/content/docs/drivers/raw_exec.mdx @@ -0,0 +1,146 @@ +--- +layout: docs +page_title: 'Drivers: Raw Exec' +sidebar_title: Raw Fork/Exec +description: The Raw Exec task driver simply fork/execs and provides no isolation. +--- + +# Raw Fork/Exec Driver + +Name: `raw_exec` + +The `raw_exec` driver is used to execute a command for a task without any +isolation. Further, the task is started as the same user as the Nomad process. +As such, it should be used with extreme care and is disabled by default. + +## Task Configuration + +```hcl +task "webservice" { + driver = "raw_exec" + + config { + command = "my-binary" + args = ["-flag", "1"] + } +} +``` + +The `raw_exec` driver supports the following configuration in the job spec: + +- `command` - The command to execute. Must be provided. If executing a binary + that exists on the host, the path must be absolute. If executing a binary that + is downloaded from an [`artifact`](/docs/job-specification/artifact), the + path can be relative from the allocation's root directory. + +- `args` - (Optional) A list of arguments to the `command`. References + to environment variables or any [interpretable Nomad + variables](/docs/runtime/interpolation) will be interpreted before + launching the task. + +## Examples + +To run a binary present on the Node: + +``` +task "example" { + driver = "raw_exec" + + config { + # When running a binary that exists on the host, the path must be absolute/ + command = "/bin/sleep" + args = ["1"] + } +} +``` + +To execute a binary downloaded from an [`artifact`](/docs/job-specification/artifact): + +``` +task "example" { + driver = "raw_exec" + + config { + command = "name-of-my-binary" + } + + artifact { + source = "https://internal.file.server/name-of-my-binary" + options { + checksum = "sha256:abd123445ds4555555555" + } + } +} +``` + +## Client Requirements + +The `raw_exec` driver can run on all supported operating systems. For security +reasons, it is disabled by default. To enable raw exec, the Nomad client +configuration must explicitly enable the `raw_exec` driver in the plugin's options: + +``` +plugin "raw_exec" { + config { + enabled = true + } +} +``` + +Nomad versions before v0.9 use the following client configuration. This configuration is +also supported in Nomad v0.9.0, but is deprecated in favor of the plugin stanza: + +``` +client { + options = { + "driver.raw_exec.enable" = "1" + } +} +``` + +## Plugin Options + +- `enabled` - Specifies whether the driver should be enabled or disabled. + Defaults to `false`. + +- `no_cgroups` - Specifies whether the driver should not use + cgroups to manage the process group launched by the driver. By default, + cgroups are used to manage the process tree to ensure full cleanup of all + processes started by the task. The driver uses cgroups by default on + Linux and when `/sys/fs/cgroup/freezer/nomad` is writable for the + Nomad process. Using a cgroup significantly reduces Nomad's CPU + usage when collecting process metrics. + +## Client Options + +~> Note: client configuration options will soon be deprecated. Please use +[plugin options][plugin-options] instead. See the [plugin stanza][plugin-stanza] documentation for more information. + +- `driver.raw_exec.enable` - Specifies whether the driver should be enabled or + disabled. Defaults to `false`. + +- `driver.raw_exec.no_cgroups` - Specifies whether the driver should not use + cgroups to manage the process group launched by the driver. By default, + cgroups are used to manage the process tree to ensure full cleanup of all + processes started by the task. The driver only uses cgroups when Nomad is + launched as root, on Linux and when cgroups are detected. + +## Client Attributes + +The `raw_exec` driver will set the following client attributes: + +- `driver.raw_exec` - This will be set to "1", indicating the driver is available. + +## Resource Isolation + +The `raw_exec` driver provides no isolation. + +If the launched process creates a new process group, it is possible that Nomad +will leak processes on shutdown unless the application forwards signals +properly. Nomad will not leak any processes if cgroups are being used to manage +the process tree. Cgroups are used on Linux when Nomad is being run with +appropriate privileges, the cgroup system is mounted and the operator hasn't +disabled cgroups for the driver. + +[plugin-options]: #plugin-options +[plugin-stanza]: /docs/configuration/plugin diff --git a/content/nomad/v0.11.x/content/docs/enterprise/index.mdx b/content/nomad/v0.11.x/content/docs/enterprise/index.mdx new file mode 100644 index 0000000000..ae0120db93 --- /dev/null +++ b/content/nomad/v0.11.x/content/docs/enterprise/index.mdx @@ -0,0 +1,84 @@ +--- +layout: docs +page_title: Nomad Enterprise +sidebar_title: Nomad Enterprise +description: >- + Nomad Enterprise adds operations, collaboration, and governance capabilities + to Nomad. + + Features include Namespaces, Resource Quotas, Sentinel Policies, and Advanced + Autopilot. +--- + +# Nomad Enterprise + +Nomad Enterprise adds collaboration, operational, and governance capabilities to Nomad. Nomad Enterprise is available as a base Platform package with an optional Governance & Policy add-on module. + +Please navigate the sub-sections for more information about each package and its features in detail. + +## Nomad Enterprise Platform + +Nomad Enterprise Platform enables operators to easily upgrade Nomad as well as enhances performance and availability through Advanced Autopilot features such as Automated Upgrades, Enhanced Read Scalability, and Redundancy Zones. + +### Automated Upgrades + +Automated Upgrades allows operators to deploy a complete cluster of new servers and then simply wait for the upgrade to complete. As the new servers join the cluster, server logic checks the version of each Nomad server node. If the version is higher than the version on the current set of voters, it will avoid promoting the new servers to voters until the number of new servers matches the number of existing servers at the previous version. Once the numbers match, Nomad will begin to promote new servers and demote old ones. + +See the [Autopilot - Upgrade Migrations](https://learn.hashicorp.com/nomad/operating-nomad/autopilot#upgrade-migrations) documentation for a thorough overview. + +### Enhanced Read Scalability + +This feature enables an operator to introduce non-voting server nodes to a Nomad cluster. Non-voting servers will receive the replication stream but will not take part in quorum (required by the leader before log entries can be committed). Adding explicit non-voters will scale reads and scheduling without impacting write latency. + +See the [Autopilot - Read Scalability](https://learn.hashicorp.com/nomad/operating-nomad/autopilot#server-read-and-scheduling-scaling) documentation for a thorough overview. + +### Redundancy Zones + +Redundancy Zones enables an operator to deploy a non-voting server as a hot standby server on a per availability zone basis. For example, in an environment with three availability zones an operator can run one voter and one non-voter in each availability zone, for a total of six servers. If an availability zone is completely lost, only one voter will be lost, so the cluster remains available. If a voter is lost in an availability zone, Nomad will promote the non-voter to a voter automatically, putting the hot standby server into service quickly. + +See the [Autopilot - Redundancy Zones](https://learn.hashicorp.com/nomad/operating-nomad/autopilot#redundancy-zones) documentation for a thorough overview. + +## Governance & Policy + +Governance & Policy features are part of an add-on module that enables an organization to securely operate Nomad at scale across multiple teams through features such as Audit Logging, Namespaces, Resource Quotas, Sentinel Policies, and Preemption. + +### Audit Logging + +Secure clusters with enhanced risk management and operational traceability to fulfill compliance requirements. This Enterprise feature provides administrators with a complete set of records for all user-issued actions in Nomad. + +With Audit Logging, enterprises can now proactively identify access anomalies, ensure enforcement of their security policies, and diagnose cluster behavior by viewing preceding user operations. Designed as an HTTP API based audit logging system, each audit event is captured with relevant request and response information in a JSON format that is easily digestibile and familiar to operators. + +See the [Audit Logging Documentation](/docs/configuration/audit) for a thorough overview. + +### Namespaces + +Namespaces enable multiple teams to safely use a shared multi-region Nomad environment and reduce cluster fleet size. In Nomad Enterprise, a shared cluster can be partitioned into multiple namespaces which allow jobs and their associated objects to be isolated from each other and other users of the cluster. + +Namespaces enhance the usability of a shared cluster by isolating teams from the jobs of others, by providing fine grain access control to jobs when coupled with ACLs, and by preventing bad actors from negatively impacting the whole cluster. + +See the [Namespaces Guide](https://learn.hashicorp.com/nomad/governance-and-policy/namespaces) for a thorough overview. + +### Resource Quotas + +Resource Quotas enable an operator to limit resource consumption across teams or projects to reduce waste and align budgets. In Nomad Enterprise, operators can define quota specifications and apply them to namespaces. When a quota is attached to a namespace, the jobs within the namespace may not consume more resources than the quota specification allows. + +This allows operators to partition a shared cluster and ensure that no single actor can consume the whole resources of the cluster. + +See the [Resource Quotas Guide](https://learn.hashicorp.com/nomad/governance-and-policy/quotas) for a thorough overview. + +### Sentinel Policies + +In Nomad Enterprise, operators can create Sentinel policies for fine-grained policy enforcement. Sentinel policies build on top of the ACL system and allow operators to define policies such as disallowing jobs to be submitted to production on Fridays or only allowing users to run jobs that use pre-authorized Docker images. Sentinel policies are defined as code, giving operators considerable flexibility to meet compliance requirements. + +See the [Sentinel Policies Guide](https://learn.hashicorp.com/nomad/governance-and-policy/sentinel) for a thorough overview. + +### Preemption + +When a Nomad cluster is at capacity for a given set of placement constraints, any allocations that result from a newly scheduled service or batch job will remain in the pending state until sufficient resources become available - regardless of the defined priority. + +Preemption enables Nomad's scheduler to automatically evict lower priority allocations of service and batch jobs so that allocations from higher priority jobs can be placed. This behavior ensures that critical workloads can run when resources are limited or when partial outages require workloads to be rescheduled across a smaller set of client nodes. + +## Try Nomad Enterprise + +Click [here](https://www.hashicorp.com/go/nomad-enterprise) to set up a demo or request a trial +of Nomad Enterprise. diff --git a/content/nomad/v0.11.x/content/docs/faq.mdx b/content/nomad/v0.11.x/content/docs/faq.mdx new file mode 100644 index 0000000000..f8e13bef28 --- /dev/null +++ b/content/nomad/v0.11.x/content/docs/faq.mdx @@ -0,0 +1,65 @@ +--- +layout: docs +page_title: Frequently Asked Questions +sidebar_title: FAQ +description: Frequently asked questions and answers for Nomad +--- + +# Frequently Asked Questions + +## Q: What is Checkpoint? / Does Nomad call home? + +Nomad makes use of a HashiCorp service called [Checkpoint](https://checkpoint.hashicorp.com) +which is used to check for updates and critical security bulletins. +Only anonymous information, which cannot be used to identify the user or host, is +sent to Checkpoint. An anonymous ID is sent which helps de-duplicate warning messages. +This anonymous ID can be disabled. Using the Checkpoint service is optional and can be disabled. + +See [`disable_anonymous_signature`](/docs/configuration#disable_anonymous_signature) +and [`disable_update_check`](/docs/configuration#disable_update_check). + +## Q: Is Nomad eventually or strongly consistent? + +Nomad makes use of both a [consensus protocol](/docs/internals/consensus) and +a [gossip protocol](/docs/internals/gossip). The consensus protocol is strongly +consistent, and is used for all state replication and scheduling. The gossip protocol +is used to manage the addresses of servers for automatic clustering and multi-region +federation. This means all data that is managed by Nomad is strongly consistent. + +## Q: Is Nomad's `datacenter` parameter the same as Consul's? + +No. For those familiar with Consul, [Consul's notion of a +datacenter][consul_dc] is more equivalent to a [Nomad region][nomad_region]. +Nomad supports grouping nodes into multiple datacenters, which should reflect +nodes being colocated, while being managed by a single set of Nomad servers. + +Consul on the other hand does not have this two-tier approach to servers and +agents and instead [relies on federation to create larger logical +clusters][consul_fed]. + +## Q: What is "bootstrapping" a Nomad cluster? ((#bootstrapping)) + +Bootstrapping is the process when a Nomad cluster elects its first leader +and writes the initial cluster state to that leader's state store. Bootstrapping +will not occur until at least a given number of servers, defined by +[`bootstrap_expect`], have connected to each other. Once this process has +completed, the cluster is said to be bootstrapped and is ready to use. + +Certain configuration options are only used to influence the creation of the +initial cluster state during bootstrapping and are not consulted again so long +as the state data remains intact. These typically are values that must be +consistent across server members. For example, the [`default_scheduler_config`] +option allows an operator to set the SchedulerConfig to non-default values +during this bootstrap process rather than requiring an immediate call to the API +once the cluster is up and running. + +If the state is completely destroyed, whether intentionally or accidentally, on +all of the Nomad servers in the same outage, the cluster will re-bootstrap based +on the Nomad defaults and any configuration present that impacts the bootstrap +process. + +[consul_dc]: https://www.consul.io/docs/agent/options.html#_datacenter +[consul_fed]: https://www.consul.io/docs/guides/datacenters.html +[nomad_region]: /docs/configuration#datacenter +[`bootstrap_expect`]: /docs/configuration/server#bootstrap_expect +[`default_scheduler_config`]: /docs/configuration/server#default_scheduler_config diff --git a/content/nomad/v0.11.x/content/docs/index.mdx b/content/nomad/v0.11.x/content/docs/index.mdx new file mode 100644 index 0000000000..5d17bf4c3f --- /dev/null +++ b/content/nomad/v0.11.x/content/docs/index.mdx @@ -0,0 +1,14 @@ +--- +layout: docs +page_title: Documentation +description: |- + Welcome to the Nomad documentation! This documentation is more of a reference + guide for all available features and options of Nomad. +--- + +# Nomad Documentation + +Welcome to the Nomad documentation! This documentation is a reference guide for +all available features and options of Nomad. If you are just getting +started with Nomad, please start with the +[introduction and getting started guide](/intro) instead. diff --git a/content/nomad/v0.11.x/content/docs/install/index.mdx b/content/nomad/v0.11.x/content/docs/install/index.mdx new file mode 100644 index 0000000000..8be776939e --- /dev/null +++ b/content/nomad/v0.11.x/content/docs/install/index.mdx @@ -0,0 +1,72 @@ +--- +layout: docs +page_title: Installing Nomad +sidebar_title: Installing Nomad +description: Learn how to install Nomad. +--- + +# Installing Nomad + +Installing Nomad is simple. There are two approaches to installing Nomad: + +1. Using a [precompiled binary](#precompiled-binaries) +1. Installing [from source](#from-source) + +Downloading a precompiled binary is easiest, and we provide downloads over +TLS along with SHA-256 sums to verify the binary. + +## Precompiled Binaries ((#precompiled-binaries)) + +To install the precompiled binary, +[download](/downloads) the appropriate package for your system. +Nomad is currently packaged as a zip file. We do not have any near term +plans to provide system packages. + +Once the zip is downloaded, unzip it into any directory. The +`nomad` (or `nomad.exe` for Windows) binary inside is all that is +necessary to run Nomad. Any additional files, if any, are not +required to run Nomad. + +Copy the binary to anywhere on your system. If you intend to access it +from the command-line, make sure to place it somewhere on your `PATH`. + +## Compiling from Source ((#from-source)) + +To compile from source, you will need [Go](https://golang.org) installed and +configured properly (including a `GOPATH` environment variable set), as well +as a copy of [`git`](https://www.git-scm.com/) in your `PATH`. + +1. Clone the Nomad repository from GitHub into your `GOPATH`: + + ```shell-session + $ mkdir -p $GOPATH/src/github.com/hashicorp && cd $_ + $ git clone https://github.com/hashicorp/nomad.git + $ cd nomad + ``` + +1. Bootstrap the project. This will download and compile libraries and tools + needed to compile Nomad: + + ```shell-session + $ make bootstrap + ``` + +1. Build Nomad for your current system and put the + binary in `./bin/` (relative to the git checkout). The `make dev` target is + just a shortcut that builds `nomad` for only your local build environment (no + cross-compiled targets). + + ```shell-session + $ make dev + ``` + +## Verifying the Installation + +To verify Nomad is properly installed, run `nomad -v` on your system. You should +see help output. If you are executing it from the command line, make sure it is +on your `PATH` or you may get an error about `nomad` not being found. + +```shell-session +$ nomad -v + +``` diff --git a/content/nomad/v0.11.x/content/docs/install/production/deployment-guide.mdx b/content/nomad/v0.11.x/content/docs/install/production/deployment-guide.mdx new file mode 100644 index 0000000000..01aa1df979 --- /dev/null +++ b/content/nomad/v0.11.x/content/docs/install/production/deployment-guide.mdx @@ -0,0 +1,229 @@ +--- +layout: docs +page_title: Nomad Deployment Guide +sidebar_title: Reference Install Guide +description: |- + This deployment guide covers the steps required to install and + configure a single HashiCorp Nomad cluster as defined in the + Nomad Reference Architecture +ea_version: 0.9 +--- + +# Nomad Reference Install Guide + +This deployment guide covers the steps required to install and configure a single HashiCorp Nomad cluster as defined in the [Nomad Reference Architecture](/docs/install/production/reference-architecture). + +These instructions are for installing and configuring Nomad on Linux hosts running the systemd system and service manager. + +## Reference Material + +This deployment guide is designed to work in combination with the [Nomad Reference Architecture](/docs/install/production/reference-architecture) and [Consul Deployment Guide](https://www.consul.io/docs/guides/deployment-guide.html). Although it is not a strict requirement to follow the Nomad Reference Architecture, please ensure you are familiar with the overall architecture design. For example, installing Nomad server agents on multiple physical or virtual (with correct anti-affinity) hosts for high-availability. + +## Overview + +To provide a highly-available single cluster architecture, we recommend Nomad server agents be deployed to more than one host, as shown in the [Nomad Reference Architecture](/docs/install/production/reference-architecture). + +![Reference diagram](/img/nomad_reference_diagram.png) + +These setup steps should be completed on all Nomad hosts: + +- [Download Nomad](#download-nomad) +- [Install Nomad](#install-nomad) +- [Configure systemd](#configure-systemd) +- [Configure Nomad](#configure-nomad) +- [Start Nomad](#start-nomad) + +## Download Nomad + +Precompiled Nomad binaries are available for download at [https://releases.hashicorp.com/nomad/](https://releases.hashicorp.com/nomad/) and Nomad Enterprise binaries are available for download by following the instructions made available to HashiCorp Enterprise customers. + +```text +export NOMAD_VERSION="0.9.0" +curl --silent --remote-name https://releases.hashicorp.com/nomad/${NOMAD_VERSION}/nomad_${NOMAD_VERSION}_linux_amd64.zip +``` + +You may perform checksum verification of the zip packages using the SHA256SUMS and SHA256SUMS.sig files available for the specific release version. HashiCorp provides [a guide on checksum verification](https://www.hashicorp.com/security) for precompiled binaries. + +## Install Nomad + +Unzip the downloaded package and move the `nomad` binary to `/usr/local/bin/`. Check `nomad` is available on the system path. + +```text +unzip nomad_${NOMAD_VERSION}_linux_amd64.zip +sudo chown root:root nomad +sudo mv nomad /usr/local/bin/ +nomad version +``` + +The `nomad` command features opt-in autocompletion for flags, subcommands, and arguments (where supported). Enable autocompletion. + +```text +nomad -autocomplete-install +complete -C /usr/local/bin/nomad nomad +``` + +Create a data directory for Nomad. + +```text +sudo mkdir --parents /opt/nomad +``` + +## Configure systemd + +Systemd uses [documented sane defaults](https://www.freedesktop.org/software/systemd/man/systemd.directives.html) so only non-default values must be set in the configuration file. + +Create a Nomad service file at `/etc/systemd/system/nomad.service`. + +```text +sudo touch /etc/systemd/system/nomad.service +``` + +Add this configuration to the Nomad service file: + +```text +[Unit] +Description=Nomad +Documentation=https://nomadproject.io/docs/ +Wants=network-online.target +After=network-online.target + +[Service] +ExecReload=/bin/kill -HUP $MAINPID +ExecStart=/usr/local/bin/nomad agent -config /etc/nomad.d +KillMode=process +KillSignal=SIGINT +LimitNOFILE=infinity +LimitNPROC=infinity +Restart=on-failure +RestartSec=2 +StartLimitBurst=3 +StartLimitIntervalSec=10 +TasksMax=infinity + +[Install] +WantedBy=multi-user.target +``` + +The following parameters are set for the `[Unit]` stanza: + +- [`Description`](https://www.freedesktop.org/software/systemd/man/systemd.unit.html#Description=) - Free-form string describing the nomad service +- [`Documentation`](https://www.freedesktop.org/software/systemd/man/systemd.unit.html#Documentation=) - Link to the nomad documentation +- [`Wants`](https://www.freedesktop.org/software/systemd/man/systemd.unit.html#Wants=) - Configure a dependency on the network service +- [`After`](https://www.freedesktop.org/software/systemd/man/systemd.unit.html#After=) - Configure an ordering dependency on the network service being started before the nomad service + +The following parameters are set for the `[Service]` stanza: + +- [`ExecReload`](https://www.freedesktop.org/software/systemd/man/systemd.service.html#ExecReload=) - Send Nomad a `SIGHUP` signal to trigger a configuration reload +- [`ExecStart`](https://www.freedesktop.org/software/systemd/man/systemd.service.html#ExecStart=) - Start Nomad with the `agent` argument and path to a directory of configuration files +- [`KillMode`](https://www.freedesktop.org/software/systemd/man/systemd.kill.html#KillMode=) - Treat nomad as a single process +- [`LimitNOFILE`, `LimitNPROC`](https://www.freedesktop.org/software/systemd/man/systemd.exec.html#Process%20Properties) - Disable limits for file descriptors and processes +- [`RestartSec`](https://www.freedesktop.org/software/systemd/man/systemd.service.html#RestartSec=) - Restart nomad after 2 seconds of it being considered 'failed' +- [`Restart`](https://www.freedesktop.org/software/systemd/man/systemd.service.html#Restart=) - Restart nomad unless it returned a clean exit code +- [`StartLimitBurst`, `StartLimitIntervalSec`](https://www.freedesktop.org/software/systemd/man/systemd.unit.html#StartLimitIntervalSec=interval) - Configure unit start rate limiting +- [`TasksMax`](https://www.freedesktop.org/software/systemd/man/systemd.resource-control.html#TasksMax=N) - Disable task limits (only available in systemd >= 226) + +The following parameters are set for the `[Install]` stanza: + +- [`WantedBy`](https://www.freedesktop.org/software/systemd/man/systemd.unit.html#WantedBy=) - Creates a weak dependency on nomad being started by the multi-user run level + +## Configure Nomad + +Nomad uses [documented sane defaults](/docs/configuration) so only non-default values must be set in the configuration file. Configuration can be read from multiple files and is loaded in lexical order. See the [full description](/docs/configuration) for more information about configuration loading and merge semantics. + +Some configuration settings are common to both server and client Nomad agents, while some configuration settings must only exist on one or the other. Follow the [common configuration](#common-configuration) guidance on all hosts and then the specific guidance depending on whether you are configuring a Nomad [server](#server-configuration) or [client](#client-configuration). + +- [Common Nomad configuration](#common-configuration) +- [Configure a Nomad server](#server-configuration) +- [Configure a Nomad client](#client-configuration) + +### Common configuration + +Create a configuration file at `/etc/nomad.d/nomad.hcl`: + +```text +sudo mkdir --parents /etc/nomad.d +sudo chmod 700 /etc/nomad.d +sudo touch /etc/nomad.d/nomad.hcl +``` + +Add this configuration to the `nomad.hcl` configuration file: + +~> **Note:** Replace the `datacenter` parameter value with the identifier you will use for the datacenter this Nomad cluster is deployed in. + +```hcl +datacenter = "dc1" +data_dir = "/opt/nomad" +``` + +- [`datacenter`](/docs/configuration#datacenter) - The datacenter in which the agent is running. +- [`data_dir`](/docs/configuration#data_dir) - The data directory for the agent to store state. + +### Server configuration + +Create a configuration file at `/etc/nomad.d/server.hcl`: + +```text +sudo touch /etc/nomad.d/server.hcl +``` + +Add this configuration to the `server.hcl` configuration file: + +~> **NOTE** Replace the `bootstrap_expect` value with the number of Nomad servers you will use; three or five [is recommended](/docs/internals/consensus#deployment-table). + +```hcl +server { + enabled = true + bootstrap_expect = 3 +} +``` + +- [`server`](/docs/configuration/server#enabled) - Specifies if this agent should run in server mode. All other server options depend on this value being set. +- [`bootstrap_expect`](/docs/configuration/server#bootstrap_expect) - The number of expected servers in the cluster. Either this value should not be provided or the value must agree with other servers in the cluster. + +### Client configuration + +Create a configuration file at `/etc/nomad.d/client.hcl`: + +```text +sudo touch /etc/nomad.d/client.hcl +``` + +Add this configuration to the `client.hcl` configuration file: + +```hcl +client { + enabled = true +} +``` + +- [`client`](/docs/configuration/client#enabled) - Specifies if this agent should run in client mode. All other client options depend on this value being set. + +~> **NOTE** The [`options`](/docs/configuration/client#options-parameters) parameter can be used to enable or disable specific configurations on Nomad clients, unique to your use case requirements. + +### ACL configuration + +The [Access Control](https://learn.hashicorp.com/nomad?track=acls#operations-and-development) guide provides instructions on configuring and enabling ACLs. + +### TLS configuration + +Securing Nomad's cluster communication with mutual TLS (mTLS) is recommended for production deployments and can even ease operations by preventing mistakes and misconfigurations. Nomad clients and servers should not be publicly accessible without mTLS enabled. + +The [Securing Nomad with TLS](https://learn.hashicorp.com/nomad/transport-security/enable-tls) guide provides instructions on configuring and enabling TLS. + +## Start Nomad + +Enable and start Nomad using the systemctl command responsible for controlling systemd managed services. Check the status of the nomad service using systemctl. + +```text +sudo systemctl enable nomad +sudo systemctl start nomad +sudo systemctl status nomad +``` + +## Next Steps + +- Read [Outage Recovery](https://learn.hashicorp.com/nomad/operating-nomad/outage) to learn + the steps required to recover from a Nomad cluster outage. +- Read [Autopilot](https://learn.hashicorp.com/nomad/operating-nomad/autopilot) to learn about + features in Nomad 0.8 to allow for automatic operator-friendly + management of Nomad servers. diff --git a/content/nomad/v0.11.x/content/docs/install/production/index.mdx b/content/nomad/v0.11.x/content/docs/install/production/index.mdx new file mode 100644 index 0000000000..3912ba9d6a --- /dev/null +++ b/content/nomad/v0.11.x/content/docs/install/production/index.mdx @@ -0,0 +1,40 @@ +--- +layout: docs +page_title: Installing Nomad for Production +sidebar_title: Production +description: Learn how to install Nomad for Production. +--- + +# Installing Nomad for Production + +This section covers how to install Nomad for production. + +There are multiple steps to cover for a successful Nomad deployment: + +## Installing Nomad + +This page lists the two primary methods to installing Nomad and how to verify a successful installation. + +Please refer to [Installing Nomad](/docs/install) sub-section. + +## Hardware Requirements + +This page details the recommended machine resources (instances), port requirements, and network topology for Nomad. + +Please refer to [Hardware Requirements](/docs/install/production/requirements) sub-section. + +## Setting Nodes with Nomad Agent +These pages explain the Nomad agent process and how to set the server and client nodes in the cluster. + +Please refer to [Set Server & Client Nodes](/docs/install/production/nomad-agent) and [Nomad Agent documentation](/docs/commands/agent) pages. + +## Reference Architecture + +This document provides recommended practices and a reference architecture for HashiCorp Nomad production deployments. This reference architecture conveys a general architecture that should be adapted to accommodate the specific needs of each implementation. + +Please refer to [Reference Architecture](/docs/install/production/reference-architecture) sub-section. + +## Install Guide Based on Reference Architecture +This guide provides an end-to-end walkthrough of the steps required to install a single production-ready Nomad cluster as defined in the Reference Architecture section. + +Please refer to [Reference Install Guide](/docs/install/production/deployment-guide) sub-section. diff --git a/content/nomad/v0.11.x/content/docs/install/production/nomad-agent.mdx b/content/nomad/v0.11.x/content/docs/install/production/nomad-agent.mdx new file mode 100644 index 0000000000..b97f40ce44 --- /dev/null +++ b/content/nomad/v0.11.x/content/docs/install/production/nomad-agent.mdx @@ -0,0 +1,142 @@ +--- +layout: docs +page_title: Nomad Agent +sidebar_title: Set Server & Client Nodes +description: |- + The Nomad agent is a long running process which can be used either in + a client or server mode. +--- + +# Setting Nodes with Nomad Agent + +The Nomad agent is a long running process which runs on every machine that +is part of the Nomad cluster. The behavior of the agent depends on if it is +running in client or server mode. Clients are responsible for running tasks, +while servers are responsible for managing the cluster. + +Client mode agents are relatively simple. They make use of fingerprinting +to determine the capabilities and resources of the host machine, as well as +determining what [drivers](/docs/drivers) are available. Clients +register with servers to provide the node information, heartbeat to provide +liveness, and run any tasks assigned to them. + +Servers take on the responsibility of being part of the +[consensus protocol](/docs/internals/consensus) and [gossip protocol](/docs/internals/gossip). +The consensus protocol, powered by Raft, allows the servers to perform +leader election and state replication. The gossip protocol allows for simple +clustering of servers and multi-region federation. The higher burden on the +server nodes means that usually they should be run on dedicated instances -- +they are more resource intensive than a client node. + +Client nodes make up the majority of the cluster, and are very lightweight as +they interface with the server nodes and maintain very little state of their +own. Each cluster has usually 3 or 5 server mode agents and potentially +thousands of clients. + +## Running an Agent + +The agent is started with the [`nomad agent` command](/docs/commands/agent). This +command blocks, running forever or until told to quit. The agent command takes a variety +of configuration options, but most have sane defaults. + +When running `nomad agent`, you should see output similar to this: + +```shell-session +$ nomad agent -dev +==> Starting Nomad agent... +==> Nomad agent configuration: + + Client: true + Log Level: INFO + Region: global (DC: dc1) + Server: true + +==> Nomad agent started! Log data will stream in below: + + [INFO] serf: EventMemberJoin: server-1.node.global 127.0.0.1 + [INFO] nomad: starting 4 scheduling worker(s) for [service batch _core] +... +``` + +There are several important messages that `nomad agent` outputs: + +- **Client**: This indicates whether the agent has enabled client mode. + Client nodes fingerprint their host environment, register with servers, + and run tasks. + +- **Log Level**: This indicates the configured log level. Only messages with + an equal or higher severity will be logged. This can be tuned to increase + verbosity for debugging, or reduced to avoid noisy logging. + +- **Region**: This is the region and datacenter in which the agent is configured + to run. Nomad has first-class support for multi-datacenter and multi-region + configurations. The `-region` and `-dc` flags can be used to set the region + and datacenter. The default is the `global` region in `dc1`. + +- **Server**: This indicates whether the agent has enabled server mode. + Server nodes have the extra burden of participating in the consensus protocol, + storing cluster state, and making scheduling decisions. + +## Stopping an Agent + +An agent can be stopped in two ways: gracefully or forcefully. By default, +any signal to an agent (interrupt, terminate, kill) will cause the agent +to forcefully stop. Graceful termination can be configured by either +setting `leave_on_interrupt` or `leave_on_terminate` to respond to the +respective signals. + +When gracefully exiting, clients will update their status to terminal on +the servers so that tasks can be migrated to healthy agents. Servers +will notify their intention to leave the cluster which allows them to +leave the [consensus](/docs/internals/consensus) peer set. + +It is especially important that a server node be allowed to leave gracefully +so that there will be a minimal impact on availability as the server leaves +the consensus peer set. If a server does not gracefully leave, and will not +return into service, the [`server force-leave` command](/docs/commands/server/force-leave) +should be used to eject it from the consensus peer set. + +## Lifecycle + +Every agent in the Nomad cluster goes through a lifecycle. Understanding +this lifecycle is useful for building a mental model of an agent's interactions +with a cluster and how the cluster treats a node. + +When a client agent is first started, it fingerprints the host machine to +identify its attributes, capabilities, and [task drivers](/docs/drivers). +These are reported to the servers during an initial registration. The addresses +of known servers are provided to the agent via configuration, potentially using +DNS for resolution. Using [Consul](https://www.consul.io) provides a way to avoid hard +coding addresses and resolving them on demand. + +While a client is running, it is performing heartbeating with servers to +maintain liveness. If the heartbeats fail, the servers assume the client node +has failed, and stop assigning new tasks while migrating existing tasks. +It is impossible to distinguish between a network failure and an agent crash, +so both cases are handled the same. Once the network recovers or a crashed agent +restarts the node status will be updated and normal operation resumed. + +To prevent an accumulation of nodes in a terminal state, Nomad does periodic +garbage collection of nodes. By default, if a node is in a failed or 'down' +state for over 24 hours it will be garbage collected from the system. + +Servers are slightly more complex as they perform additional functions. They +participate in a [gossip protocol](/docs/internals/gossip) both to cluster +within a region and to support multi-region configurations. When a server is +first started, it does not know the address of other servers in the cluster. +To discover its peers, it must _join_ the cluster. This is done with the +[`server join` command](/docs/commands/server/join) or by providing the +proper configuration on start. Once a node joins, this information is gossiped +to the entire cluster, meaning all nodes will eventually be aware of each other. + +When a server _leaves_, it specifies its intent to do so, and the cluster marks that +node as having _left_. If the server has _left_, replication to it will stop and it +is removed from the consensus peer set. If the server has _failed_, replication +will attempt to make progress to recover from a software or network failure. + +## Permissions + +Nomad servers should be run with the lowest possible permissions. Nomad clients +must be run as root due to the OS isolation mechanisms that require root +privileges. In all cases, it is recommended you create a `nomad` user with the +minimal set of required privileges. diff --git a/content/nomad/v0.11.x/content/docs/install/production/reference-architecture.mdx b/content/nomad/v0.11.x/content/docs/install/production/reference-architecture.mdx new file mode 100644 index 0000000000..ea7d1f7f37 --- /dev/null +++ b/content/nomad/v0.11.x/content/docs/install/production/reference-architecture.mdx @@ -0,0 +1,134 @@ +--- +layout: docs +page_title: Nomad Reference Architecture +sidebar_title: Reference Architecture +description: |- + This document provides recommended practices and a reference + architecture for HashiCorp Nomad production deployments. +ea_version: 0.9 +--- + +# Nomad Reference Architecture + +This document provides recommended practices and a reference architecture for HashiCorp Nomad production deployments. This reference architecture conveys a general architecture that should be adapted to accommodate the specific needs of each implementation. + +The following topics are addressed: + +- [Reference Architecture](#ra) +- [Deployment Topology within a Single Region](#one-region) +- [Deployment Topology across Multiple Regions](#multi-region) +- [Network Connectivity Details](#net) +- [Deployment System Requirements](#system-reqs) +- [High Availability](#high-availability) +- [Failure Scenarios](#failure-scenarios) + +This document describes deploying a Nomad cluster in combination with, or with access to, a [Consul cluster](/docs/integrations/consul-integration). We recommend the use of Consul with Nomad to provide automatic clustering, service discovery, health checking and dynamic configuration. + +## Reference Architecture ((#ra)) + +A Nomad cluster typically comprises three or five servers (but no more than seven) and a number of client agents. Nomad differs slightly from Consul in that it divides infrastructure into regions which are served by one Nomad server cluster, but can manage multiple datacenters or availability zones. For example, a _US Region_ can include datacenters _us-east-1_ and _us-west-2_. + +In a Nomad multi-region architecture, communication happens via [WAN gossip](/docs/internals/gossip). Additionally, Nomad can integrate easily with Consul to provide features such as automatic clustering, service discovery, and dynamic configurations. Thus we recommend you use Consul in your Nomad deployment to simplify the deployment. + +In cloud environments, a single cluster may be deployed across multiple availability zones. For example, in AWS each Nomad server can be deployed to an associated EC2 instance, and those EC2 instances distributed across multiple AZs. Similarly, Nomad server clusters can be deployed to multiple cloud regions to allow for region level HA scenarios. + +For more information on Nomad server cluster design, see the [cluster requirements documentation](/docs/install/production/requirements). + +The design shared in this document is the recommended architecture for production environments, as it provides flexibility and resilience. Nomad utilizes an existing Consul server cluster; however, the deployment design of the Consul server cluster is outside the scope of this document. + +Nomad to Consul connectivity is over HTTP and should be secured with TLS as well as a Consul token to provide encryption of all traffic. This is done using Nomad's [Automatic Clustering with Consul](https://learn.hashicorp.com/nomad/operating-nomad/clustering). + +### Deployment Topology within a Single Region ((#one-region)) + +A single Nomad cluster is recommended for applications deployed in the same region. + +Each cluster is expected to have either three or five servers. This strikes a balance between availability in the case of failure and performance, as [Raft](https://raft.github.io/) consensus gets progressively slower as more servers are added. + +The time taken by a new server to join an existing large cluster may increase as the size of the cluster increases. + +#### Reference Diagram + +![Reference diagram](/img/nomad_reference_diagram.png) + +### Deployment Topology across Multiple Regions ((#multi-region)) + +By deploying Nomad server clusters in multiple regions, the user is able to interact with the Nomad servers by targeting any region from any Nomad server even if that server resides in a separate region. However, most data is not replicated between regions as they are fully independent clusters. The exceptions are [ACL tokens and policies][acl], as well as [Sentinel policies in Nomad Enterprise][sentinel], which _are_ replicated between regions. + +Nomad server clusters in different datacenters can be federated using WAN links. The server clusters can be joined to communicate over the WAN on port `4648`. This same port is used for single datacenter deployments over LAN as well. + +Additional documentation is available to learn more about [Nomad server federation](https://learn.hashicorp.com/nomad/operating-nomad/federation). + +## Network Connectivity Details ((#net)) + +![Nomad network diagram](/img/nomad_network_arch.png) + +Nomad servers are expected to be able to communicate in high bandwidth, low latency network environments and have below 10 millisecond latencies between cluster members. Nomad servers can be spread across cloud regions or datacenters if they satisfy these latency requirements. + +Nomad client clusters require the ability to receive traffic as noted above in the Network Connectivity Details; however, clients can be separated into any type of infrastructure (multi-cloud, on-prem, virtual, bare metal, etc.) as long as they are reachable and can receive job requests from the Nomad servers. + +Additional documentation is available to learn more about [Nomad networking](/docs/install/production/requirements#network-topology). + +## Deployment System Requirements ((#system-reqs)) + +Nomad server agents are responsible for maintaining the cluster state, responding to RPC queries (read operations), and for processing all write operations. Given that Nomad server agents do most of the heavy lifting, server sizing is critical for the overall performance efficiency and health of the Nomad cluster. + +### Nomad Servers + +| Size | CPU | Memory | Disk | Typical Cloud Instance Types | +| ----- | -------- | ------------ | ------ | ----------------------------------------- | +| Small | 2 core | 8-16 GB RAM | 50 GB | **AWS:** m5.large, m5.xlarge | +| | | | | **Azure:** Standard_D2_v3, Standard_D4_v3 | +| | | | | **GCE:** n1-standard-8, n1-standard-16 | +| Large | 4-8 core | 32-64 GB RAM | 100 GB | **AWS:** m5.2xlarge, m5.2xlarge | +| | | | | **Azure:** Standard_D4_v3, Standard_D8_v3 | +| | | | | **GCE:** n1-standard-16, n1-standard-32 | + +#### Hardware Sizing Considerations + +- The small size would be appropriate for most initial production + deployments, or for development/testing environments. + +- The large size is for production environments where there is a + consistently high workload. + +~> **NOTE** For large workloads, ensure that the disks support a high number of IOPS to keep up with the rapid Raft log update rate. + +Nomad clients can be setup with specialized workloads as well. For example, if workloads require GPU processing, a Nomad datacenter can be created to serve those GPU specific jobs and joined to a Nomad server cluster. For more information on specialized workloads, see the documentation on [job constraints](/docs/job-specification/constraint) to target specific client nodes. + +## High Availability + +A Nomad server cluster is the highly-available unit of deployment within a single datacenter. A recommended approach is to deploy a three or five node Nomad server cluster. With this configuration, during a Nomad server outage, failover is handled immediately without human intervention. + +When setting up high availability across regions, multiple Nomad server clusters are deployed and connected via WAN gossip. Nomad clusters in regions are fully independent from each other and do not share jobs, clients, or state. Data residing in a single region-specific cluster is not replicated to other clusters in other regions. + +## Failure Scenarios + +Typical distribution in a cloud environment is to spread Nomad server nodes into separate Availability Zones (AZs) within a high bandwidth, low latency network, such as an AWS Region. The diagram below shows Nomad servers deployed in multiple AZs promoting a single voting member per AZ and providing both AZ-level and node-level failure protection. + +![Nomad fault tolerance](/img/nomad_fault_tolerance.png) + +Additional documentation is available to learn more about [cluster sizing and failure tolerances](/docs/internals/consensus#deployment-table) as well as [outage recovery](https://learn.hashicorp.com/nomad/operating-nomad/outage). + +### Availability Zone Failure + +In the event of a single AZ failure, only a single Nomad server will be affected which would not impact job scheduling as long as there is still a Raft quorum (i.e. 2 available servers in a 3 server cluster, 3 available servers in a 5 server cluster, etc.). There are two scenarios that could occur should an AZ fail in a multiple AZ setup: leader loss or follower loss. + +#### Leader Server Loss + +If the AZ containing the Nomad leader server fails, the remaining quorum members would elect a new leader. The new leader then begins to accept new log entries and replicates these entries to the remaining followers. + +#### Follower Server Loss + +If the AZ containing a Nomad follower server fails, there is no immediate impact to the Nomad leader server or cluster operations. However, there still must be a Raft quorum in order to properly manage a future failure of the Nomad leader server. + +### Region Failure + +In the event of a region-level failure (which would contain an entire Nomad server cluster), clients will still be able to submit jobs to another region that is properly federated. However, there will likely be data loss as Nomad server clusters do not replicate their data to other region clusters. See [Multi-region Federation](https://learn.hashicorp.com/nomad/operating-nomad/federation) for more setup information. + +## Next Steps + +- Read [Deployment Guide](/docs/install/production/deployment-guide) to learn + the steps required to install and configure a single HashiCorp Nomad cluster. + +[acl]: https://learn.hashicorp.com/nomad?track=acls#operations-and-development +[sentinel]: https://learn.hashicorp.com/nomad/governance-and-policy/sentinel diff --git a/content/nomad/v0.11.x/content/docs/install/production/requirements.mdx b/content/nomad/v0.11.x/content/docs/install/production/requirements.mdx new file mode 100644 index 0000000000..da6eeb7ac5 --- /dev/null +++ b/content/nomad/v0.11.x/content/docs/install/production/requirements.mdx @@ -0,0 +1,109 @@ +--- +layout: docs +page_title: Hardware Requirements +sidebar_title: Hardware Requirements +description: |- + Learn about Nomad client and server requirements such as memory and CPU + recommendations, network topologies, and more. +--- + +# Hardware Requirements + +## Resources (RAM, CPU, etc.) + +**Nomad servers** may need to be run on large machine instances. We suggest +having between 4-8+ cores, 16-32 GB+ of memory, 40-80 GB+ of **fast** disk and +significant network bandwidth. The core count and network recommendations are to +ensure high throughput as Nomad heavily relies on network communication and as +the Servers are managing all the nodes in the region and performing scheduling. +The memory and disk requirements are due to the fact that Nomad stores all state +in memory and will store two snapshots of this data onto disk, which causes high IO in busy clusters with lots of writes. Thus disk should +be at least 2 times the memory available to the server when deploying a high +load cluster. When running on AWS prefer NVME or Provisioned IOPS SSD storage for data dir. + +These recommendations are guidelines and operators should always monitor the +resource usage of Nomad to determine if the machines are under or over-sized. + +**Nomad clients** support reserving resources on the node that should not be +used by Nomad. This should be used to target a specific resource utilization per +node and to reserve resources for applications running outside of Nomad's +supervision such as Consul and the operating system itself. + +Please see the [reservation configuration](/docs/configuration/client#reserved) for +more detail. + +## Network Topology + +**Nomad servers** are expected to have sub 10 millisecond network latencies +between each other to ensure liveness and high throughput scheduling. Nomad +servers can be spread across multiple datacenters if they have low latency +connections between them to achieve high availability. + +For example, on AWS every region comprises of multiple zones which have very low +latency links between them, so every zone can be modeled as a Nomad datacenter +and every Zone can have a single Nomad server which could be connected to form a +quorum and a region. + +Nomad servers uses Raft for state replication and Raft being highly consistent +needs a quorum of servers to function, therefore we recommend running an odd +number of Nomad servers in a region. Usually running 3-5 servers in a region is +recommended. The cluster can withstand a failure of one server in a cluster of +three servers and two failures in a cluster of five servers. Adding more servers +to the quorum adds more time to replicate state and hence throughput decreases +so we don't recommend having more than seven servers in a region. + +**Nomad clients** do not have the same latency requirements as servers since they +are not participating in Raft. Thus clients can have 100+ millisecond latency to +their servers. This allows having a set of Nomad servers that service clients +that can be spread geographically over a continent or even the world in the case +of having a single "global" region and many datacenter. + +## Ports Used + +Nomad requires 3 different ports to work properly on servers and 2 on clients, +some on TCP, UDP, or both protocols. Below we document the requirements for each +port. + +- HTTP API (Default 4646). This is used by clients and servers to serve the HTTP + API. TCP only. + +- RPC (Default 4647). This is used for internal RPC communication between client + agents and servers, and for inter-server traffic. TCP only. + +- Serf WAN (Default 4648). This is used by servers to gossip both over the LAN and + WAN to other servers. It isn't required that Nomad clients can reach this address. + TCP and UDP. + +When tasks ask for dynamic ports, they are allocated out of the port range +between 20,000 and 32,000. This is well under the ephemeral port range suggested +by the [IANA](https://en.wikipedia.org/wiki/Ephemeral_port). If your operating +system's default ephemeral port range overlaps with Nomad's dynamic port range, +you should tune the OS to avoid this overlap. + +On Linux this can be checked and set as follows: + +```shell-session +$ cat /proc/sys/net/ipv4/ip_local_port_range +32768 60999 +$ echo "49152 65535" > /proc/sys/net/ipv4/ip_local_port_range +``` + +## Bridge Networking and `iptables` + +Nomad's task group networks and Consul Connect integration use bridge networking and iptables to send traffic between containers. The Linux kernel bridge module has three "tunables" that control whether traffic crossing the bridge are processed by iptables. Some operating systems (RedHat, CentOS, and Fedora in particular) configure these tunables to optimize for VM workloads where iptables rules might not be correctly configured for guest traffic. + +These tunables can be set to allow iptables processing for the bridge network as follows: + +```shell-session +$ echo 1 > /proc/sys/net/bridge/bridge-nf-call-arptables +$ echo 1 > /proc/sys/net/bridge/bridge-nf-call-ip6tables +$ echo 1 > /proc/sys/net/bridge/bridge-nf-call-iptables +``` + +To preserve these settings on startup of a client node, add a file including the following to `/etc/sysctl.d/` or remove the file your Linux distribution puts in that directory. + +```text +net.bridge.bridge-nf-call-arptables = 1 +net.bridge.bridge-nf-call-ip6tables = 1 +net.bridge.bridge-nf-call-iptables = 1 +``` diff --git a/content/nomad/v0.11.x/content/docs/install/quickstart/index.mdx b/content/nomad/v0.11.x/content/docs/install/quickstart/index.mdx new file mode 100644 index 0000000000..26fcf81356 --- /dev/null +++ b/content/nomad/v0.11.x/content/docs/install/quickstart/index.mdx @@ -0,0 +1,43 @@ +--- +layout: docs +page_title: Installing Nomad for QuickStart +sidebar_title: Quickstart +description: Learn how to install Nomad locally or in a sandbox. +--- + +# Quickstart + +This page lists multiple methods to installing Nomad locally or in a sandbox +environment. + +These installations are designed to get you started with Nomad easily and should +be used only for experimentation purposes. If you are looking to install Nomad +in production, please refer to our [Production +Installation](/docs/install/production) guide here. + +## Local + +Install Nomad on your local machine. + +- [Installing the Pre-compiled Binary][installing-binary] +- [Installing Nomad with Vagrant][vagrant-environment] + +## Cloud + +Install Nomad on the public cloud. + +- AWS + - [CloudFormation](https://aws.amazon.com/quickstart/architecture/nomad/) + - [Terraform](https://github.com/hashicorp/nomad/blob/master/terraform/aws/README.md) +- Azure + - [Terraform](https://github.com/hashicorp/nomad/tree/master/terraform/azure) + +## Katacoda + +Experiment with Nomad in your browser via KataCoda's interactive learning platform. + +- [Introduction to Nomad](https://www.katacoda.com/hashicorp/scenarios/nomad-introduction) +- [Nomad Playground](https://katacoda.com/hashicorp/scenarios/playground) + +[installing-binary]: /docs/install/#precompiled-binaries +[vagrant-environment]: https://learn.hashicorp.com/nomad/getting-started/install#vagrant-setup-optional diff --git a/content/nomad/v0.11.x/content/docs/install/windows-service.mdx b/content/nomad/v0.11.x/content/docs/install/windows-service.mdx new file mode 100644 index 0000000000..dba9e7bd9e --- /dev/null +++ b/content/nomad/v0.11.x/content/docs/install/windows-service.mdx @@ -0,0 +1,75 @@ +--- +layout: docs +page_title: Nomad as a Windows Service +sidebar_title: Windows Service +description: Discusses how to register and run Nomad as a native Windows service. +--- + +# Installing Nomad as a Windows service + +Nomad can be run as a native Windows service. In order to do this, you will need +to register the Nomad application with the Windows Service Control Manager using +[`sc.exe`], configure Nomad to log to a file, and then start the Nomad service. + +~> **Note:** These steps should be run in a PowerShell session with Administrator +capabilities. + +## Register Nomad with Windows + +[Download] the Nomad binary for your architecture. + +Use the [`sc.exe`] command to create a Service named "Nomad". The binPath +argument should include the fully qualified path to the Nomad executable and any +arguments to the nomad command: agent, -config, etc. + +```plaintext +sc.exe create "Nomad" binPath="«full path to nomad.exe» agent -config=«path to config file or directory»" start= auto +[SC] CreateService SUCCESS +``` + +If you receive a success message, your service is registered with the service +manager. + +If you get an error, please verify the path to the binary and check the +arguments, by running the contents of `binPath` directly in a PowerShell session +and observing the results. + +## Configure Nomad to log to file + +Because Windows services run non-interactively and Nomad does not log to the +Windows Event Viewer, you will need to configure file-based logging in Nomad. + +To do this, set the [`log_file`][logging] argument in your Nomad configuration +file or in the binPath argument of the [`sc.exe`] command used to register the +service. + +## Start the Nomad service + +You have two ways to start the service. + +- Go to the Windows Service Manager, and look for **Nomad** in the service name + column. Click the _Start_ button to start the service. + +- Using the [`sc.exe`] command: + + ```plaintext + sc.exe start "Nomad" + + SERVICE_NAME: Nomad + TYPE : 10 WIN32_OWN_PROCESS + STATE : 4 RUNNING + (STOPPABLE, NOT_PAUSABLE, ACCEPTS_SHUTDOWN) + WIN32_EXIT_CODE : 0 (0x0) + SERVICE_EXIT_CODE : 0 (0x0) + CHECKPOINT : 0x0 + WAIT_HINT : 0x0 + PID : 8008 + FLAGS : + ``` + +The service automatically starts up during/after boot, so you don't need to +launch Nomad from the command-line again. + +[`sc.exe`]: https://msdn.microsoft.com/en-us/library/windows/desktop/ms682107(v=vs.85).aspx +[download]: /downloads +[logging]: /docs/configuration#log_file diff --git a/content/nomad/v0.11.x/content/docs/integrations/consul-connect.mdx b/content/nomad/v0.11.x/content/docs/integrations/consul-connect.mdx new file mode 100644 index 0000000000..83e77d2d48 --- /dev/null +++ b/content/nomad/v0.11.x/content/docs/integrations/consul-connect.mdx @@ -0,0 +1,343 @@ +--- +layout: docs +page_title: Consul Connect +sidebar_title: Consul Connect +description: >- + Learn how to use Nomad with Consul Connect to enable secure service to service + communication +--- + +# Consul Connect + +~> **Note:** This guide requires Nomad 0.10.0 or later and Consul 1.6.0 or +later. + +~> **Note:** Nomad's Connect integration requires Linux network namespaces. +Nomad Connect will not run on Windows or macOS. + +[Consul Connect](https://www.consul.io/docs/connect) provides +service-to-service connection authorization and encryption using mutual +Transport Layer Security (TLS). Applications can use sidecar proxies in a +service mesh configuration to automatically establish TLS connections for +inbound and outbound connections without being aware of Connect at all. + +# Nomad with Consul Connect Integration + +Nomad integrates with Consul to provide secure service-to-service communication +between Nomad jobs and task groups. In order to support Consul Connect, Nomad +adds a new networking mode for jobs that enables tasks in the same task group to +share their networking stack. With a few changes to the job specification, job +authors can opt into Connect integration. When Connect is enabled, Nomad will +launch a proxy alongside the application in the job file. The proxy (Envoy) +provides secure communication with other applications in the cluster. + +Nomad job specification authors can use Nomad's Consul Connect integration to +implement [service segmentation](https://www.consul.io/segmentation.html) in a +microservice architecture running in public clouds without having to directly +manage TLS certificates. This is transparent to job specification authors as +security features in Connect continue to work even as the application scales up +or down or gets rescheduled by Nomad. + +For using the Consul Connect integration with Consul ACLs enabled, see the +[Secure Nomad Jobs with Consul Connect](https://learn.hashicorp.com/nomad/consul-integration/nomad-connect-acl) +guide. + +# Nomad Consul Connect Example + +The following section walks through an example to enable secure communication +between a web dashboard and a backend counting service. The web dashboard and +the counting service are managed by Nomad. Nomad additionally configures Envoy +proxies to run along side these applications. The dashboard is configured to +connect to the counting service via localhost on port 9001. The proxy is managed +by Nomad, and handles mTLS communication to the counting service. + +## Prerequisites + +### Consul + +Connect integration with Nomad requires [Consul 1.6 or +later.](https://releases.hashicorp.com/consul/1.6.0/) The Consul agent can be +run in dev mode with the following command: + +**Note**: Nomad's Connect integration requires Consul in your `$PATH` + +```shell-session +$ consul agent -dev +``` + +To use Connect on a non-dev Consul agent, you will minimally need to enable the +GRPC port and set `connect` to enabled by adding some additional information to +your Consul client configurations, depending on format. + +For HCL configurations: + +```hcl +# ... + +ports { + grpc = 8502 +} + +connect { + enabled = true +} +``` + +For JSON configurations: + +```javascript +{ + // ... + "ports": { + "grpc": 8502 + }, + "connect": { + "enabled": true + } +} +``` + +### Nomad + +Nomad must schedule onto a routable interface in order for the proxies to +connect to each other. The following steps show how to start a Nomad dev agent +configured for Connect. + +```shell-session +$ sudo nomad agent -dev-connect +``` + +### CNI Plugins + +Nomad uses CNI plugins to configure the network namespace used to secure the +Consul Connect sidecar proxy. All Nomad client nodes using network namespaces +must have CNI plugins installed. + +The following commands install CNI plugins: + +```shell-session +$ curl -L -o cni-plugins.tgz https://github.com/containernetworking/plugins/releases/download/v0.8.6/cni-plugins-linux-amd64-v0.8.6.tgz +$ sudo mkdir -p /opt/cni/bin +$ sudo tar -C /opt/cni/bin -xzf cni-plugins.tgz +``` + +Ensure the your Linux operating system distribution has been configured to allow +container traffic through the bridge network to be routed via iptables. These +tunables can be set as follows: + +```shell-session +$ echo 1 > /proc/sys/net/bridge/bridge-nf-call-arptables +$ echo 1 > /proc/sys/net/bridge/bridge-nf-call-ip6tables +$ echo 1 > /proc/sys/net/bridge/bridge-nf-call-iptables +``` + +To preserve these settings on startup of a client node, add a file including the +following to `/etc/sysctl.d/` or remove the file your Linux distribution puts in +that directory. + +``` +net.bridge.bridge-nf-call-arptables = 1 +net.bridge.bridge-nf-call-ip6tables = 1 +net.bridge.bridge-nf-call-iptables = 1 +``` + +## Run the Connect-enabled Services + +Once Nomad and Consul are running, submit the following Connect-enabled services +to Nomad by copying the HCL into a file named `connect.nomad` and running: +`nomad run connect.nomad` + +```hcl +job "countdash" { + datacenters = ["dc1"] + + group "api" { + network { + mode = "bridge" + } + + service { + name = "count-api" + port = "9001" + + connect { + sidecar_service {} + } + } + + task "web" { + driver = "docker" + + config { + image = "hashicorpnomad/counter-api:v1" + } + } + } + + group "dashboard" { + network { + mode = "bridge" + + port "http" { + static = 9002 + to = 9002 + } + } + + service { + name = "count-dashboard" + port = "9002" + + connect { + sidecar_service { + proxy { + upstreams { + destination_name = "count-api" + local_bind_port = 8080 + } + } + } + } + } + + task "dashboard" { + driver = "docker" + + env { + COUNTING_SERVICE_URL = "http://${NOMAD_UPSTREAM_ADDR_count_api}" + } + + config { + image = "hashicorpnomad/counter-dashboard:v1" + } + } + } +} +``` + +The job contains two task groups: an API service and a web frontend. + +### API Service + +The API service is defined as a task group with a bridge network: + +```hcl + group "api" { + network { + mode = "bridge" + } + + # ... + } +``` + +Since the API service is only accessible via Consul Connect, it does not define +any ports in its network. The service stanza enables Connect: + +```hcl + group "api" { + + # ... + + service { + name = "count-api" + port = "9001" + + connect { + sidecar_service {} + } + } + + # ... + + } +``` + +The `port` in the service stanza is the port the API service listens on. The +Envoy proxy will automatically route traffic to that port inside the network +namespace. + +### Web Frontend + +The web frontend is defined as a task group with a bridge network and a static +forwarded port: + +```hcl + group "dashboard" { + network { + mode = "bridge" + + port "http" { + static = 9002 + to = 9002 + } + } + + # ... + + } +``` + +The `static = 9002` parameter requests the Nomad scheduler reserve port 9002 on +a host network interface. The `to = 9002` parameter forwards that host port to +port 9002 inside the network namespace. + +This allows you to connect to the web frontend in a browser by visiting +`http://:9002` as show below: + +[![Count Dashboard][count-dashboard]][count-dashboard] + +The web frontend connects to the API service via Consul Connect: + +```hcl + service { + name = "count-dashboard" + port = "9002" + + connect { + sidecar_service { + proxy { + upstreams { + destination_name = "count-api" + local_bind_port = 8080 + } + } + } + } + } +``` + +The `upstreams` stanza defines the remote service to access (`count-api`) and +what port to expose that service on inside the network namespace (`8080`). + +The web frontend is configured to communicate with the API service with an +environment variable: + +```hcl + env { + COUNTING_SERVICE_URL = "http://${NOMAD_UPSTREAM_ADDR_count_api}" + } +``` + +The web frontend is configured via the `$COUNTING_SERVICE_URL`, so you must +interpolate the upstream's address into that environment variable. Note that +dashes (`-`) are converted to underscores (`_`) in environment variables so +`count-api` becomes `count_api`. + +## Limitations + +- The `consul` binary must be present in Nomad's `$PATH` to run the Envoy + proxy sidecar on client nodes. +- Consul Connect Native is not yet supported ([#6083][gh6083]). +- Only the Docker, `exec`, `raw_exec`, and `java` drivers support network namespaces + and Connect. +- Changes to the `connect` stanza may not properly trigger a job update + ([#6459][gh6459]). Changing a `meta` variable is the suggested workaround as + this will always cause an update to occur. +- Consul Connect and network namespaces are only supported on Linux. + +[count-dashboard]: /img/count-dashboard.png +[gh6083]: https://github.com/hashicorp/nomad/issues/6083 +[gh6120]: https://github.com/hashicorp/nomad/issues/6120 +[gh6701]: https://github.com/hashicorp/nomad/issues/6701 +[gh6459]: https://github.com/hashicorp/nomad/issues/6459 diff --git a/content/nomad/v0.11.x/content/docs/integrations/consul-integration.mdx b/content/nomad/v0.11.x/content/docs/integrations/consul-integration.mdx new file mode 100644 index 0000000000..fa95528fa5 --- /dev/null +++ b/content/nomad/v0.11.x/content/docs/integrations/consul-integration.mdx @@ -0,0 +1,76 @@ +--- +layout: docs +page_title: Consul Integration +sidebar_title: Consul +description: Learn how to integrate Nomad with Consul and add service discovery to jobs +--- + +# Consul Integration + +[Consul][] is a tool for discovering and configuring services in your +infrastructure. Consul's key features include service discovery, health checking, +a KV store, and robust support for multi-datacenter deployments. Nomad's integration +with Consul enables automatic clustering, built-in service registration, and +dynamic rendering of configuration files and environment variables. The sections +below describe the integration in more detail. + +## Configuration + +In order to use Consul with Nomad, you will need to configure and +install Consul on your nodes alongside Nomad, or schedule it as a system job. +Nomad does not currently run Consul for you. + +To enable Consul integration, please see the +[Nomad agent Consul integration](/docs/configuration/consul) +configuration. + +## Automatic Clustering with Consul + +Nomad servers and clients will be automatically informed of each other's +existence when a running Consul cluster already exists and the Consul agent is +installed and configured on each host. Please see the [Automatic Clustering with +Consul](https://learn.hashicorp.com/nomad/operating-nomad/clustering) guide for more information. + +## Service Discovery + +Nomad schedules workloads of various types across a cluster of generic hosts. +Because of this, placement is not known in advance and you will need to use +service discovery to connect tasks to other services deployed across your +cluster. Nomad integrates with Consul to provide service discovery and +monitoring. + +To configure a job to register with service discovery, please see the +[`service` job specification documentation][service]. + +## Dynamic Configuration + +Nomad's job specification includes a [`template` stanza](/docs/job-specification/template) +that utilizes a Consul ecosystem tool called [Consul Template](https://github.com/hashicorp/consul-template). This mechanism creates a convenient way to ship configuration files +that are populated from environment variables, Consul data, Vault secrets, or just +general configurations within a Nomad task. + +For more information on Nomad's template stanza and how it leverages Consul Template, +please see the [`template` job specification documentation](/docs/job-specification/template). + +## Assumptions + +- Consul 0.7.2 or later is needed for `tls_skip_verify` in HTTP checks. + +- Consul 0.6.4 or later is needed for using the Script checks. + +- Consul 0.6.0 or later is needed for using the TCP checks. + +- The service discovery feature in Nomad depends on operators making sure that + the Nomad client can reach the Consul agent. + +- Tasks running inside Nomad also need to reach out to the Consul agent if + they want to use any of the Consul APIs. Ex: A task running inside a docker + container in the bridge mode won't be able to talk to a Consul Agent running + on the loopback interface of the host since the container in the bridge mode + has its own network interface and doesn't see interfaces on the global + network namespace of the host. There are a couple of ways to solve this, one + way is to run the container in the host networking mode, or make the Consul + agent listen on an interface in the network namespace of the container. + +[consul]: https://www.consul.io/ 'Consul by HashiCorp' +[service]: /docs/job-specification/service 'Nomad service Job Specification' diff --git a/content/nomad/v0.11.x/content/docs/integrations/index.mdx b/content/nomad/v0.11.x/content/docs/integrations/index.mdx new file mode 100644 index 0000000000..5778b3e468 --- /dev/null +++ b/content/nomad/v0.11.x/content/docs/integrations/index.mdx @@ -0,0 +1,12 @@ +--- +layout: docs +page_title: Nomad HashiStack Integrations +sidebar_title: Integrations +description: This section features Nomad's integrations with Consul and Vault. +--- + +# HashiCorp Integrations + +Nomad integrates seamlessly with Consul and Vault for service discovery and secrets management. + +Please navigate the appropriate sub-sections for more information. diff --git a/content/nomad/v0.11.x/content/docs/integrations/vault-integration.mdx b/content/nomad/v0.11.x/content/docs/integrations/vault-integration.mdx new file mode 100644 index 0000000000..f3e7f9e1dd --- /dev/null +++ b/content/nomad/v0.11.x/content/docs/integrations/vault-integration.mdx @@ -0,0 +1,709 @@ +--- +layout: docs +page_title: Vault Integration and Retrieving Dynamic Secrets +sidebar_title: Vault +description: |- + Learn how to deploy an application in Nomad and retrieve dynamic credentials + by integrating with Vault. +--- + +# Vault Integration + +Nomad integrates seamlessly with [Vault][vault] and allows your application to +retrieve dynamic credentials for various tasks. In this guide, you will deploy a +web application that needs to authenticate against [PostgreSQL][postgresql] to +display data from a table to the user. + +## Reference Material + +- [Vault Integration Documentation][vault-integration] +- [Nomad Template Stanza Integration with Vault][nomad-template-vault] +- [Secrets Task Directory][secrets-task-directory] + +## Estimated Time to Complete + +20 minutes + +## Challenge + +Think of a scenario where a Nomad operator needs to deploy an application that +can quickly and safely retrieve dynamic credentials to authenticate against a +database and return information. + +## Solution + +Deploy Vault and configure the nodes in your Nomad cluster to integrate with it. +Use the appropriate [templating syntax][nomad-template-vault] to retrieve +credentials from Vault and then store those credentials in the +[secrets][secrets-task-directory] task directory to be consumed by the Nomad +task. + +## Prerequisites + +To perform the tasks described in this guide, you need to have a Nomad +environment with Consul and Vault installed. You can use this [repo][repo] to +easily provision a sandbox environment. This guide will assume a cluster with +one server node and three client nodes. + +-> **Please Note:** This guide is for demo purposes and is only using a single +Nomad server with Vault installed alongside. In a production cluster, 3 or 5 +Nomad server nodes are recommended along with a separate Vault cluster. + +## Steps + +### Step 1: Initialize Vault Server + +Run the following command to initialize Vault server and receive an +[unseal][seal] key and initial root [token][token]. Be sure to note the unseal +key and initial root token as you will need these two pieces of information. + +```shell-session +$ vault operator init -key-shares=1 -key-threshold=1 +``` + +The `vault operator init` command above creates a single Vault unseal key for +convenience. For a production environment, it is recommended that you create at +least five unseal key shares and securely distribute them to independent +operators. The `vault operator init` command defaults to five key shares and a +key threshold of three. If you provisioned more than one server, the others will +become standby nodes but should still be unsealed. + +### Step 2: Unseal Vault + +Run the following command and then provide your unseal key to Vault. + +```shell-session +$ vault operator unseal +``` + +The output of unsealing Vault will look similar to the following: + +```text +Key Value +--- ----- +Seal Type shamir +Initialized true +Sealed false +Total Shares 1 +Threshold 1 +Version 0.11.4 +Cluster Name vault-cluster-d12535e5 +Cluster ID 49383931-c782-fdc6-443e-7681e7b15aca +HA Enabled true +HA Cluster n/a +HA Mode standby +Active Node Address +``` + +### Step 3: Log in to Vault + +Use the [login][login] command to authenticate yourself against Vault using the +initial root token you received earlier. You will need to authenticate to run +the necessary commands to write policies, create roles, and configure a +connection to your database. + +```shell-session +$ vault login +``` + +If your login is successful, you will see output similar to what is shown below: + +```text +Success! You are now authenticated. The token information displayed below +is already stored in the token helper. You do NOT need to run "vault login" +again. Future Vault requests will automatically use this token. +... +``` + +### Step 4: Write the Policy for the Nomad Server Token + +To use the Vault integration, you must provide a Vault token to your Nomad +servers. Although you can provide your root token to easily get started, the +recommended approach is to use a token [role][role] based token. This first +requires writing a policy that you will attach to the token you provide to your +Nomad servers. By using this approach, you can limit the set of +[policies][policy] that tasks managed by Nomad can access. + +For this exercise, use the following policy for the token you will create for +your Nomad server. Place this policy in a file named `nomad-server-policy.hcl`. + +```hcl +# Allow creating tokens under "nomad-cluster" token role. The token role name +# should be updated if "nomad-cluster" is not used. +path "auth/token/create/nomad-cluster" { + capabilities = ["update"] +} + +# Allow looking up "nomad-cluster" token role. The token role name should be +# updated if "nomad-cluster" is not used. +path "auth/token/roles/nomad-cluster" { + capabilities = ["read"] +} + +# Allow looking up the token passed to Nomad to validate # the token has the +# proper capabilities. This is provided by the "default" policy. +path "auth/token/lookup-self" { + capabilities = ["read"] +} + +# Allow looking up incoming tokens to validate they have permissions to access +# the tokens they are requesting. This is only required if +# `allow_unauthenticated` is set to false. +path "auth/token/lookup" { + capabilities = ["update"] +} + +# Allow revoking tokens that should no longer exist. This allows revoking +# tokens for dead tasks. +path "auth/token/revoke-accessor" { + capabilities = ["update"] +} + +# Allow checking the capabilities of our own token. This is used to validate the +# token upon startup. +path "sys/capabilities-self" { + capabilities = ["update"] +} + +# Allow our own token to be renewed. +path "auth/token/renew-self" { + capabilities = ["update"] +} +``` + +You can now write a policy called `nomad-server` by running the following +command: + +```shell-session +$ vault policy write nomad-server nomad-server-policy.hcl +``` + +You should see the following output: + +```text +Success! Uploaded policy: nomad-server +``` + +You will generate the actual token in the next few steps. + +### Step 5: Create a Token Role + +At this point, you must create a Vault token role that Nomad can use. The token +role allows you to limit what Vault policies are accessible by jobs +submitted to Nomad. We will use the following token role: + +```json +{ + "allowed_policies": "access-tables", + "token_explicit_max_ttl": 0, + "name": "nomad-cluster", + "orphan": true, + "token_period": 259200, + "renewable": true +} +``` + +Please notice that the `access-tables` policy is listed under the +`allowed_policies` key. We have not created this policy yet, but it will be used +by our job to retrieve credentials to access the database. A job running in our +Nomad cluster will only be allowed to use the `access-tables` policy. + +If you would like to allow all policies to be used by any job in the Nomad +cluster except for the ones you specifically prohibit, then use the +`disallowed_policies` key instead and simply list the policies that should not +be granted. If you take this approach, be sure to include `nomad-server` in the +disallowed policies group. An example of this is shown below: + +```json +{ + "disallowed_policies": "nomad-server", + "token_explicit_max_ttl": 0, + "name": "nomad-cluster", + "orphan": true, + "token_period": 259200, + "renewable": true +} +``` + +Save the policy in a file named `nomad-cluster-role.json` and create the token +role named `nomad-cluster`. + +```shell-session +$ vault write /auth/token/roles/nomad-cluster @nomad-cluster-role.json +``` + +You should see the following output: + +```text +Success! Data written to: auth/token/roles/nomad-cluster +``` + +### Step 6: Generate the Token for the Nomad Server + +Run the following command to create a token for your Nomad server: + +```shell-session +$ vault token create -policy nomad-server -period 72h -orphan +``` + +The `-orphan` flag is included when generating the Nomad server token above to +prevent revocation of the token when its parent expires. Vault typically creates +tokens with a parent-child relationship. When an ancestor token is revoked, all +of its descendant tokens and their associated leases are revoked as well. + +If everything works, you should see output similar to the following: + +```text +Key Value +--- ----- +token 1gr0YoLyTBVZl5UqqvCfK9RJ +token_accessor 5fz20DuDbxKgweJZt3cMynya +token_duration 72h +token_renewable true +token_policies ["default" "nomad-server"] +identity_policies [] +policies ["default" "nomad-server"] +``` + +### Step 7: Edit the Nomad Server Configuration to Enable Vault Integration + +At this point, you are ready to edit the [vault stanza][vault-stanza] in the +Nomad Server's configuration file located at `/etc/nomad.d/nomad.hcl`. Provide +the token you generated in the previous step in the `vault` stanza of your Nomad +server configuration. The token can also be provided as an environment variable +called `VAULT_TOKEN`. Be sure to specify the `nomad-cluster-role` in the +[create_from_role][create-from-role] option. If using +[Vault Namespaces](https://www.vaultproject.io/docs/enterprise/namespaces), +modify both the client and server configuration to include the namespace; +alternatively, it can be provided in the environment variable `VAULT_NAMESPACE`. +After following these steps and enabling Vault, the `vault` stanza in your Nomad +server configuration will be similar to what is shown below: + +```hcl +vault { + enabled = true + address = "http://active.vault.service.consul:8200" + task_token_ttl = "1h" + create_from_role = "nomad-cluster" + token = "" + namespace = "" +} +``` + +Restart the Nomad server + +```shell-session +$ sudo systemctl restart nomad +``` + +NOTE: Nomad servers will renew the token automatically. + +Vault integration needs to be enabled on the client nodes as well, but this has +been configured for you already in this environment. You will see the `vault` +stanza in your Nomad clients' configuration (located at +`/etc/nomad.d/nomad.hcl`) looks similar to the following: + +```hcl +vault { + enabled = true + address = "http://active.vault.service.consul:8200" +} +``` + +Please note that the Nomad clients do not need to be provided with a Vault +token. + +### Step 8: Deploy Database + +The next few steps will involve configuring a connection between Vault and our +database, so let's deploy one that we can connect to. Create a Nomad job called +`db.nomad` with the following content: + +```hcl +job "postgres-nomad-demo" { + datacenters = ["dc1"] + + group "db" { + + task "server" { + driver = "docker" + + config { + image = "hashicorp/postgres-nomad-demo:latest" + port_map { + db = 5432 + } + } + resources { + network { + port "db"{ + static = 5432 + } + } + } + + service { + name = "database" + port = "db" + + check { + type = "tcp" + interval = "2s" + timeout = "2s" + } + } + } + } +} +``` + +Run the job as shown below: + +```shell-session +$ nomad run db.nomad +``` + +Verify the job is running with the following command: + +```shell-session +$ nomad status postgres-nomad-demo +``` + +The result of the status command will look similar to the output below: + +```text +ID = postgres-nomad-demo +Name = postgres-nomad-demo +Submit Date = 2018-11-15T21:01:00Z +Type = service +Priority = 50 +Datacenters = dc1 +Status = running +Periodic = false +Parameterized = false + +Summary +Task Group Queued Starting Running Failed Complete Lost +db 0 0 1 0 0 0 + +Allocations +ID Node ID Task Group Version Desired Status Created Modified +701e2699 5de1330c db 0 run running 1m56s ago 1m33s ago +``` + +Now we can move on to configuring the connection between Vault and our database. + +### Step 9: Enable the Database Secrets Engine + +We are using the database secrets engine for Vault in this exercise so that we +can generate dynamic credentials for our PostgreSQL database. Run the following command to enable it: + +```shell-session +$ vault secrets enable database +``` + +If the previous command was successful, you will see the following output: + +```text +Success! Enabled the database secrets engine at: database/ +``` + +### Step 10: Configure the Database Secrets Engine + +Create a file named `connection.json` and placed the following information into +it: + +```json +{ + "plugin_name": "postgresql-database-plugin", + "allowed_roles": "accessdb", + "connection_url": "postgresql://{{username}}:{{password}}@database.service.consul:5432/postgres?sslmode=disable", + "username": "postgres", + "password": "postgres123" +} +``` + +The information above allows Vault to connect to our database and create users +with specific privileges. We will specify the `accessdb` role soon. In a +production setting, it is recommended to give Vault credentials with enough +privileges to generate database credentials dynamically and and manage their +lifecycle. + +Run the following command to configure the connection between the database +secrets engine and our database: + +```shell-session +$ vault write database/config/postgresql @connection.json +``` + +If the operation is successful, there will be no output. + +### Step 11: Create a Vault Role to Manage Database Privileges + +Recall from the previous step that we specified `accessdb` in the +`allowed_roles` key of our connection information. Let's set up that role now. Create a file called `accessdb.sql` with the following content: + +```shell +CREATE USER "{{name}}" WITH ENCRYPTED PASSWORD '{{password}}' VALID UNTIL +'{{expiration}}'; +GRANT USAGE ON ALL SEQUENCES IN SCHEMA public TO "{{name}}"; +GRANT ALL PRIVILEGES ON ALL TABLES IN SCHEMA public TO "{{name}}"; +GRANT ALL ON SCHEMA public TO "{{name}}"; +``` + +The SQL above will be used in the [creation_statements][creation-statements] +parameter of our next command to specify the privileges that the dynamic +credentials being generated will possess. In our case, the dynamic database user +will have broad privileges that include the ability to read from the tables that +our application will need to access. + +Run the following command to create the role: + +```shell-session +$ vault write database/roles/accessdb db_name=postgresql \ +creation_statements=@accessdb.sql default_ttl=1h max_ttl=24h +``` + +You should see the following output after running the previous command: + +```text +Success! Data written to: database/roles/accessdb +``` + +### Step 12: Generate PostgreSQL Credentials + +You should now be able to generate dynamic credentials to access your database. +Run the following command to generate a set of credentials: + +```shell-session +$ vault read database/creds/accessdb +``` + +The previous command should return output similar to what is shown below: + +```text +Key Value +--- ----- +lease_id database/creds/accessdb/3JozEMSMqw0vHHhvla15sKTW +lease_duration 1h +lease_renewable true +password A1a-3pMGjpDXHZ2Qzuf7 +username v-root-accessdb-5LA65urB4daA8KYy2xku-1542318363 +``` + +Congratulations! You have configured Vault's connection to your database and can +now generate credentials with the previously specified privileges. Now we need +to deploy our application and make sure that it will be able to communicate with +Vault and obtain the credentials as well. + +### Step 13: Create the `access-tables` Policy for Your Nomad Job to Use + +Recall from [Step 5][step-5] that we specified a policy named `access-tables` in +our `allowed_policies` section of the token role. We will create this policy now +and give it the capability to read from the `database/creds/accessdb` endpoint +(the same endpoint we read from in the previous step to generate credentials for +our database). We will then specify this policy in our Nomad job which will +allow it to retrieve credentials for itself to access the database. + +On the Nomad server (which is also running Vault), create a file named +`access-tables-policy.hcl` with the following content: + +```hcl +path "database/creds/accessdb" { + capabilities = ["read"] +} +``` + +Create the `access-tables` policy with the following command: + +```shell-session +$ vault policy write access-tables access-tables-policy.hcl +``` + +You should see the following output: + +```text +Success! Uploaded policy: access-tables +``` + +### Step 14: Deploy Your Job with the Appropriate Policy and Templating + +Now we are ready to deploy our web application and give it the necessary policy +and configuration to communicate with our database. Create a file called +`web-app.nomad` and save the following content in it. + +```hcl +job "nomad-vault-demo" { + datacenters = ["dc1"] + + group "demo" { + task "server" { + + vault { + policies = ["access-tables"] + } + + driver = "docker" + config { + image = "hashicorp/nomad-vault-demo:latest" + port_map { + http = 8080 + } + + volumes = [ + "secrets/config.json:/etc/demo/config.json" + ] + } + + template { + data = < + + + +

Welcome!

+

If everything worked correctly, you should be able to see a list of names +below

+ +
+ + +

John Doe

+ +

Peter Parker

+ +

Clifford Roosevelt

+ +

Bruce Wayne

+ +

Steven Clark

+ +

Mary Jane

+ + + + +``` + +- You can also deploy [fabio][fabio] and visit any Nomad client at its public IP + address using a fixed port. The details of this method are beyond the scope of + this guide, but you can refer to the [Load Balancing with Fabio][fabio-lb] + guide for more information on this topic. Alternatively, you could use the + `nomad` [alloc status][alloc-status] command along with the AWS console to + determine the public IP and port your service is running (remember to open the + port in your AWS security group if you choose this method). + +[![Web Service][web-service]][web-service] + +[alloc-status]: /docs/commands/alloc/status +[consul-template]: https://github.com/hashicorp/consul-template +[consul-temp-syntax]: https://github.com/hashicorp/consul-template#secret +[create-from-role]: /docs/configuration/vault#create_from_role +[creation-statements]: https://www.vaultproject.io/api/secret/databases#creation_statements +[destination]: /docs/job-specification/template#destination +[fabio]: https://github.com/fabiolb/fabio +[fabio-lb]: https://learn.hashicorp.com/nomad/load-balancing/fabio +[inline]: /docs/job-specification/template#inline-template +[login]: https://www.vaultproject.io/docs/commands/login +[nomad-alloc-fs]: /docs/commands/alloc/fs +[nomad-template-vault]: /docs/job-specification/template#vault-integration +[policy]: https://www.vaultproject.io/docs/concepts/policies +[postgresql]: https://www.postgresql.org/about/ +[remote-template]: /docs/job-specification/template#remote-template +[repo]: https://github.com/hashicorp/nomad/tree/master/terraform +[role]: https://www.vaultproject.io/docs/auth/token +[seal]: https://www.vaultproject.io/docs/concepts/seal +[secrets-task-directory]: /docs/runtime/environment#secrets +[step-5]: /docs/integrations/vault-integration#step-5-create-a-token-role +[template]: /docs/job-specification/template +[token]: https://www.vaultproject.io/docs/concepts/tokens +[vault]: https://www.vaultproject.io/ +[vault-integration]: /docs/vault-integration +[vault-jobspec]: /docs/job-specification/vault +[vault-stanza]: /docs/configuration/vault +[web-service]: /img/nomad-demo-app.png diff --git a/content/nomad/v0.11.x/content/docs/internals/architecture.mdx b/content/nomad/v0.11.x/content/docs/internals/architecture.mdx new file mode 100644 index 0000000000..18919474bd --- /dev/null +++ b/content/nomad/v0.11.x/content/docs/internals/architecture.mdx @@ -0,0 +1,137 @@ +--- +layout: docs +page_title: Architecture +sidebar_title: Architecture +description: Learn about the internal architecture of Nomad. +--- + +# Architecture + +Nomad is a complex system that has many different pieces. To help both users and developers of Nomad +build a mental model of how it works, this page documents the system architecture. + +~> **Advanced Topic!** This page covers technical details +of Nomad. You do not need to understand these details to +effectively use Nomad. The details are documented here for +those who wish to learn about them without having to go +spelunking through the source code. + +# Glossary + +Before describing the architecture, we provide a glossary of terms to help +clarify what is being discussed: + +- **Job** - A Job is a specification provided by users that declares a workload for + Nomad. A Job is a form of _desired state_; the user is expressing that the job should + be running, but not where it should be run. The responsibility of Nomad is to make sure + the _actual state_ matches the user desired state. A Job is composed of one or more + task groups. + +- **Task Group** - A Task Group is a set of tasks that must be run together. For example, a + web server may require that a log shipping co-process is always running as well. A task + group is the unit of scheduling, meaning the entire group must run on the same client node and + cannot be split. + +- **Driver** – A Driver represents the basic means of executing your **Tasks**. + Example Drivers include Docker, Qemu, Java, and static binaries. + +- **Task** - A Task is the smallest unit of work in Nomad. Tasks are executed by drivers, + which allow Nomad to be flexible in the types of tasks it supports. Tasks + specify their driver, configuration for the driver, constraints, and resources required. + +- **Client** - A Client of Nomad is a machine that tasks can be run on. All clients run the + Nomad agent. The agent is responsible for registering with the servers, watching for any + work to be assigned and executing tasks. The Nomad agent is a long lived process which + interfaces with the servers. + +- **Allocation** - An Allocation is a mapping between a task group in a job and a client + node. A single job may have hundreds or thousands of task groups, meaning an equivalent + number of allocations must exist to map the work to client machines. Allocations are created + by the Nomad servers as part of scheduling decisions made during an evaluation. + +- **Evaluation** - Evaluations are the mechanism by which Nomad makes scheduling decisions. + When either the _desired state_ (jobs) or _actual state_ (clients) changes, Nomad creates + a new evaluation to determine if any actions must be taken. An evaluation may result + in changes to allocations if necessary. + +- **Server** - Nomad servers are the brains of the cluster. There is a cluster of servers + per region and they manage all jobs and clients, run evaluations, and create task allocations. + The servers replicate data between each other and perform leader election to ensure high + availability. Servers federate across regions to make Nomad globally aware. + +- **Regions and Datacenters** - Nomad models infrastructure as regions and + datacenters. Regions may contain multiple datacenters. Servers are assigned to + a specific region, managing state and making scheduling decisions within that + region. Multiple regions can be federated together. For example, you may + have a `US` region with the `us-east-1` and `us-west-1` datacenters, + connected to the `EU` region with the `eu-fr-1` and `eu-uk-1` datacenters. + Requests that are made between regions are forwarded to the appropriate servers. + Data is _not_ replicated between regions. + +- **Bin Packing** - Bin Packing is the process of filling bins with items in a way that + maximizes the utilization of bins. This extends to Nomad, where the clients are "bins" + and the items are task groups. Nomad optimizes resources by efficiently bin packing + tasks onto client machines. + +# High-Level Overview + +Looking at only a single region, at a high level Nomad looks like this: + +[![Regional Architecture](/img/nomad-architecture-region.png)](/img/nomad-architecture-region.png) + +Within each region, we have both clients and servers. Servers are responsible for +accepting jobs from users, managing clients, and [computing task placements](/docs/internals/scheduling/scheduling). +Each region may have clients from multiple datacenters, allowing a small number of servers +to handle very large clusters. + +In some cases, for either availability or scalability, you may need to run multiple +regions. Nomad supports federating multiple regions together into a single cluster. +At a high level, this setup looks like this: + +[![Global Architecture](/img/nomad-architecture-global.png)](/img/nomad-architecture-global.png) + +Regions are fully independent from each other, and do not share jobs, clients, or +state. They are loosely-coupled using a gossip protocol, which allows users to +submit jobs to any region or query the state of any region transparently. Requests +are forwarded to the appropriate server to be processed and the results returned. +Data is _not_ replicated between regions. + +The servers in each region are all part of a single consensus group. This means +that they work together to elect a single leader which has extra duties. The leader +is responsible for processing all queries and transactions. Nomad is optimistically +concurrent, meaning all servers participate in making scheduling decisions in parallel. +The leader provides the additional coordination necessary to do this safely and +to ensure clients are not oversubscribed. + +Each region is expected to have either three or five servers. This strikes a balance +between availability in the case of failure and performance, as consensus gets +progressively slower as more servers are added. However, there is no limit to the number +of clients per region. + +Clients are configured to communicate with their regional servers and communicate +using remote procedure calls (RPC) to register themselves, send heartbeats for liveness, +wait for new allocations, and update the status of allocations. A client registers +with the servers to provide the resources available, attributes, and installed drivers. +Servers use this information for scheduling decisions and create allocations to assign +work to clients. + +Users make use of the Nomad CLI or API to submit jobs to the servers. A job represents +a desired state and provides the set of tasks that should be run. The servers are +responsible for scheduling the tasks, which is done by finding an optimal placement for +each task such that resource utilization is maximized while satisfying all constraints +specified by the job. Resource utilization is maximized by bin packing, in which +the scheduling tries to make use of all the resources of a machine without +exhausting any dimension. Job constraints can be used to ensure an application is +running in an appropriate environment. Constraints can be technical requirements based +on hardware features such as architecture and availability of GPUs, or software features +like operating system and kernel version, or they can be business constraints like +ensuring PCI compliant workloads run on appropriate servers. + +# Getting in Depth + +This has been a brief high-level overview of the architecture of Nomad. There +are more details available for each of the sub-systems. The [consensus protocol](/docs/internals/consensus), +[gossip protocol](/docs/internals/gossip), and [scheduler design](/docs/internals/scheduling/scheduling) +are all documented in more detail. + +For other details, either consult the code, ask in IRC or reach out to the mailing list. diff --git a/content/nomad/v0.11.x/content/docs/internals/consensus.mdx b/content/nomad/v0.11.x/content/docs/internals/consensus.mdx new file mode 100644 index 0000000000..280729ab05 --- /dev/null +++ b/content/nomad/v0.11.x/content/docs/internals/consensus.mdx @@ -0,0 +1,212 @@ +--- +layout: docs +page_title: Consensus Protocol +sidebar_title: Consensus Protocol +description: |- + Nomad uses a consensus protocol to provide Consistency as defined by CAP. + The consensus protocol is based on Raft: In search of an Understandable + Consensus Algorithm. For a visual explanation of Raft, see The Secret Lives of + Data. +--- + +# Consensus Protocol + +Nomad uses a [consensus protocol]() +to provide [Consistency (as defined by CAP)](https://en.wikipedia.org/wiki/CAP_theorem). +The consensus protocol is based on +["Raft: In search of an Understandable Consensus Algorithm"](https://raft.github.io/raft.pdf). +For a visual explanation of Raft, see [The Secret Lives of Data](http://thesecretlivesofdata.com/raft). + +~> **Advanced Topic!** This page covers technical details of +the internals of Nomad. You do not need to know these details to effectively +operate and use Nomad. These details are documented here for those who wish +to learn about them without having to go spelunking through the source code. + +## Raft Protocol Overview + +Raft is a consensus algorithm that is based on +[Paxos](https://en.wikipedia.org/wiki/Paxos_%28computer_science%29). Compared +to Paxos, Raft is designed to have fewer states and a simpler, more +understandable algorithm. + +There are a few key terms to know when discussing Raft: + +- **Log** - The primary unit of work in a Raft system is a log entry. The problem + of consistency can be decomposed into a _replicated log_. A log is an ordered + sequence of entries. We consider the log consistent if all members agree on + the entries and their order. + +- **FSM** - [Finite State Machine](https://en.wikipedia.org/wiki/Finite-state_machine). + An FSM is a collection of finite states with transitions between them. As new logs + are applied, the FSM is allowed to transition between states. Application of the + same sequence of logs must result in the same state, meaning behavior must be deterministic. + +- **Peer set** - The peer set is the set of all members participating in log replication. + For Nomad's purposes, all server nodes are in the peer set of the local region. + +- **Quorum** - A quorum is a majority of members from a peer set: for a set of size `n`, + quorum requires at least `⌊(n/2)+1⌋` members. + For example, if there are 5 members in the peer set, we would need 3 nodes + to form a quorum. If a quorum of nodes is unavailable for any reason, the + cluster becomes _unavailable_ and no new logs can be committed. + +- **Committed Entry** - An entry is considered _committed_ when it is durably stored + on a quorum of nodes. Once an entry is committed it can be applied. + +- **Leader** - At any given time, the peer set elects a single node to be the leader. + The leader is responsible for ingesting new log entries, replicating to followers, + and managing when an entry is considered committed. + +Raft is a complex protocol and will not be covered here in detail (for those who +desire a more comprehensive treatment, the full specification is available in this +[paper](https://raft.github.io/raft.pdf)). +We will, however, attempt to provide a high level description which may be useful +for building a mental model. + +Raft nodes are always in one of three states: follower, candidate, or leader. All +nodes initially start out as a follower. In this state, nodes can accept log entries +from a leader and cast votes. If no entries are received for some time, nodes +self-promote to the candidate state. In the candidate state, nodes request votes from +their peers. If a candidate receives a quorum of votes, then it is promoted to a leader. +The leader must accept new log entries and replicate to all the other followers. +In addition, if stale reads are not acceptable, all queries must also be performed on +the leader. + +Once a cluster has a leader, it is able to accept new log entries. A client can +request that a leader append a new log entry (from Raft's perspective, a log entry +is an opaque binary blob). The leader then writes the entry to durable storage and +attempts to replicate to a quorum of followers. Once the log entry is considered +_committed_, it can be _applied_ to a finite state machine. The finite state machine +is application specific; in Nomad's case, we use +[MemDB](https://github.com/hashicorp/go-memdb) to maintain cluster state. + +Obviously, it would be undesirable to allow a replicated log to grow in an unbounded +fashion. Raft provides a mechanism by which the current state is snapshotted and the +log is compacted. Because of the FSM abstraction, restoring the state of the FSM must +result in the same state as a replay of old logs. This allows Raft to capture the FSM +state at a point in time and then remove all the logs that were used to reach that +state. This is performed automatically without user intervention and prevents unbounded +disk usage while also minimizing time spent replaying logs. One of the advantages of +using MemDB is that it allows Nomad to continue accepting new transactions even while +old state is being snapshotted, preventing any availability issues. + +Consensus is fault-tolerant up to the point where quorum is available. +If a quorum of nodes is unavailable, it is impossible to process log entries or reason +about peer membership. For example, suppose there are only 2 peers: A and B. The quorum +size is also 2, meaning both nodes must agree to commit a log entry. If either A or B +fails, it is now impossible to reach quorum. This means the cluster is unable to add +or remove a node or to commit any additional log entries. This results in +_unavailability_. At this point, manual intervention would be required to remove +either A or B and to restart the remaining node in bootstrap mode. + +A Raft cluster of 3 nodes can tolerate a single node failure while a cluster +of 5 can tolerate 2 node failures. The recommended configuration is to either +run 3 or 5 Nomad servers per region. This maximizes availability without +greatly sacrificing performance. The [deployment table](#deployment_table) below +summarizes the potential cluster size options and the fault tolerance of each. + +In terms of performance, Raft is comparable to Paxos. Assuming stable leadership, +committing a log entry requires a single round trip to half of the cluster. +Thus, performance is bound by disk I/O and network latency. + +## Raft in Nomad + +Only Nomad server nodes participate in Raft and are part of the peer set. All +client nodes forward requests to servers. The clients in Nomad only need to know +about their allocations and query that information from the servers, while the +servers need to maintain the global state of the cluster. + +Since all servers participate as part of the peer set, they all know the current +leader. When an RPC request arrives at a non-leader server, the request is +forwarded to the leader. If the RPC is a _query_ type, meaning it is read-only, +the leader generates the result based on the current state of the FSM. If +the RPC is a _transaction_ type, meaning it modifies state, the leader +generates a new log entry and applies it using Raft. Once the log entry is committed +and applied to the FSM, the transaction is complete. + +Because of the nature of Raft's replication, performance is sensitive to network +latency. For this reason, each region elects an independent leader and maintains +a disjoint peer set. Data is partitioned by region, so each leader is responsible +only for data in their region. When a request is received for a remote region, +the request is forwarded to the correct leader. This design allows for lower latency +transactions and higher availability without sacrificing consistency. + +## Consistency Modes + +Although all writes to the replicated log go through Raft, reads are more +flexible. To support various trade-offs that developers may want, Nomad +supports 2 different consistency modes for reads. + +The two read modes are: + +- `default` - Raft makes use of leader leasing, providing a time window + in which the leader assumes its role is stable. However, if a leader + is partitioned from the remaining peers, a new leader may be elected + while the old leader is holding the lease. This means there are 2 leader + nodes. There is no risk of a split-brain since the old leader will be + unable to commit new logs. However, if the old leader services any reads, + the values are potentially stale. The default consistency mode relies only + on leader leasing, exposing clients to potentially stale values. We make + this trade-off because reads are fast, usually strongly consistent, and + only stale in a hard-to-trigger situation. The time window of stale reads + is also bounded since the leader will step down due to the partition. + +- `stale` - This mode allows any server to service the read regardless of if + it is the leader. This means reads can be arbitrarily stale but are generally + within 50 milliseconds of the leader. The trade-off is very fast and scalable + reads but with stale values. This mode allows reads without a leader meaning + a cluster that is unavailable will still be able to respond. + +## Deployment Table ((#deployment_table)) + +Below is a table that shows quorum size and failure tolerance for various +cluster sizes. The recommended deployment is either 3 or 5 servers. A single +server deployment is _**highly**_ discouraged as data loss is inevitable in a +failure scenario. + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + +
ServersQuorum SizeFailure Tolerance
110
220
321
431
532
642
743
diff --git a/content/nomad/v0.11.x/content/docs/internals/gossip.mdx b/content/nomad/v0.11.x/content/docs/internals/gossip.mdx new file mode 100644 index 0000000000..400e0c6629 --- /dev/null +++ b/content/nomad/v0.11.x/content/docs/internals/gossip.mdx @@ -0,0 +1,35 @@ +--- +layout: docs +page_title: Gossip Protocol +sidebar_title: Gossip Protocol +description: |- + Nomad uses a gossip protocol to manage membership. All of this is provided + through the use of the Serf library. +--- + +# Gossip Protocol + +Nomad uses a [gossip protocol](https://en.wikipedia.org/wiki/Gossip_protocol) +to manage membership. This is provided through the use of the [Serf library](https://www.serf.io/). +The gossip protocol used by Serf is based on +["SWIM: Scalable Weakly-consistent Infection-style Process Group Membership Protocol"](https://www.cs.cornell.edu/projects/Quicksilver/public_pdfs/SWIM.pdf), +with a few minor adaptations. There are more details about [Serf's protocol here](https://www.serf.io/docs/internals/gossip.html). + +~> **Advanced Topic!** This page covers technical details of +the internals of Nomad. You do not need to know these details to effectively +operate and use Nomad. These details are documented here for those who wish +to learn about them without having to go spelunking through the source code. + +## Gossip in Nomad + +Nomad makes use of a single global WAN gossip pool that all servers participate in. +Membership information provided by the gossip pool allows servers to perform cross region +requests. The integrated failure detection allows Nomad to gracefully handle an entire region +losing connectivity, or just a single server in a remote region. The gossip protocol +is also used to detect servers in the same region to perform automatic clustering +via the [consensus protocol](/docs/internals/consensus). + +All of these features are provided by leveraging [Serf](https://www.serf.io/). It +is used as an embedded library to provide these features. From a user perspective, +this is not important, since the abstraction should be masked by Nomad. It can be useful +however as a developer to understand how this library is leveraged. diff --git a/content/nomad/v0.11.x/content/docs/internals/index.mdx b/content/nomad/v0.11.x/content/docs/internals/index.mdx new file mode 100644 index 0000000000..d9c0711f9c --- /dev/null +++ b/content/nomad/v0.11.x/content/docs/internals/index.mdx @@ -0,0 +1,18 @@ +--- +layout: docs +page_title: Internals +sidebar_title: Internals +description: >- + This section covers the internals of Nomad and explains technical details of + Nomads operation. +--- + +# Nomad Internals + +This section covers the internals of Nomad and explains the technical +details of how Nomad functions, its architecture, and sub-systems. + +-> **Note:** Knowledge of Nomad internals is not +required to use Nomad. If you aren't interested in the internals +of Nomad, you may safely skip this section. If you are operating Nomad, +we recommend understanding the internals. diff --git a/content/nomad/v0.11.x/content/docs/internals/plugins/base.mdx b/content/nomad/v0.11.x/content/docs/internals/plugins/base.mdx new file mode 100644 index 0000000000..8a71735125 --- /dev/null +++ b/content/nomad/v0.11.x/content/docs/internals/plugins/base.mdx @@ -0,0 +1,80 @@ +--- +layout: docs +page_title: Base Plugin +sidebar_title: Base +description: Learn about how to author a Nomad plugin. +--- + +# Base Plugin + +The base plugin is a special plugin type implemented by all plugins. It allows +for common plugin operations such as defining a configuration schema and +version information. + +## Plugin API + +#### `PluginInfo() (*PluginInfoResponse, error)` + +A `PluginInfoResponse` contains meta data about the plugin. + +```go +PluginInfoResponse{ + // Type is the plugin type which is implemented + Type: PluginTypeDriver, + // Plugin API versions supported by the plugin + PluginApiVersions: []string{drivers.ApiVersion010}, + // Version of the plugin + PluginVersion: "0.1.0", + // Name of the plugin + Name: "foodriver", +} +``` + +#### `ConfigSchema() (*hclspec.Spec, error)` + +The `ConfigSchema` function allows a plugin to tell Nomad the schema for its +configuration. This configuration is given in a [plugin block][pluginblock] of +the client configuration. The schema is defined with the [hclspec][hclspec] +package. + +#### `SetConfig(config *Config) error` + +The `SetConfig` function is called when starting the plugin for the first +time. The `Config` given has two different configuration fields. The first +`PluginConfig`, is an encoded configuration from the `plugin` block of the +client config. The second, `AgentConfig`, is the Nomad agent's configuration +which is given to all plugins. + +## HCL Specifications + +`*hclspec.Spec` is a struct that defines the schema to validate an HCL entity +against. The full documentation of the different hcl attribute types can be +found on the [hclspec godoc][hclspec]. + +For a basic example, lets look at the driver configuration for the raw_exec +driver: + +```hcl +job "example" { +... + driver = "raw_exec" + config { + command = "/bin/sleep" + args = ["100"] + } +} +``` + +The `config` block is what is validated against the `hclspec.Spec`. It has two +keys, command which takes a string attribute and args which takes an array +attribute. The corresponding `*hclspec.Spec` would be: + +```go + spec := hclspec.NewObject(map[string]*hclspec.Spec{ + "command": hclspec.NewAttr("command", "string", true), + "args": hclspec.NewAttr("args", "list(string)", false), + }) +``` + +[hclspec]: https://godoc.org/github.com/hashicorp/nomad/plugins/shared/hclspec +[pluginblock]: /docs/configuration/plugin diff --git a/content/nomad/v0.11.x/content/docs/internals/plugins/csi.mdx b/content/nomad/v0.11.x/content/docs/internals/plugins/csi.mdx new file mode 100644 index 0000000000..e0c3c57ec1 --- /dev/null +++ b/content/nomad/v0.11.x/content/docs/internals/plugins/csi.mdx @@ -0,0 +1,128 @@ +--- +layout: docs +page_title: Storage Plugins +sidebar_title: Storage +description: Learn how Nomad manages dynamic storage plugins. +--- + +# Storage Plugins + +Nomad has built-in support for scheduling compute resources such as +CPU, memory, and networking. Nomad's storage plugin support extends +this to allow scheduling tasks with externally created storage +volumes. Storage plugins are third-party plugins that conform to the +[Container Storage Interface (CSI)][csi-spec] specification. + +Storage plugins are created dynamically as Nomad jobs, unlike device +and task driver plugins that need to be installed and configured on +each client. Each dynamic plugin type has its own type-specific job +spec block; currently there is only the `csi_plugin` type. Nomad +tracks which clients have instances of a given plugin, and +communicates with plugins over a Unix domain socket that it creates +inside the plugin's tasks. + +## CSI Plugins + +Every storage vendor has its own APIs and workflows, and the +industry-standard Container Storage Interface specification unifies +these APIs in a way that's agnostic to both the storage vendor and the +container orchestrator. Each storage provider can build its own CSI +plugin. Jobs can claim storage volumes from AWS Elastic Block Storage +(EBS) volumes, GCP persistent disks, Ceph, Portworx, vSphere, etc. The +Nomad scheduler will be aware of volumes created by CSI plugins and +schedule workloads based on the availability of volumes on a given +Nomad client node. A list of available CSI plugins can be found in the +[Kubernetes CSI documentation][csi-drivers-list]. Any of these plugins +should work with Nomad out of the box. + +A CSI plugin task requires the [`csi_plugin`][csi_plugin] block: + +```hcl +csi_plugin { + id = "csi-hostpath" + type = "monolith" + mount_dir = "/csi" +} +``` + +There are three **types** of CSI plugins. **Controller Plugins** +communicate with the storage provider's APIs. For example, for a job +that needs an AWS EBS volume, Nomad will tell the controller plugin +that it needs a volume to be "published" to the client node, and the +controller will make the API calls to AWS to attach the EBS volume to +the right EC2 instance. **Node Plugins** do the work on each client +node, like creating mount points. **Monolith Plugins** are plugins +that perform both the controller and node roles in the same +instance. Not every plugin provider has or needs a controller; that's +specific to the provider implementation. + +You should almost always run node plugins as Nomad `system` jobs to +ensure volume claims are released when a Nomad client is drained. Use +constraints for the node plugin jobs based on the availability of +volumes. For example, AWS EBS volumes are specific to particular +availability zones with a region. Controller plugins can be run as +`service` jobs. + +Nomad exposes a Unix domain socket named `csi.sock` inside each CSI +plugin task, and communicates over the gRPC protocol expected by the +CSI specification. The `mount_dir` field tells Nomad where the plugin +expects to find the socket file. + +### Plugin Lifecycle and State + +CSI plugins report their health like other Nomad jobs. If the plugin +crashes or otherwise terminates, Nomad will launch it again using the +same `restart` and `reschedule` logic used for other jobs. If plugins +are unhealthy, Nomad will mark the volumes they manage as +"unscheduable". + +Storage plugins don't have any responsibility (or ability) to monitor +the state of tasks that claim their volumes. Nomad sends mount and +publish requests to storage plugins when a task claims a volume, and +unmount and unpublish requests when a task stops. + +The dynamic plugin registry persists state to the Nomad client so that +it can restore volume managers for plugin jobs after client restarts +without disrupting storage. + +### Volume Lifecycle + +The Nomad scheduler decides whether a given client can run an +allocation based on whether it has a node plugin present for the +volume. But before a task can use a volume the client needs to "claim" +the volume for the allocation. The client makes an RPC call to the +server and waits for a response; the allocation's tasks won't start +until the volume has been claimed and is ready. + +If the volume's plugin requires a controller, the server will send an +RPC to the Nomad client where that controller is running. The Nomad +client will forward this request over the controller plugin's gRPC +socket. The controller plugin will make the request volume available +to the node that needs it. + +Once the controller is done (or if there's no controller required), +the server will increment the count of claims on the volume and return +to the client. This count passes through Nomad's state store so that +Nomad has a consistent view of which volumes are available for +scheduling. + +The client then makes RPC calls to the node plugin running on that +client, and the node plugin mounts the volume to a staging area in +the Nomad data directory. Nomad will bind-mount this staged directory +into each task that mounts the volume. + +This cycle is reversed when a task that claims a volume becomes +terminal. The client updates the server frequently about changes to +allocations, including terminal state. When the server receives a +terminal state for a job with volume claims, it creates a volume claim +garbage collection (GC) evaluation to to handled by the core job +scheduler. The GC job will send "detach" RPCs to the node plugin. The +node plugin unmounts the bind-mount from the allocation and unmounts +the volume from the plugin (if it's not in use by another task). The +GC job will then send "unpublish" RPCs to the controller plugin (if +any), and decrement the claim count for the volume. At this point the +volume’s claim capacity has been freed up for scheduling. + +[csi-spec]: https://github.com/container-storage-interface/spec +[csi-drivers-list]: https://kubernetes-csi.github.io/docs/drivers.html +[csi_plugin]: /docs/job-specification/csi_plugin diff --git a/content/nomad/v0.11.x/content/docs/internals/plugins/devices.mdx b/content/nomad/v0.11.x/content/docs/internals/plugins/devices.mdx new file mode 100644 index 0000000000..01829331bb --- /dev/null +++ b/content/nomad/v0.11.x/content/docs/internals/plugins/devices.mdx @@ -0,0 +1,82 @@ +--- +layout: docs +page_title: Device Plugins +sidebar_title: Devices +description: Learn how to author a Nomad device plugin. +--- + +# Devices + +Nomad has built-in support for scheduling compute resources such as CPU, memory, +and networking. Nomad device plugins are used to support scheduling tasks with +other devices, such as GPUs. They are responsible for fingerprinting these +devices and working with the Nomad client to make them available to assigned +tasks. + +For a real world example of a Nomad device plugin implementation, see the [Nvidia +GPU plugin](https://github.com/hashicorp/nomad/tree/master/devices/gpu/nvidia). + +## Authoring Device Plugins + +Authoring a device plugin in Nomad consists of implementing the +[DevicePlugin][deviceplugin] interface alongside +a main package to launch the plugin. + +The [device plugin skeleton project][skeletonproject] exists to help bootstrap +the development of new device plugins. It provides most of the boilerplate +necessary for a device plugin, along with detailed comments. + +### Lifecycle and State + +A device plugin is long-lived. Nomad will ensure that one instance of the plugin is +running. If the plugin crashes or otherwise terminates, Nomad will launch another +instance of it. + +However, unlike [task drivers](/docs/internals/plugins/task-drivers), device plugins do not currently +have an interface for persisting state to the Nomad client. Instead, the device +plugin API emphasizes fingerprinting devices and reporting their status. After +helping to provision a task with a scheduled device, a device plugin does not +have any responsibility (or ability) to monitor the task. + +## Device Plugin API + +The [base plugin][baseplugin] must be implemented in addition to the following +functions. + +### `Fingerprint(context.Context) (<-chan *FingerprintResponse, error)` + +The `Fingerprint` [function][fingerprintfn] is called by the client when the plugin is started. +It allows the plugin to provide Nomad with a list of discovered devices, along with their +attributes, for the purpose of scheduling workloads using devices. +The channel returned should immediately send an initial +[`FingerprintResponse`][fingerprintresponse], then send periodic updates at +an appropriate interval until the context is canceled. + +Each fingerprint response consists of either an error or a list of device groups. +A device group is a list of detected devices that are identical for the purpose of +scheduling; that is, they will have identical attributes. + +### `Stats(context.Context, time.Duration) (<-chan *StatsResponse, error)` + +The `Stats` [function][statsfn] returns a channel on which the plugin should +emit device statistics, at the specified interval, until either an error is +encountered or the specified context is cancelled. The `StatsReponse` object +allows [dimensioned][dimensioned] statistics to be returned for each device in a device group. + +### `Reserve(deviceIDs []string) (*ContainerReservation, error)` + +The `Reserve` [function][reservefn] accepts a list of device IDs and returns the information +necessary for the client to make those devices available to a task. Currently, +the `ContainerReservation` object allows the plugin to specify environment +variables for the task, as well as a list of host devices and files to be mounted +into the task's filesystem. Any orchestration required to prepare the device for +use should also be performed in this function. + +[deviceplugin]: https://github.com/hashicorp/nomad/blob/v0.9.0/plugins/device/device.go#L20-L33 +[baseplugin]: /docs/internals/plugins/base +[skeletonproject]: https://github.com/hashicorp/nomad-skeleton-device-plugin +[fingerprintresponse]: https://github.com/hashicorp/nomad/blob/v0.9.0/plugins/device/device.go#L37-L43 +[fingerprintfn]: https://github.com/hashicorp/nomad-skeleton-device-plugin/blob/v0.1.0/device/device.go#L159-L165 +[statsfn]: https://github.com/hashicorp/nomad-skeleton-device-plugin/blob/v0.1.0/device/device.go#L169-L176 +[reservefn]: https://github.com/hashicorp/nomad-skeleton-device-plugin/blob/v0.1.0/device/device.go#L189-L245 +[dimensioned]: https://github.com/hashicorp/nomad/blob/v0.9.0/plugins/shared/structs/stats.go#L33-L34 diff --git a/content/nomad/v0.11.x/content/docs/internals/plugins/index.mdx b/content/nomad/v0.11.x/content/docs/internals/plugins/index.mdx new file mode 100644 index 0000000000..d9bb9ae245 --- /dev/null +++ b/content/nomad/v0.11.x/content/docs/internals/plugins/index.mdx @@ -0,0 +1,32 @@ +--- +layout: docs +page_title: Plugins +sidebar_title: Plugins +description: Learn about how external plugins work in Nomad. +--- + +# Plugins + +Nomad 0.9 introduced a plugin framework which allows users to extend the +functionality of some components within Nomad. The design of the plugin system +is inspired by the lessons learned from plugin systems implemented in other +HashiCorp products such as Terraform and Vault. + +The following components are currently pluggable within Nomad: + +- [Task Drivers](/docs/internals/plugins/task-drivers) +- [Devices](/docs/internals/plugins/devices) + +# Architecture + +The Nomad plugin framework uses the [go-plugin][goplugin] project to expose +a language independent plugin interface. Plugins implement a set of gRPC +services and methods which Nomad manages by running the plugin and calling the +implemented RPCs. This means that plugins are free to be implemented in the +author's language of choice. + +To make plugin development easier, a set of go interfaces and structs exist for +each plugin type that abstract away go-plugin and the gRPC interface. The +guides in this documentation reference these abstractions for ease of use. + +[goplugin]: https://github.com/hashicorp/go-plugin diff --git a/content/nomad/v0.11.x/content/docs/internals/plugins/task-drivers.mdx b/content/nomad/v0.11.x/content/docs/internals/plugins/task-drivers.mdx new file mode 100644 index 0000000000..7a45b0cb95 --- /dev/null +++ b/content/nomad/v0.11.x/content/docs/internals/plugins/task-drivers.mdx @@ -0,0 +1,212 @@ +--- +layout: docs +page_title: Task Driver Plugins +sidebar_title: Task Drivers +description: Learn how to author a Nomad task driver plugin. +--- + +# Task Drivers + +Task drivers in Nomad are the runtime components that execute workloads. For +a real world example of a Nomad task driver plugin implementation, see the [LXC +driver source][lxcdriver]. + +## Authoring Task Driver Plugins + +Authoring a task driver (shortened to driver in this documentation) in Nomad +consists of implementing the [DriverPlugin][driverplugin] interface and adding +a main package to launch the plugin. A driver plugin is long-lived and its +lifetime is not bound to the Nomad client. This means that the Nomad client can +be restarted without restarting the driver. Nomad will ensure that one +instance of the driver is running, meaning if the driver crashes or otherwise +terminates, Nomad will launch another instance of it. + +Drivers should maintain as little state as possible. State for a task is stored +by the Nomad client on task creation. This enables a pattern where the driver +can maintain an in-memory state of the running tasks, and if necessary the +Nomad client can recover tasks into the driver state. + +The [driver plugin skeleton project][skeletonproject] exists to help bootstrap +the development of new driver plugins. It provides most of the boilerplate +necessary for a driver plugin, along with detailed comments. + +## Task Driver Plugin API + +The [base plugin][baseplugin] must be implemented in addition to the following +functions. + +### `TaskConfigSchema() (*hclspec.Spec, error)` + +This function returns the schema for the driver configuration of the task. For +more information on `hclspec.Spec` see the HCL section in the [base +plugin][baseplugin] documentation. + +### `Capabilities() (*Capabilities, error)` + +Capabilities define what features the driver implements. Example: + +```go +Capabilities { + // Does the driver support sending OS signals to the task? + SendSignals: true, + // Does the driver support executing a command within the task execution + // environment? + Exec: true, + // What filesystem isolation is supported by the driver. Options include + // FSIsolationImage, FSIsolationChroot, and FSIsolationNone + FSIsolation: FSIsolationImage, + + // NetIsolationModes lists the set of isolation modes supported by the driver. + // Options include NetIsolationModeHost, NetIsolationModeGroup, + // NetIsolationModeTask, and NetIsolationModeNone. + NetIsolationModes []NetIsolationMode + + // MustInitiateNetwork tells Nomad that the driver must create the network + // namespace and that the CreateNetwork and DestroyNetwork RPCs are implemented. + MustInitiateNetwork bool + + // MountConfigs tells Nomad which mounting config options the driver + // supports. This is used to check whether mounting host volumes or CSI + // volumes is allowed. Options include MountConfigSupportAll (default), + // or MountConfigSupportNone. + MountConfigs MountConfigSupport +} +``` + +### `Fingerprint(context.Context) (<-chan *Fingerprint, error)` + +This function is called by the client when the plugin is started. It allows the +driver to indicate its health to the client. The channel returned should +immediately send an initial Fingerprint, then send periodic updates at an +interval that is appropriate for the driver until the context is canceled. + +The fingerprint consists of a `HealthState` and `HealthDescription` to inform +the client about its health. Additionally an `Attributes` field is available +for the driver to add additional attributes to the client node. The fingerprint +`HealthState` can be one of three states. + +- `HealthStateUndetected`: Indicates that the necessary dependencies for the + driver are not detected on the system. Ex. java runtime for the java driver +- `HealthStateUnhealthy`: Indicates that something is wrong with the driver + runtime. Ex. docker daemon stopped for the Docker driver +- `HealthStateHealthy`: All systems go + +### `StartTask(*TaskConfig) (*TaskHandle, *DriverNetwork, error)` + +This function takes a [`TaskConfig`][taskconfig] which includes all of the configuration +needed to launch the task. Additionally the driver configuration can be decoded +from the `TaskConfig` by calling `*TaskConfig.DecodeDriverConfig(t interface{})` +passing in a pointer to the driver specific configuration struct. The +`TaskConfig` includes an `ID` field which future operations on the task will be +referenced by. + +Drivers return a [`*TaskHandle`][taskhandle] which contains +the required information for the driver to reattach to the running task in the +case of plugin crashes or restarts. Some of this required state +will be specific to the driver implementation, thus a `DriverState` field +exists to allow the driver to encode custom state into the struct. Helper +fields exist on the `TaskHandle` to `GetDriverState` and `SetDriverState` +removing the need for the driver to handle serialization. + +A `*DriverNetwork` can optionally be returned to describe the network of the +task if it is modified by the driver. An example of this is in the Docker +driver where tasks can be attached to a specific Docker network. + +If an error occurs, it is expected that the driver will cleanup any created +resources prior to returning the error. + +#### Logging + +Nomad handles all rotation and plumbing of task logs. In order for task stdout +and stderr to be received by Nomad, they must be written to the correct +location. Prior to starting the task through the driver, the Nomad client +creates FIFOs for stdout and stderr. These paths are given to the driver in the +`TaskConfig`. The [`fifo` package][fifopackage] can be used to support +cross platform writing to these paths. + +#### TaskHandle Schema Versioning + +A `Version` field is available on the TaskHandle struct to facilitate backwards +compatible recovery of tasks. This field is opaque to Nomad, but allows the +driver to handle recover tasks that were created by an older version of the +plugin. + +### `RecoverTask(*TaskHandle) error` + +When a driver is restarted it is not expected to persist any internal state to +disk. To support this, Nomad will attempt to recover a task that was +previously started if the driver does not recognize the task ID. During task +recovery, Nomad calls `RecoverTask` passing the `TaskHandle` that was +returned by the `StartTask` function. If no error was returned, it is +expected that the driver can now operate on the task by referencing the task +ID. If an error occurs, the Nomad client will mark the task as `lost`. + +### `WaitTask(context.Context, id string) (<-chan *ExitResult, error)` + +The `WaitTask` function is expected to return a channel that will send an +`*ExitResult` when the task exits or close the channel when the context is +canceled. It is also expected that calling `WaitTask` on an exited task will +immediately send an `*ExitResult` on the returned channel. + +### `StopTask(taskID string, timeout time.Duration, signal string) error` + +The `StopTask` function is expected to stop a running task by sending the given +signal to it. If the task does not stop during the given timeout, the driver +must forcefully kill the task. + +`StopTask` does not clean up resources of the task or remove it from the +driver's internal state. A call to `WaitTask` after `StopTask` is valid and +should be handled. + +### `DestroyTask(taskID string, force bool) error` + +The `DestroyTask` function cleans up and removes a task that has terminated. If +force is set to true, the driver must destroy the task even if it is still +running. If `WaitTask` is called after `DestroyTask`, it should return +`drivers.ErrTaskNotFound` as no task state should exist after `DestroyTask` is +called. + +### `InspectTask(taskID string) (*TaskStatus, error)` + +The `InspectTask` function returns detailed status information for the +referenced `taskID`. + +### `TaskStats(context.Context, id string, time.Duration) (<-chan *cstructs.TaskResourceUsage, error)` + +The `TaskStats` function returns a channel which the driver should send stats +to at the given interval. The driver must send stats at the given interval +until the given context is canceled or the task terminates. + +### `TaskEvents(context.Context) (<-chan *TaskEvent, error)` + +The Nomad client publishes events associated with an allocation. The +`TaskEvents` function allows the driver to publish driver specific events about +tasks and the Nomad client will associate them with the correct allocation. + +An `Eventer` utility is available in the +`github.com/hashicorp/nomad/drivers/shared/eventer` package implements an +event loop and publishing mechanism for use in the `TaskEvents` function. + +### `SignalTask(taskID string, signal string) error` + +> Optional - can be skipped by embedding `drivers.DriverSignalTaskNotSupported` + +The `SignalTask` function is used by drivers which support sending OS signals +(`SIGHUP`, `SIGKILL`, `SIGUSR1` etc.) to the task. It is an optional function +and is listed as a capability in the driver `Capabilities` struct. + +### `ExecTask(taskID string, cmd []string, timeout time.Duration) (*ExecTaskResult, error)` + +> Optional - can be skipped by embedding `drivers.DriverExecTaskNotSupported` + +The `ExecTask` function is used by the Nomad client to execute commands inside +the task execution context. For example, the Docker driver executes commands +inside the running container. `ExecTask` is called for Consul script checks. + +[lxcdriver]: https://github.com/hashicorp/nomad-driver-lxc +[driverplugin]: https://github.com/hashicorp/nomad/blob/v0.9.0/plugins/drivers/driver.go#L39-L57 +[skeletonproject]: https://github.com/hashicorp/nomad-skeleton-driver-plugin +[baseplugin]: /docs/internals/plugins/base +[taskconfig]: https://godoc.org/github.com/hashicorp/nomad/plugins/drivers#TaskConfig +[taskhandle]: https://godoc.org/github.com/hashicorp/nomad/plugins/drivers#TaskHandle +[fifopackage]: https://godoc.org/github.com/hashicorp/nomad/client/lib/fifo diff --git a/content/nomad/v0.11.x/content/docs/internals/scheduling/index.mdx b/content/nomad/v0.11.x/content/docs/internals/scheduling/index.mdx new file mode 100644 index 0000000000..76e2deaf0d --- /dev/null +++ b/content/nomad/v0.11.x/content/docs/internals/scheduling/index.mdx @@ -0,0 +1,20 @@ +--- +layout: docs +page_title: Scheduling +sidebar_title: Scheduling +description: Learn about how scheduling works in Nomad. +--- + +# Scheduling + +Scheduling is a core function of Nomad. It is the process of assigning tasks +from jobs to client machines. The design is heavily inspired by Google's work on +both [Omega: flexible, scalable schedulers for large compute clusters][omega] and +[Large-scale cluster management at Google with Borg][borg]. See the links below +for implementation details on scheduling in Nomad. + +- [Scheduling Internals](/docs/internals/scheduling/scheduling) - An overview of how the scheduler works. +- [Preemption](/docs/internals/scheduling/preemption) - Details of preemption, an advanced scheduler feature introduced in Nomad 0.9. + +[omega]: https://research.google.com/pubs/pub41684.html +[borg]: https://research.google.com/pubs/pub43438.html diff --git a/content/nomad/v0.11.x/content/docs/internals/scheduling/preemption.mdx b/content/nomad/v0.11.x/content/docs/internals/scheduling/preemption.mdx new file mode 100644 index 0000000000..50f4119d85 --- /dev/null +++ b/content/nomad/v0.11.x/content/docs/internals/scheduling/preemption.mdx @@ -0,0 +1,99 @@ +--- +layout: docs +page_title: Preemption +sidebar_title: Preemption +description: Learn about how preemption works in Nomad. +--- + +# Preemption + +Preemption allows Nomad to kill existing allocations in order to place allocations for a higher priority job. +The evicted allocation is temporary displaced until the cluster has capacity to run it. This allows operators to +run high priority jobs even under resource contention across the cluster. + +~> **Advanced Topic!** This page covers technical details of Nomad. You do not need to understand these details to effectively use Nomad. The details are documented here for those who wish to learn about them without having to go spelunking through the source code. + +# Preemption in Nomad + +Every job in Nomad has a priority associated with it. Priorities impact scheduling at the evaluation and planning +stages by sorting the respective queues accordingly (higher priority jobs get moved ahead in the queues). + +Prior to Nomad 0.9, when a cluster is at capacity, any allocations that result from a newly scheduled or updated +job remain in the pending state until sufficient resources become available - regardless of the defined priority. +This leads to priority inversion, where a low priority task can prevent high priority tasks from completing. + +Nomad has preemption capabilities for service, batch, and system jobs. The Nomad scheduler can be configured to evict lower priority running allocations +to free up capacity for new allocations resulting from relatively higher priority jobs, sending evicted allocations back +into the plan queue. + +# Details + +~> **Enterprise Functionality** System job preemption is available as an Open Source +feature, while Batch and Service job preemption are only available as +Enterprise features. + +Preemption is enabled by default in Nomad 0.9. Operators can use the [scheduler config](/api-docs/operator#update-scheduler-configuration) API endpoint to disable preemption. + +Nomad uses the [job priority](/docs/job-specification/job#priority) field to determine what running allocations can be preempted. +In order to prevent a cascade of preemptions due to jobs close in priority being preempted, only allocations from jobs with a priority +delta of more than 10 from the job needing placement are eligible for preemption. + +For example, consider a node with the following distribution of allocations: + +| Job | Priority | Allocations | Total Used capacity | +| --------------- | -------- | ----------- | ------------------------------------------------------------------------ | +| cache | 70 | a6 | 2 GB Memory, 0.5 GB Disk, 1 CPU | +| batch-analytics | 50 | a4, a5 | <1 GB Memory, 0.5 GB Disk, 0.5 CPU>, <1 GB Memory, 0.5 GB Disk, 0.5 CPU> | +| email-marketing | 20 | a1, a2 | <0.5 GB Memory, 0.8 GB Disk>, <0.5 GB Memory, 0.2 GB Disk> | + +If a job `webapp` with priority `75` needs placement on the above node, only allocations from `batch-analytics` and `email-marketing` are considered +eligible to be preempted because they are of a lower priority. Allocations from the `cache` job will never be preempted because its priority value `70` +is lesser than the required delta of `10`. + +Allocations are selected starting from the lowest priority, and scored according +to how closely they fit the job's required capacity. For example, if the `75` priority job needs 1GB disk and 2GB memory, Nomad will preempt +allocations `a1`, `a2` and `a4` to satisfy those requirements. + +# Preemption Visibility + +Operators can use the [allocation API](/api-docs/allocations#read-allocation) or the `alloc status` command to get visibility into +whether an allocation has been preempted. Preempted allocations will have their DesiredStatus set to “evict”. The `Allocation` object +in the API also has two additional fields related to preemption. + +- `PreemptedAllocs` - This field is set on an allocation that caused preemption. It contains the allocation ids of allocations + that were preempted to place this allocation. In the above example, allocations created for the job `webapp` will have the values + `a1`, `a2` and `a4` set. +- `PreemptedByAllocID` - This field is set on allocations that were preempted by the scheduler. It contains the allocation ID of the allocation + that preempted it. In the above example, allocations `a1`, `a2` and `a4` will have this field set to the ID of the allocation from the job `webapp`. + +# Integration with Nomad plan + +`nomad plan` allows operators to dry run the scheduler. If the scheduler determines that +preemption is necessary to place the job, it shows additional information in the CLI output for +`nomad plan` as seen below. + +```shell-session +$ nomad plan example.nomad + ++ Job: "test" ++ Task Group: "test" (1 create) + + Task: "test" (forces create) + +Scheduler dry-run: +- All tasks successfully allocated. + +Preemptions: + +Alloc ID Job ID Task Group +ddef9521 my-batch analytics +ae59fe45 my-batch analytics +``` + +Note that, the allocations shown in the `nomad plan` output above +are not guaranteed to be the same ones picked when running the job later. +They provide the operator a sample of the type of allocations that could be preempted. + +[omega]: https://research.google.com/pubs/pub41684.html +[borg]: https://research.google.com/pubs/pub43438.html +[img-data-model]: /img/nomad-data-model.png +[img-eval-flow]: /img/nomad-evaluation-flow.png diff --git a/content/nomad/v0.11.x/content/docs/internals/scheduling/scheduling.mdx b/content/nomad/v0.11.x/content/docs/internals/scheduling/scheduling.mdx new file mode 100644 index 0000000000..96c9ff1983 --- /dev/null +++ b/content/nomad/v0.11.x/content/docs/internals/scheduling/scheduling.mdx @@ -0,0 +1,94 @@ +--- +layout: docs +page_title: Scheduling +sidebar_title: Internals +description: Learn about how scheduling works in Nomad. +--- + +# Scheduling in Nomad + +[![Nomad Data Model][img-data-model]][img-data-model] + +There are four primary "nouns" in Nomad; jobs, nodes, allocations, and +evaluations. Jobs are submitted by users and represent a _desired state_. A job +is a declarative description of tasks to run which are bounded by constraints +and require resources. Tasks can be scheduled on nodes in the cluster running +the Nomad client. The mapping of tasks in a job to clients is done using +allocations. An allocation is used to declare that a set of tasks in a job +should be run on a particular node. Scheduling is the process of determining +the appropriate allocations and is done as part of an evaluation. + +An evaluation is created any time the external state, either desired or +emergent, changes. The desired state is based on jobs, meaning the desired +state changes if a new job is submitted, an existing job is updated, or a job +is deregistered. The emergent state is based on the client nodes, and so we +must handle the failure of any clients in the system. These events trigger the +creation of a new evaluation, as Nomad must _evaluate_ the state of the world +and reconcile it with the desired state. + +This diagram shows the flow of an evaluation through Nomad: + +[![Nomad Evaluation Flow][img-eval-flow]][img-eval-flow] + +The lifecycle of an evaluation begins with an event causing the evaluation to +be created. Evaluations are created in the `pending` state and are enqueued +into the evaluation broker. There is a single evaluation broker which runs on +the leader server. The evaluation broker is used to manage the queue of pending +evaluations, provide priority ordering, and ensure at least once delivery. + +Nomad servers run scheduling workers, defaulting to one per CPU core, which are +used to process evaluations. The workers dequeue evaluations from the broker, +and then invoke the appropriate scheduler as specified by the job. Nomad ships +with a `service` scheduler that optimizes for long-lived services, a `batch` +scheduler that is used for fast placement of batch jobs, a `system` scheduler +that is used to run jobs on every node, and a `core` scheduler which is used +for internal maintenance. Nomad can be extended to support custom schedulers as +well. + +Schedulers are responsible for processing an evaluation and generating an +allocation _plan_. The plan is the set of allocations to evict, update, or +create. The specific logic used to generate a plan may vary by scheduler, but +generally the scheduler needs to first reconcile the desired state with the +real state to determine what must be done. New allocations need to be placed +and existing allocations may need to be updated, migrated, or stopped. + +Placing allocations is split into two distinct phases, feasibility checking and +ranking. In the first phase the scheduler finds nodes that are feasible by +filtering unhealthy nodes, those missing necessary drivers, and those failing +the specified constraints. + +The second phase is ranking, where the scheduler scores feasible nodes to find +the best fit. Scoring is primarily based on bin packing, which is used to +optimize the resource utilization and density of applications, but is also +augmented by affinity and anti-affinity rules. Nomad automatically applies a job +anti-affinity rule which discourages colocating multiple instances of a task +group. The combination of this anti-affinity and bin packing optimizes for +density while reducing the probability of correlated failures. + +Once the scheduler has ranked enough nodes, the highest ranking node is +selected and added to the allocation plan. + +When planning is complete, the scheduler submits the plan to the leader which +adds the plan to the plan queue. The plan queue manages pending plans, provides +priority ordering, and allows Nomad to handle concurrency races. Multiple +schedulers are running in parallel without locking or reservations, making +Nomad optimistically concurrent. As a result, schedulers might overlap work on +the same node and cause resource over-subscription. The plan queue allows the +leader node to protect against this and do partial or complete rejections of a +plan. + +As the leader processes plans, it creates allocations when there is no conflict +and otherwise informs the scheduler of a failure in the plan result. The plan +result provides feedback to the scheduler, allowing it to terminate or explore +alternate plans if the previous plan was partially or completely rejected. + +Once the scheduler has finished processing an evaluation, it updates the status +of the evaluation and acknowledges delivery with the evaluation broker. This +completes the lifecycle of an evaluation. Allocations that were created, +modified or deleted as a result will be picked up by client nodes and will +begin execution. + +[omega]: https://research.google.com/pubs/pub41684.html +[borg]: https://research.google.com/pubs/pub43438.html +[img-data-model]: /img/nomad-data-model.png +[img-eval-flow]: /img/nomad-evaluation-flow.png diff --git a/content/nomad/v0.11.x/content/docs/job-specification/affinity.mdx b/content/nomad/v0.11.x/content/docs/job-specification/affinity.mdx new file mode 100644 index 0000000000..b6a7cf447f --- /dev/null +++ b/content/nomad/v0.11.x/content/docs/job-specification/affinity.mdx @@ -0,0 +1,279 @@ +--- +layout: docs +page_title: affinity Stanza - Job Specification +sidebar_title: affinity +description: |- + The "affinity" stanza allows restricting the set of eligible nodes. + Affinities may filter on attributes or metadata. Additionally affinities may + be specified at the job, group, or task levels for ultimate flexibility. +--- + +# `affinity` Stanza + + + +The `affinity` stanza allows operators to express placement preference for a set of nodes. Affinities may +be expressed on [attributes][interpolation] or [client metadata][client-meta]. +Additionally affinities may be specified at the [job][job], [group][group], or +[task][task] levels for ultimate flexibility. + +```hcl +job "docs" { + # Prefer nodes in the us-west1 datacenter + affinity { + attribute = "${node.datacenter}" + value = "us-west1" + weight = 100 + } + + group "example" { + # Prefer the "r1" rack + affinity { + attribute = "${meta.rack}" + value = "r1" + weight = 50 + } + + task "server" { + # Prefer nodes where "my_custom_value" is greater than 5 + affinity { + attribute = "${meta.my_custom_value}" + operator = ">" + value = "3" + weight = 50 + } + } + } +} +``` + +Affinities apply to task groups but may be specified within job and task stanzas as well. +Job affinities apply to all groups within the job. Task affinities apply to the whole task group +that the task is a part of. + +Nomad will use affinities when computing scores for placement. Nodes that match affinities will +have their scores boosted. Affinity scores are combined with other scoring factors such as bin packing. +Operators can use weights to express relative preference across multiple affinities. If no nodes match a given affinity, +placement is still successful. This is different from [constraints][constraint] where placement is +restricted only to nodes that meet the constraint's criteria. + +## `affinity` Parameters + +- `attribute` `(string: "")` - Specifies the name or reference of the attribute + to examine for the affinity. This can be any of the [Nomad interpolated + values](/docs/runtime/interpolation#interpreted_node_vars). + +- `operator` `(string: "=")` - Specifies the comparison operator. The ordering is + compared lexically. Possible values include: + + ```text + = + != + > + >= + < + <= + regexp + set_contains_all + set_contains_any + version + ``` + + For a detailed explanation of these values and their behavior, please see + the [operator values section](#operator-values). + +- `value` `(string: "")` - Specifies the value to compare the attribute against + using the specified operation. This can be a literal value, another attribute, + or any [Nomad interpolated + values](/docs/runtime/interpolation#interpreted_node_vars). + +- `weight` `(integer: 50)` - Specifies a weight for the affinity. The weight is used + during scoring and must be an integer between -100 to 100. Negative weights act as + anti affinities, causing nodes that match them to be scored lower. Weights can be used + when there is more than one affinity to express relative preference across them. + +### `operator` Values + +This section details the specific values for the "operator" parameter in the +Nomad job specification for affinities. The operator is always specified as a +string, but the string can take on different values which change the behavior of +the overall affinity evaluation. + +```hcl +affinity { + operator = "..." +} +``` + +- `"regexp"` - Specifies a regular expression affinity against the attribute. + The syntax of the regular expressions accepted is the same general syntax used + by Perl, Python, and many other languages. More precisely, it is the syntax + accepted by RE2 and described at in the [Google RE2 + syntax](https://golang.org/s/re2syntax). + + ```hcl + affinity { + attribute = "..." + operator = "regexp" + value = "[a-z0-9]" + weight = 50 + } + ``` + +- `"set_contains_all"` - Specifies a contains affinity against the attribute. The + attribute and the list being checked are split using commas. This will check + that the given attribute contains **all** of the specified elements. + + ```hcl + affinity { + attribute = "..." + operator = "set_contains" + value = "a,b,c" + weight = 50 + } + ``` + +- `"set_contains"` - Same as `set_contains_all` + +- `"set_contains_any"` - Specifies a contains affinity against the attribute. The + attribute and the list being checked are split using commas. This will check + that the given attribute contains **any** of the specified elements. + + ```hcl + affinity { + attribute = "..." + operator = "set_contains" + value = "a,b,c" + weight = 50 + } + ``` + +- `"version"` - Specifies a version affinity against the attribute. This + supports a comma-separated list of values, including the pessimistic + operator. For more examples please see the [go-version + repository](https://github.com/hashicorp/go-version) for more specific + examples. + + ```hcl + affinity { + attribute = "..." + operator = "version" + value = ">= 0.1.0, < 0.2" + weight = 50 + } + ``` + +## `affinity` Examples + +The following examples only show the `affinity` stanzas. Remember that the +`affinity` stanza is only valid in the placements listed above. + +### Kernel Data + +This example adds a preference for running on nodes which have a kernel version +higher than "3.19". + +```hcl +affinity { + attribute = "${attr.kernel.version}" + operator = "version" + value = "> 3.19" + weight = 50 +} +``` + +### Operating Systems + +This example adds a preference to running on nodes that are running Ubuntu +14.04 + +```hcl +affinity { + attribute = "${attr.os.name}" + value = "ubuntu" + weight = 50 +} + +affinity { + attribute = "${attr.os.version}" + value = "14.04" + weight = 100 +} +``` + +### Meta Data + +The following example adds a preference to running on nodes with specific rack metadata + +```hcl +affinity { + attribute = "${meta.rack}" + value = "rack1" + weight = 50 +} +``` + +The following example adds a preference to running on nodes in a specific datacenter. + +```hcl +affinity { + attribute = "${node.datacenter}" + value = "us-west1" + weight = 50 +} +``` + +### Cloud Metadata + +When possible, Nomad populates node attributes from the cloud environment. These +values are accessible as filters in affinities. This example adds a preference to run this +task on nodes that are memory-optimized on AWS. + +```hcl +affinity { + attribute = "${attr.platform.aws.instance-type}" + value = "m4.xlarge" + weight = 50 +} +``` + +[job]: /docs/job-specification/job 'Nomad job Job Specification' +[group]: /docs/job-specification/group 'Nomad group Job Specification' +[client-meta]: /docs/configuration/client#meta 'Nomad meta Job Specification' +[task]: /docs/job-specification/task 'Nomad task Job Specification' +[interpolation]: /docs/runtime/interpolation 'Nomad interpolation' +[node-variables]: /docs/runtime/interpolation#node-variables- 'Nomad interpolation-Node variables' +[constraint]: /docs/job-specification/constraint 'Nomad Constraint job Specification' + +### Placement Details + +Operators can run `nomad alloc status -verbose` to get more detailed information on various +factors, including affinities that affect the final placement. + +#### Example Placement Metadata + +The following is a snippet from the CLI output of `nomad alloc status -verbose ` showing scoring metadata. + +```text +Placement Metrics +Node binpack job-anti-affinity node-reschedule-penalty node-affinity final score +30bd48cc-d760-1096-9bab-13caac424af5 0.225 -0.6 0 1 0.208 +f2aa8b59-96b8-202f-2258-d98c93e360ab 0.225 -0.6 0 1 0.208 +86df0f74-15cc-3a0e-23f0-ad7306131e0d 0.0806 0 0 0 0.0806 +7d6c2e9e-b080-5995-8b9d-ef1695458b52 0.0806 0 0 0 0.0806 +``` + +The placement score is affected by the following factors. + +- `bin-packing` - Scores nodes according to how well they fit requirements. Optimizes for using minimal number of nodes. +- `job-anti-affinity` - A penalty added for additional instances of the same job on a node, used to avoid having too many instances + of a job on the same node. +- `node-reschedule-penalty` - Used when the job is being rescheduled. Nomad adds a penalty to avoid placing the job on a node where + it has failed to run before. +- `node-affinity` - Used when the criteria specified in the `affinity` stanza matches the node. diff --git a/content/nomad/v0.11.x/content/docs/job-specification/artifact.mdx b/content/nomad/v0.11.x/content/docs/job-specification/artifact.mdx new file mode 100644 index 0000000000..9338e935d2 --- /dev/null +++ b/content/nomad/v0.11.x/content/docs/job-specification/artifact.mdx @@ -0,0 +1,189 @@ +--- +layout: docs +page_title: artifact Stanza - Job Specification +sidebar_title: artifact +description: |- + The "artifact" stanza instructs Nomad to fetch and unpack a remote resource, + such as a file, tarball, or binary, and permits downloading artifacts from a + variety of locations using a URL as the input source. +--- + +# `artifact` Stanza + + + +The `artifact` stanza instructs Nomad to fetch and unpack a remote resource, +such as a file, tarball, or binary. Nomad downloads artifacts using the popular +[`go-getter`][go-getter] library, which permits downloading artifacts from a +variety of locations using a URL as the input source. + +```hcl +job "docs" { + group "example" { + task "server" { + artifact { + source = "https://example.com/file.tar.gz" + destination = "local/some-directory" + options { + checksum = "md5:df6a4178aec9fbdc1d6d7e3634d1bc33" + } + } + } + } +} +``` + +Nomad supports downloading `http`, `https`, `git`, `hg` and `S3` artifacts. If +these artifacts are archived (`zip`, `tgz`, `bz2`, `xz`), they are +automatically unarchived before the starting the task. + +## `artifact` Parameters + +- `destination` `(string: "local/")` - Specifies the directory path to download + the artifact, relative to the root of the task's directory. If omitted, the + default value is to place the artifact in `local/`. The destination is treated + as a directory unless `mode` is set to `file`. Source files will be downloaded + into that directory path. + +- `mode` `(string: "any")` - One of `any`, `file`, or `dir`. If set to `file` + the `destination` must be a file, not a directory. By default the + `destination` will be `local/`. + +- `options` `(map: nil)` - Specifies configuration parameters to + fetch the artifact. The key-value pairs map directly to parameters appended to + the supplied `source` URL. Please see the [`go-getter` + documentation][go-getter] for a complete list of options and examples + +- `source` `(string: )` - Specifies the URL of the artifact to download. + See [`go-getter`][go-getter] for details. + +## `artifact` Examples + +The following examples only show the `artifact` stanzas. Remember that the +`artifact` stanza is only valid in the placements listed above. + +### Download File + +This example downloads the artifact from the provided URL and places it in +`local/file.txt`. The `local/` path is relative to the task's directory. + +```hcl +artifact { + source = "https://example.com/file.txt" +} +``` + +### Download using git + +This example downloads the artifact from the provided GitHub URL and places it at +`local/repo`, as specified by the optional `destination` parameter. + +```hcl +artifact { + source = "git::https://github.com/example/nomad-examples" + destination = "local/repo" +} +``` + +To download from private repo, sshkey needs to be set. The key must be +base64-encoded string. Run `base64 -w0 ` + +```hcl +artifact { + source = "git@github.com:example/nomad-examples" + destination = "local/repo" + options { + sshkey = "" + } +} +``` + +### Download and Unarchive + +This example downloads and unarchives the result in `local/file`. Because the +source URL is an archive extension, Nomad will automatically decompress it: + +```hcl +artifact { + source = "https://example.com/file.tar.gz" +} +``` + +To disable automatic unarchiving, set the `archive` option to false: + +```hcl +artifact { + source = "https://example.com/file.tar.gz" + options { + archive = false + } +} +``` + +### Download and Verify Checksums + +This example downloads an artifact and verifies the resulting artifact's +checksum before proceeding. If the checksum is invalid, an error will be +returned. + +```hcl +artifact { + source = "https://example.com/file.zip" + + options { + checksum = "md5:df6a4178aec9fbdc1d6d7e3634d1bc33" + } +} +``` + +### Download from an S3-compatible Bucket + +These examples download artifacts from Amazon S3. There are several different +types of [S3 bucket addressing][s3-bucket-addr] and [S3 region-specific +endpoints][s3-region-endpoints]. As of Nomad 0.6 non-Amazon S3-compatible +endpoints like [Minio] are supported, but you must explicitly set the "s3::" +prefix. + +This example uses path-based notation on a publicly-accessible bucket: + +```hcl +artifact { + source = "https://my-bucket-example.s3-us-west-2.amazonaws.com/my_app.tar.gz" +} +``` + +If a bucket requires authentication, you can avoid the use of credentials by +using [EC2 IAM instance profiles][iam-instance-profiles]. If this is not possible, +credentials may be supplied via the `options` parameter: + +```hcl +artifact { + options { + aws_access_key_id = "" + aws_access_key_secret = "" + aws_access_token = "" + } +} +``` + +To force the S3-specific syntax, use the `s3::` prefix: + +```hcl +artifact { + source = "s3::https://my-bucket-example.s3-eu-west-1.amazonaws.com/my_app.tar.gz" +} +``` + +Alternatively you can use virtual hosted style: + +```hcl +artifact { + source = "https://my-bucket-example.s3-eu-west-1.amazonaws.com/my_app.tar.gz" +} +``` + +[go-getter]: https://github.com/hashicorp/go-getter 'HashiCorp go-getter Library' +[minio]: https://www.minio.io/ +[s3-bucket-addr]: http://docs.aws.amazon.com/AmazonS3/latest/dev/UsingBucket.html#access-bucket-intro 'Amazon S3 Bucket Addressing' +[s3-region-endpoints]: http://docs.aws.amazon.com/general/latest/gr/rande.html#s3_region 'Amazon S3 Region Endpoints' +[iam-instance-profiles]: https://docs.aws.amazon.com/IAM/latest/UserGuide/id_roles_use_switch-role-ec2_instance-profiles.html 'EC2 IAM instance profiles' diff --git a/content/nomad/v0.11.x/content/docs/job-specification/check_restart.mdx b/content/nomad/v0.11.x/content/docs/job-specification/check_restart.mdx new file mode 100644 index 0000000000..5b884f1cc1 --- /dev/null +++ b/content/nomad/v0.11.x/content/docs/job-specification/check_restart.mdx @@ -0,0 +1,141 @@ +--- +layout: docs +page_title: check_restart Stanza - Job Specification +sidebar_title: check_restart +description: |- + The "check_restart" stanza instructs Nomad when to restart tasks with + unhealthy service checks. +--- + +# `check_restart` Stanza + + + +As of Nomad 0.7 the `check_restart` stanza instructs Nomad when to restart +tasks with unhealthy service checks. When a health check in Consul has been +unhealthy for the `limit` specified in a `check_restart` stanza, it is +restarted according to the task group's [`restart` policy][restart_stanza]. The +`check_restart` settings apply to [`check`s][check_stanza], but may also be +placed on [`service`s][service_stanza] to apply to all checks on a service. +If `check_restart` is set on both the check and service, the stanzas are +merged with the check values taking precedence. + +```hcl +job "mysql" { + group "mysqld" { + + restart { + attempts = 3 + delay = "10s" + interval = "10m" + mode = "fail" + } + + task "server" { + service { + tags = ["leader", "mysql"] + + port = "db" + + check { + type = "tcp" + port = "db" + interval = "10s" + timeout = "2s" + } + + check { + type = "script" + name = "check_table" + command = "/usr/local/bin/check_mysql_table_status" + args = ["--verbose"] + interval = "60s" + timeout = "5s" + + check_restart { + limit = 3 + grace = "90s" + ignore_warnings = false + } + } + } + } + } +} +``` + +- `limit` `(int: 0)` - Restart task when a health check has failed `limit` + times. For example 1 causes a restart on the first failure. The default, + `0`, disables health check based restarts. Failures must be consecutive. A + single passing check will reset the count, so flapping services may not be + restarted. + +- `grace` `(string: "1s")` - Duration to wait after a task starts or restarts + before checking its health. + +- `ignore_warnings` `(bool: false)` - By default checks with both `critical` + and `warning` statuses are considered unhealthy. Setting `ignore_warnings = true` treats a `warning` status like `passing` and will not trigger a restart. + +## Example Behavior + +Using the example `mysql` above would have the following behavior: + +```hcl +check_restart { + # ... + grace = "90s" + # ... +} +``` + +When the `server` task first starts and is registered in Consul, its health +will not be checked for 90 seconds. This gives the server time to startup. + +```hcl +check_restart { + limit = 3 + # ... +} +``` + +After the grace period if the script check fails, it has 180 seconds (`60s interval * 3 limit`) to pass before a restart is triggered. Once a restart is +triggered the task group's [`restart` policy][restart_stanza] takes control: + +```hcl +restart { + # ... + delay = "10s" + # ... +} +``` + +The [`restart` stanza][restart_stanza] controls the restart behavior of the +task. In this case it will stop the task and then wait 10 seconds before +starting it again. + +Once the task restarts Nomad waits the `grace` period again before starting to +check the task's health. + +```hcl +restart { + attempts = 3 + # ... + interval = "10m" + mode = "fail" +} +``` + +If the check continues to fail, the task will be restarted up to `attempts` +times within an `interval`. If the `restart` attempts are reached within the +`limit` then the `mode` controls the behavior. In this case the task would fail +and not be restarted again. See the [`restart` stanza][restart_stanza] for +details. + +[check_stanza]: /docs/job-specification/service#check-parameters 'check stanza' +[restart_stanza]: /docs/job-specification/restart 'restart stanza' +[service_stanza]: /docs/job-specification/service 'service stanza' diff --git a/content/nomad/v0.11.x/content/docs/job-specification/connect.mdx b/content/nomad/v0.11.x/content/docs/job-specification/connect.mdx new file mode 100644 index 0000000000..15567cd1e2 --- /dev/null +++ b/content/nomad/v0.11.x/content/docs/job-specification/connect.mdx @@ -0,0 +1,176 @@ +--- +layout: docs +page_title: connect Stanza - Job Specification +sidebar_title: connect +description: The "connect" stanza allows specifying options for Consul Connect integration +--- + +# `connect` Stanza + + + +The `connect` stanza allows configuring various options for +[Consul Connect](/docs/integrations/consul-connect). It is +valid only within the context of a service definition at the task group +level. For using `connect` when Consul ACLs are enabled, be sure to read through +the [Secure Nomad Jobs with Consul Connect](https://learn.hashicorp.com/nomad/consul-integration/nomad-connect-acl) +guide. + +```hcl +job "countdash" { + datacenters = ["dc1"] + + group "api" { + network { + mode = "bridge" + } + + service { + name = "count-api" + port = "9001" + + connect { + sidecar_service {} + } + } + + task "web" { + driver = "docker" + + config { + image = "hashicorpnomad/counter-api:v2" + } + } + } +} +``` + +## `connect` Parameters + +- `sidecar_service` - ([sidecar_service][]: nil) - This is used to configure the sidecar + service injected by Nomad for Consul Connect. + +- `sidecar_task` - ([sidecar_task][]:nil) - This modifies the configuration of the Envoy + proxy task. + +## `connect` Examples + +The following example is a minimal connect stanza with defaults and is +sufficient to start an Envoy proxy sidecar for allowing incoming connections +via Consul Connect. + +```hcl + connect { + sidecar_service {} + } +``` + +The following example includes specifying [`upstreams`][upstreams]. + +```hcl + connect { + sidecar_service { + proxy { + upstreams { + destination_name = "count-api" + local_bind_port = 8080 + } + } + } + } +``` + +The following is the complete `countdash` example. It includes an API service +and a frontend Dashboard service which connects to the API service as a Connect +upstream. Once running, the dashboard is accessible at `localhost:9002`. + +```hcl +job "countdash" { + datacenters = ["dc1"] + + group "api" { + network { + mode = "bridge" + } + + service { + name = "count-api" + port = "9001" + + connect { + sidecar_service {} + } + + check { + expose = true + type = "http" + name = "api-health" + path = "/health" + interval = "10s" + timeout = "3s" + } + } + + task "web" { + driver = "docker" + + config { + image = "hashicorpnomad/counter-api:v2" + } + } + } + + group "dashboard" { + network { + mode = "bridge" + + port "http" { + static = 9002 + to = 9002 + } + } + + service { + name = "count-dashboard" + port = "9002" + + connect { + sidecar_service { + proxy { + upstreams { + destination_name = "count-api" + local_bind_port = 8080 + } + } + } + } + } + + task "dashboard" { + driver = "docker" + + env { + COUNTING_SERVICE_URL = "http://${NOMAD_UPSTREAM_ADDR_count_api}" + } + + config { + image = "hashicorpnomad/counter-dashboard:v2" + } + } + } +} +``` + +### Limitations + +[Consul Connect Native services][native] and [Nomad variable +interpolation][interpolation] are _not_ yet supported. + +[job]: /docs/job-specification/job 'Nomad job Job Specification' +[group]: /docs/job-specification/group 'Nomad group Job Specification' +[task]: /docs/job-specification/task 'Nomad task Job Specification' +[interpolation]: /docs/runtime/interpolation 'Nomad interpolation' +[sidecar_service]: /docs/job-specification/sidecar_service 'Nomad sidecar service Specification' +[sidecar_task]: /docs/job-specification/sidecar_task 'Nomad sidecar task config Specification' +[upstreams]: /docs/job-specification/upstreams 'Nomad sidecar service upstreams Specification' +[native]: https://www.consul.io/docs/connect/native.html diff --git a/content/nomad/v0.11.x/content/docs/job-specification/constraint.mdx b/content/nomad/v0.11.x/content/docs/job-specification/constraint.mdx new file mode 100644 index 0000000000..6bddf52d10 --- /dev/null +++ b/content/nomad/v0.11.x/content/docs/job-specification/constraint.mdx @@ -0,0 +1,305 @@ +--- +layout: docs +page_title: constraint Stanza - Job Specification +sidebar_title: constraint +description: |- + The "constraint" stanza allows restricting the set of eligible nodes. + Constraints may filter on attributes or metadata. Additionally constraints may + be specified at the job, group, or task levels for ultimate flexibility. +--- + +# `constraint` Stanza + + + +The `constraint` allows restricting the set of eligible nodes. Constraints may +filter on [attributes][interpolation] or [client metadata][client-meta]. +Additionally constraints may be specified at the [job][job], [group][group], or +[task][task] levels for ultimate flexibility. + +~> **It is possible to define irreconcilable constraints in a job.** +For example, because all [tasks within a group are scheduled on the same client node][group], +specifying different [`${attr.unique.hostname}`][node-variables] constraints at +the task level will cause a job to be unplaceable. + +```hcl +job "docs" { + # All tasks in this job must run on linux. + constraint { + attribute = "${attr.kernel.name}" + value = "linux" + } + + group "example" { + # All groups in this job should be scheduled on different hosts. + constraint { + operator = "distinct_hosts" + value = "true" + } + + task "server" { + # All tasks must run where "my_custom_value" is greater than 3. + constraint { + attribute = "${meta.my_custom_value}" + operator = ">" + value = "3" + } + } + } +} +``` + +Placing constraints at both the job level and at the group level is redundant +since constraints are applied hierarchically. The job constraints will affect +all groups (and tasks) in the job. + +## `constraint` Parameters + +- `attribute` `(string: "")` - Specifies the name or reference of the attribute + to examine for the constraint. This can be any of the [Nomad interpolated + values](/docs/runtime/interpolation#interpreted_node_vars). + +- `operator` `(string: "=")` - Specifies the comparison operator. The ordering is + compared lexically. Possible values include: + + ```text + = + != + > + >= + < + <= + distinct_hosts + distinct_property + regexp + set_contains + version + semver + is_set + is_not_set + ``` + + For a detailed explanation of these values and their behavior, please see + the [operator values section](#operator-values). + +- `value` `(string: "")` - Specifies the value to compare the attribute against + using the specified operation. This can be a literal value, another attribute, + or any [Nomad interpolated + values](/docs/runtime/interpolation#interpreted_node_vars). + +### `operator` Values + +This section details the specific values for the "operator" parameter in the +Nomad job specification for constraints. The operator is always specified as a +string, but the string can take on different values which change the behavior of +the overall constraint evaluation. + +```hcl +constraint { + operator = "..." +} +``` + +- `"distinct_hosts"` - Instructs the scheduler to not co-locate any groups on + the same machine. When specified as a job constraint, it applies to all groups + in the job. When specified as a group constraint, the effect is constrained to + that group. This constraint can not be specified at the task level. Note that + the `attribute` parameter should be omitted when using this constraint. + + ```hcl + constraint { + operator = "distinct_hosts" + value = "true" + } + ``` + + The constraint may also be specified as follows for a more compact + representation: + + ```hcl + constraint { + distinct_hosts = true + } + ``` + +- `"distinct_property"` - Instructs the scheduler to select nodes that have a + distinct value of the specified property. The `value` parameter specifies how + many allocations are allowed to share the value of a property. The `value` + must be 1 or greater and if omitted, defaults to 1. When specified as a job + constraint, it applies to all groups in the job. When specified as a group + constraint, the effect is constrained to that group. This constraint can not + be specified at the task level. + + ```hcl + constraint { + operator = "distinct_property" + attribute = "${meta.rack}" + value = "3" + } + ``` + + The constraint may also be specified as follows for a more compact + representation: + + ```hcl + constraint { + distinct_property = "${meta.rack}" + value = "3" + } + ``` + +- `"regexp"` - Specifies a regular expression constraint against the attribute. + The syntax of the regular expressions accepted is the same general syntax used + by Perl, Python, and many other languages. More precisely, it is the syntax + accepted by RE2 and described at in the [Google RE2 + syntax](https://golang.org/s/re2syntax). + + ```hcl + constraint { + attribute = "..." + operator = "regexp" + value = "[a-z0-9]" + } + ``` + +- `"set_contains"` - Specifies a contains constraint against the attribute. The + attribute and the list being checked are split using commas. This will check + that the given attribute contains **all** of the specified elements. + + ```hcl + constraint { + attribute = "..." + operator = "set_contains" + value = "a,b,c" + } + ``` + +- `"version"` - Specifies a version constraint against the attribute. This + supports a comma-separated list of constraints, including the pessimistic + operator. `version` will not consider a prerelease (eg `1.6.0-beta`) + sufficient to match a non-prerelease constraint (eg `>= 1.0`). Use the + `semver` constraint for strict [Semantic Versioning 2.0][semver2] ordering. + For more examples please see the [go-version + repository](https://github.com/hashicorp/go-version) for more specific + examples. + + ```hcl + constraint { + attribute = "..." + operator = "version" + value = ">= 0.1.0, < 0.2" + } + ``` + +- `"semver"` - Specifies a version constraint against the attribute. Only + [Semantic Versioning 2.0][semver2] compliant versions and comparison + operators are supported, so there is no pessimistic operator. Unlike `version`, + this operator considers prereleases (eg `1.6.0-beta`) sufficient to satisfy + non-prerelease constraints (eg `>= 1.0`). _Added in Nomad v0.10.2._ + + ```hcl + constraint { + attribute = "..." + operator = "semver" + value = ">= 0.1.0, < 0.2" + } + ``` + +- `"is_set"` - Specifies that a given attribute must be present. This can be + combined with the `"!="` operator to require that an attribute has been set + before checking for equality. The default behavior for `"!="` is to include + nodes that don't have that attribute set. + +- `"is_not_set"` - Specifies that a given attribute must not be present. + +## `constraint` Examples + +The following examples only show the `constraint` stanzas. Remember that the +`constraint` stanza is only valid in the placements listed above. + +### Kernel Data + +This example restricts the task to running on nodes which have a kernel version +higher than "3.19". + +```hcl +constraint { + attribute = "${attr.kernel.version}" + operator = "version" + value = "> 3.19" +} +``` + +### Distinct Property + +A potential use case of the `distinct_property` constraint is to spread a +service with `count > 1` across racks to minimize correlated failure. Nodes can +be annotated with which rack they are on using [custom client +metadata][client-meta] with values such as "rack-12-1", "rack-12-2", etc. +The following constraint would assure that an individual rack is not running +more than 2 instances of the task group. + +```hcl +constraint { + distinct_property = "${meta.rack}" + value = "2" +} +``` + +### Operating Systems + +This example restricts the task to running on nodes that are running Ubuntu +14.04 + +```hcl +constraint { + attribute = "${attr.os.name}" + value = "ubuntu" +} + +constraint { + attribute = "${attr.os.version}" + value = "14.04" +} +``` + +### Cloud Metadata + +When possible, Nomad populates node attributes from the cloud environment. These +values are accessible as filters in constraints. This example constrains this +task to only run on nodes that are memory-optimized on AWS. + +```hcl +constraint { + attribute = "${attr.platform.aws.instance-type}" + value = "m4.xlarge" +} +``` + +### User-Specified Metadata + +This example restricts the task to running on nodes where the binaries for +redis, cypress, and nginx are all cached locally. This particular example is +utilizing [custom client metadata][client-meta]. + +```hcl +constraint { + attribute = "${meta.cached_binaries}" + set_contains = "redis,cypress,nginx" +} +``` + +[job]: /docs/job-specification/job 'Nomad job Job Specification' +[group]: /docs/job-specification/group 'Nomad group Job Specification' +[client-meta]: /docs/configuration/client#meta 'Nomad meta Job Specification' +[task]: /docs/job-specification/task 'Nomad task Job Specification' +[interpolation]: /docs/runtime/interpolation 'Nomad interpolation' +[node-variables]: /docs/runtime/interpolation#node-variables- 'Nomad interpolation-Node variables' +[client-meta]: /docs/configuration/client#custom-metadata-network-speed-and-node-class 'Nomad Custom Metadata, Network Speed, and Node Class' +[semver2]: https://semver.org/spec/v2.0.0.html 'Semantic Versioning 2.0' diff --git a/content/nomad/v0.11.x/content/docs/job-specification/csi_plugin.mdx b/content/nomad/v0.11.x/content/docs/job-specification/csi_plugin.mdx new file mode 100644 index 0000000000..cff59997df --- /dev/null +++ b/content/nomad/v0.11.x/content/docs/job-specification/csi_plugin.mdx @@ -0,0 +1,102 @@ +--- +layout: docs +page_title: csi_plugin Stanza - Job Specification +sidebar_title: csi_plugin Beta +description: >- + The "csi_plugin" stanza allows the task to specify it provides a + Container Storage Interface plugin to the cluster. +--- + +# `csi_plugin` Stanza + + + +The "csi_plugin" stanza allows the task to specify it provides a +Container Storage Interface plugin to the cluster. Nomad will +automatically register the plugin so that it can be used by other jobs +to claim [volumes][csi_volumes]. + +```hcl +csi_plugin { + id = "csi-hostpath" + type = "monolith" + mount_dir = "/csi" +} +``` + +## `csi_plugin` Parameters + +- `id` `(string: )` - This is the ID for the plugin. Some + plugins will require both controller and node plugin types (see + below); you need to use the same ID for both so that Nomad knows the + belong to the same plugin. + +- `type` `(string: )` - One of `node`, `controller`, or + `monolith`. Each plugin supports one or more types. Each Nomad + client node where you want to mount a volume will need a `node` + plugin instance. Some plugins will also require one or more + `controller` plugin instances to communicate with the storage + provider's APIs. Some plugins can serve as both `controller` and + `node` at the same time, and these are called `monolith` + plugins. Refer to your CSI plugin's documentation. + +- `mount_dir` `(string: )` - The directory path inside the + container where the plugin will expect a Unix domain socket for + bidirectional communication with Nomad. + + +~> **Note:** Plugins running as `node` or `monolith` require root +privileges (or `CAP_SYS_ADMIN` on Linux) to mount volumes on the +host. With the Docker task driver, you can use the `privileged = true` +configuration, but no other default task drivers currently have this +option. + +~> **Note:** During node drains, jobs that claim volumes should be +moved before the `node` or `monolith` plugin for those +volumes. Because [`system`][system] jobs are moved last during node drains, you +should run `node` or `monolith` plugins as `system` jobs. + +## `csi_plugin` Examples + +```hcl +job "plugin-efs" { + datacenters = ["dc1"] + + # you can run node plugins as service jobs as well, but running + # as a system job ensures all nodes in the DC have a copy. + type = "system" + + group "nodes" { + task "plugin" { + driver = "docker" + + config { + image = "amazon/aws-efs-csi-driver:latest" + + args = [ + "node", + "--endpoint=unix://csi/csi.sock", + "--logtostderr", + "--v=5", + ] + + # all CSI node plugins will need to run as privileged tasks + # so they can mount volumes to the host. controller plugins + # do not need to be privileged. + privileged = true + } + + csi_plugin { + id = "aws-efs0" + type = "node" + mount_dir = "/csi" # this path /csi matches the --endpoint + # argument for the container + } + } + } +} +``` + +[csi]: https://github.com/container-storage-interface/spec +[csi_volumes]: /docs/job-specification/volume +[system]: /docs/schedulers/#system diff --git a/content/nomad/v0.11.x/content/docs/job-specification/device.mdx b/content/nomad/v0.11.x/content/docs/job-specification/device.mdx new file mode 100644 index 0000000000..2c4c6b83a6 --- /dev/null +++ b/content/nomad/v0.11.x/content/docs/job-specification/device.mdx @@ -0,0 +1,304 @@ +--- +layout: docs +page_title: device Stanza - Job Specification +sidebar_title: device +description: |- + The "device" stanza is used to require a certain device be made available + to the task. +--- + +# `device` Stanza + + + +The `device` stanza is used to create both a scheduling and runtime requirement +that the given task has access to the specified devices. A device is a hardware +device that is attached to the node and may be made available to the task. +Examples are GPUs, FPGAs, and TPUs. + +When a `device` stanza is added, Nomad will schedule the task onto a node that +contains the set of device(s) that meet the specified requirements. The `device` stanza +allows the operator to specify as little as just the type of device required, +such as `gpu`, all the way to specifying arbitrary constraints and affinities. +Once the scheduler has placed the allocation on a suitable node, the Nomad +Client will invoke the device plugin to retrieve information on how to mount the +device and what environment variables to expose. For more information on the +runtime environment, please consult the individual device plugin's documentation. + +See the [device plugin's documentation][devices] for a list of supported devices. + +```hcl +job "docs" { + group "example" { + task "server" { + resources { + device "nvidia/gpu" { + count = 2 + + constraint { + attribute = "${device.attr.memory}" + operator = ">=" + value = "2 GiB" + } + + affinity { + attribute = "${device.attr.memory}" + operator = ">=" + value = "4 GiB" + weight = 75 + } + } + } + } + } +} +``` + +In the above example, the task is requesting two GPUs, from the Nvidia vendor, +but is not specifying the specific model required. Instead it is placing a hard +constraint that the device has at least 2 GiB of memory and that it would prefer +to use GPUs that have at least 4 GiB. This examples shows how expressive the +`device` stanza can be. + +~> Device supported is currently limited to Linux, and container based drivers +due to the ability to isolate devices to specific tasks. + +## `device` Parameters + +- `name` `(string: "")` - Specifies the device required. The following inputs + are valid: + + - ``: If a single value is given, it is assumed to be the device + type, such as "gpu", or "fpga". + + - `/`: If two values are given separated by a `/`, the + given device type will be selected, constraining on the provided vendor. + Examples include "nvidia/gpu" or "amd/gpu". + + - `//`: If three values are given separated by a `/`, the + given device type will be selected, constraining on the provided vendor, and + model name. Examples include "nvidia/gpu/1080ti" or "nvidia/gpu/2080ti". + +- `count` `(int: 1)` - Specifies the number of instances of the given device + that are required. + +- `constraint` ([Constraint][]: nil) - Constraints to restrict + which devices are eligible. This can be provided multiple times to define + additional constraints. See below for available attributes. + +- `affinity` ([Affinity][]: nil) - Affinity to specify a preference + for which devices get selected. This can be provided multiple times to define + additional affinities. See below for available attributes. + +## `device` Constraint and Affinity Attributes + +The set of attributes available for use in a `constraint` or `affinity` are as +follows: + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + +
VariableDescriptionExample Value
+ {'${device.type}'} + The type of device + "gpu", "tpu", "fpga" +
+ {'${device.vendor}'} + The device's vendor + "amd", "nvidia", "intel" +
+ {'${device.model}'} + The device's model + "1080ti" +
+ + ${'{'}device.attr.<property>{'}'} + + Property of the device + {'${device.attr.memory} => 8 GiB'} +
+ +For the set of attributes available, please see the individual [device plugin's +documentation][devices]. + +### Attribute Units and Conversions + +Devices report their attributes with strict types and can also provide unit +information. For example, when a GPU is reporting its memory, it can report that +it is "4096 MiB". Since Nomad has the associated unit information, a constraint +that requires greater than "3.5 GiB" can match since Nomad can convert between +these units. + +The units Nomad supports is as follows: + + + + + + + + + + + + + + + + + + + + + + + + + + +
Base UnitValues
+ Byte + + + Base 2: KiB, MiB, GiB, TiB, PiB, EiB + +
+ + Base 10: kB, KB (equivalent to kB), MB, GB, TB, PB, + EB + +
+ Byte Rates + + + Base 2: KiB/s, MiB/s, GiB/s, TiB/s, PiB/s, EiB/s + +
+ + Base 10: kB/s, KB/s (equivalent to kB/s), MB/s, GB/s, + TB/s, PB/s,EB/s + +
+ Hertz + + MHz, GHz +
+ Watts + + mW, W, kW, MW, GW +
+ +Conversion is only possible within the same base unit. + +## `device` Examples + +The following examples only show the `device` stanzas. Remember that the +`device` stanza is only valid in the placements listed above. + +### Single Nvidia GPU + +This example schedules a task with a single Nvidia GPU made available. + +```hcl +device "nvidia/gpu" {} +``` + +### Multiple Nvidia GPU + +This example schedules a task with a two Nvidia GPU made available. + +```hcl +device "nvidia/gpu" { + count = 2 +} +``` + +### Single Nvidia GPU with Specific Model + +This example schedules a task with a single Nvidia GPU made available and uses +the name to specify the exact model to be used. + +```hcl +device "nvidia/gpu/1080ti" {} +``` + +This is a simplification of the following: + +```hcl +device "gpu" { + count = 1 + + constraint { + attribute = "${device.vendor}" + value = "nvidia" + } + + constraint { + attribute = "${device.model}" + value = "1080ti" + } +} +``` + +### Affinity with Unit Conversion + +This example uses an affinity to tell the scheduler it would prefer if the GPU +had at least 1.5 GiB of memory. The following are both equivalent as Nomad can +do unit conversions. + +Specified in `GiB`: + +```hcl +device "nvidia/gpu" { + affinity { + attribute = "${device.attr.memory}" + operator = ">=" + value = "1.5 GiB" + weight = 75 + } +} +``` + +Specified in `MiB`: + +```hcl +device "nvidia/gpu" { + affinity { + attribute = "${device.attr.memory}" + operator = ">=" + value = "1500 MiB" + weight = 75 + } +} +``` + +[affinity]: /docs/job-specification/affinity 'Nomad affinity Job Specification' +[constraint]: /docs/job-specification/constraint 'Nomad constraint Job Specification' +[devices]: /docs/devices 'Nomad Device Plugins' diff --git a/content/nomad/v0.11.x/content/docs/job-specification/dispatch_payload.mdx b/content/nomad/v0.11.x/content/docs/job-specification/dispatch_payload.mdx new file mode 100644 index 0000000000..15c20b5d5f --- /dev/null +++ b/content/nomad/v0.11.x/content/docs/job-specification/dispatch_payload.mdx @@ -0,0 +1,55 @@ +--- +layout: docs +page_title: dispatch_payload Stanza - Job Specification +sidebar_title: dispatch_payload +description: |- + The "dispatch_payload" stanza allows a task to access dispatch payloads. + to +--- + +# `dispatch_payload` Stanza + + + +The `dispatch_payload` stanza is used in conjunction with a [`parameterized`][parameterized] job +that expects a payload. When the job is dispatched with a payload, the payload +will be made available to any task that has a `dispatch_payload` stanza. The +payload will be written to the configured file before the task is started. This +allows the task to use the payload as input or configuration. + +```hcl +job "docs" { + group "example" { + task "server" { + dispatch_payload { + file = "config.json" + } + } + } +} +``` + +## `dispatch_payload` Parameters + +- `file` `(string: "")` - Specifies the file name to write the content of + dispatch payload to. The file is written relative to the [task's local + directory][localdir]. + +## `dispatch_payload` Examples + +The following examples only show the `dispatch_payload` stanzas. Remember that the +`dispatch_payload` stanza is only valid in the placements listed above. + +### Write Payload to a File + +This example shows a `dispatch_payload` block in a parameterized job that writes +the payload to a `config.json` file. + +```hcl +dispatch_payload { + file = "config.json" +} +``` + +[localdir]: /docs/runtime/environment#local 'Task Local Directory' +[parameterized]: /docs/job-specification/parameterized 'Nomad parameterized Job Specification' diff --git a/content/nomad/v0.11.x/content/docs/job-specification/env.mdx b/content/nomad/v0.11.x/content/docs/job-specification/env.mdx new file mode 100644 index 0000000000..bff811ed19 --- /dev/null +++ b/content/nomad/v0.11.x/content/docs/job-specification/env.mdx @@ -0,0 +1,78 @@ +--- +layout: docs +page_title: env Stanza - Job Specification +sidebar_title: env +description: |- + The "env" stanza configures a list of environment variables to populate the + task's environment before starting. +--- + +# `env` Stanza + + + +The `env` stanza configures a list of environment variables to populate the +task's environment before starting. + +```hcl +job "docs" { + group "example" { + task "server" { + env { + my_key = "my-value" + } + } + } +} +``` + +## `env` Parameters + +The "parameters" for the `env` stanza can be any key-value. The keys and values +are both of type `string`, but they can be specified as other types. They will +automatically be converted to strings. Invalid characters such as dashes (`-`) +will be converted to underscores. + +## `env` Examples + +The following examples only show the `env` stanzas. Remember that the +`env` stanza is only valid in the placements listed above. + +### Coercion + +This example shows the different ways to specify key-value pairs. Internally, +these values will be stored as their string representation. No type information +is preserved. + +```hcl +env { + key = 1.4 + key = "1.4" + "key" = 1.4 + "key" = "1.4" + + key = true + key = "1" + key = 1 +} +``` + +### Interpolation + +This example shows using [Nomad interpolation][interpolation] to populate +environment variables. + +```hcl +env { + NODE_CLASS = "${node.class}" +} +``` + +### Dynamic Environment Variables + +Nomad also supports populating dynamic environment variables from data stored in +HashiCorp Consul and Vault. To use this feature please see the documentation on +the [`template` stanza][template-env]. + +[interpolation]: /docs/runtime/interpolation 'Nomad interpolation' +[template-env]: /docs/job-specification/template#environment-variables 'Nomad template Stanza' diff --git a/content/nomad/v0.11.x/content/docs/job-specification/ephemeral_disk.mdx b/content/nomad/v0.11.x/content/docs/job-specification/ephemeral_disk.mdx new file mode 100644 index 0000000000..6645886bbb --- /dev/null +++ b/content/nomad/v0.11.x/content/docs/job-specification/ephemeral_disk.mdx @@ -0,0 +1,63 @@ +--- +layout: docs +page_title: ephemeral_disk Stanza - Job Specification +sidebar_title: ephemeral_disk +description: |- + The "ephemeral_disk" stanza describes the ephemeral disk requirements of the + group. Ephemeral disks can be marked as sticky and support live data + migrations. +--- + +# `ephemeral_disk` Stanza + + + +The `ephemeral_disk` stanza describes the ephemeral disk requirements of the +group. Ephemeral disks can be marked as sticky and support live data migrations. +All tasks in this group will share the same ephemeral disk. + +```hcl +job "docs" { + group "example" { + ephemeral_disk { + migrate = true + size = "500" + sticky = true + } + } +} +``` + +## `ephemeral_disk` Parameters + +- `migrate` `(bool: false)` - When `sticky` is true, this specifies that the + Nomad client should make a best-effort attempt to migrate the data from a + remote machine if placement cannot be made on the original node. During data + migration, the task will block starting until the data migration has + completed. Migration is atomic and any partially migrated data will be + removed if an error is encountered. + +- `size` `(int: 300)` - Specifies the size of the ephemeral disk in MB. The + current Nomad ephemeral storage implementation does not enforce this limit; + however, it is used during job placement. + +- `sticky` `(bool: false)` - Specifies that Nomad should make a best-effort + attempt to place the updated allocation on the same machine. This will move + the `local/` and `alloc/data` directories to the new allocation. + +## `ephemeral_disk` Examples + +The following examples only show the `ephemeral_disk` stanzas. Remember that the +`ephemeral_disk` stanza is only valid in the placements listed above. + +### Sticky Volumes + +This example shows enabling sticky volumes with Nomad using ephemeral disks: + +```hcl +ephemeral_disk { + sticky = true +} +``` + +[resources]: /docs/job-specification/resources 'Nomad resources Job Specification' diff --git a/content/nomad/v0.11.x/content/docs/job-specification/expose.mdx b/content/nomad/v0.11.x/content/docs/job-specification/expose.mdx new file mode 100644 index 0000000000..3e51efced2 --- /dev/null +++ b/content/nomad/v0.11.x/content/docs/job-specification/expose.mdx @@ -0,0 +1,221 @@ +--- +layout: docs +page_title: expose Stanza - Job Specification +sidebar_title: expose +description: |- + The "expose" stanza allows specifying options for configuring Envoy expose + paths used in Consul Connect integration +--- + +# `expose` Stanza + + + +The `expose` stanza allows configuration of additional listeners for the default Envoy sidecar +proxy managed by Nomad for [Consul Connect](/guides/integrations/consul-connect). These +listeners create a bypass of the Connect TLS and network namespace isolation, enabling +non-Connect enabled services to make requests to specific HTTP paths through the sidecar proxy. + +The `expose` configuration is valid within the context of a `proxy` stanza. Additional +information about Expose Path configurations for Envoy can be found in Consul's +[Expose Paths Configuration Reference](https://www.consul.io/docs/connect/registration/service-registration.html#expose-paths-configuration-reference). + +Service [check](https://nomadproject.io/docs/job-specification/service/#check-parameters) +configurations can use their [expose](/docs/job-specification/service#expose) +parameter to automatically generate expose path configurations for HTTP and gRPC checks. + +```hcl +job "expose-check-example" { + datacenters = ["dc1"] + + group "api" { + network { + mode = "bridge" + } + + service { + name = "count-api" + port = "9001" + + connect { + sidecar_service {} + } + + check { + expose = true + name = "api-health" + type = "http" + path = "/health" + interval = "10s" + timeout = "3s" + } + } + + task "web" { + driver = "docker" + + config { + image = "hashicorpnomad/counter-api:v2" + } + } + } +} +``` + +For uses other than Consul service checks, use the `expose` configuration in the +`proxy` stanza. The example below effectively demonstrates exposing the `/health` +endpoint similar to the example above, but using the fully flexible `expose` +configuration. + +```hcl +job "expose-example" { + datacenters = ["dc1"] + + group "api" { + network { + mode = "bridge" + + port "api_expose_healthcheck" { + to = -1 + } + } + + service { + name = "count-api" + port = "9001" + + connect { + sidecar_service { + proxy { + expose { + path { + path = "/health" + protocol = "http" + local_path_port = 9001 + listener_port = "api_expose_healthcheck" + } + } + } + } + } + + check { + name = "api-health" + type = "http" + path = "/health" + port = "api_expose_healthcheck" + interval = "10s" + timeout = "3s" + } + } + + task "web" { + driver = "docker" + + config { + image = "hashicorpnomad/counter-api:v2" + } + + # e.g. reference ${NOMAD_PORT_api_expose_healthcheck} for other uses + } + } +} +``` + +## `expose` Parameters + +- `path` ([Path]: nil) - A list of [Envoy Expose Path Configurations](/docs/job-specification/path) + to expose through Envoy. + +### `path` Parameters + +- `path` `(string: required)` - The HTTP or gRPC path to expose. The path must be prefixed + with a slash. +- `protocol` `(string: required)` - Sets the protocol of the listener. Must be + `http` or `http2`. For gRPC use `http2`. +- `local_path_port` `(int: required)` - The port the service is listening to for connections to + the configured `path`. Typically this will be the same as the `service.port` value, but + could be different if for example the exposed path is intended to resolve to another task + in the task group. +- `listener_port` ([Port]: required) - The name of the port to use + for the exposed listener. The port should be configured to [map inside](/docs/job-specification/network#to) + the task's network namespace. + + +## `expose` Examples + +The following example is configured to expose the `/metrics` endpoint of the Connect-enabled +`count-dashboard` service, using the `HTTP` protocol. `count-dashboard` is expected +to listen inside its namespace to port `9001`, and external services will be able to +reach its `/metrics` endpoint by connecting to the [network interface](https://nomadproject.io/docs/configuration/client/#network_interface) +of the node on the allocated `metrics` [Port](/docs/job-specification/network#port-parameters). + +```hcl +service { + name = "count-dashboard" + port = "9001" + + connect { + sidecar_service { + proxy { + expose { + path { + path = "/metrics" + protocol = "http" + local_path_port = 9001 + listener_port = "metrics" + } + } + } + } + } +} +``` + +## `path` Examples + +The following example is an expose configuration that exposes a `/metrics` endpoint +using the `http2` protocol (typical for gRPC), and an HTTP `/v2/health` endpoint. + +```hcl +proxy { + expose { + path { + path = "/metrics" + protocol = "http2" + local_path_port = 9001 + listener_port = "expose" + } + path { + path = "/v2/health" + protocol = "http" + local_path_port = 9001 + listener_port = "expose" + } + } +} +``` + +### Exposing Service Checks + +A common use case for `expose` is for exposing endpoints used in Consul service check +definitions. For these cases the [expose](/docs/job-specification/service#expose) +parameter in the service check stanza can be used to automatically generate the +expose path configuration. Configuring a port for use by the check is optional, +as a dynamic port will be automatically generated if not provided. + +```hcl +check { + expose = true + type = "http" + name = "dashboard-health" + path = "/health" + interval = "10s" + timeout = "3s" +} +``` + +[path]: /docs/job-specification/expose#path-parameters 'Nomad Expose Path Parameters' +[port]: /docs/job-specification/network#port-parameters 'Nomad Port Parameters' diff --git a/content/nomad/v0.11.x/content/docs/job-specification/group.mdx b/content/nomad/v0.11.x/content/docs/job-specification/group.mdx new file mode 100644 index 0000000000..8949949754 --- /dev/null +++ b/content/nomad/v0.11.x/content/docs/job-specification/group.mdx @@ -0,0 +1,200 @@ +--- +layout: docs +page_title: group Stanza - Job Specification +sidebar_title: group +description: |- + The "group" stanza defines a series of tasks that should be co-located on the + same Nomad client. Any task within a group will be placed on the same client. +--- + +# `group` Stanza + + + +The `group` stanza defines a series of tasks that should be co-located on the +same Nomad client. Any [task][] within a group will be placed on the same +client. + +```hcl +job "docs" { + group "example" { + # ... + } +} +``` + +## `group` Parameters + +- `constraint` ([Constraint][]: nil) - + This can be provided multiple times to define additional constraints. + +- `affinity` ([Affinity][]: nil) - This can be provided + multiple times to define preferred placement criteria. + +- `spread` ([Spread][spread]: nil) - This can be provided + multiple times to define criteria for spreading allocations across a + node attribute or metadata. See the + [Nomad spread reference](/docs/job-specification/spread) for more details. + +- `count` `(int: 1)` - Specifies the number of the task groups that should + be running under this group. This value must be non-negative. + +- `ephemeral_disk` ([EphemeralDisk][]: nil) - Specifies the + ephemeral disk requirements of the group. Ephemeral disks can be marked as + sticky and support live data migrations. + +- `meta` ([Meta][]: nil) - Specifies a key-value map that annotates + with user-defined metadata. + +- `migrate` ([Migrate][]: nil) - Specifies the group strategy for + migrating off of draining nodes. Only service jobs with a count greater than + 1 support migrate stanzas. + +- `reschedule` ([Reschedule][]: nil) - Allows to specify a + rescheduling strategy. Nomad will then attempt to schedule the task on another + node if any of the group allocation statuses become "failed". + +- `restart` ([Restart][]: nil) - Specifies the restart policy for + all tasks in this group. If omitted, a default policy exists for each job + type, which can be found in the [restart stanza documentation][restart]. + +- `shutdown_delay` `(string: "0s")` - Specifies the duration to wait when + stopping a group's tasks. The delay occurs between Consul deregistration + and sending each task a shutdown signal. Ideally, services would fail + healthchecks once they receive a shutdown signal. Alternatively + `shutdown_delay` may be set to give in flight requests time to complete + before shutting down. A group level `shutdown_delay` will run regardless + if there are any defined group services. In addition, tasks may have their + own [`shutdown_delay`](/docs/job-specification/task#shutdown_delay) + which waits between deregistering task services and stopping the task. + +- `stop_after_client_disconnect` `(string: "")` - Specifies a duration + after which a Nomad client that cannot communicate with the servers + will stop allocations based on this task group. By default, a client + will not stop an allocation until explicitly told to by a server. A + client that fails to heartbeat to a server within the + `hearbeat_grace` window and any allocations running on it will be + marked "lost" and Nomad will schedule replacement + allocations. However, these replaced allocations will continue to + run on the non-responsive client; an operator may desire that these + replaced allocations are also stopped in this case — for example, + allocations requiring exclusive access to an external resource. When + specified, the Nomad client will stop them after this duration. The + Nomad client process must be running for this to occur. + +- `task` ([Task][]: <required>) - Specifies one or more tasks to run + within this group. This can be specified multiple times, to add a task as part + of the group. + +- `vault` ([Vault][]: nil) - Specifies the set of Vault policies + required by all tasks in this group. Overrides a `vault` block set at the + `job` level. + +- `volume` ([Volume][]: nil) - Specifies the volumes that are + required by tasks within the group. + +## `group` Examples + +The following examples only show the `group` stanzas. Remember that the +`group` stanza is only valid in the placements listed above. + +### Specifying Count + +This example specifies that 5 instances of the tasks within this group should be +running: + +```hcl +group "example" { + count = 5 +} +``` + +### Tasks with Constraint + +This example shows two abbreviated tasks with a constraint on the group. This +will restrict the tasks to 64-bit operating systems. + +```hcl +group "example" { + constraint { + attribute = "${attr.cpu.arch}" + value = "amd64" + } + + task "cache" { + # ... + } + + task "server" { + # ... + } +} +``` + +### Metadata + +This example show arbitrary user-defined metadata on the group: + +```hcl +group "example" { + meta { + "my-key" = "my-value" + } +} +``` + +### Stop After Client Disconnect + +This example shows how `stop_after_client_disconnect` interacts with +other stanzas. For the `first` group, after the default 10 second +[`heartbeat_grace`] window expires and 90 more seconds passes, the +server will reschedule the allocation. The client will wait 90 seconds +before sending a stop signal (`SIGTERM`) to the `first-task` +task. After 15 more seconds because of the task's `kill_timeout`, the +client will send `SIGKILL`. The `second` group does not have +`stop_after_client_disconnect`, so the server will reschedule the +allocation after the 10 second [`heartbeat_grace`] expires. It will +not be stopped on the client, regardless of how long the client is out +of touch. + +Note that if the server's clocks are not closely synchronized with +each other, the server may reschedule the group before the client has +stopped the allocation. Operators should ensure that clock drift +between servers is as small as possible. + +Note also that a group using this feature will be stopped on the +client if the Nomad server cluster fails, since the client will be +unable to contact any server in that case. Groups opting in to this +feature are therefore exposed to an additional runtime dependency and +potential point of failure. + +```hcl +group "first" { + stop_after_client_disconnect = "90s" + + task "first-task" { + kill_timeout = "15s" + } +} + +group "second" { + + task "second-task" { + kill_timeout = "5s" + } +} +``` + +[task]: /docs/job-specification/task 'Nomad task Job Specification' +[job]: /docs/job-specification/job 'Nomad job Job Specification' +[constraint]: /docs/job-specification/constraint 'Nomad constraint Job Specification' +[spread]: /docs/job-specification/spread 'Nomad spread Job Specification' +[affinity]: /docs/job-specification/affinity 'Nomad affinity Job Specification' +[ephemeraldisk]: /docs/job-specification/ephemeral_disk 'Nomad ephemeral_disk Job Specification' +[`heartbeat_grace`]: /docs/configuration/server/#heartbeat_grace +[meta]: /docs/job-specification/meta 'Nomad meta Job Specification' +[migrate]: /docs/job-specification/migrate 'Nomad migrate Job Specification' +[reschedule]: /docs/job-specification/reschedule 'Nomad reschedule Job Specification' +[restart]: /docs/job-specification/restart 'Nomad restart Job Specification' +[vault]: /docs/job-specification/vault 'Nomad vault Job Specification' +[volume]: /docs/job-specification/volume 'Nomad volume Job Specification' diff --git a/content/nomad/v0.11.x/content/docs/job-specification/index.mdx b/content/nomad/v0.11.x/content/docs/job-specification/index.mdx new file mode 100644 index 0000000000..37ac6ff1be --- /dev/null +++ b/content/nomad/v0.11.x/content/docs/job-specification/index.mdx @@ -0,0 +1,148 @@ +--- +layout: docs +page_title: Job Specification +sidebar_title: Job Specification +description: Learn about the Job specification used to submit jobs to Nomad. +--- + +# Job Specification + +The Nomad job specification (or "jobspec" for short) defines the schema for +Nomad jobs. Nomad jobs are specified in [HCL][], which aims to strike a balance +between human readable and editable, and machine-friendly. + +The job specification is broken down into smaller pieces, which you will find +expanded in the navigation menu. We recommend getting started at the [job][] +stanza. Alternatively, you can keep reading to see a few examples. + +For machine-friendliness, Nomad can also read JSON-equivalent configurations. In +general, we recommend using the HCL syntax. + +The general hierarchy for a job is: + +```text +job + \_ group + \_ task +``` + +Each job file has only a single job, however a job may have multiple groups, and +each group may have multiple tasks. Groups contain a set of tasks that are +co-located on a machine. + +## Example + +This example shows a sample job file. We tried to keep it as simple as possible, +while still showcasing the power of Nomad. For a more detailed explanation of +any of these fields, please use the navigation to dive deeper. + +```hcl +# This declares a job named "docs". There can be exactly one +# job declaration per job file. +job "docs" { + # Specify this job should run in the region named "us". Regions + # are defined by the Nomad servers' configuration. + region = "us" + + # Spread the tasks in this job between us-west-1 and us-east-1. + datacenters = ["us-west-1", "us-east-1"] + + # Run this job as a "service" type. Each job type has different + # properties. See the documentation below for more examples. + type = "service" + + # Specify this job to have rolling updates, two-at-a-time, with + # 30 second intervals. + update { + stagger = "30s" + max_parallel = 2 + } + + # A group defines a series of tasks that should be co-located + # on the same client (host). All tasks within a group will be + # placed on the same host. + group "webs" { + # Specify the number of these tasks we want. + count = 5 + + # Create an individual task (unit of work). This particular + # task utilizes a Docker container to front a web application. + task "frontend" { + # Specify the driver to be "docker". Nomad supports + # multiple drivers. + driver = "docker" + + # Configuration is specific to each driver. + config { + image = "hashicorp/web-frontend" + } + + # The service block tells Nomad how to register this service + # with Consul for service discovery and monitoring. + service { + # This tells Consul to monitor the service on the port + # labelled "http". Since Nomad allocates high dynamic port + # numbers, we use labels to refer to them. + port = "http" + + check { + type = "http" + path = "/health" + interval = "10s" + timeout = "2s" + } + } + + # It is possible to set environment variables which will be + # available to the task when it runs. + env { + "DB_HOST" = "db01.example.com" + "DB_USER" = "web" + "DB_PASS" = "loremipsum" + } + + # Specify the maximum resources required to run the task, + # include CPU, memory, and bandwidth. + resources { + cpu = 500 # MHz + memory = 128 # MB + + network { + mbits = 100 + + # This requests a dynamic port named "http". This will + # be something like "46283", but we refer to it via the + # label "http". + port "http" {} + + # This requests a static port on 443 on the host. This + # will restrict this task to running once per host, since + # there is only one port 443 on each host. + port "https" { + static = 443 + } + } + } + } + } +} +``` + +Note that starting with Nomad 0.10, the `service` stanza can also be specified at the group level. This +allows job specification authors to create and register services with Consul Connect support. A service +stanza specified at the group level must include a [connect][] stanza, like the following snippet. + +```hcl +service { + name = "count-api" + port = "9001" + + connect { + sidecar_service {} + } + } +``` + +[hcl]: https://github.com/hashicorp/hcl 'HashiCorp Configuration Language' +[job]: /docs/job-specification/job 'Nomad job Job Specification' +[connect]: /docs/job-specification/connect 'Connect Stanza Specification' diff --git a/content/nomad/v0.11.x/content/docs/job-specification/job.mdx b/content/nomad/v0.11.x/content/docs/job-specification/job.mdx new file mode 100644 index 0000000000..2a00ee83ba --- /dev/null +++ b/content/nomad/v0.11.x/content/docs/job-specification/job.mdx @@ -0,0 +1,253 @@ +--- +layout: docs +page_title: job Stanza - Job Specification +sidebar_title: job +description: |- + The "job" stanza is the top-most configuration option in the job + specification. A job is a declarative specification of tasks that Nomad + should run. +--- + +# `job` Stanza + + + +The `job` stanza is the top-most configuration option in the job specification. +A job is a declarative specification of tasks that Nomad should run. Jobs have +one or more task groups, which are themselves collections of one or more tasks. +Job names are unique per [region][region] or [namespace][namespace] (if Nomad +Enterprise is used). + +```hcl +job "docs" { + constraint { + # ... + } + + datacenters = ["us-east-1"] + + group "example" { + # ... + } + + meta { + "my-key" = "my-value" + } + + parameterized { + # ... + } + + periodic { + # ... + } + + priority = 100 + + region = "north-america" + + task "docs" { + # ... + } + + update { + # ... + } +} +``` + +## `job` Parameters + +- `all_at_once` `(bool: false)` - Controls whether the scheduler can make + partial placements if optimistic scheduling resulted in an oversubscribed + node. This does not control whether all allocations for the job, where all + would be the desired count for each task group, must be placed atomically. + This should only be used for special circumstances. + +- `constraint` ([Constraint][constraint]: nil) - + This can be provided multiple times to define additional constraints. See the + [Nomad constraint reference][constraint] for more + details. + +- `affinity` ([Affinity][affinity]: nil) - + This can be provided multiple times to define preferred placement criteria. See the + [Nomad affinity reference][affinity] for more + details. + +- `spread` ([Spread][spread]: nil) - This can be provided multiple times + to define criteria for spreading allocations across a node attribute or metadata. + See the [Nomad spread reference][spread] for more details. + +- `datacenters` `(array: )` - A list of datacenters in the region which are eligible + for task placement. This must be provided, and does not have a default. + +- `group` `([Group][group]: )` - Specifies the start of a + group of tasks. This can be provided multiple times to define additional + groups. Group names must be unique within the job file. + +- `meta` ([Meta][]: nil) - Specifies a key-value map that annotates + with user-defined metadata. + +- `migrate` ([Migrate][]: nil) - Specifies the groups strategy for + migrating off of draining nodes. If omitted, a default migration strategy is + applied. Only service jobs with a count greater than 1 support migrate stanzas. + +- `namespace` `(string: "default")` - The namespace in which to execute the job. + Values other than default are not allowed in non-Enterprise versions of Nomad. + +- `parameterized` ([Parameterized][parameterized]: nil) - Specifies + the job as a parameterized job such that it can be dispatched against. + +- `periodic` ([Periodic][]: nil) - Allows the job to be scheduled + at fixed times, dates or intervals. + +- `priority` `(int: 50)` - Specifies the job priority which is used to + prioritize scheduling and access to resources. Must be between 1 and 100 + inclusively, with a larger value corresponding to a higher priority. + +- `region` `(string: "global")` - The region in which to execute the job. + +- `reschedule` ([Reschedule][]: nil) - Allows to specify a + rescheduling strategy. Nomad will then attempt to schedule the task on another + node if any of its allocation statuses become "failed". + +- `type` `(string: "service")` - Specifies the [Nomad scheduler][scheduler] to + use. Nomad provides the `service`, `system` and `batch` schedulers. + +- `update` ([Update][update]: nil) - Specifies the task's update + strategy. When omitted, rolling updates are disabled. + +- `vault` ([Vault][]: nil) - Specifies the set of Vault policies + required by all tasks in this job. + +- `vault_token` `(string: "")` - Specifies the Vault token that proves the + submitter of the job has access to the specified policies in the + [`vault`][vault] stanza. This field is only used to transfer the token and is + not stored after job submission. + + !> It is **strongly discouraged** to place the token as a configuration + parameter like this, since the token could be checked into source control + accidentally. Users should set the `VAULT_TOKEN` environment variable when + running the job instead. + +- `consul_token` `(string: "")` - Specifies the Consul token that proves the + submitter of the job has access to the Service Identity policies associated + with the job's Consul Connect enabled services. This field is only used to + transfer the token and is not stored after job submission. + + !> It is **strongly discouraged** to place the token as a configuration + parameter like this, since the token could be checked into source control + accidentally. Users should set the `CONSUL_HTTP_TOKEN` environment variable when + running the job instead. + +## `job` Examples + +The following examples only show the `job` stanzas. Remember that the +`job` stanza is only valid in the placements listed above. + +### Docker Container + +This example job starts a Docker container which runs as a service. Even though +the type is not specified as "service", that is the default job type. + +```hcl +job "docs" { + datacenters = ["default"] + + group "example" { + task "server" { + driver = "docker" + config { + image = "hashicorp/http-echo" + args = ["-text", "hello"] + } + + resources { + memory = 128 + } + } + } +} +``` + +### Batch Job + +This example job executes the `uptime` command on 10 Nomad clients in the fleet, +restricting the eligible nodes to Linux machines. + +```hcl +job "docs" { + datacenters = ["default"] + + type = "batch" + + constraint { + attribute = "${attr.kernel.name}" + value = "linux" + } + + group "example" { + count = 10 + task "uptime" { + driver = "exec" + config { + command = "uptime" + } + } + } +} +``` + +### Consuming Secrets + +This example shows a job which retrieves secrets from Vault and writes those +secrets to a file on disk, which the application then consumes. Nomad handles +all interactions with Vault. + +```hcl +job "docs" { + datacenters = ["default"] + + group "example" { + task "cat" { + driver = "exec" + + config { + command = "cat" + args = ["local/secrets.txt"] + } + + template { + data = "{{ secret \"secret/data\" }}" + destination = "local/secrets.txt" + } + + vault { + policies = ["secret-readonly"] + } + } + } +} +``` + +When submitting this job, you would run: + +```shell-session +$ VAULT_TOKEN="..." nomad job run example.nomad +``` + +[affinity]: /docs/job-specification/affinity 'Nomad affinity Job Specification' +[constraint]: /docs/job-specification/constraint 'Nomad constraint Job Specification' +[group]: /docs/job-specification/group 'Nomad group Job Specification' +[meta]: /docs/job-specification/meta 'Nomad meta Job Specification' +[migrate]: /docs/job-specification/migrate 'Nomad migrate Job Specification' +[namespace]: https://learn.hashicorp.com/nomad/governance-and-policy/namespaces +[parameterized]: /docs/job-specification/parameterized 'Nomad parameterized Job Specification' +[periodic]: /docs/job-specification/periodic 'Nomad periodic Job Specification' +[region]: https://learn.hashicorp.com/nomad/operating-nomad/federation +[reschedule]: /docs/job-specification/reschedule 'Nomad reschedule Job Specification' +[scheduler]: /docs/schedulers 'Nomad Scheduler Types' +[spread]: /docs/job-specification/spread 'Nomad spread Job Specification' +[task]: /docs/job-specification/task 'Nomad task Job Specification' +[update]: /docs/job-specification/update 'Nomad update Job Specification' +[vault]: /docs/job-specification/vault 'Nomad vault Job Specification' diff --git a/content/nomad/v0.11.x/content/docs/job-specification/lifecycle.mdx b/content/nomad/v0.11.x/content/docs/job-specification/lifecycle.mdx new file mode 100644 index 0000000000..60aad0d2a8 --- /dev/null +++ b/content/nomad/v0.11.x/content/docs/job-specification/lifecycle.mdx @@ -0,0 +1,70 @@ +--- +layout: docs +page_title: lifecycle Stanza - Job Specification +sidebar_title: lifecycle +description: |- + The "lifecycle" stanza configures when a task is run within the lifecycle of a + task group +--- + +# `lifecycle` Stanza + + + +The `lifecycle` stanza is used to express task dependencies in Nomad and +configures when a task is run within the lifecycle of a task group. + +Main tasks are tasks that do not have a `lifecycle` stanza. Lifecycle task hooks +are run in relation to the main tasks. Tasks can be run as Prestart Hooks, which +ensures the prestart task is run before the main tasks are run. + +Tasks can be run with an additional parameter to indicate that they are +sidecars, which ensures that they are running over the duration of the whole +task group. This will allow you to run a long-lived task in a task group for a +batch job. The absence of the sidecar flag indicates that the task is ephemeral +and the task will not be restarted if it completes successfully. This allows you +to run an ephemeral prestart task in a task group for a service job, which can +serve as initialization that occurs before the main services are started. + +Learn more about Nomad's task dependencies on the [HashiCorp Learn website][learn-taskdeps]. + + +```hcl +job "docs" { + group "example" { + + task "init" { + lifecycle { + hook = "prestart" + } + ... + } + + task "logging" { + lifecycle { + hook = "prestart" + sidecar = true + } + ... + } + + task "main" { + ... + } + + } +} +``` + +## `lifecycle` Parameters + +- `hook` `(string: "prestart")` - Specifies when a task should be run within + the lifecycle of a group. Currently only Prestart Hooks are supported. + +- `sidecar` `(bool: false)` - Controls whether or not a task is ephemeral or + long-lived within the task group. If a lifecycle task is ephemeral (`sidecar = + false`), the task will not be restarted after it completes successfully. If a + lifecycle task is long-lived (`sidecar = true`) and it terminates, it will be + restarted as long as the task group is running in its allocation. + +[learn-taskdeps]: https://learn.hashicorp.com/nomad?track=task-deps&utm_source=WEBSITE&utm_medium=WEB_IO&utm_offer=ARTICLE_PAGE&utm_content=DOCS#task-deps diff --git a/content/nomad/v0.11.x/content/docs/job-specification/logs.mdx b/content/nomad/v0.11.x/content/docs/job-specification/logs.mdx new file mode 100644 index 0000000000..1d262aa49d --- /dev/null +++ b/content/nomad/v0.11.x/content/docs/job-specification/logs.mdx @@ -0,0 +1,84 @@ +--- +layout: docs +page_title: logs Stanza - Job Specification +sidebar_title: logs +description: |- + The "logs" stanza configures the log rotation policy for a task's stdout and + stderr. Logging is enabled by default with sane defaults. The "logs" stanza + allows for finer-grained control over how Nomad handles log files. +--- + +# `logs` Stanza + + + +The `logs` stanza configures the log rotation policy for a task's `stdout` and +`stderr`. Logging is enabled by default with sane defaults (provided in the +parameters section below), and there is currently no way to disable logging for +tasks. The `logs` stanza allows for finer-grained control over how Nomad handles +log files. + +Nomad's log rotation works by writing stdout/stderr output from tasks to a file +inside the `alloc/logs/` directory with the following format: +`..`. Output is written to a particular index, +starting at zero, till that log file hits the configured `max_file_size`. After, +a new file is created at `index + 1` and logs will then be written there. A log +file is never rolled over, instead Nomad will keep up to `max_files` worth of +logs and once that is exceeded, the log file with the lowest index is deleted. + +```hcl +job "docs" { + group "example" { + task "server" { + logs { + max_files = 10 + max_file_size = 10 + } + } + } +} +``` + +For information on how to interact with logs after they have been configured, +please see the [`nomad alloc logs`][logs-command] command. + +## `logs` Parameters + +- `max_files` `(int: 10)` - Specifies the maximum number of rotated files Nomad + will retain for `stdout` and `stderr`. Each stream is tracked individually, so + specifying a value of 2 will create 4 files - 2 for stdout and 2 for stderr + +- `max_file_size` `(int: 10)` - Specifies the maximum size of each rotated file + in `MB`. If the amount of disk resource requested for the task is less than + the total amount of disk space needed to retain the rotated set of files, + Nomad will return a validation error when a job is submitted. + +## `logs` Examples + +The following examples only show the `logs` stanzas. Remember that the +`logs` stanza is only valid in the placements listed above. + +### Configure Defaults + +This example shows a default logging configuration. Yes, it is empty on purpose. +Nomad automatically enables logging with sane defaults as described in the +parameters section above. + +```hcl + +``` + +### Customization + +This example asks Nomad to retain 3 rotated files for each of `stderr` and +`stdout`, each a maximum size of 5 MB per file. The minimum disk space this +would require is 30 MB (3 `stderr` + 3 `stdout` × 5 MB = 30 MB). + +```hcl +logs { + max_files = 3 + max_file_size = 5 +} +``` + +[logs-command]: /docs/commands/alloc/logs 'Nomad logs command' diff --git a/content/nomad/v0.11.x/content/docs/job-specification/meta.mdx b/content/nomad/v0.11.x/content/docs/job-specification/meta.mdx new file mode 100644 index 0000000000..9067713e33 --- /dev/null +++ b/content/nomad/v0.11.x/content/docs/job-specification/meta.mdx @@ -0,0 +1,88 @@ +--- +layout: docs +page_title: meta Stanza - Job Specification +sidebar_title: meta +description: The "meta" stanza allows for user-defined arbitrary key-value pairs. +--- + +# `meta` Stanza + + + +The `meta` stanza allows for user-defined arbitrary key-value pairs. It is +possible to use the `meta` stanza at the [job][], [group][], or [task][] level. + +```hcl +job "docs" { + meta { + my-key = "my-value" + } + + group "example" { + meta { + my-key = "my-value" + } + + task "server" { + meta { + my-key = "my-value" + } + } + } +} +``` + +Metadata is merged up the job specification, so metadata defined at the job +level applies to all groups and tasks within that job. Metadata defined at the +group layer applies to all tasks within that group. + +## `meta` Parameters + +The "parameters" for the `meta` stanza can be any key-value. The keys and values +are both of type `string`, but they can be specified as other types. They will +automatically be converted to strings. + +## `meta` Examples + +The following examples only show the `meta` stanzas. Remember that the +`meta` stanza is only valid in the placements listed above. + +### Coercion + +This example shows the different ways to specify key-value pairs. Internally, +these values will be stored as their string representation. No type information +is preserved. + +```hcl +meta { + key = "true" + key = true + + "key" = true + + key = 1.4 + key = "1.4" +} +``` + +### Interpolation + +This example shows using [Nomad interpolation][interpolation] to populate +environment variables. + +```hcl +meta { + class = "${node.class}" +} +``` + +[job]: /docs/job-specification/job 'Nomad job Job Specification' +[group]: /docs/job-specification/group 'Nomad group Job Specification' +[task]: /docs/job-specification/task 'Nomad task Job Specification' +[interpolation]: /docs/runtime/interpolation 'Nomad interpolation' diff --git a/content/nomad/v0.11.x/content/docs/job-specification/migrate.mdx b/content/nomad/v0.11.x/content/docs/job-specification/migrate.mdx new file mode 100644 index 0000000000..f82b3bf0a6 --- /dev/null +++ b/content/nomad/v0.11.x/content/docs/job-specification/migrate.mdx @@ -0,0 +1,81 @@ +--- +layout: docs +page_title: migrate Stanza - Job Specification +sidebar_title: migrate +description: |- + The "migrate" stanza specifies the group's migrate strategy. The migrate + strategy is used to control the job's behavior when it is being migrated off + of a draining node. +--- + +# `migrate` Stanza + + + +The `migrate` stanza specifies the group's strategy for migrating off of +[draining][drain] nodes. If omitted, a default migration strategy is applied. +If specified at the job level, the configuration will apply to all groups +within the job. Only service jobs with a count greater than 1 support migrate +stanzas. + +```hcl +job "docs" { + migrate { + max_parallel = 1 + health_check = "checks" + min_healthy_time = "10s" + healthy_deadline = "5m" + } +} +``` + +When one or more nodes are draining, only `max_parallel` allocations will be +stopped at a time. Node draining will not continue until replacement +allocations have been healthy for their `min_healthy_time` or +`healthy_deadline` is reached. + +Note that a node's drain [deadline][deadline] will override the `migrate` +stanza for allocations on that node. The `migrate` stanza is for job authors to +define how their services should be migrated, while the node drain deadline is +for system operators to put hard limits on how long a drain may take. + +See the [Workload Migration Guide](https://learn.hashicorp.com/nomad/operating-nomad/node-draining) for details +on node draining. + +## `migrate` Parameters + +- `max_parallel` `(int: 1)` - Specifies the number of allocations that can be + migrated at the same time. This number must be less than the total + [`count`][count] for the group as `count - max_parallel` will be left running + during migrations. + +- `health_check` `(string: "checks")` - Specifies the mechanism in which + allocations health is determined. The potential values are: + + - "checks" - Specifies that the allocation should be considered healthy when + all of its tasks are running and their associated [checks][checks] are + healthy, and unhealthy if any of the tasks fail or not all checks become + healthy. This is a superset of "task_states" mode. + + - "task_states" - Specifies that the allocation should be considered healthy when + all its tasks are running and unhealthy if tasks fail. + +- `min_healthy_time` `(string: "10s")` - Specifies the minimum time the + allocation must be in the healthy state before it is marked as healthy and + unblocks further allocations from being migrated. This is specified using a + label suffix like "30s" or "15m". + +- `healthy_deadline` `(string: "5m")` - Specifies the deadline in which the + allocation must be marked as healthy after which the allocation is + automatically transitioned to unhealthy. This is specified using a label + suffix like "2m" or "1h". + +[checks]: /docs/job-specification/service#check-parameters +[count]: /docs/job-specification/group#count +[drain]: /docs/commands/node/drain +[deadline]: /docs/commands/node/drain#deadline diff --git a/content/nomad/v0.11.x/content/docs/job-specification/network.mdx b/content/nomad/v0.11.x/content/docs/job-specification/network.mdx new file mode 100644 index 0000000000..c6a9c1b17c --- /dev/null +++ b/content/nomad/v0.11.x/content/docs/job-specification/network.mdx @@ -0,0 +1,213 @@ +--- +layout: docs +page_title: network Stanza - Job Specification +sidebar_title: network +description: |- + The "network" stanza specifies the networking requirements for the task, + including the minimum bandwidth and port allocations. The network stanza + can be specified at the task group level to enable all tasks in the task + group to share the same network namespace. +--- + +# `network` Stanza + + + +The `network` stanza specifies the networking requirements for the task, +including the minimum bandwidth and port allocations. When scheduling jobs in +Nomad they are provisioned across your fleet of machines along with other jobs +and services. Because you don't know in advance what host your job will be +provisioned on, Nomad will provide your tasks with network configuration when +they start up. + +Nomad 0.10 enables support for the `network` stanza at the task group level. When +the `network` stanza is defined at the group level with `bridge` as the networking mode, +all tasks in the task group share the same network namespace. This is a prerequisite for +[Consul Connect](/docs/integrations/consul-connect). Tasks running within a +network namespace are not visible to applications outside the namespace on the same host. +This allows [Connect][] enabled applications to bind only to localhost within the shared network stack, +and use the proxy for ingress and egress traffic. + +Note that this document only applies to services that want to _listen_ on a +port. Batch jobs or services that only make outbound connections do not need to +allocate ports, since they will use any available interface to make an outbound +connection. + +```hcl +job "docs" { + group "example" { + task "server" { + resources { + network { + mbits = 200 + port "http" {} + port "https" {} + port "lb" { + static = 8889 + } + } + } + } + } +} +``` + +## `network` Parameters + +- `mbits` `(int: 10)` - Specifies the bandwidth required in MBits. + +- `port` ([Port](#port-parameters): nil) - Specifies a TCP/UDP port + allocation and can be used to specify both dynamic ports and reserved ports. + +- `mode` `(string: "host")` - Mode of the network. The following modes are available: + + - `none` - Task group will have an isolated network without any network interfaces. + - `bridge` - Task group will have an isolated network namespace with an interface + that is bridged with the host. Note that bridge networking is only + currently supported for the `docker`, `exec`, `raw_exec`, and `java` task + drivers. + - `host` - Each task will join the host network namespace and a shared network + namespace is not created. This matches the current behavior in Nomad 0.9. + +### `port` Parameters + +- `static` `(int: nil)` - Specifies the static TCP/UDP port to allocate. If omitted, a dynamic port is chosen. We **do not recommend** using static ports, except + for `system` or specialized jobs like load balancers. +- `to` `(string:nil)` - Applicable when using "bridge" mode to configure port + to map to inside the task's network namespace. `-1` sets the mapped port equal to the dynamic port allocated by the scheduler. The `NOMAD_PORT_