Consul tutorial
##Setting up a local consul cluster for testing.
You can test consul with one node. To test out the consul RAFT algorithm, we will set up four agents. Three will be servers and one will be a client.
A consul agent runs in server mode or client mode. Server agents maintain the state of the cluster. You want to have three to 5 server agents per datacenter. It should be an odd number to facilitate leader selection. Client agents are used on the servers whose services you want to monitor and report back to the consul servers.
The goal is to make sure you have a complete handle on how to recover and the difference between ctrl-c (graceful shutdown) and kill -9 (I have fallen and can't get up), and when you need to bootstrap and when you do not.
To reduce the amount of command line arguments, I will use a config file to start up the servers.
Here is the server1 config.
####Server 1 configuration server1.json
{
"datacenter": "dc1",
"data_dir": "/opt/consul1",
"log_level": "INFO",
"node_name": "server1",
"server": true,
"bootstrap" : true,
"ports" : {
"dns" : -1,
"http" : 9500,
"rpc" : 9400,
"serf_lan" : 9301,
"serf_wan" : 9302,
"server" : 9300
}
}
When you are starting up a new server cluster you typically put one of the servers in bootstrap
mode. This tells consul that this server is allowed to elect itself as leader. It is like when you ask Dick Chenney to be on the VP selection commitee and he nominates himself. He was in bootstrap mode.
Consul servers are boring if they have no one to talk to so let's add two more server config files to the mix.
{
"datacenter": "dc1",
"data_dir": "/opt/consul2",
"log_level": "INFO",
"node_name": "server2",
"server": true,
"ports" : {
"dns" : -1,
"http" : 10500,
"rpc" : 10400,
"serf_lan" : 10301,
"serf_wan" : 10302,
"server" : 10300
},
"start_join" : ["127.0.0.1:9301", "127.0.0.1:11301"]
}
Notice that server 2 points to server 1 and three in the start_join
config option.
{
"datacenter": "dc1",
"data_dir": "/opt/consul3",
"log_level": "INFO",
"node_name": "server3",
"server": true,
"ports" : {
"dns" : -1,
"http" : 11500,
"rpc" : 11400,
"serf_lan" : 11301,
"serf_wan" : 11302,
"server" : 11300
},
"start_join" : ["127.0.0.1:9301", "127.0.0.1:10301"]
}
And server 3 points to server 1 and 2. We are assigning different ports so we can run them all on one box. For obvious reasons, you would not need to assign ports if you were not running them all on the same box. You would never run servers on the same box. It would defeat the purpose of the replication for reliability.
To startup the three servers use the following command lines.
consul agent -config-file=server1.json -ui-dir=/opt/consul/web
We installed the web consul files for the UI in /opt/consul/web. You can download the UI from Consul UI.
consul agent -config-file=server2.json
consul agent -config-file=server3.json
Go ahead and start up the servers.
You should have the following files.
$ tree
.
├── server1.json
├── server1.sh
├── server2.json
├── server2.sh
├── server3.json
└── server3.sh
Run chmod +x on the .sh files so you can run them. Then run them.
The log for server 1 should look like this:
$ ./server1.sh
==> WARNING: Bootstrap mode enabled! Do not enable unless necessary
==> WARNING: It is highly recommended to set GOMAXPROCS higher than 1
==> Starting Consul agent...
==> Starting Consul agent RPC...
==> Consul agent running!
Node name: 'server1'
Datacenter: 'dc1'
Server: true (bootstrap: true)
Client Addr: 127.0.0.1 (HTTP: 9500, HTTPS: -1, DNS: -1, RPC: 9400)
Cluster Addr: 10.0.0.162 (LAN: 9301, WAN: 9302)
Gossip encrypt: false, RPC-TLS: false, TLS-Incoming: false
Atlas: <disabled>
==> Log data will now stream in as it occurs:
2015/04/14 10:43:04 [INFO] raft: Node at 10.0.0.162:9300 [Follower] entering Follower state
2015/04/14 10:43:04 [INFO] serf: EventMemberJoin: server1 10.0.0.162
2015/04/14 10:43:04 [INFO] serf: EventMemberJoin: server1.dc1 10.0.0.162
2015/04/14 10:43:04 [ERR] agent: failed to sync remote state: No cluster leader
2015/04/14 10:43:04 [INFO] consul: adding server server1 (Addr: 10.0.0.162:9300) (DC: dc1)
2015/04/14 10:43:04 [INFO] consul: adding server server1.dc1 (Addr: 10.0.0.162:9300) (DC: dc1)
2015/04/14 10:43:05 [WARN] raft: Heartbeat timeout reached, starting election
2015/04/14 10:43:05 [INFO] raft: Node at 10.0.0.162:9300 [Candidate] entering Candidate state
2015/04/14 10:43:05 [INFO] raft: Election won. Tally: 1
2015/04/14 10:43:05 [INFO] raft: Node at 10.0.0.162:9300 [Leader] entering Leader state
2015/04/14 10:43:05 [INFO] consul: cluster leadership acquired
2015/04/14 10:43:05 [INFO] consul: New leader elected: server1
2015/04/14 10:43:05 [INFO] raft: Disabling EnableSingleNode (bootstrap)
2015/04/14 10:43:05 [INFO] consul: member 'server1' joined, marking health alive
2015/04/14 10:43:07 [INFO] agent: Synced service 'consul'
It is warning us that we should not start server 1 in bootstrap mode unless we know what we are doing. Since we are just learning consul, let's leave it as such.
Then it does a Dick Chenney, it says it could not find a leader so it makes itself leader.
Now we start up server 2 in another terminal window.
2015/04/14 10:45:56 [INFO] serf: EventMemberJoin: server2 10.0.0.162
2015/04/14 10:45:56 [INFO] consul: adding server server2 (Addr: 10.0.0.162:10300) (DC: dc1)
2015/04/14 10:45:56 [INFO] raft: Added peer 10.0.0.162:10300, starting replication
2015/04/14 10:45:56 [WARN] raft: AppendEntries to 10.0.0.162:10300 rejected, sending older logs (next: 1)
2015/04/14 10:45:56 [INFO] raft: pipelining replication to peer 10.0.0.162:10300
2015/04/14 10:45:56 [INFO] consul: member 'server2' joined, marking health alive
It sees server 2 and marks it as alive. Whoot!
Going back to server 2's output we get
$ ./server2.sh
==> WARNING: It is highly recommended to set GOMAXPROCS higher than 1
==> Starting Consul agent...
==> Starting Consul agent RPC...
==> Joining cluster...
Join completed. Synced with 1 initial agents
==> Consul agent running!
Node name: 'server2'
Datacenter: 'dc1'
Server: true (bootstrap: false)
Client Addr: 127.0.0.1 (HTTP: 10500, HTTPS: -1, DNS: -1, RPC: 10400)
Cluster Addr: 10.0.0.162 (LAN: 10301, WAN: 10302)
Gossip encrypt: false, RPC-TLS: false, TLS-Incoming: false
Atlas: <disabled>
==> Log data will now stream in as it occurs:
2015/04/14 10:45:56 [INFO] raft: Node at 10.0.0.162:10300 [Follower] entering Follower state
2015/04/14 10:45:56 [INFO] serf: EventMemberJoin: server2 10.0.0.162
2015/04/14 10:45:56 [INFO] serf: EventMemberJoin: server2.dc1 10.0.0.162
2015/04/14 10:45:56 [INFO] agent: (LAN) joining: [127.0.0.1:9301 127.0.0.1:11301]
2015/04/14 10:45:56 [INFO] consul: adding server server2 (Addr: 10.0.0.162:10300) (DC: dc1)
2015/04/14 10:45:56 [INFO] consul: adding server server2.dc1 (Addr: 10.0.0.162:10300) (DC: dc1)
2015/04/14 10:45:56 [INFO] serf: EventMemberJoin: server1 10.0.0.162
2015/04/14 10:45:56 [INFO] consul: adding server server1 (Addr: 10.0.0.162:9300) (DC: dc1)
2015/04/14 10:45:56 [INFO] agent: (LAN) joined: 1 Err: <nil>
2015/04/14 10:45:56 [ERR] agent: failed to sync remote state: No cluster leader
2015/04/14 10:45:56 [WARN] raft: Failed to get previous log: 6 log not found (last: 0)
2015/04/14 10:46:20 [INFO] agent: Synced service 'consul'
It is a bit cryptic. We get some error messages about there being no leader. Then it says it synced. Whoot!
Server 3 statup is a little more smooth because the other two servers are alive!
$ ./server3.sh
==> WARNING: It is highly recommended to set GOMAXPROCS higher than 1
==> Starting Consul agent...
==> Starting Consul agent RPC...
==> Joining cluster...
Join completed. Synced with 2 initial agents
==> Consul agent running!
Node name: 'server3'
Datacenter: 'dc1'
Server: true (bootstrap: false)
Client Addr: 127.0.0.1 (HTTP: 11500, HTTPS: -1, DNS: -1, RPC: 11400)
Cluster Addr: 10.0.0.162 (LAN: 11301, WAN: 11302)
Gossip encrypt: false, RPC-TLS: false, TLS-Incoming: false
Atlas: <disabled>
==> Log data will now stream in as it occurs:
2015/04/14 10:48:45 [INFO] serf: EventMemberJoin: server3 10.0.0.162
2015/04/14 10:48:45 [INFO] raft: Node at 10.0.0.162:11300 [Follower] entering Follower state
2015/04/14 10:48:45 [INFO] consul: adding server server3 (Addr: 10.0.0.162:11300) (DC: dc1)
2015/04/14 10:48:45 [INFO] serf: EventMemberJoin: server3.dc1 10.0.0.162
2015/04/14 10:48:45 [INFO] agent: (LAN) joining: [127.0.0.1:9301 127.0.0.1:10301]
2015/04/14 10:48:45 [INFO] consul: adding server server3.dc1 (Addr: 10.0.0.162:11300) (DC: dc1)
2015/04/14 10:48:45 [INFO] serf: EventMemberJoin: server1 10.0.0.162
2015/04/14 10:48:45 [INFO] serf: EventMemberJoin: server2 10.0.0.162
2015/04/14 10:48:45 [INFO] consul: adding server server1 (Addr: 10.0.0.162:9300) (DC: dc1)
2015/04/14 10:48:45 [INFO] consul: adding server server2 (Addr: 10.0.0.162:10300) (DC: dc1)
2015/04/14 10:48:45 [INFO] agent: (LAN) joined: 2 Err: <nil>
2015/04/14 10:48:45 [ERR] agent: failed to sync remote state: No cluster leader
2015/04/14 10:48:45 [WARN] raft: Failed to get previous log: 12 log not found (last: 0)
2015/04/14 10:49:07 [INFO] agent: Synced service 'consul'
```
Now are three servers are up and their state is synced.
Now in the server 1 log we have two messages about being in-sync.
```bash
2015/04/14 10:45:56 [INFO] serf: EventMemberJoin: server2 10.0.0.162
2015/04/14 10:45:56 [INFO] consul: adding server server2 (Addr: 10.0.0.162:10300) (DC: dc1)
2015/04/14 10:45:56 [INFO] raft: Added peer 10.0.0.162:10300, starting replication
2015/04/14 10:45:56 [WARN] raft: AppendEntries to 10.0.0.162:10300 rejected, sending older logs (next: 1)
2015/04/14 10:45:56 [INFO] raft: pipelining replication to peer 10.0.0.162:10300
2015/04/14 10:45:56 [INFO] consul: member 'server2' joined, marking health alive
2015/04/14 10:48:45 [INFO] serf: EventMemberJoin: server3 10.0.0.162
2015/04/14 10:48:45 [INFO] consul: adding server server3 (Addr: 10.0.0.162:11300) (DC: dc1)
2015/04/14 10:48:45 [INFO] raft: Added peer 10.0.0.162:11300, starting replication
2015/04/14 10:48:45 [INFO] consul: member 'server3' joined, marking health alive
2015/04/14 10:48:45 [WARN] raft: AppendEntries to 10.0.0.162:11300 rejected, sending older logs (next: 1)
2015/04/14 10:48:45 [INFO] raft: pipelining replication to peer 10.0.0.162:11300
```
All is well. We have three healthy nodes.
If we go to server2 terminal and shut it down with control-C, it shuts down gracefully.
Now go to server 1 output and watch. It keeps trying to recconnect to server 2.
Now start up server 2 again. Notice how it reconnects and the look at the logs for server1 and server3.
Now do the same with server3. Shut it down. Watch the logs of server 1 and server 2.
When you see this log:
#### Server 3 log on reconnect
```bash
2015/04/14 11:01:53 [WARN] raft: Failed to get previous log: 23 log not found (last: 21)
It means it could not find a log so it has to advance and replicate data after index 21. Consul keeps a version number of the data called an index and as servers come online, they look at their index and then ask for stuff that happened after that index so they can sync changes.
2015/04/14 11:01:53 [INFO] raft: Removed ourself, transitioning to follower
2015/04/14 11:02:09 [INFO] agent: Synced service 'consul'
You can shutdown any two servers, and then bring them back up. The leadership will change to the server that stayed up. Do not shut down all three. Then it gets a bit trickier to get them bootstrapped again.
Each server maintains its own state. Look at the files in each servers data folder.
$ pwd
/opt/consul1
$ tree
.
├── checkpoint-signature
├── raft
│ ├── mdb
│ │ ├── data.mdb
│ │ └── lock.mdb
│ ├── peers.json
│ └── snapshots
├── serf
│ ├── local.snapshot
│ └── remote.snapshot
└── tmp
└── state506459124
├── data.mdb
└── lock.mdb
6 directories, 8 files
Look at the snapshot files and the contents of the json peers.json.
Now shut down all three with ctrl-C. Now start them up again. They will not be able to select a leader because ctrl-C allows them to leave gracefully. Do I understand this? No. But I know the solution. Because we left with ctrl-c it was a graceful leave and now have to delete the peers so it could reconnect which almost makes sense but not quite but I don't care because it works. We just delete the peers.json file.
I went ahead and changed the configuration file for them to be able to be restarted.
{
"datacenter": "dc1",
"data_dir": "/opt/consul1",
"log_level": "INFO",
"node_name": "server1",
"server": true,
"ports" : {
"dns" : -1,
"http" : 9500,
"rpc" : 9400,
"serf_lan" : 9301,
"serf_wan" : 9302,
"server" : 9300
},
"retry_join" : [
"127.0.0.1:9301",
"127.0.0.1:10301",
"127.0.0.1:11301"]
}
{
"datacenter": "dc1",
"data_dir": "/opt/consul2",
"log_level": "INFO",
"node_name": "server2",
"server": true,
"ports" : {
"dns" : -1,
"http" : 10500,
"rpc" : 10400,
"serf_lan" : 10301,
"serf_wan" : 10302,
"server" : 10300
},
"retry_join" : [
"127.0.0.1:9301",
"127.0.0.1:10301",
"127.0.0.1:11301"
]
}
{
"datacenter": "dc1",
"data_dir": "/opt/consul3",
"log_level": "INFO",
"node_name": "server3",
"server": true,
"ports" : {
"dns" : -1,
"http" : 11500,
"rpc" : 11400,
"serf_lan" : 11301,
"serf_wan" : 11302,
"server" : 11300
},
"retry_join" : [
"127.0.0.1:9301",
"127.0.0.1:10301",
"127.0.0.1:11301"
]
}
In order to startup clean, you have to start one of the servers in bootstrap mode like we had server1.json setup before and then delete all of the peers.json files.
$ pwd
/opt
$ find . -name "peers.json"
./consul1/raft/peers.json
./consul2/raft/peers.json
./consul3/raft/peers.json
$ find . -name "peers.json" | xargs rm
Now restart them.
As long as one of the servers does not startup in bootstrap mode, you will get this all day long.
2015/04/14 13:02:31 [ERR] agent: failed to sync remote state: rpc error: No cluster leader
2015/04/14 13:02:32 [ERR] agent: failed to sync remote state: rpc error: No cluster leader
I have another set of bootstrap json files for each server with a startup script. You need to delete the peer file and then start up one of the servers in bootstrap mode using these scripts. See the discussion at issue 526, and then re-read outage recovery.
{
"datacenter": "dc1",
"data_dir": "/opt/consul1",
"log_level": "INFO",
"node_name": "server1",
"server": true,
"bootstrap": true,
"ports" : {
"dns" : -1,
"http" : 9500,
"rpc" : 9400,
"serf_lan" : 9301,
"serf_wan" : 9302,
"server" : 9300
}
}
{
"datacenter": "dc1",
"data_dir": "/opt/consul2",
"log_level": "INFO",
"node_name": "server2",
"server": true,
"bootstrap": true,
"ports" : {
"dns" : -1,
"http" : 10500,
"rpc" : 10400,
"serf_lan" : 10301,
"serf_wan" : 10302,
"server" : 10300
}
}
{
{
"datacenter": "dc1",
"data_dir": "/opt/consul3",
"log_level": "INFO",
"node_name": "server3",
"server": true,
"bootstrap": true,
"ports" : {
"dns" : -1,
"http" : 11500,
"rpc" : 11400,
"serf_lan" : 11301,
"serf_wan" : 11302,
"server" : 11300
}
}
$ cat server1boot.sh
export GOMAXPROCS=10
consul agent -config-file=server1boot.json \
-retry-interval=3s \
-ui-dir=/opt/consul/web
There is a startup script per server.
Remember to delete the peers.json file.
$ pwd
/opt
$ find . -name "peers.json" | xargs rm
Now you are set. Pick any server, run it in bootstrap mode.
$ ./server3boot.sh
==> WARNING: Bootstrap mode enabled! Do not enable unless necessary
==> Starting Consul agent...
==> Starting Consul agent RPC...
==> Consul agent running!
Node name: 'server3'
Datacenter: 'dc1'
Server: true (bootstrap: true)
Client Addr: 127.0.0.1 (HTTP: 11500, HTTPS: -1, DNS: -1, RPC: 11400)
Cluster Addr: 10.0.0.162 (LAN: 11301, WAN: 11302)
Gossip encrypt: false, RPC-TLS: false, TLS-Incoming: false
Atlas: <disabled>
....
$ ./server2.sh
==> Starting Consul agent...
==> Starting Consul agent RPC...
==> Consul agent running!
Node name: 'server2'
Datacenter: 'dc1'
Server: true (bootstrap: false)
Client Addr: 127.0.0.1 (HTTP: 10500, HTTPS: -1, DNS: -1, RPC: 10400)
Cluster Addr: 10.0.0.162 (LAN: 10301, WAN: 10302)
Gossip encrypt: false, RPC-TLS: false, TLS-Incoming: false
Atlas: <disabled>
...
$ ./server1.sh
==> Starting Consul agent...
==> Starting Consul agent RPC...
==> Consul agent running!
Node name: 'server1'
Datacenter: 'dc1'
Server: true (bootstrap: false)
Client Addr: 127.0.0.1 (HTTP: 9500, HTTPS: -1, DNS: -1, RPC: 9400)
Cluster Addr: 10.0.0.162 (LAN: 9301, WAN: 9302)
Gossip encrypt: false, RPC-TLS: false, TLS-Incoming: false
Atlas: <disabled>
To force a failure, pick a server process and kill -9 it.
Also if you kill all of the servers at once with kill -9 as in:
$ pkill -9 consul
You do not have to boostrap any of them.
Thus you can run:
./server1.sh
./server2.sh
./server3.sh
##Setting up a client to test our local cluster
The Agent is the center of the Consul world. The agent must run on every node that is part of a Consul cluster. You have two types of agents: clients and servers. The agent servers are the information hub and they store data for the cluster. They also replicate the date to the other server nodes and to the client nodes. The client nodes are what sit on every service box. Agent client nodes are lightweight instances that sit on every server in the cluster and rely on the agent servers for most of their state.
To start up consul in client mode we will use the following config file.
$ cat client1.json
{
"datacenter": "dc1",
"data_dir": "/opt/consulclient",
"log_level": "INFO",
"node_name": "client1",
"server": false,
"ports" : {
"dns" : -1,
"http" : 8500,
"rpc" : 8400,
"serf_lan" : 8301,
"serf_wan" : 8302,
"server" : 8300
},
"start_join" : [
"127.0.0.1:9301",
"127.0.0.1:10301",
"127.0.0.1:11301"
]
}
You will notice that we are not in server mode which puts us in client mode. The client is addressable on port 8XXX. We specified where to see the servers using the start_join key.
consul agent -config-file=client1.json
Now when we start up consul, we just specify the client1.json file.
Now we can get info about our cluster.
$ consul info
agent:
check_monitors = 0
check_ttls = 0
checks = 0
services = 0
build:
prerelease =
revision = 0c7ca91c
version = 0.5.0
consul:
known_servers = 3
server = false
runtime:
arch = amd64
cpu_count = 8
goroutines = 34
max_procs = 1
os = darwin
version = go1.4.2
serf_lan:
encrypted = false
event_queue = 0
event_time = 6
failed = 0
intent_queue = 0
left = 0
member_time = 42
members = 4
query_queue = 0
query_time = 1
We can list the members in this cluster:
$ consul members
Node Address Status Type Build Protocol
client1 10.0.0.162:8301 alive client 0.5.0 2
server2 10.0.0.162:10301 alive server 0.5.0 2
server1 10.0.0.162:9301 alive server 0.5.0 2
server3 10.0.0.162:11301 alive server 0.5.0 2
You can also use the HTTP interface to see what members are in the cluster.
$ curl http://localhost:8500/v1/agent/members
[
{
"Name": "server2",
"Addr": "10.0.0.162",
"Port": 10301,
"Tags": {
"build": "0.5.0:0c7ca91c",
"dc": "dc1",
"port": "10300",
"role": "consul",
"vsn": "2",
"vsn_max": "2",
"vsn_min": "1"
},
"Status": 1,
"ProtocolMin": 1,
"ProtocolMax": 2,
"ProtocolCur": 2,
"DelegateMin": 2,
"DelegateMax": 4,
"DelegateCur": 4
},
{
"Name": "server1",
"Addr": "10.0.0.162",
"Port": 9301,
"Tags": {
"build": "0.5.0:0c7ca91c",
"dc": "dc1",
"port": "9300",
"role": "consul",
"vsn": "2",
"vsn_max": "2",
"vsn_min": "1"
},
"Status": 1,
"ProtocolMin": 1,
"ProtocolMax": 2,
"ProtocolCur": 2,
"DelegateMin": 2,
"DelegateMax": 4,
"DelegateCur": 4
},
{
"Name": "server3",
"Addr": "10.0.0.162",
"Port": 11301,
"Tags": {
"build": "0.5.0:0c7ca91c",
"dc": "dc1",
"port": "11300",
"role": "consul",
"vsn": "2",
"vsn_max": "2",
"vsn_min": "1"
},
"Status": 1,
"ProtocolMin": 1,
"ProtocolMax": 2,
"ProtocolCur": 2,
"DelegateMin": 2,
"DelegateMax": 4,
"DelegateCur": 4
},
{
"Name": "client1",
"Addr": "10.0.0.162",
"Port": 8301,
"Tags": {
"build": "0.5.0:0c7ca91c",
"dc": "dc1",
"role": "node",
"vsn": "2",
"vsn_max": "2",
"vsn_min": "1"
},
"Status": 1,
"ProtocolMin": 1,
"ProtocolMax": 2,
"ProtocolCur": 2,
"DelegateMin": 2,
"DelegateMax": 4,
"DelegateCur": 4
}
]
You can try out other agent HTTP calls by looking at the Agent HTTP API.
You can use the HTTP API from any client or server:
$ curl http://localhost:8500/v1/catalog/datacenters
["dc1"]
$ curl http://localhost:9500/v1/catalog/datacenters
["dc1"]
$ curl http://localhost:10500/v1/catalog/datacenters
["dc1"]
$ curl http://localhost:10500/v1/catalog/nodes
[{"Node":"client1","Address":"10.0.0.162"},
{"Node":"server1","Address":"10.0.0.162"},
{"Node":"server2","Address":"10.0.0.162"},
{"Node":"server3","Address":"10.0.0.162"}]
$ curl http://localhost:8500/v1/catalog/services
{"consul":[]}
$ curl --upload-file register_service.json \
http://localhost:8500/v1/agent/service/register
{
"ID": "myservice1",
"Name": "myservice",
"Address": "127.0.0.1",
"Port": 8080,
"Check": {
"Interval": "10s",
"TTL": "15s"
}
}
The above registers a new service called myservice. Name
is the name of the service while ID
is a specific instance of that service.
The Check we installed expects the service to check in with the servers every 15s.
Once you register the service, then you can see it from the agent as follows:
$ curl http://localhost:8500/v1/agent/services
{
"myservice1": {
"ID": "myservice1",
"Service": "myservice",
"Tags": null,
"Address": "127.0.0.1",
"Port": 8080
}
}
To check this services health, we can use this endpoint.
$ curl http://localhost:8500/v1/health/service/myservice
[
{
"Node": {
"Node": "client1",
"Address": "10.0.0.162"
},
"Service": {
"ID": "myservice1",
"Service": "myservice",
"Tags": null,
"Address": "127.0.0.1",
"Port": 8080
},
"Checks": [
{
"Node": "client1",
"CheckID": "service:myservice1",
"Name": "Service 'myservice' check",
"Status": "critical",
"Notes": "",
"Output": "TTL expired",
"ServiceID": "myservice1",
"ServiceName": "myservice"
},
{
"Node": "client1",
"CheckID": "serfHealth",
"Name": "Serf Health Status",
"Status": "passing",
"Notes": "",
"Output": "Agent alive and reachable",
"ServiceID": "",
"ServiceName": ""
}
]
}
]
Here we can see that the health status is critical because the TTL failed.
To tell consul that our fictional service is passing, every 15 seconds we need to send it this:
$ curl http://localhost:8500/v1/agent/check/pass/service:myservice1
$ curl http://localhost:8500/v1/health/service/myservice
[
{
"Node": {
"Node": "client1",
"Address": "10.0.0.162"
},
"Service": {
"ID": "myservice1",
"Service": "myservice",
"Tags": null,
"Address": "127.0.0.1",
"Port": 8080
},
"Checks": [
{
"Node": "client1",
"CheckID": "service:myservice1",
"Name": "Service 'myservice' check",
"Status": "passing",
"Notes": "",
"Output": "",
"ServiceID": "myservice1",
"ServiceName": "myservice"
},
{
"Node": "client1",
"CheckID": "serfHealth",
"Name": "Serf Health Status",
"Status": "passing",
"Notes": "",
"Output": "Agent alive and reachable",
"ServiceID": "",
"ServiceName": ""
}
]
}
]
Notice that our status went form "Status": "critical"
to "Status": "passing"
.
You can also mark a service as warn or critical using an HTTP call.
$ curl http://localhost:8500/v1/agent/check/warn/service:myservice1
$ curl http://localhost:8500/v1/agent/check/fail/service:myservice1
Note that you can also query the health status from any node. All nodes in the cluster know where myservice services are running. All server nodes and all client nodes.
$ curl http://localhost:9500/v1/health/service/myservice
Let's start up another client and install a service in it. We created another client startup script.
$ cat client2.json
{
"datacenter": "dc1",
"data_dir": "/opt/consulclient2",
"log_level": "INFO",
"node_name": "client2",
"server": false,
"ports" : {
"dns" : -1,
"http" : 7500,
"rpc" : 7400,
"serf_lan" : 7301,
"serf_wan" : 7302,
"server" : 7300
},
"start_join" : [
"127.0.0.1:9301",
"127.0.0.1:10301",
"127.0.0.1:11301"
]
}
$ cat client2.sh
consul agent -config-file=client2.json
Then we will create another register service json file.
$ cat register_service2.json
{
"ID": "myservice2",
"Name": "myservice",
"Address": "127.0.0.1",
"Port": 9090,
"Check": {
"Interval": "10s",
"TTL": "15s"
}
}
#### Running service registry against client2 agent
```bash
curl --upload-file register_service2.json \
http://localhost:7500/v1/agent/service/register
####Make both services healthy
curl http://localhost:8500/v1/agent/check/pass/service:myservice1
curl http://localhost:7500/v1/agent/check/pass/service:myservice2
####Query them
curl http://localhost:9500/v1/health/service/myservice
[
{
"Node": {
"Node": "client2",
"Address": "10.0.0.162"
},
"Service": {
"ID": "myservice2",
"Service": "myservice",
"Tags": null,
"Address": "127.0.0.1",
"Port": 9090
},
"Checks": [
{
"Node": "client2",
"CheckID": "service:myservice2",
"Name": "Service 'myservice' check",
"Status": "passing",
"Notes": "",
"Output": "",
"ServiceID": "myservice2",
"ServiceName": "myservice"
},
{
"Node": "client2",
"CheckID": "serfHealth",
"Name": "Serf Health Status",
"Status": "passing",
"Notes": "",
"Output": "Agent alive and reachable",
"ServiceID": "",
"ServiceName": ""
}
]
},
{
"Node": {
"Node": "client1",
"Address": "10.0.0.162"
},
"Service": {
"ID": "myservice1",
"Service": "myservice",
"Tags": null,
"Address": "127.0.0.1",
"Port": 8080
},
"Checks": [
{
"Node": "client1",
"CheckID": "service:myservice1",
"Name": "Service 'myservice' check",
"Status": "passing",
"Notes": "",
"Output": "",
"ServiceID": "myservice1",
"ServiceName": "myservice"
},
{
"Node": "client1",
"CheckID": "serfHealth",
"Name": "Serf Health Status",
"Status": "passing",
"Notes": "",
"Output": "Agent alive and reachable",
"ServiceID": "",
"ServiceName": ""
}
]
}
]