Nomad consul prometheus
By using nomad to start and stop our services they are all registered into a central key value store with a specific tag we configured.
Prometheus will use the key value store to collect the services with the 'metrics' tag to identify as an endpoint automatically.
The following steps should make that clear;
bring up the environment by using vagrant which will bring up a centos 7 virtualbox machine or lxc container. It will use the intialize.sh bash script to install both nomad and consul and also start a nomad job for our prometheus instance.
The proved working vagrant providers used on an ArchLinux system are
$ vagrant up --provider lxc OR $ vagrant up --provider libvirt OR $ vagrant up --provider virtualbox
Once it is finished, you should be able to connect to the vagrant environment through SSH and interact with Nomad:
$ vagrant ssh [vagrant@nomad ~]$ nomad status ID Type Priority Status Submit Date consul service 50 running 2019-04-03T14:52:25Z prometheus service 50 running 2019-04-03T14:52:25Z
As you can see we have a consul container running next to the prometheus one.
The consul interface is accessible through http://localhost:8500
When you browse to http://localhost:9090/targets you should be able to see the prometheus target configuration
By starting a nomad job for one of the two exporters available as a nomad job you should see that it comes up in the target configuration after a few moments.
[vagrant@nomad ~]$ nomad run /opt/nomad/node-exporter.hcl ==> Monitoring evaluation "310cc1b6" Evaluation triggered by job "node-exporter" Allocation "4cf3aff6" created: node "a52cf97d", group "app" Evaluation status changed: "pending" -> "complete" ==> Evaluation "310cc1b6" finished with status "complete" [vagrant@nomad ~]$ nomad status node-exporter ID = node-exporter Name = node-exporter Submit Date = 2018-06-04T20:24:13Z Type = service Priority = 50 Datacenters = dc1 Status = running Periodic = false Parameterized = false Summary Task Group Queued Starting Running Failed Complete Lost app 0 0 1 0 0 0 Allocations ID Node ID Task Group Version Desired Status Created Modified 4cf3aff6 a52cf97d app 0 run running 29s ago 8s ago [vagrant@nomad ~]$ nomad alloc-status 4cf3aff6 ID = 4cf3aff6 Eval ID = 310cc1b6 Name = node-exporter.app Node ID = a52cf97d Job ID = node-exporter Job Version = 0 Client Status = running Client Description = <none> Desired Status = run Desired Description = <none> Created = 52s ago Modified = 14s ago Task "node-exporter" is "running" Task Resources CPU Memory Disk IOPS Addresses 0/50 MHz 4.4 MiB/100 MiB 300 MiB 0 http: 127.0.0.1:9100 Task Events: Started At = 2018-06-04T20:24:34Z Finished At = N/A Total Restarts = 0 Last Restart = N/A Recent Events: Time Type Description 2018-06-04T20:24:34Z Started Task started by client 2018-06-04T20:24:13Z Driver Downloading image prom/node-exporter:v0.16.0 2018-06-04T20:24:13Z Task Setup Building Task Directory 2018-06-04T20:24:13Z Received Task received by client
If you are too fast it will be showing as unknown when refreshing a few times it should come up as UP
a second exporter can be started for example cadvisor
[vagrant@nomad ~]$ nomad run /opt/nomad/cadvisor.hcl ==> Monitoring evaluation "d65d4fa8" Evaluation triggered by job "cadvisor" Allocation "f43d6b44" created: node "a52cf97d", group "app" Evaluation status changed: "pending" -> "complete" ==> Evaluation "d65d4fa8" finished with status "complete"
When we now stop the node-exporter it should automatically be removed from the target configuration:
[vagrant@nomad ~]$ nomad stop -purge node-exporter ==> Monitoring evaluation "7f683028" Evaluation triggered by job "node-exporter" Evaluation status changed: "pending" -> "complete" ==> Evaluation "7f683028" finished with status "complete"
All the nomad jobs are sending the application logs from the docker container towards journalctl so you could follow the progress there too.