Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

S01E07: How to monitor RabbitMQ? #18

Merged
merged 39 commits into from
Nov 10, 2020
Merged

S01E07: How to monitor RabbitMQ? #18

merged 39 commits into from
Nov 10, 2020

Conversation

gerhard
Copy link
Contributor

@gerhard gerhard commented Oct 26, 2020

TGIR S01E07: How to monitor RabbitMQ?

TGIR S01E07: How to monitor RabbitMQ?

You have a few RabbitMQ deployments running (on Kubernetes). How do you monitor them?
You have heard of the great Grafana dashboards that team RabbitMQ maintains, maybe from this RabbitMQ Summit 2019 talk or from the official Monitoring with Prometheus & Grafana guide. But how do you actually set them up?

For speed and convenience, we spin up a K3S instance on a Linux host and do the following:

  • integrate K3S with Prometheus & Grafana, all running inside K3S
  • deploy a few RabbitMQ clusters together with workloads
  • cover the most important Grafana dashboards that we maintain by looking at the above workloads

You may follow along on any Linux host, including a VM running on your macOS or Windows host.
We had some credits with Equinix Metal that we wanted to put to good use.

Closes #17

Signed-off-by: Gerhard Lazu <gerhard@lazu.co.uk>
Signed-off-by: Gerhard Lazu <gerhard@lazu.co.uk>
k3sup will work with any Linux host, simply specify a K3SUP_IP variable
when running the make target

@displague there are a few areas that the Equinix Metal CLI needs obvious
improvements. Most obvious ones are getting the instance IP, ID &
deleting an instance not picking up the PROJECT_ID env var. Let me know
if you want to continue this conversation.

@alexellis why is k3sup installing traefik by default? It felt awkward
to have to disable it so that I would end up with a more vanilla K3S
install. Everything else worked as advertised, thanks for a smooth
experience.

Signed-off-by: Gerhard Lazu <gerhard@lazu.co.uk>
To force rebuild, run: make -B kubeconfig

Signed-off-by: Gerhard Lazu <gerhard@lazu.co.uk>
There seems to be a permissions issue in k3s:

    Failed to list *v1.Service: services is forbidden: User \"system:serviceaccount:monitoring:prometheus-k8s\" cannot list resource \"services\" in API group \"\" in the namespace \"default\""

I found this which seems to be a simpler way of getting Prometheus &
Grafana onto k3s: https://github.com/cablespaghetti/k3s-monitoring

Going to try that next as the current approach feels like a rabbit hole
based on what we care about. Committing it to capture what we have so
far, even though it may end up discarded.

Signed-off-by: Gerhard Lazu <gerhard@lazu.co.uk>
Most likely an intermediary step...

Signed-off-by: Gerhard Lazu <gerhard@lazu.co.uk>
Signed-off-by: Gerhard Lazu <gerhard@lazu.co.uk>
Signed-off-by: Gerhard Lazu <gerhard@lazu.co.uk>
gerhard and others added 21 commits October 29, 2020 09:34
Signed-off-by: Gerhard Lazu <gerhard@lazu.co.uk>
Signed-off-by: Gerhard Lazu <gerhard@lazu.co.uk>
Signed-off-by: Gerhard Lazu <gerhard@lazu.co.uk>
Signed-off-by: Gerhard Lazu <gerhard@lazu.co.uk>
Signed-off-by: Gerhard Lazu <gerhard@lazu.co.uk>
Signed-off-by: Gerhard Lazu <gerhard@lazu.co.uk>
Otherwise they will use the system hostname which is the first part of
the FQDN. More info:
https://github.com/rabbitmq/rabbitmq-website/blob/stream-queue/site/stream.md#advertised-host-port

re https://github.com/rabbitmq/rabbitmq-server/issues/2486

Signed-off-by: Gerhard Lazu <gerhard@lazu.co.uk>
3-4 CPUs & 1-2 CPUs are enough to push just over 1mil msg/s with this
setup.

Signed-off-by: Gerhard Lazu <gerhard@lazu.co.uk>
Signed-off-by: Gerhard Lazu <gerhard@lazu.co.uk>
That results in 1.4mil msg/s

Signed-off-by: Gerhard Lazu <gerhard@lazu.co.uk>
+ share a single server
+ do not create server if it exists

Signed-off-by: Gerhard Lazu <gerhard@lazu.co.uk>
Signed-off-by: Gerhard Lazu <gerhard@lazu.co.uk>
Signed-off-by: Gerhard Lazu <gerhard@lazu.co.uk>
Signed-off-by: Gerhard Lazu <gerhard@lazu.co.uk>
This makes the benefits of minimal or prometheus metrics more obvious.

Signed-off-by: Gerhard Lazu <gerhard@lazu.co.uk>
If rates_mode is anything other than none, collect_statistics gets set
to fine:
https://github.com/rabbitmq/rabbitmq-management-agent/blob/77aac8f4985559b2a660610938ee0567588e0422/src/rabbit_mgmt_db_handler.erl#L57-L72

Because we don't use detailed rates_mode, so the setting is irrelevant.

Signed-off-by: Gerhard Lazu <gerhard@lazu.co.uk>
mkuratczyk and others added 6 commits November 4, 2020 17:02
Otherwise RabbitMQ will fail to boot due to missing
sample_retention_policies. By the way, this property is not used because
we don't have detailed rates_mode set, but there must be a map merge
happening in Cuttlefish that results in an incomplete config which
RabbitMQ fails to handle and crashes.

cc @lukebakken @michaelklishin

Signed-off-by: Gerhard Lazu <gerhard@lazu.co.uk>
Memory was the problem, CPUs are sufficient.

Signed-off-by: Gerhard Lazu <gerhard@lazu.co.uk>
Signed-off-by: Gerhard Lazu <gerhard@lazu.co.uk>
Otherwise descriptions will be misaligned for tagets that have longer
names.

Signed-off-by: Gerhard Lazu <gerhard@lazu.co.uk>
`kubectl example rmq` would have been a sweet idea, but the currently
available version of the plugin doesn't support generating examples from
CRD schemas.

seredot/kubectl-example#2

Signed-off-by: Gerhard Lazu <gerhard@lazu.co.uk>
Signed-off-by: Gerhard Lazu <gerhard@lazu.co.uk>
Signed-off-by: Gerhard Lazu <gerhard@lazu.co.uk>
Signed-off-by: Gerhard Lazu <gerhard@lazu.co.uk>
Signed-off-by: Gerhard Lazu <gerhard@lazu.co.uk>
@gerhard gerhard merged commit 8344215 into master Nov 10, 2020
@gerhard gerhard deleted the S01E07 branch November 10, 2020 01:53
@gerhard
Copy link
Contributor Author

gerhard commented Nov 10, 2020

Scheduled to go live in a number of hours, after it gets converted to HD and maybe even 4K.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

How to monitor RabbitMQ?
3 participants