Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Added useful variables to override the default #125

Merged
merged 5 commits into from
Apr 8, 2020

Conversation

stefangusa
Copy link
Contributor

@stefangusa stefangusa commented Apr 2, 2020

Closes #124

@pull-request-size pull-request-size bot added the size/XS PR that changes 0-9 lines. Quick fix/merge. label Apr 2, 2020
values.yaml Outdated
@@ -440,6 +441,11 @@ rabbitmq-ha:
rabbitmqUsername: admin
# TODO: Use default random 24 character password, but need to fetch this string for use by downstream services
rabbitmqPassword: 9jS+w1u07NbHtZke1m+jW4Cj
# RabbitMQ available memory
rabbitmqMemoryHighWatermark: "0.8"
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is very custom value which I'd argue makes sense for everyone.
Can you clarify the practical use case and why it's important in K8s environment and is absolutely recommended for any environment?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That value should not be fixed necessarily but the stackstorm-ha deployment starts working erroneously when the default memory watermark attached to the RabbitMQ Pods is full (to me, this happened after less than one day) RabbitMQ Issue #12730.

This is why I insist on adding these attributes to the values file of stackstorm-ha and maybe you can help me with a more suitable default value.

Copy link
Member

@arm4b arm4b Apr 7, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@stefangusa Thanks for the explanation.

If no resources limits set for the RabbitMQ Deployment/Pods, how the relative memory watermark setting

rabbitmqMemoryHighWatermark: "0.8"
rabbitmqMemoryHighWatermarkType: relative

behaves in a real-world K8s deployment and how much memory RabbitMQ cluster acquires in that situation?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

From the RabbitMQ docs,

The default value of 0.4 stands for 40% of availalbe (detected) RAM or 40% of available virtual address space, whichever is smaller. E.g. on a 32-bit platform with 4 GiB of RAM installed, 40% of 4 GiB is 1.6 GiB, but 32-bit Windows normally limits processes to 2 GiB, so the threshold is actually to 40% of 2 GiB (which is 820 MiB).

That setting is something I have chosen heuristically based on my needs, I haven't made a deep analysis to potentially lower the value as I haven't had to so far.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The default value in RabbitMQ Helm chart is

rabbitmqMemoryHighWatermark: 256MB
rabbitmqMemoryHighWatermarkType: absolute

https://github.com/helm/charts/blob/4b82dae876d95e87b05b52d157cd6866169f992a/stable/rabbitmq-ha/values.yaml#L134-L138

I'm wondering it was set that way to an absolute value by a reason and what's happening with the relative settings if we recommend it to user, considering there is no Pod limit is set.

Will the RabbitMQ take entire possible cluster memory or what in that scenario?
Depending on how it behaves results in what we can suggest to the users.

Copy link
Contributor Author

@stefangusa stefangusa Apr 7, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

According to the docs:

Default:
vm_memory_high_watermark.relative = 0.4

I don't really figure out why that absolute value is set but I found an issue #17089 where that default value solved the issue. I can also see other issues regarding this problem in the GitHub Charts repo issues.

Regarding the last question, according to the docs:

rabbitmqctl set_vm_memory_high_watermark fraction

When using the absolute mode, it is possible to use one of the following memory units:
M, MiB for mebibytes (2^20 bytes)
MB for megabytes (10^6 bytes)
G, GiB for gibibytes (2^30 bytes)
GB for gigabytes (10^9 bytes)

it is revealed that both rabbitmqMemoryHighWatermark and rabbitmqMemoryHighWatermarkType parameters need to be set correctly (fraction for relative type and value + memory units for absolute type) in order that RabbitMQ can start.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm referring to RabbitMQ chart defaults we're using which relies on absolute rabbitmqMemoryHighWatermarkType and hard memory limit by default. This probably happens by a reason as we're not in VM environment anymore where entire memory could be potentially dedicated to a single app like RabbitMQ.

It's important to make sure safety settings before we suggest our users to rely on relative memory in a highly distributed K8s environment. So can you please verify/experiment that behavior in K8s cluster?

What I mean, specifically:
If RabbitMQ-HA is deployed on K8s cluster that has for example 100GB of memory pool, with * 0.8 rabbitmqMemoryHighWatermark setting how much memory would RabbitMQ cluster may take before enforcing connection throttling, considering Pods have no memory limits set? This could be a huge and non-practical value that can affect entire K8s cluster.

What if K8s cluster consists from a nodes with different memory resources. And so for example 1 Pod of RabbitMQ HA cluster is deployed on a node with 5GB of memory, but another is deployed by a scheduler on a node with 128GB of memory, how * 0.8 memory watermark will behave in reality with this relative setting?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good, now I see the problem and watching it from this point of view I agree that there is an issue and thank you for explaining it!

The RabbitMQ with relative high watermark takes a percent of the memory available from the machine where it runs on. An if three replicas run on three machines with different memory resources and there is no bound, each replica is going to have another amount of memory available.

For this reason, I suggest we keep both of the fields but with other values: rabbitmqMemoryHighWatermarkType: absolute and rabbitmqMemoryHighWatermark: 512MB.

This way, along with the comment in code, people are made aware of these fields and are given a short description of what they mean and in addition to this, a "safer" value (according to the issues mentioned in the code comment and above in this discussion) for the memory available is set. An if this turns out not to be enough for somebody, he would know precisely what to overwrite.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Makes sense, agree with that 👍

Just left a few code recommendations so we make these suggestions clearer in code comments.

values.yaml Outdated Show resolved Hide resolved
@stefangusa stefangusa requested a review from arm4b April 7, 2020 12:43
values.yaml Outdated Show resolved Hide resolved
values.yaml Show resolved Hide resolved
stefangusa and others added 2 commits April 8, 2020 22:05
Co-Authored-By: Eugen C. <armab@users.noreply.github.com>
Co-Authored-By: Eugen C. <armab@users.noreply.github.com>
@stefangusa stefangusa requested a review from arm4b April 8, 2020 19:06
Copy link
Member

@arm4b arm4b left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good, thanks! 👍

@pull-request-size pull-request-size bot added size/S PR that changes 10-29 lines. Very easy to review. and removed size/XS PR that changes 0-9 lines. Quick fix/merge. labels Apr 8, 2020
@arm4b arm4b merged commit 9d3fade into StackStorm:master Apr 8, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
size/S PR that changes 10-29 lines. Very easy to review.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Add custom attributes to third party chart dependencies
2 participants