Skip to content

Conversation

jovial
Copy link
Contributor

@jovial jovial commented Nov 24, 2022

Nodes with alot of ram can end up triggering this low memory alert even when they have a significant amount of free memory e.g for a node with 512GiB ram, when we hit the current alert (which is configured to be 80%) we still have 102.5GiB free.

Nodes with alot of ram can end up triggering this low memory alert
even when they have a significant amount of free memory e.g for a
node with 512GiB ram, when we hit the current alert (which is con
figured to be 80%) we still have 102.5GiB free.
@jovial jovial requested a review from a team as a code owner November 24, 2022 15:34
Copy link
Member

@priteau priteau left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In commit message: alot -> a lot

Add new line at end of globals.yml

es_heap_size: 8g
prometheus_cmdline_extras: "--storage.tsdb.retention.time=30d"
# Threshold to trigger a LowMemory alert in Gibibytes (GiB). When the
# amount of free memory is lower that this value an alert will
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

that -> than

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done, thanks.

# Threshold to trigger a LowMemory alert in Gibibytes (GiB). When the
# amount of free memory is lower that this value an alert will
# be triggered.
alertmanager_low_memory_threshold_gb: 5
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should we use gib instead of gb?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good idea, I've changed it.

Copy link
Member

@dougszumski dougszumski left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good change. Unrelated CI fail. Would be helpful to squash on merge.

@markgoddard markgoddard merged commit c5fa564 into stackhpc/xena Nov 28, 2022
@markgoddard markgoddard deleted the xena/absolute-memory-alert branch November 28, 2022 09:30
@markgoddard markgoddard mentioned this pull request Nov 28, 2022
@bbezak
Copy link
Member

bbezak commented Nov 29, 2022

we're using RAW statement in the beginning of the file, therefore this change is not enough

ts=2022-11-29T13:20:50.654Z caller=manager.go:973 level=error component="rule manager" msg="loading groups failed" err="/etc/prometheus/system.rules: 19:11: group \"Node\", rule 2, \"LowMemory\": could not parse expression: 1:46: parse error: unexpected left brace '{'"

@bbezak
Copy link
Member

bbezak commented Nov 29, 2022

we could probably add endraw/raw between this new variable @jovial

but then, it should probably be a kayobe var, not k-a

@markgoddard markgoddard mentioned this pull request Sep 21, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants