Updated RabbitMQ-Chart to 1.46.1 & improved Reboot-Resilience #158

moonrail · 2020-11-20T08:03:18Z

Hello altogether

Potentially partly helps with the #11

This Pull Request updates RabbitMQs-Chart-Dependency to 1.46.1, as on our kubernetes installations (1.17, 1.18, 1.19) 1.44.1 would not run due to mysterious disk-space complaints while being correctly assigned to a PersistentVolume and being able to RW to it.

As this new RabbitMQ-Version enables Prometheus-Monitoring by default, most of current installations would fail, therefore this is disabled it by default.

Enabled rabbitmqErlangCookie, as otherwise Cluster-Data is not reusable after RabbitMQ-Deployment-Restart/-Rebuild (or short period of 0 Replicas).

Added forceBoot, as otherwise Cluster-Data is not reusable after RabbitMQ-Deployment-Restart/-Rebuild (or short period of 0 Replicas), as Mnesia Tables are not cleaned up and cause RabbitMQ to not boot up. See helm/charts#13485

So this should improve User Experience by not having to ditch PersistentVolumes after RabbitMQ-Redeployments.

Not really sure about this - but if StackStorm holds queued Executions in RabbitMQ, this would also help in disaster cases, to not loose all running Executions.

Tested with 3.3dev on kubernetes 1.17, 1.18 & 1.19.

Please let me know, if there is something to improve. :)

arm4b · 2020-11-23T18:43:26Z

values.yaml

+  # On unclean cluster restarts forceBoot is required to cleanup Mnesia tables (see: https://github.com/helm/charts/issues/13485)
+  forceBoot: true


As every option is about trade-off, worth adding comment for this new default:

Suggested change

# On unclean cluster restarts forceBoot is required to cleanup Mnesia tables (see: https://github.com/helm/charts/issues/13485)

forceBoot: true

# On unclean cluster restarts forceBoot is required to cleanup Mnesia tables (see: https://github.com/helm/charts/issues/13485)

# Use it only if you prefer availability over integrity.

forceBoot: true

Suggested comment is added

arm4b

Thanks for PR! 👍

Just left a few minor comments to address and we're good to merge.

arm4b · 2020-11-23T18:54:27Z

values.yaml

@@ -458,7 +460,7 @@ rabbitmq-ha:
  #rabbitmqMemoryHighWatermark: 512MB
  #rabbitmqMemoryHighWatermarkType: absolute
  # Up to 255 character string, should be fixed so that re-deploying the chart does not fail (see: https://github.com/helm/charts/issues/12371)


As we're defaulting rabbitmqErlangCookie value for everyone, let's at least include a warning comment as recommendation to change the default.

Suggested change

# Up to 255 character string, should be fixed so that re-deploying the chart does not fail (see: https://github.com/helm/charts/issues/12371)

# Up to 255 character string, should be fixed so that re-deploying the chart does not fail (see: https://github.com/helm/charts/issues/12371)

# NB! It's highly recommended to change the default insecure rabbitmqErlangCookie value!

Warning is added as well

arm4b · 2020-11-23T18:55:44Z

CHANGELOG.md

@@ -1,7 +1,10 @@
 # Changelog

 ## In Development
-
+* Update `rabbitmq-ha` 3rd party chart from `1.44.1` to `1.46.1` (#158) (by @moonrail)
+* Disable newly introduced `rabbitmq-ha` prometheus operator by default (#158) (by @moonrail)


This could be omitted as there's too verbose changelog for a single PR and we stick with the previous default.

Suggested change

* Disable newly introduced `rabbitmq-ha` prometheus operator by default (#158) (by @moonrail)

This line is now removed - wasn't sure about three lines of Changelog for one PR either when writing it

arm4b · 2020-11-23T18:59:23Z

values.yaml

-  #rabbitmqErlangCookie: 8MrqQdCQ6AQ8U3MacSubHE5RqkSfvNaRHzvxuFcG
+  rabbitmqErlangCookie: 8MrqQdCQ6AQ8U3MacSubHE5RqkSfvNaRHzvxuFcG


This was, in fact proposed in older PR and it's questionable insecure default to me.

I realize that it could be useful for someone re-deploying (destroying/creating) rabbitmq-ha for many times in a row. And it also could take time to understand they need the same rabbitmqErlangCookie to make re-deployment not fail with the same PV/PVC.

But instead of forcing this rabbitmq cookie for everyone, we decided previously to hide the value under comment and include recommendation about why it could be useful or important. This way someone experiencing re-deployment issue would be able to consult with this Helm values hint.

Considering it's a second PR to enable rabbitmqErlangCookie by default, - let's do it 👍

this was enabled in my cluster. I ran into this issue when we upgrade clusters (2 times) for Kubernetes version in EKS. AWS drains pods from old node groups and shift them to the new node groups. Even though all pods shift fine, due to the rabbitmqErlangCookie mismatch and not found in the helm chart, i was running into the issue which required me to delete the PVCs and then run helm upgrade to reconstruct the rabbitmq-ha cluster. Of course all that is a down-time. But since enabling that, i could see all pods shifted to the new nodes just fine and app was on-line the whole time! I would recommend it. Thanks!

arm4b · 2020-11-23T22:48:32Z

Looks good!
Can you please also fix the CHANGELOG.md git conflict?

…bernetes installations. Enabled Erlang-Cookie & Force-Boot, as otherwise Cluster-Data is not reusable after Deployment-Restart. Due to new RabbitMQ-Version enabling Prometheus-Monitoring by default, most installations would fail, therefore disabled it by default.

moonrail · 2020-11-23T22:52:53Z

Rebased, conflict should be gone now

arm4b

👍

pull-request-size bot added size/XS PR that changes 0-9 lines. Quick fix/merge. size/S PR that changes 10-29 lines. Very easy to review. and removed size/XS PR that changes 0-9 lines. Quick fix/merge. labels Nov 20, 2020

arm4b reviewed Nov 23, 2020

View reviewed changes

arm4b suggested changes Nov 23, 2020

View reviewed changes

arm4b added the enhancement New feature or request label Nov 23, 2020

arm4b reviewed Nov 23, 2020

View reviewed changes

arm4b approved these changes Nov 23, 2020

View reviewed changes

arm4b merged commit 70e861a into StackStorm:master Nov 23, 2020

moonrail deleted the update_rabbitmq_chart branch November 23, 2020 23:16

arm4b mentioned this pull request Jun 16, 2021

Add @arms11 as a Contributor StackStorm/st2#5288

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Updated RabbitMQ-Chart to 1.46.1 & improved Reboot-Resilience #158

Updated RabbitMQ-Chart to 1.46.1 & improved Reboot-Resilience #158

moonrail commented Nov 20, 2020 •

edited by arm4b

Loading

arm4b Nov 23, 2020

moonrail Nov 23, 2020

arm4b left a comment •

edited

Loading

arm4b Nov 23, 2020

moonrail Nov 23, 2020

arm4b Nov 23, 2020

moonrail Nov 23, 2020

arm4b Nov 23, 2020 •

edited

Loading

arms11 Nov 23, 2020

arm4b commented Nov 23, 2020

moonrail commented Nov 23, 2020

arm4b left a comment

		# On unclean cluster restarts forceBoot is required to cleanup Mnesia tables (see: https://github.com/helm/charts/issues/13485)
		forceBoot: true

	# Up to 255 character string, should be fixed so that re-deploying the chart does not fail (see: https://github.com/helm/charts/issues/12371)
	# Up to 255 character string, should be fixed so that re-deploying the chart does not fail (see: https://github.com/helm/charts/issues/12371)
	# NB! It's highly recommended to change the default insecure rabbitmqErlangCookie value!

		#rabbitmqErlangCookie: 8MrqQdCQ6AQ8U3MacSubHE5RqkSfvNaRHzvxuFcG
		rabbitmqErlangCookie: 8MrqQdCQ6AQ8U3MacSubHE5RqkSfvNaRHzvxuFcG

Updated RabbitMQ-Chart to 1.46.1 & improved Reboot-Resilience #158

Updated RabbitMQ-Chart to 1.46.1 & improved Reboot-Resilience #158

Conversation

moonrail commented Nov 20, 2020 • edited by arm4b Loading

arm4b Nov 23, 2020

Choose a reason for hiding this comment

moonrail Nov 23, 2020

Choose a reason for hiding this comment

arm4b left a comment • edited Loading

Choose a reason for hiding this comment

arm4b Nov 23, 2020

Choose a reason for hiding this comment

moonrail Nov 23, 2020

Choose a reason for hiding this comment

arm4b Nov 23, 2020

Choose a reason for hiding this comment

moonrail Nov 23, 2020

Choose a reason for hiding this comment

arm4b Nov 23, 2020 • edited Loading

Choose a reason for hiding this comment

arms11 Nov 23, 2020

Choose a reason for hiding this comment

arm4b commented Nov 23, 2020

moonrail commented Nov 23, 2020

arm4b left a comment

Choose a reason for hiding this comment

moonrail commented Nov 20, 2020 •

edited by arm4b

Loading

arm4b left a comment •

edited

Loading

arm4b Nov 23, 2020 •

edited

Loading