New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Cluster/HA Install of AWX #26

Closed
MrMEEE opened this Issue Jun 4, 2018 · 4 comments

Comments

2 participants
@MrMEEE
Owner

MrMEEE commented Jun 4, 2018

Moved from here:
subuk/awx-rpm#11

@MrMEEE MrMEEE added the configuration label Jun 4, 2018

@MrMEEE MrMEEE self-assigned this Jun 4, 2018

@MrMEEE MrMEEE added this to Configuration Issues in AWX Jul 19, 2018

@Aglidic

This comment has been minimized.

Aglidic commented Aug 9, 2018

hello, we have a successfull HA deployement thanks to your rpm.

Here is what we have done:
rabbitmq clustering
disable celery-beat service
modify the celery-worker execstart command

we have made a lot of test on it and everything seems fine

@MrMEEE

This comment has been minimized.

Owner

MrMEEE commented Aug 10, 2018

That is great to hear... thanks for your feedback..

If you have a more detailed installation description, I would love to add it to the documentation..

@MrMEEE MrMEEE closed this Aug 10, 2018

@Aglidic

This comment has been minimized.

Aglidic commented Aug 13, 2018

ok so here is the process:
install db on an external server with your install guide.
install 1st awx server with your install guide (connect it to the db)
install 2&3 awx server with your install guide (connect them to the DB and don't make those commands:
echo "from django.contrib.auth.models import User; User.objects.create_superuser('admin', 'root@localhost', 'password')" | sudo -u awx /opt/awx/bin/awx-manage shell
sudo -u awx /opt/awx/bin/awx-manage create_preload_data)

When all nodes are installed we can now build the rabbitmq cluster.
Connect on node 1 an copy the erlang cookie to node 2 and 3
var/lib/rabbitmq/.erlang.cookie

Connect to nodes 2 and 3:
restart app to make it see the new cookie
rabbitmqctl stop_app
rabbitmqctl start_app
create rabbitmqctl cluster:
rabbitmqctl stop_app
rabbitmqctl join_cluster rabbit@node1
rabbitmqctl start_app
set the HA policy
rabbitmq-plugins enable rabbitmq_management
rabbitmqctl set_policy ha-all "" '{"ha-mode":"all","ha-sync-mode":"automatic"}'
systemctl restart rabbitmq-server

rabbitmq is now in cluster.

Second step is celery:
first disable and stop celery beat on all server:
Second modify the exec command of the celery services:
etc/systemd/system/multi-user.target.wants/awx-celery-worker.service ->
ExecStart=/opt/awx/bin/celery worker -A awx -l info --autoscale=50,4 -Ofair -Q tower_scheduler,tower,%(ENV_HOSTNAME)s -n celery@%(ENV_HOSTNAME)s
restart celery services on all node.

We also saw that at this step it can be better to reboot all 3 nodes but one by one to kept the rabbitmq cluster in good shape.

hope that can help

@Aglidic

This comment has been minimized.

Aglidic commented Aug 13, 2018

i forgot but of course final step go to the web interface and create the instance with all 3 nodes

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment