Cluster/HA Install of AWX #26

MrMEEE · 2018-06-04T08:09:56Z

Moved from here:
subuk/awx-rpm#11

Aglidic · 2018-08-09T07:09:13Z

hello, we have a successfull HA deployement thanks to your rpm.

Here is what we have done:
rabbitmq clustering
disable celery-beat service
modify the celery-worker execstart command

we have made a lot of test on it and everything seems fine

MrMEEE · 2018-08-10T10:34:44Z

That is great to hear... thanks for your feedback..

If you have a more detailed installation description, I would love to add it to the documentation..

Aglidic · 2018-08-13T05:45:48Z

ok so here is the process:
install db on an external server with your install guide.
install 1st awx server with your install guide (connect it to the db)
install 2&3 awx server with your install guide (connect them to the DB and don't make those commands:
echo "from django.contrib.auth.models import User; User.objects.create_superuser('admin', 'root@localhost', 'password')" | sudo -u awx /opt/awx/bin/awx-manage shell
sudo -u awx /opt/awx/bin/awx-manage create_preload_data)

When all nodes are installed we can now build the rabbitmq cluster.
Connect on node 1 an copy the erlang cookie to node 2 and 3
var/lib/rabbitmq/.erlang.cookie

Connect to nodes 2 and 3:
restart app to make it see the new cookie
rabbitmqctl stop_app
rabbitmqctl start_app
create rabbitmqctl cluster:
rabbitmqctl stop_app
rabbitmqctl join_cluster rabbit@node1
rabbitmqctl start_app
set the HA policy
rabbitmq-plugins enable rabbitmq_management
rabbitmqctl set_policy ha-all "" '{"ha-mode":"all","ha-sync-mode":"automatic"}'
systemctl restart rabbitmq-server

rabbitmq is now in cluster.

Second step is celery:
first disable and stop celery beat on all server:
Second modify the exec command of the celery services:
etc/systemd/system/multi-user.target.wants/awx-celery-worker.service ->
ExecStart=/opt/awx/bin/celery worker -A awx -l info --autoscale=50,4 -Ofair -Q tower_scheduler,tower,%(ENV_HOSTNAME)s -n celery@%(ENV_HOSTNAME)s
restart celery services on all node.

We also saw that at this step it can be better to reboot all 3 nodes but one by one to kept the rabbitmq cluster in good shape.

hope that can help

Aglidic · 2018-08-13T05:47:04Z

i forgot but of course final step go to the web interface and create the instance with all 3 nodes

sujiar37 · 2019-04-23T15:43:26Z

@MrMEEE , this a bit out of topic. But for those who wish to explore and automate the HA / Instance group using official docker stand alone method can be access it from my repository . Were you able to add this piece of info under your wiki, may be some people out there could be helpful. Thanks

MrMEEE · 2019-04-23T18:55:12Z

@sujiar37

So, basically, everything that is needed for a HA setup is to do a standalone postgresql (cluster) and a rabbitmq (cluster)

and then do frontends that connects to these??

Should be pretty simple to implement..

I have added a links section

sujiar37 · 2019-04-24T02:37:09Z

@MrMEEE , thank you for adding these piece of info under your wiki.

Only requirement is to setup a standalone postgresql and rest all will be taken care by playbook such as building and configuring the rabbitmq cluster and enabling the docker version of HA in all nodes. And Yes, it is pretty simple to implement now through my playbook

powertim · 2019-04-24T07:42:04Z

Hi guys,
As we worked on it with @Aglidic to build the first HA implementation of the RPM, we have a playbook which does the full setup automatically.
It's just corporate currently, so I need find some time to generalize it if you want to add it somewhere.

Best,

Tim.

MrMEEE · 2019-04-24T07:54:09Z

@powertim

I would love to include playbooks for installing in the RPM...

powertim · 2019-04-24T08:13:10Z

OK so I'll add that to my TODO for the next days...

bufooo · 2019-06-14T14:22:20Z

OK so I'll add that to my TODO for the next days...

Did you had a chance to do it? I would love to try them.

dnc92301 · 2019-06-24T20:26:40Z

Hi all, very much interested in the playbook. If playbook is not available now, can someone highlight what's needed for pointing to external Postgres server - using the RPM installation method please.
Something like this -

Set pg_hostname if you have an external postgres server, otherwise

a new postgres service will be created

pg_hostname=hostname
pg_username=awx
pg_password=xxxxx
pg_database=awx
pg_port=5432

Thanks everyone for great efforts!

MrMEEE · 2019-06-24T20:32:56Z

In regards to the external postgres, you basically only needs to setup an external postgres (cluster?) And change the configuration in /etc/tower/settings.py to point to that server, before running the database initialization..

dnc92301 · 2019-06-24T21:18:32Z

Thanks much for the quick response . Yes it will be a Postgres 2-node cluster with steaming replication . Yes I see there’s a section for configuring USER/PASSWORD/HOST/PORT in settings.py. So Initializing DB / all steps listed in awx.wiki/multi-section-page/configuration is still required?

dnc92301 · 2019-06-26T20:51:31Z

No issues setting up external Postgresdb . Issues with setting up cluster . I followed the previous comments on setting up clustering and got to point of enabling rabbitmq cluster within 2 nodes - but awx didn’t detect the additional node . The endpoint - api/v2/Ping only displays one activenode. Also there’s no awx-celery-worker - this service appears to have been deprecated? Thanks.

MrMEEE · 2019-06-26T23:07:18Z

Hi..

I think you have to enable each of the awx nodes with the command:

sudo -u awx scl enable rh-python36 rh-postgresql10 "awx-manage register_queue --queuename=tower --hostnames=$(hostname)"

and yes, the celery worker is deprecated...

dnc92301 · 2019-06-27T13:48:08Z

Thanks much it worked! I had to run this command first - sudo -u awx scl enable rh-python36 rh-postgresql10 "awx-manage provision_instance --hostname=$(hostname)"

before running your command -
sudo -u awx scl enable rh-python36 rh-postgresql10 "awx-manage register_queue --queuename=tower --hostnames=$(hostname)"

dnc92301 · 2019-06-27T13:49:17Z

Also as far as upgrading to latest AWX version, I presume it will still work just so that we have to upgrade on all Nodes within the cluster. Thanks again!

MrMEEE · 2019-06-27T13:57:01Z

Ah, yes.. of course you have to do the provision_instance first :)..

I will do a write-up on this and put on awx.wiki as soon as possible.. also i'm planning a setup-tools for simpler installation and configuration, which will also contain the HA... Could you share the exact changes you have made to the systemd files??

Remember not to change the files themselves, but to overwrite them with copies in /etc/systemd/system.. else they will get returned to default on the next update...

In regards to updating, I think you should update the ansible-awx (and dependencies) on all nodes before running the database migrations...

dnc92301 · 2019-06-27T14:29:59Z

I said it too fast :)
Yes, both nodes are within the cluster. For some reason jobs couldn't execute on the newly added node. When attempted to run a job against new node - it goes to a "Wait" state before timing out with the message -
Task was marked as running in Tower but was not present in the job queue, so it has been marked as failed..

Tried - rabbitmqctl stop_app/ rabbitmqctl start_app , systemctl restart rabbitmq-server on the server. Also bounce both nodes. On the web GUI, I had switched from OFF to ON, but USED CAPACITY eventually becomes "UNAVAILABLE."

dnc92301 · 2019-06-27T14:33:25Z

ignore - the issue was with AWX not running during startup on the new node:)
still doing more testing. Thanks again..

dnc92301 · 2019-06-27T14:41:21Z

so far so good . I didn't make any systemD changes since celery has been deprecated.

dnc92301 · 2019-06-27T14:55:41Z

one issue came up so far is that when a job finished running on new node, the node USED CAPACITY goes into "UNAVAILABLE." It is as though the node lost its heartbeat to the rabbitmq cluster. Need to troubleshoot further.

MrMEEE · 2019-06-27T15:07:30Z

I'm in Praque for the week for a Red Hat event.. I will try to setup a HA environment when I get home, then we can debug together

dnc92301 · 2019-06-27T18:03:35Z

This is error msg I'm getting.

2019-06-27 14:01:34.390 [info] <0.1498.0> connection <0.1498.0> (127.0.0.1:42950 -> 127.0.0.1:5672): user 'guest' authenticated and granted access to vhost '/'
2019-06-27 14:01:44.556 [warning] <0.1498.0> closing AMQP connection <0.1498.0> (127.0.0.1:42950 -> 127.0.0.1:5672, vhost: '/', user: 'guest'):
client unexpectedly closed TCP connection

Thanks

dnc92301 · 2019-06-27T19:33:44Z

Looks like the issue has to do with the fact - awx requires 'tower' vhost. Currently we're using host (default) '/'. So getting a bunch of closing AMQP connections.

2019-06-27 15:30:29.289 [info] <0.2130.0> connection <0.2130.0> (127.0.0.1:47012 -> 127.0.0.1:5672): user 'guest' authenticated and granted access to vhost '/'
2019-06-27 15:30:49.603 [info] <0.2139.0> accepting AMQP connection <0.2139.0> (127.0.0.1:47390 -> 127.0.0.1:5672)
2019-06-27 15:30:49.610 [info] <0.2139.0> connection <0.2139.0> (127.0.0.1:47390 -> 127.0.0.1:5672): user 'guest' authenticated and granted access to vhost '/'
2019-06-27 15:30:49.619 [info] <0.2139.0> closing AMQP connection <0.2139.0> (127.0.0.1:47390 -> 127.0.0.1:5672, vhost: '/', user: 'guest')
2019-06-27 15:30:49.687 [info] <0.2150.0> accepting AMQP connection <0.2150.0> (127.0.0.1:47394 -> 127.0.0.1:5672)
2019-06-27 15:30:49.695 [info] <0.2150.0> connection <0.2150.0> (127.0.0.1:47394 -> 127.0.0.1:5672): user 'guest' authenticated and granted access to vhost '/'
2019-06-27 15:30:49.710 [info] <0.2150.0> closing AMQP connection <0.2150.0> (127.0.0.1:47394 -> 127.0.0.1:5672, vhost: '/', user: 'guest')
2019-06-27 15:30:49.748 [info] <0.2161.0> accepting AMQP connection <0.2161.0> (127.0.0.1:47396 -> 127.0.0.1:5672)
2019-06-27 15:30:49.755 [info] <0.2161.0> connection <0.2161.0> (127.0.0.1:47396 -> 127.0.0.1:5672): user 'guest' authenticated and granted access to vhost '/'
2019-06-27 15:30:49.771 [info] <0.2161.0> closing AMQP connection <0.2161.0> (127.0.0.1:47396 -> 127.0.0.1:5672, vhost: '/', user: 'guest')

MrMEEE · 2019-06-28T21:29:57Z

@dnc92301 Let's move the discussion to #121

powertim · 2019-07-23T14:37:14Z

Hi guys,
As we worked on it with @Aglidic to build the first HA implementation of the RPM, we have a playbook which does the full setup automatically.
It's just corporate currently, so I need find some time to generalize it if you want to add it somewhere.

Best,

Tim.

Hi guys,

Finally the playbook is here: https://github.com/powertim/deploy_awx-rpm
Currently designed for RHEL7 x86_64 with Satellite repos.
I will try to update it with manual repos as described on https://awx.wiki/installation/repositories/rhel7-x86_64.
And why not in the future for the different OS supported on the awx.wiki...

Please try first to adapt the playbook before opening an issue.
I'll fill up the README soon.

Best,

Tim

gowthamakanthan · 2019-07-24T12:27:42Z

Hi Tim, Good to see the repo. Waiting for the README. Wondering will it work on centos7 as well?

…

--- Best regards, Gowtham 07798838879 ===================== Learn from mistakes.... Please consider the environment before printing this email - Thanks

On Tue, Jul 23, 2019 at 3:37 PM Timothée Christin ***@***.***> wrote: Hi guys, As we worked on it with @Aglidic <https://github.com/Aglidic> to build the first HA implementation of the RPM, we have a playbook which does the full setup automatically. It's just corporate currently, so I need find some time to generalize it if you want to add it somewhere. Best, Tim. Hi guys, Finally the playbook is here: https://github.com/powertim/deploy_awx-rpm Currently designed for RHEL7 x86_64 with Satellite repos. I will try to update it with manual repos as described on https://awx.wiki/installation/repositories/rhel7-x86_64. And why not in the future for the different OS supported on the awx.wiki... Please try first to adapt the playbook before opening an issue. I'll fill up the README soon. Best, Tim — You are receiving this because you commented. Reply to this email directly, view it on GitHub <#26?email_source=notifications&email_token=AA66HLTARJAVF4QGFEYCAXTQA4JTFA5CNFSM4FDCQXP2YY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGOD2TKQDI#issuecomment-514238477>, or mute the thread <https://github.com/notifications/unsubscribe-auth/AA66HLXIP2J22MFX5W3L7EDQA4JTFANCNFSM4FDCQXPQ> .

powertim · 2019-07-24T13:19:17Z

Hi @gowthamakanthan ,

It should work on CentOS 7 with a few changes:

Add local repos with module 'yum_repository' instead of the Satellite repos I'm using with module 'rhsm_repository' in files roles/db_prereqs/tasks/main.yml & roles/nodes_prereqs/tasks/main.yml.
Maybe change the line #26 of file roles/nodes_prereqs/tasks/main.yml to succeed installation of dependencies.

But I'll try to add this content when I find the time for that (I hope quickly).

dnc92301 · 2019-07-26T16:06:46Z

@powertim - Thanks for the efforts! I've tested and works as expected. However, the previous reported issue still exists where the 2nd node (I have a 2 node rabbit mq cluster) goes into "UNAVAILABLE" state as soon as Job finished running. hostnameB is the 2nd node which has a Capacity of 0 because it's NOT availalbe. Primary node I've DISABLED it intentionally.

[root@hostnameA deploy_awx-rpm]# sudo -u awx scl enable rh-python36 rh-postgresql10 "awx-manage list_instances"

[tower capacity=0]
hostnameB capacity=0 version=6.1.0
[DISABLED] hostnameA capacity=0 version=6.1.0

dnc92301 · 2019-07-26T17:23:00Z

This is installed using latest AWX 6.10. This is example of a run where node becomes "unavailable" where job no longer exists in the queue - with the below explanation.

EXPLANATION
Task was marked as running in Tower but was not present in the job queue, so it has been marked as failed.
STARTED
7/26/2019 1:18:34 PM
FINISHED
7/26/2019 1:20:25 PM

dnc92301 · 2019-08-06T18:43:05Z

Hi all,
It looks like the problem is no longer servicing after setting up a new server . However, I'm hitting the following issues with starting up awx.

Issue with - scl: RuntimeError: Django version other than 2.2.2 detected: 2.2.4.

Django is what comes by default - rh-python36-Django-2.2.4-1.noarch

Thanks.

Aug 6 18:40:19 hostnameA scl: Traceback (most recent call last):
Aug 6 18:40:19 hostnameA scl: File "/opt/rh/rh-python36/root/usr/bin/daphne", line 11, in
Aug 6 18:40:19 hostnameA scl: load_entry_point('daphne==1.3.0', 'console_scripts', 'daphne')()
Aug 6 18:40:19 hostnameA scl: File "/opt/rh/rh-python36/root/usr/lib/python3.6/site-packages/daphne/cli.py", line 144, in entrypoint
Aug 6 18:40:19 hostnameA scl: cls().run(sys.argv[1:])
Aug 6 18:40:19 hostnameA scl: File "/opt/rh/rh-python36/root/usr/lib/python3.6/site-packages/daphne/cli.py", line 174, in run
Aug 6 18:40:19 hostnameA scl: channel_layer = importlib.import_module(module_path)
Aug 6 18:40:19 hostnameA scl: File "/opt/rh/rh-python36/root/usr/lib64/python3.6/importlib/init.py", line 126, in import_module
Aug 6 18:40:19 hostnameA scl: return _bootstrap._gcd_import(name[level:], package, level)
Aug 6 18:40:19 hostnameA scl: File "", line 994, in _gcd_import
Aug 6 18:40:19 hostnameA scl: File "", line 971, in _find_and_load
Aug 6 18:40:19 hostnameA scl: File "", line 941, in _find_and_load_unlocked
Aug 6 18:40:19 hostnameA scl: File "", line 219, in _call_with_frames_removed
Aug 6 18:40:19 hostnameA scl: File "", line 994, in _gcd_import
Aug 6 18:40:19 hostnameA scl: File "", line 971, in _find_and_load
Aug 6 18:40:19 hostnameA scl: File "", line 955, in _find_and_load_unlocked
Aug 6 18:40:19 hostnameA scl: File "", line 665, in _load_unlocked
Aug 6 18:40:19 hostnameA scl: File "", line 678, in exec_module
Aug 6 18:40:19 hostnameA scl: File "", line 219, in _call_with_frames_removed
Aug 6 18:40:19 hostnameA scl: File "/opt/rh/rh-python36/root/usr/lib/python3.6/site-packages/awx/init.py", line 49, in
Aug 6 18:40:19 hostnameA scl: current=django.version)
Aug 6 18:40:19 hostnameA scl: RuntimeError: Django version other than 2.2.2 detected: 2.2.4. Overriding names_digest is known to work for Django 2.2.2 and may not work in other Django versions.
Aug 6 18:40:19 hostnameA systemd: awx-daphne.service: main process exited, code=exited, status=1/FAILURE
Aug 6 18:40:19 hostnameA systemd: Unit awx-daphne.service entered failed state.
Aug 6 18:40:19 hostnameA systemd: awx-daphne.service failed.
Aug 6 18:40:21 hostnameA systemd: awx-cbreceiver.service holdoff time over, scheduling restart.
Aug 6 18:40:21 hostnameA systemd: awx-channels-worker.service holdoff time over, scheduling restart.
Aug 6 18:40:21 hostnameA systemd: awx-dispatcher.service holdoff time over, scheduling restart.
Aug 6 18:40:21 hostnameA systemd: Stopped AWX Dispatcher.
Aug 6 18:40:21 hostnameA systemd: Stopped AWX channels worker service.
Aug 6 18:40:21 hostnameA systemd: Stopping AWX web service...
Aug 6 18:40:21 hostnameA systemd: Stopped AWX cbreceiver service.
Aug 6 18:40:21 hostnameA systemd: awx-daphne.service holdoff time over, scheduling restart.
Aug 6 18:40:21 hostnameA systemd: Stopped AWX daphne service.
[root@hostnameA ~]#

powertim · 2019-08-06T18:52:58Z

Did you install your cluster using the playbook? I had this issue when relaunching the playbook. Should be ok with a full clean install. Cheers, Tim. Le mar. 6 août 2019 à 20:43, dnc92301 <notifications@github.com> a écrit :

…

Hi all, It looks like the problem is no longer servicing after setting up a new server . However, I'm hitting the following issues with starting up awx. Issue with - scl: RuntimeError: Django version other than 2.2.2 detected: 2.2.4. Django is what comes by default - rh-python36-Django-2.2.4-1.noarch Thanks. Aug 6 18:40:19 hostnameA scl: Traceback (most recent call last): Aug 6 18:40:19 hostnameA scl: File "/opt/rh/rh-python36/root/usr/bin/daphne", line 11, in Aug 6 18:40:19 hostnameA scl: load_entry_point('daphne==1.3.0', 'console_scripts', 'daphne')() Aug 6 18:40:19 hostnameA scl: File "/opt/rh/rh-python36/root/usr/lib/python3.6/site-packages/daphne/cli.py", line 144, in entrypoint Aug 6 18:40:19 hostnameA scl: cls().run(sys.argv[1:]) Aug 6 18:40:19 hostnameA scl: File "/opt/rh/rh-python36/root/usr/lib/python3.6/site-packages/daphne/cli.py", line 174, in run Aug 6 18:40:19 hostnameA scl: channel_layer = importlib.import_module(module_path) Aug 6 18:40:19 hostnameA scl: File "/opt/rh/rh-python36/root/usr/lib64/python3.6/importlib/*init*.py", line 126, in import_module Aug 6 18:40:19 hostnameA scl: return _bootstrap._gcd_import(name[level:], package, level) Aug 6 18:40:19 hostnameA scl: File "", line 994, in _gcd_import Aug 6 18:40:19 hostnameA scl: File "", line 971, in _find_and_load Aug 6 18:40:19 hostnameA scl: File "", line 941, in _find_and_load_unlocked Aug 6 18:40:19 hostnameA scl: File "", line 219, in _call_with_frames_removed Aug 6 18:40:19 hostnameA scl: File "", line 994, in _gcd_import Aug 6 18:40:19 hostnameA scl: File "", line 971, in _find_and_load Aug 6 18:40:19 hostnameA scl: File "", line 955, in _find_and_load_unlocked Aug 6 18:40:19 hostnameA scl: File "", line 665, in _load_unlocked Aug 6 18:40:19 hostnameA scl: File "", line 678, in exec_module Aug 6 18:40:19 hostnameA scl: File "", line 219, in _call_with_frames_removed Aug 6 18:40:19 hostnameA scl: File "/opt/rh/rh-python36/root/usr/lib/python3.6/site-packages/awx/*init*.py", line 49, in Aug 6 18:40:19 hostnameA scl: current=django.*version*) Aug 6 18:40:19 hostnameA scl: RuntimeError: Django version other than 2.2.2 detected: 2.2.4. Overriding names_digest is known to work for Django 2.2.2 and may not work in other Django versions. Aug 6 18:40:19 hostnameA systemd: awx-daphne.service: main process exited, code=exited, status=1/FAILURE Aug 6 18:40:19 hostnameA systemd: Unit awx-daphne.service entered failed state. Aug 6 18:40:19 hostnameA systemd: awx-daphne.service failed. Aug 6 18:40:21 hostnameA systemd: awx-cbreceiver.service holdoff time over, scheduling restart. Aug 6 18:40:21 hostnameA systemd: awx-channels-worker.service holdoff time over, scheduling restart. Aug 6 18:40:21 hostnameA systemd: awx-dispatcher.service holdoff time over, scheduling restart. Aug 6 18:40:21 hostnameA systemd: Stopped AWX Dispatcher. Aug 6 18:40:21 hostnameA systemd: Stopped AWX channels worker service. Aug 6 18:40:21 hostnameA systemd: Stopping AWX web service... Aug 6 18:40:21 hostnameA systemd: Stopped AWX cbreceiver service. Aug 6 18:40:21 hostnameA systemd: awx-daphne.service holdoff time over, scheduling restart. Aug 6 18:40:21 hostnameA systemd: Stopped AWX daphne service. ***@***.*** ~]# — You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub <#26?email_source=notifications&email_token=AB5CGUAI2Q2TP7DI463EVALQDHA3VA5CNFSM4FDCQXP2YY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGOD3WCZMI#issuecomment-518794417>, or mute the thread <https://github.com/notifications/unsubscribe-auth/AB5CGUAIKH4KQ53MR2YAV23QDHA3VANCNFSM4FDCQXPQ> .

MrMEEE · 2019-08-06T18:52:59Z

@dnc92301 Please create new issues, instead of reusing old ones...

Have you remembered to update the ansible-awx package???

MrMEEE · 2019-08-06T18:53:48Z

@powertim Maybe the playbook doesn't update the ansible-awx package??

dnc92301 · 2019-08-06T19:26:44Z

@tim - yes this happens after rerunning playbook. After upgrading to latest ansible-awx version it worked!

powertim · 2019-08-09T09:07:31Z

@powertim Maybe the playbook doesn't update the ansible-awx package??

It's updated now !
See commit df571c0

powertim · 2019-08-09T09:08:07Z

@tim - yes this happens after rerunning playbook. After upgrading to latest ansible-awx version it worked!

Yeah unfortunately re-running playbook cause failures.
I need to improve that.

VJoshi0 · 2019-10-18T19:28:54Z

Hello, I have offline VMs where I need to build AWX. As listed above I saw about 160 rh-python36-* dependencies. Where I can find a tar ball or url for all rpms I need for AWX?
Not using docker, plan to use RHEL7 VMs to create HA.
But I'm lost to collect all rh-python36-* from mirror sites one by one , will appreciate if I know what order which rpm needs to get installed. Thanks.

cameronkerrnz · 2019-11-29T17:52:34Z

@VJoshi0: yum install --download-only --download-dir /to/here/ rh-python36-*

Further example at https://unix.stackexchange.com/questions/259640/how-to-use-yum-to-get-all-rpms-required-for-offline-use

cs-laurentiuvasiescu · 2020-01-10T21:57:29Z

So after having the 3 instances clustered is a loadbalancer used at all?

What about manual projects that are on the local filesystem? Rsync them?

elstoncawley · 2020-03-18T19:05:11Z

Hi All thanks for the great work that you are doing, I was wondering if there was a step by step guide for the HA setup similar to the standalone setup in this wiki guide https://awx.wiki/installation/installation

powertim · 2020-03-18T20:14:10Z

Hi @elstoncawley,

Unfortunately not, and I didn't work for a long time on the HA setup but you'll find steps in the playbook here https://github.com/powertim/deploy_awx-rpm.
Roles names should help you to find the steps for building the cluster.

Cheers,

Tim.

elstoncawley · 2020-03-19T03:42:54Z

Thanks @powertim
I am actually installing on a CentOS 7 server and was wondering about the repo. In the vars/nodes.yml file. Could I use https://awx.wiki/repository/ for the awx_repo variable?

powertim · 2020-04-01T14:17:48Z

Yes in theory you can use all the repos you want but you need to change the way you enable repos and call them because I only provided a RHEL conf with Satellitte so subscription-manager command won't be available for you.

bryanasdev000 · 2020-05-22T20:26:42Z

Hi everbody!

Did anyone get HA/Clustering running with AWX 11.X.X and redis?

bryanasdev000 · 2020-05-22T21:59:09Z

Hi everbody!

Did anyone get HA/Clustering running with AWX 11.X.X and redis?

Responding to myself, and leaving reference material for those who need it, issue of https://github.com/sujiar37/AWX-HA-InstanceGroup/issues/26 seems to shed some light.

I will test asap.

Nikkurer · 2020-11-12T09:30:07Z

https://github.com/fitbeard/awx-ha-cluster this playbook working well. I'm using it for a time.

MrMEEE added the configuration label Jun 4, 2018

MrMEEE self-assigned this Jun 4, 2018

MrMEEE closed this as completed Aug 10, 2018

efenex mentioned this issue Dec 11, 2018

High-available configuration job distribution #65

Closed

MrMEEE mentioned this issue Apr 5, 2019

Add new instance at INSTANCE_GROUP #97

Closed

MrMEEE reopened this Jun 28, 2019

MrMEEE closed this as completed Jun 28, 2019

VJoshi0 mentioned this issue Oct 18, 2019

Split out Requires and BuildRequires for legibility MrMEEE/awx-rpm#1

Merged

Cluster/HA Install of AWX #26

Cluster/HA Install of AWX #26

Comments

MrMEEE commented Jun 4, 2018

Aglidic commented Aug 9, 2018

MrMEEE commented Aug 10, 2018

Aglidic commented Aug 13, 2018

Aglidic commented Aug 13, 2018

sujiar37 commented Apr 23, 2019 • edited Loading

MrMEEE commented Apr 23, 2019

sujiar37 commented Apr 24, 2019 • edited Loading

powertim commented Apr 24, 2019

MrMEEE commented Apr 24, 2019

powertim commented Apr 24, 2019

bufooo commented Jun 14, 2019

dnc92301 commented Jun 24, 2019 • edited Loading

Set pg_hostname if you have an external postgres server, otherwise

a new postgres service will be created

MrMEEE commented Jun 24, 2019

dnc92301 commented Jun 24, 2019

dnc92301 commented Jun 26, 2019

MrMEEE commented Jun 26, 2019

dnc92301 commented Jun 27, 2019

dnc92301 commented Jun 27, 2019

MrMEEE commented Jun 27, 2019

dnc92301 commented Jun 27, 2019

dnc92301 commented Jun 27, 2019

dnc92301 commented Jun 27, 2019

dnc92301 commented Jun 27, 2019

MrMEEE commented Jun 27, 2019

dnc92301 commented Jun 27, 2019

dnc92301 commented Jun 27, 2019

MrMEEE commented Jun 28, 2019

powertim commented Jul 23, 2019

gowthamakanthan commented Jul 24, 2019 via email

powertim commented Jul 24, 2019

dnc92301 commented Jul 26, 2019

dnc92301 commented Jul 26, 2019

dnc92301 commented Aug 6, 2019

powertim commented Aug 6, 2019 via email

MrMEEE commented Aug 6, 2019

MrMEEE commented Aug 6, 2019

dnc92301 commented Aug 6, 2019

powertim commented Aug 9, 2019

powertim commented Aug 9, 2019

VJoshi0 commented Oct 18, 2019

cameronkerrnz commented Nov 29, 2019

cs-laurentiuvasiescu commented Jan 10, 2020

elstoncawley commented Mar 18, 2020

powertim commented Mar 18, 2020

elstoncawley commented Mar 19, 2020

powertim commented Apr 1, 2020

bryanasdev000 commented May 22, 2020

bryanasdev000 commented May 22, 2020

Nikkurer commented Nov 12, 2020

sujiar37 commented Apr 23, 2019 •

edited

Loading

sujiar37 commented Apr 24, 2019 •

edited

Loading

dnc92301 commented Jun 24, 2019 •

edited

Loading