-
-
Notifications
You must be signed in to change notification settings - Fork 34
Allow switching between running open source AWX and Ansible Tower #5
Comments
Follow-up to #1. |
And in the logs:
So it seems something is different in the config between Tower 3.5 and AWX 9.x? |
It looks like the database initialization is not done automatically for Tower, only for AWX. So I had to:
I'll have to add something to the operator that checks if this is a fresh install, and runs the migration if it needs to be run. |
After that, it looks like the |
RE: the above two comments; for Tower, the OpenShift setup playbook contains the following tasks (all these things seem to be done by AWX automatically when first setting it up as long as your env vars and config are correct; so not sure why it's not the same for Tower): - name: Migrate database
shell: |
{{ kubectl_or_oc }} -n {{ kubernetes_namespace }} exec ansible-tower-management -- \
bash -c "awx-manage migrate --noinput"
- name: Check for Tower Super users
shell: |
{{ kubectl_or_oc }} -n {{ kubernetes_namespace }} exec ansible-tower-management -- \
bash -c "echo 'from django.contrib.auth.models import User; nsu = User.objects.filter(is_superuser=True).count(); exit(0 if nsu > 0 else 1)' | awx-manage shell"
register: super_check
ignore_errors: yes
changed_when: super_check.rc > 0
- name: create django super user if it does not exist
shell: |
{{ kubectl_or_oc }} -n {{ kubernetes_namespace }} exec ansible-tower-management -- \
bash -c "echo \"from django.contrib.auth.models import User; User.objects.create_superuser('{{ admin_user }}', '{{ admin_email }}', '{{ admin_password }}')\" | awx-manage shell"
no_log: yes
when: super_check.rc > 0
- name: update django super user password
shell: |
{{ kubectl_or_oc }} -n {{ kubernetes_namespace }} exec ansible-tower-management -- \
bash -c "awx-manage update_password --username='{{ admin_user }}' --password='{{ admin_password }}'"
no_log: yes
register: result
changed_when: "'Password updated' in result.stdout"
- name: Create the default organization if it is needed.
shell: |
{{ kubectl_or_oc }} -n {{ kubernetes_namespace }} exec ansible-tower-management -- \
bash -c "awx-manage create_preload_data"
register: cdo
changed_when: "'added' in cdo.stdout"
when: create_preload_data | bool |
So asking more about this from some Ansible devs, I found out that the automatic stuff that's done is part of the AWX Docker image installation convenience script: For OpenShift/Kubernetes installs, it looks like this is the command used for the And the default Dockerfile CMD is also set to it ( So... I guess I'll just have to detect if we're installing Tower or AWX, and from that decide whether to do the extra steps. |
Looks like there is no user account (used
So I ran:
And now I'm on the license page, logged in. Nice! |
To achieve everything automatically, I'm going to need the I'll probably toss it into the |
That module is giving me:
|
In the above commit, I split the example CRs, with one for AWX and one for Tower. That way I can continue using the AWX one in the CI tests (at least for now... eventually I'll want to test both AWX and local...). |
At this point I'm getting:
|
Some testing—on the command line, I can run:
Testing in the operator playbook: - name: Test a simple command.
k8s_exec:
namespace: '{{ meta.namespace }}'
pod: '{{ tower_pod_name }}'
command: date
register: date_result
- debug: var=date_result It results in:
Digging a little bit, it seems that can happen if you're hitting an endpoint that's not actually a websocket; see https://stackoverflow.com/a/40110656/100134 So maybe the module's not finding the right URL to hit when it's running inside the Operator? Could it be an Ansible 2.8 issue (I believe I'm running 2.9 externally)? Going to do some more digging... |
Running the same task on my host against Minikube with |
I ran |
Inside the container I was hitting:
It looks like the Ansible Operator Dockerfile adds environment information for an
And the Ansible/Python getpwuid errors went away. |
So this is fun. If I create the following playbook inside the running - hosts: localhost
connection: local
gather_facts: false
tasks:
- name: Get the Tower web pod information.
# TODO: Change to k8s_info after Ansible 2.9.0 is available in Operator image.
k8s_facts:
kind: Pod
namespace: example-tower
label_selectors:
- app=tower
register: tower_pods
- name: Set the tower pod name as a variable.
set_fact:
tower_pod_name: "{{ tower_pods['resources'][0]['metadata']['name'] }}"
- name: Verify tower_pod_name is populated.
assert:
that: tower_pod_name != ''
fail_msg: "Could not find the tower pod's name."
- name: Test a simple command.
k8s_exec:
namespace: example-tower
pod: '{{ tower_pod_name }}'
command: date
register: date_result
- debug: var=date_result Then I get the result:
So it seems that something is different when it runs through |
@fabianvf and I were discussing this in the CoreOS slack, and it could be that the proxy set up inside the Ansible Operator between K8s and Ansible runs might be intercepting the websockets request and not proxying the connection cleanly... was glancing through https://github.com/operator-framework/operator-sdk/tree/424a61d56000e6e3d91d352faa1bd4f7c814661f/internal/scaffold/ansible and will have to dig a little deeper. One other possibility: Install |
Opened an upstream issue operator-framework/operator-sdk#2204, as it does seem related to the ansible operator's proxy. |
I have everything working—I think—to get Tower automatically installed and operating now, but using |
Now when I run jobs they're never starting, and the logs on the task Pod instance seem to indicate there could be some issues:
And the last messages repeat over and over as it seems to be trying to kick off jobs but is not successful. |
(For the first item, see #3). |
It looks like in the AWX/Tower OpenShift installer, it uses a sidecar pod to provide celery... or something strange like that. It's running the command So I added the
And in the backend:
That seems to be related to the initial SCM sync job, which errored out with the following after I restarted the tower task container:
|
And now everything seems to be working, after manually re-running the SCM sync job for the Demo Project... |
It takes about 10m for everything to come up on first run, but the task container still runs into the following when I run the first job on it:
If I delete the task pod, then wait for its replacement, then monitor it, it seems to at least bump jobs from 'Pending' to 'Waiting'... and then it takes some time for new jobs to be processed. Maybe just a weird first-time setup thing. But I'll probably take a deeper look at it later. Don't want to have to be restarting the task container all the time... Side note—one other error that occurs on startup every time:
|
If this next test passes, I'm going to test that AWX still works the same, and if so, close out this issue as complete. |
Yay, test passed! Just need to test that AWX works similarly to Tower, then I'll close the issue. Day is wrapping up so it'll have to be later or tomorrow. |
AWX worked just fine, but also needed the CI tests are now passing, too, so I'm going to go ahead and merge to master and close out this issue. Yay! |
Right now I'm building out everything using open source AWX, just for convenience's sake. But I'm working on building the operator in a way where users could choose between AWX and Tower (if they want support and a license, and all that).
See:
Docs for setup:
The text was updated successfully, but these errors were encountered: