Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

AWX and Execution Environment not synchronized #10211

Closed
metabsd opened this issue May 17, 2021 · 18 comments
Closed

AWX and Execution Environment not synchronized #10211

metabsd opened this issue May 17, 2021 · 18 comments

Comments

@metabsd
Copy link

metabsd commented May 17, 2021

ISSUE TYPE
  • Bug Report
SUMMARY

When executing a Job Template. It seems that the execution environment continues its work in the background but AWX is no longer up to date and at the end of the execution the job template is in error.

ENVIRONMENT
  • AWX version: 19.1.0
  • AWX install method: awx-operator
  • Ansible version: ansible [core 2.12.0.dev0] from https://github.com/metabsd/awx-ee
  • Operating System: Linux
  • Web Browser: Firefox and Chromium
STEPS TO REPRODUCE

I only encounter this problem when I run a Job template with multiple hosts. In my case I run on 100+ hosts. I do not encounter this problem on a small amount of hosts.

My environments are functional and I am available to troubleshoot and find the exact cause.

EXPECTED RESULTS

I expect to receive the complete log of the execution to review and make sure it went well

ACTUAL RESULTS

The job template is in error and the stdout is incomplete. However, I have confirmed that the playbook continues by using the command kubectl logs awx-job-128-qmx5p -n ansible-awx and I see that the job template continues but the output does not return to AWX.

@AlanCoding
Copy link
Member

Could you give more details from the job? Visit the endpoint /api/v2/jobs/42/ (if the id was 42) and copy and paste the JSON data from that. Of particular interest are job_explanation, result_traceback and a few things like that.

@metabsd
Copy link
Author

metabsd commented May 17, 2021

Both are empty.

"status": "error",
"job_explanation": "",
"result_traceback": "",

I can't post the whole json. I have to cleanup some stuff if you really need it.

@metabsd
Copy link
Author

metabsd commented May 18, 2021

Maybe this is useful for you.

        "AWX_SERVICE_PORT_80_TCP_ADDR": "10.152.183.159",
        "HOSTNAME": "awx-job-412-rnpml",
        "AWX_SERVICE_PORT": "tcp://10.152.183.159:80",
        "AWX_SERVICE_PORT_80_TCP_PORT": "80",
        "KUBERNETES_PORT_443_TCP_PROTO": "tcp",
        "KUBERNETES_PORT_443_TCP_ADDR": "10.152.183.1",
        "AWX_SERVICE_PORT_80_TCP_PROTO": "tcp",
        "KUBERNETES_PORT": "tcp://10.152.183.1:443",
        "PWD": "/runner",
        "HOME": "/var/lib/awx",
        "AWX_SERVICE_PORT_80_TCP": "tcp://10.152.183.159:80",
        "KUBERNETES_SERVICE_PORT_HTTPS": "443",
        "KUBERNETES_PORT_443_TCP_PORT": "443",
        "AWX_SERVICE_SERVICE_PORT": "80",
        "AWX_SERVICE_SERVICE_HOST": "10.152.183.159",
        "KUBERNETES_PORT_443_TCP": "tcp://10.152.183.1:443",
        "SHLVL": "0",
        "KUBERNETES_SERVICE_PORT": "443",
        "PATH": "/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin",
        "AWX_SERVICE_SERVICE_PORT_HTTP": "80",
        "KUBERNETES_SERVICE_HOST": "10.152.183.1",
        "LC_CTYPE": "C.UTF-8",
        "ANSIBLE_FORCE_COLOR": "True",
        "ANSIBLE_HOST_KEY_CHECKING": "False",
        "ANSIBLE_INVENTORY_UNPARSED_FAILED": "True",
        "ANSIBLE_PARAMIKO_RECORD_HOST_KEYS": "False",
        "ANSIBLE_VENV_PATH": "/var/lib/awx/venv/ansible",
        "ANSIBLE_SSH_ARGS": "-C -o ControlMaster=auto -o ControlPersist=60s -o UserKnownHostsFile=/dev/null -o StrictHostKeyChecking=no",
        "AWX_PRIVATE_DATA_DIR": "/tmp/pdd_wrapper_412_aw5g5qc4/awx_412_gesws5po",
        "JOB_ID": "412",
        "INVENTORY_ID": "4",
        "PROJECT_REVISION": "65332bc405066b7e8b1227be62e84d4b1b5bbc51",
        "ANSIBLE_RETRY_FILES_ENABLED": "False",
        "MAX_EVENT_RES": "700000",
        "AWX_HOST": "https://awx.apps.k8sdev01.npr.isaq.app",
        "ANSIBLE_SSH_CONTROL_PATH_DIR": "/runner/cp",
        "ANSIBLE_COLLECTIONS_PATHS": "/runner/requirements_collections:~/.ansible/collections:/usr/share/ansible/collections",
        "ANSIBLE_ROLES_PATH": "/runner/requirements_roles:~/.ansible/roles:/usr/share/ansible/roles:/etc/ansible/roles",
        "PYTHONPATH": ":/usr/local/lib/python3.8/site-packages/ansible_runner/config/../callbacks",
        "ANSIBLE_CALLBACK_PLUGINS": "/usr/local/lib/python3.8/site-packages/ansible_runner/config/../callbacks",
        "ANSIBLE_STDOUT_CALLBACK": "awx_display",
        "AWX_ISOLATED_DATA_DIR": "/runner/artifacts/412",
        "RUNNER_OMIT_EVENTS": "False",
        "RUNNER_ONLY_FAILED_EVENTS": "False",
        "RECEPTOR_UNIT_ID": "EsNltVwP"
    },

I also double check the content of /api/v2/jobs/ID/stdout/ and I miss a lot of stuff. Just stop for I don't know why.

Can you help me to find the reason so I can provide useful info?

FYI: English is not my principal language. I do my best ;)

Thx!

@shanemcd
Copy link
Member

This smells a lot like #9961 - we're actively working on it.

@metabsd
Copy link
Author

metabsd commented May 27, 2021

Can I help by doing something on my env ?

I plan to centralized our k8s logs to Elk and start a big playbook and collect everything and analyze it and share it.

That can be useful?

@shanemcd
Copy link
Member

shanemcd commented Jun 4, 2021

I believe this should be fixed now. Can you try on 19.2.0+ ?

@metabsd
Copy link
Author

metabsd commented Jun 25, 2021

Do I have to use the default EE coming with awx-operator or I can use my own one ?

There is a relation or it's principaly ansible.awx concern.

I will run the test next week with 19.2.1 deploy with awx-operator 0.11

Thx!

@metabsd
Copy link
Author

metabsd commented Jul 8, 2021

@shanemcd I deployed 19.2.2 with awx-operator 0.12.0. (Fresh Install)

The bug seems to persist.
I noticed that AWX always seems to stop receiving the output of the job after about 2900 lines.
Approximately 15 minutes of running time

If I can do more to help solve this problem, please do not hesitate.

Here are the details of the job from the API point of view.

HTTP 200 OK
Allow: GET, DELETE, HEAD, OPTIONS
Content-Type: application/json
Vary: Accept
X-API-Node: awx-78b469c655-kwbcx
X-API-Product-Name: AWX
X-API-Product-Version: 19.2.2
X-API-Time: 0.165s

{
    "id": 61,
    "type": "job",
    "url": "/api/v2/jobs/61/",
    "related": {
        "created_by": "/api/v2/users/1/",
        "labels": "/api/v2/jobs/61/labels/",
        "inventory": "/api/v2/inventories/4/",
        "project": "/api/v2/projects/10/",
        "organization": "/api/v2/organizations/2/",
        "credentials": "/api/v2/jobs/61/credentials/",
        "unified_job_template": "/api/v2/job_templates/37/",
        "stdout": "/api/v2/jobs/61/stdout/",
        "execution_environment": "/api/v2/execution_environments/1/",
        "job_events": "/api/v2/jobs/61/job_events/",
        "job_host_summaries": "/api/v2/jobs/61/job_host_summaries/",
        "activity_stream": "/api/v2/jobs/61/activity_stream/",
        "notifications": "/api/v2/jobs/61/notifications/",
        "create_schedule": "/api/v2/jobs/61/create_schedule/",
        "job_template": "/api/v2/job_templates/37/",
        "cancel": "/api/v2/jobs/61/cancel/",
        "relaunch": "/api/v2/jobs/61/relaunch/"
    },
    "summary_fields": {
        "organization": {
            "id": 2,
            "name": "SUPERDUPER-ORG",
            "description": ""
        },
        "inventory": {
            "id": 4,
            "name": "ServiceNow Serveurs Unix non-production",
            "description": "Inventaire non-production serveurs Unix venant de ServiceNow",
            "has_active_failures": false,
            "total_hosts": 427,
            "hosts_with_active_failures": 0,
            "total_groups": 105,
            "has_inventory_sources": true,
            "total_inventory_sources": 1,
            "inventory_sources_with_failures": 0,
            "organization_id": 2,
            "kind": ""
        },
        "execution_environment": {
            "id": 1,
            "name": "AWX CUSTOM EE",
            "description": "",
            "image": "metabsd/awx-ee:latest"
        },
        "project": {
            "id": 10,
            "name": "ELK_AWX",
            "description": "",
            "status": "successful",
            "scm_type": "git",
            "allow_override": false
        },
        "job_template": {
            "id": 37,
            "name": "Deployer Metricbeat Redhat Linux",
            "description": "RHEL 7 et plus. Ce playbook sert a la mise en place de Metricbeat de base. Ne pas utiliser si Metricbeat est gerer par une soution plus custom."
        },
        "unified_job_template": {
            "id": 37,
            "name": "Deployer Metricbeat Redhat Linux",
            "description": "RHEL 7 et plus. Ce playbook sert a la mise en place de Metricbeat de base. Ne pas utiliser si Metricbeat est gerer par une soution plus custom.",
            "unified_job_type": "job"
        },
        "instance_group": {
            "id": 2,
            "name": "default",
            "is_container_group": true
        },
        "created_by": {
            "id": 1,
            "username": "admin",
            "first_name": "",
            "last_name": ""
        },
        "user_capabilities": {
            "delete": true,
            "start": true
        },
        "labels": {
            "count": 0,
            "results": []
        },
        "credentials": [
            {
                "id": 5,
                "name": "dummyusername",
                "description": "",
                "kind": "ssh",
                "cloud": false
            },
            {
                "id": 19,
                "name": "Metricbeat Elasticsearch output_conf",
                "description": "",
                "kind": null,
                "cloud": true
            }
        ]
    },
    "created": "2021-07-08T13:36:18.515550Z",
    "modified": "2021-07-08T13:36:32.854899Z",
    "name": "Deployer Metricbeat Redhat Linux",
    "description": "RHEL 7 et plus. Ce playbook sert a la mise en place de Metricbeat de base. Ne pas utiliser si Metricbeat est gerer par une soution plus custom.",
    "job_type": "run",
    "inventory": 4,
    "project": 10,
    "playbook": "SUPERDUPER/Beats/metricbeat.yml",
    "scm_branch": "",
    "forks": 0,
    "limit": "",
    "verbosity": 0,
    "extra_vars": "{\"use_repository\": true, \"beats_add_repository\": false, \"beat\": \"metricbeat\", \"version_lock\": true, \"beats_version\": \"7.13.2\", \"beat_conf\": {\"metricbeat.modules\": [{\"module\": \"beat\", \"metricsets\": [\"stats\", \"state\"], \"period\": \"1m\", \"hosts\": [\"http://localhost:5066\", \"http://localhost:5067\"], \"xpack.enabled\": true}, {\"module\": \"system\", \"metricsets\": [\"cpu\", \"load\", \"memory\", \"network\", \"process\", \"diskio\", \"process_summary\", \"uptime\", \"socket_summary\", \"filesystem\", \"service\"], \"enabled\": true, \"period\": \"2m\", \"processes\": [\".*\"], \"cpu.metrics\": [\"percentages\", \"normalized_percentages\"], \"core.metrics\": [\"percentages\"], \"filesystem.ignore_types\": [\"nfs\", \"smbfs\", \"autofs\", \"sysfs\", \"rootfs\", \"ramfs\", \"bdev\", \"proc\", \"cpuset\", \"cgroup\", \"cgroup2\", \"tmpfs\", \"devtmpfs\", \"configfs\", \"debugfs\", \"tracefs\", \"securityfs\", \"sockfs\", \"bpf\", \"pipefs\", \"hugetlbfs\", \"devpts\", \"pstore\", \"mqueue\", \"fuse\", \"fusectl\", \"binfmt_misc\"], \"service.state_filter\": [\"active\"]}], \"setup\": {\"ilm\": {\"check_exists\": false}}, \"setup.template.settings\": {\"index.number_of_shards\": 3, \"index.codec\": \"best_compression\"}, \"tags\": [\"{{ hostvars[inventory_hostname].sn_company }}\", \"Linux\"], \"http.enable\": true, \"http.host\": \"localhost\", \"http.port\": 5066, \"monitoring.enabled\": false}, \"output_conf\": {\"elasticsearch\": {\"hosts\": \"{{ HOST }}\", \"username\": \"{{ BEAT_USERNAME }}\", \"password\": \"{{ BEAT_PASSWORD }}\", \"loadbalance\": true, \"protocol\": \"https\", \"ssl.enabled\": true, \"ssl.verification_mode\": \"full\", \"ssl.certificate_authorities\": [\"/etc/pki/dummy/chain-dummy.qc.ca.crt\"]}}, \"logging_conf\": {\"logging\": null, \"level\": \"info\", \"to_files\": true, \"files\": {\"rotateeverybytes\": 10485760, \"path\": \"/var/log/metricbeat\", \"name\": \"metricbeat\", \"keepfiles\": 3, \"permissions\": 420}}}",
    "job_tags": "",
    "force_handlers": false,
    "skip_tags": "",
    "start_at_task": "",
    "timeout": 0,
    "use_fact_cache": false,
    "organization": 2,
    "unified_job_template": 37,
    "launch_type": "manual",
    "status": "error",
    "execution_environment": 1,
    "failed": true,
    "started": "2021-07-08T13:36:32.958527Z",
    "finished": "2021-07-08T13:52:09.888863Z",
    "canceled_on": null,
    "elapsed": 936.93,
    "job_args": "[\"ssh-agent\", \"sh\", \"-c\", \"trap 'rm -f /runner/artifacts/61/ssh_key_data' EXIT && ssh-add /runner/artifacts/61/ssh_key_data && rm -f /runner/artifacts/61/ssh_key_data && ansible-playbook -u dummyusername --become --become-method sudo -e @/runner/env/tmphohk8g7z -i /runner/inventory/hosts -e @/runner/env/extravars dummy/Beats/metricbeat.yml\"]",
    "job_cwd": "/runner/project",
    "job_env": {
        "AWX_SERVICE_PORT_80_TCP_ADDR": "10.152.183.189",
        "HOSTNAME": "automation-job-61-7knrv",
        "AWX_SERVICE_PORT": "tcp://10.152.183.189:80",
        "AWX_SERVICE_PORT_80_TCP_PORT": "80",
        "KUBERNETES_PORT_443_TCP_PROTO": "tcp",
        "KUBERNETES_PORT_443_TCP_ADDR": "10.152.183.1",
        "AWX_SERVICE_PORT_80_TCP_PROTO": "tcp",
        "KUBERNETES_PORT": "tcp://10.152.183.1:443",
        "PWD": "/runner",
        "HOME": "/var/lib/awx",
        "AWX_SERVICE_PORT_80_TCP": "tcp://10.152.183.189:80",
        "KUBERNETES_SERVICE_PORT_HTTPS": "443",
        "AWX_SERVICE_SERVICE_PORT": "80",
        "KUBERNETES_PORT_443_TCP_PORT": "443",
        "AWX_SERVICE_SERVICE_HOST": "10.152.183.189",
        "KUBERNETES_PORT_443_TCP": "tcp://10.152.183.1:443",
        "SHLVL": "0",
        "KUBERNETES_SERVICE_PORT": "443",
        "PATH": "/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin",
        "AWX_SERVICE_SERVICE_PORT_HTTP": "80",
        "KUBERNETES_SERVICE_HOST": "10.152.183.1",
        "LC_CTYPE": "C.UTF-8",
        "ANSIBLE_FORCE_COLOR": "True",
        "ANSIBLE_HOST_KEY_CHECKING": "False",
        "ANSIBLE_INVENTORY_UNPARSED_FAILED": "True",
        "ANSIBLE_PARAMIKO_RECORD_HOST_KEYS": "False",
        "ANSIBLE_SSH_ARGS": "-C -o ControlMaster=auto -o ControlPersist=60s -o UserKnownHostsFile=/dev/null -o StrictHostKeyChecking=no",
        "AWX_PRIVATE_DATA_DIR": "/tmp/pdd_wrapper_61_3jkajwlj/awx_61_oj5yhyfm",
        "JOB_ID": "61",
        "INVENTORY_ID": "4",
        "PROJECT_REVISION": "187a40e84c978cb62759410c228d93dcf0705884",
        "ANSIBLE_RETRY_FILES_ENABLED": "False",
        "MAX_EVENT_RES": "700000",
        "AWX_HOST": "https://awx.apps.k8sdev01.npr.idummy.app",
        "ANSIBLE_SSH_CONTROL_PATH_DIR": "/runner/cp",
        "ANSIBLE_COLLECTIONS_PATHS": "/runner/requirements_collections:~/.ansible/collections:/usr/share/ansible/collections",
        "ANSIBLE_ROLES_PATH": "/runner/requirements_roles:~/.ansible/roles:/usr/share/ansible/roles:/etc/ansible/roles",
        "PYTHONPATH": ":/usr/local/lib/python3.8/site-packages/ansible_runner/config/../callbacks",
        "ANSIBLE_CALLBACK_PLUGINS": "/usr/local/lib/python3.8/site-packages/ansible_runner/config/../callbacks",
        "ANSIBLE_STDOUT_CALLBACK": "awx_display",
        "AWX_ISOLATED_DATA_DIR": "/runner/artifacts/61",
        "RUNNER_OMIT_EVENTS": "False",
        "RUNNER_ONLY_FAILED_EVENTS": "False"
    },
    "job_explanation": "",
    "execution_node": "",
    "controller_node": "awx-78b469c655-kwbcx",
    "result_traceback": "",
    "event_processing_finished": true,
    "launched_by": {
        "id": 1,
        "name": "admin",
        "type": "user",
        "url": "/api/v2/users/1/"
    },
    "work_unit_id": "Z0NRGn3F",
    "job_template": 37,
    "passwords_needed_to_start": [],
    "allow_simultaneous": false,
    "artifacts": {},
    "scm_revision": "187a40e84c978cb62759410c228d93dcf0705884",
    "instance_group": 2,
    "diff_mode": false,
    "job_slice_number": 0,
    "job_slice_count": 1,
    "webhook_service": "",
    "webhook_credential": null,
    "webhook_guid": "",
    "host_status_counts": {},
    "playbook_counts": {
        "play_count": 1,
        "task_count": 19
    },
    "custom_virtualenv": null
}

Thank you and have a nice day!

@kzinas-adv
Copy link

kzinas-adv commented Sep 16, 2021

HI,

awx 19.3.0
awx-operator 0.13
awx-ee custom
digitalocean Kubernetes 1.21.3-do.0
4cpu 8ram

Exactly the same issue as described here. Job with 78 tasks 39 hosts, runs fine in GUI for about 10 minutes till line ~1780 after that halts in GUI, but keeps running in automation container.
If turn verbosity higher this comes sooner, with DEBUG on line ~600.
"kubectl logs -f pods/automation-job-3159-xzlm5" also halts output at the same with GUI, but if you restart "kubectl logs -f pods/automation-job-3159-xzlm5" it keeps running further.


HTTP 200 OK
Allow: GET, DELETE, HEAD, OPTIONS
Content-Type: application/json
Vary: Accept
X-API-Node: awx-prod-76f94567ff-8drtf
X-API-Product-Name: AWX
X-API-Product-Version: 19.3.0
X-API-Time: 0.090s

{
    "id": 3158,
    "type": "job",
    "url": "/api/v2/jobs/3158/",
    "related": {
        "created_by": "/api/v2/users/2/",
        "labels": "/api/v2/jobs/3158/labels/",
        "inventory": "/api/v2/inventories/1/",
        "project": "/api/v2/projects/6/",
        "organization": "/api/v2/organizations/1/",
        "credentials": "/api/v2/jobs/3158/credentials/",
        "unified_job_template": "/api/v2/job_templates/13/",
        "stdout": "/api/v2/jobs/3158/stdout/",
        "execution_environment": "/api/v2/execution_environments/3/",
        "job_events": "/api/v2/jobs/3158/job_events/",
        "job_host_summaries": "/api/v2/jobs/3158/job_host_summaries/",
        "activity_stream": "/api/v2/jobs/3158/activity_stream/",
        "notifications": "/api/v2/jobs/3158/notifications/",
        "create_schedule": "/api/v2/jobs/3158/create_schedule/",
        "job_template": "/api/v2/job_templates/13/",
        "cancel": "/api/v2/jobs/3158/cancel/",
        "relaunch": "/api/v2/jobs/3158/relaunch/"
    },
    "summary_fields": {
        "organization": {
            "id": 1,
            "name": "bbdbet",
            "description": ""
        },
        "inventory": {
            "id": 1,
            "name": "bbd-dev",
            "description": "",
            "has_active_failures": false,
            "total_hosts": 39,
            "hosts_with_active_failures": 0,
            "total_groups": 12,
            "has_inventory_sources": true,
            "total_inventory_sources": 1,
            "inventory_sources_with_failures": 0,
            "organization_id": 1,
            "kind": ""
        },
        "execution_environment": {
            "id": 3,
            "name": "bbdbet-ee1",
            "description": "docker.io",
            "image": "domasbbd/awx-custom-ee:latest"
        },
        "project": {
            "id": 6,
            "name": "ansible-bbdbet",
            "description": "",
            "status": "successful",
            "scm_type": "git",
            "allow_override": false
        },
        "job_template": {
            "id": 13,
            "name": "bbd-dev ALL",
            "description": ""
        },
        "unified_job_template": {
            "id": 13,
            "name": "bbd-dev ALL",
            "description": "",
            "unified_job_type": "job"
        },
        "instance_group": {
            "id": 3,
            "name": "default",
            "is_container_group": true
        },
        "created_by": {
            "id": 2,
            "username": "domantas.sabonis",
            "first_name": "Domantas",
            "last_name": "Sabonis"
        },
        "user_capabilities": {
            "delete": true,
            "start": true
        },
        "labels": {
            "count": 1,
            "results": [
                {
                    "id": 1,
                    "name": "vhosts"
                }
            ]
        },
        "credentials": [
            {
              "smth": "smth"
            }
        ]
    },
    "created": "2021-09-16T07:54:09.910255Z",
    "modified": "2021-09-16T07:54:10.429265Z",
    "name": "bbd-dev ALL",
    "description": "",
    "job_type": "run",
    "inventory": 1,
    "project": 6,
    "playbook": "bbd-dev.yaml",
    "scm_branch": "",
    "forks": 0,
    "limit": "",
    "verbosity": 0,
    "extra_vars": "{}",
    "job_tags": "",
    "force_handlers": false,
    "skip_tags": "",
    "start_at_task": "",
    "timeout": 0,
    "use_fact_cache": false,
    "organization": 1,
    "unified_job_template": 13,
    "launch_type": "relaunch",
    "status": "running",
    "execution_environment": 3,
    "failed": false,
    "started": "2021-09-16T07:54:10.599048Z",
    "finished": null,
    "canceled_on": null,
    "elapsed": 954.227314,
    "job_args": "[\"ssh-agent\", \"sh\", \"-c\", \"trap 'rm -f /runner/artifacts/3158/ssh_key_data' EXIT && ssh-add /runner/artifacts/3158/ssh_key_data && rm -f /runner/artifacts/3158/ssh_key_data && ansible-playbook -u root -e @/runner/env/tmpmegnhq9z -e @/runner/env/tmpeclhz1kt -i /runner/inventory/hosts -e @/runner/env/extravars bbd-dev.yaml\"]",
    "job_cwd": "/runner/project",
    "job_env": {
        "HOSTNAME": "automation-job-3158-fnssx",
        "QUICKSTART_INGRESS_NGINX_CONTROLLER_SERVICE_PORT_HTTPS": "443",
        "AWX_PROD_SERVICE_SERVICE_PORT": "80",
        "QUICKSTART_INGRESS_NGINX_CONTROLLER_PORT_443_TCP": "tcp://10.245.184.182:443",
        "AWX_PROD_SERVICE_PORT_80_TCP": "tcp://10.245.20.247:80",
        "QUICKSTART_INGRESS_NGINX_CONTROLLER_SERVICE_PORT": "80",
        "QUICKSTART_INGRESS_NGINX_CONTROLLER_PORT_80_TCP_PROTO": "tcp",
        "QUICKSTART_INGRESS_NGINX_CONTROLLER_SERVICE_HOST": "10.245.184.182",
        "QUICKSTART_INGRESS_NGINX_CONTROLLER_PORT_443_TCP_PROTO": "tcp",
        "QUICKSTART_INGRESS_NGINX_CONTROLLER_PORT": "tcp://10.245.184.182:80",
        "KUBERNETES_PORT_443_TCP_PROTO": "tcp",
        "AWX_OPERATOR_METRICS_PORT_8383_TCP_PROTO": "tcp",
        "KUBERNETES_PORT_443_TCP_ADDR": "10.245.0.1",
        "AWX_OPERATOR_METRICS_PORT_8383_TCP_ADDR": "10.245.35.186",
        "QUICKSTART_INGRESS_NGINX_CONTROLLER_PORT_443_TCP_ADDR": "10.245.184.182",
        "AWX_PROD_SERVICE_SERVICE_HOST": "10.245.20.247",
        "QUICKSTART_INGRESS_NGINX_CONTROLLER_PORT_80_TCP_ADDR": "10.245.184.182",
        "QUICKSTART_INGRESS_NGINX_CONTROLLER_ADMISSION_SERVICE_HOST": "10.245.13.104",
        "AWX_OPERATOR_METRICS_SERVICE_PORT": "8383",
        "QUICKSTART_INGRESS_NGINX_CONTROLLER_ADMISSION_PORT_443_TCP_PORT": "443",
        "AWX_PROD_SERVICE_PORT_80_TCP_PROTO": "tcp",
        "KUBERNETES_PORT": "tcp://10.245.0.1:443",
        "AWX_OPERATOR_METRICS_SERVICE_HOST": "10.245.35.186",
        "QUICKSTART_INGRESS_NGINX_CONTROLLER_SERVICE_PORT_HTTP": "80",
        "PWD": "/runner",
        "QUICKSTART_INGRESS_NGINX_CONTROLLER_PORT_443_TCP_PORT": "443",
        "HOME": "/var/lib/awx",
        "QUICKSTART_INGRESS_NGINX_CONTROLLER_ADMISSION_PORT": "tcp://10.245.13.104:443",
        "KUBERNETES_SERVICE_PORT_HTTPS": "443",
        "KUBERNETES_PORT_443_TCP_PORT": "443",
        "AWX_OPERATOR_METRICS_PORT_8383_TCP_PORT": "8383",
        "AWX_PROD_SERVICE_PORT_80_TCP_ADDR": "10.245.20.247",
        "AWX_PROD_SERVICE_SERVICE_PORT_HTTP": "80",
        "AWX_OPERATOR_METRICS_PORT_8686_TCP_ADDR": "10.245.35.186",
        "AWX_OPERATOR_METRICS_PORT_8686_TCP": "tcp://10.245.35.186:8686",
        "AWX_OPERATOR_METRICS_PORT_8383_TCP": "tcp://10.245.35.186:8383",
        "KUBERNETES_PORT_443_TCP": "tcp://10.245.0.1:443",
        "QUICKSTART_INGRESS_NGINX_CONTROLLER_ADMISSION_SERVICE_PORT_HTTPS_WEBHOOK": "443",
        "QUICKSTART_INGRESS_NGINX_CONTROLLER_ADMISSION_PORT_443_TCP_PROTO": "tcp",
        "QUICKSTART_INGRESS_NGINX_CONTROLLER_PORT_80_TCP": "tcp://10.245.184.182:80",
        "AWX_OPERATOR_METRICS_SERVICE_PORT_HTTP_METRICS": "8383",
        "AWX_PROD_SERVICE_PORT": "tcp://10.245.20.247:80",
        "AWX_OPERATOR_METRICS_PORT": "tcp://10.245.35.186:8383",
        "SHLVL": "0",
        "AWX_OPERATOR_METRICS_PORT_8686_TCP_PORT": "8686",
        "KUBERNETES_SERVICE_PORT": "443",
        "AWX_OPERATOR_METRICS_PORT_8686_TCP_PROTO": "tcp",
        "AWX_PROD_SERVICE_PORT_80_TCP_PORT": "80",
        "QUICKSTART_INGRESS_NGINX_CONTROLLER_ADMISSION_SERVICE_PORT": "443",
        "PATH": "/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin",
        "KUBERNETES_SERVICE_HOST": "10.245.0.1",
        "QUICKSTART_INGRESS_NGINX_CONTROLLER_ADMISSION_PORT_443_TCP_ADDR": "10.245.13.104",
        "AWX_OPERATOR_METRICS_SERVICE_PORT_CR_METRICS": "8686",
        "QUICKSTART_INGRESS_NGINX_CONTROLLER_ADMISSION_PORT_443_TCP": "tcp://10.245.13.104:443",
        "QUICKSTART_INGRESS_NGINX_CONTROLLER_PORT_80_TCP_PORT": "80",
        "LC_CTYPE": "C.UTF-8",
        "ANSIBLE_FORCE_COLOR": "True",
        "ANSIBLE_HOST_KEY_CHECKING": "False",
        "ANSIBLE_INVENTORY_UNPARSED_FAILED": "True",
        "ANSIBLE_PARAMIKO_RECORD_HOST_KEYS": "False",
        "AWX_PRIVATE_DATA_DIR": "/tmp/pdd_wrapper_3158_gj2qu4z0/awx_3158_82dvzw38",
        "JOB_ID": "3158",
        "INVENTORY_ID": "1",
        "PROJECT_REVISION": "5a795dbc82e0671dd9ca665c084b7b2649601fc9",
        "ANSIBLE_RETRY_FILES_ENABLED": "False",
        "MAX_EVENT_RES": "700000",
        "AWX_HOST": "https://awx-test.ab.zorik.ge/",
        "ANSIBLE_SSH_CONTROL_PATH_DIR": "/runner/cp",
        "ANSIBLE_COLLECTIONS_PATHS": "/runner/requirements_collections:~/.ansible/collections:/usr/share/ansible/collections",
        "ANSIBLE_ROLES_PATH": "/runner/requirements_roles:./roles:~/.ansible/roles:/usr/share/ansible/roles:/etc/ansible/roles",
        "AWS_ACCESS_KEY_ID": "AKIA4BOZGJQIO6JSM7TK",
        "AWS_SECRET_ACCESS_KEY": "**********",
        "PYTHONPATH": ":/usr/local/lib/python3.8/site-packages/ansible_runner/config/../callbacks",
        "ANSIBLE_CALLBACK_PLUGINS": "/usr/local/lib/python3.8/site-packages/ansible_runner/config/../callbacks",
        "ANSIBLE_STDOUT_CALLBACK": "awx_display",
        "AWX_ISOLATED_DATA_DIR": "/runner/artifacts/3158",
        "RUNNER_OMIT_EVENTS": "False",
        "RUNNER_ONLY_FAILED_EVENTS": "False"
    },
    "job_explanation": "",
    "execution_node": "",
    "controller_node": "awx-prod-76f94567ff-8drtf",
    "result_traceback": "",
    "event_processing_finished": false,
    "launched_by": {
        "id": 2,
        "name": "domantas.sabonis",
        "type": "user",
        "url": "/api/v2/users/2/"
    },
    "work_unit_id": "5zwZhmnr",
    "job_template": 13,
    "passwords_needed_to_start": [],
    "allow_simultaneous": false,
    "artifacts": {},
    "scm_revision": "5a795dbc82e0671dd9ca665c084b7b2649601fc9",
    "instance_group": 3,
    "diff_mode": false,
    "job_slice_number": 0,
    "job_slice_count": 1,
    "webhook_service": "",
    "webhook_credential": null,
    "webhook_guid": "",
    "host_status_counts": {},
    "playbook_counts": {
        "play_count": 2,
        "task_count": 78
    },
    "custom_virtualenv": null
}

@kukacz
Copy link

kukacz commented Dec 10, 2021

Same issue here. We basically cannot run any long jobs with external postgres cluster:

  • Limit is 15 minutes. All jobs running longer than that eventually end up in error state with message Task was marked as running but was not present in the job queue, so it has been marked as failed. The Ansible task itself ends with success.
  • Occurs everytime on all awx deployments configured with external database - we are testing it by running a "sleep" task.
  • Never occurs on deployments with awx - managed database.
  • Different namespace or colocation of awx with database on same K8s node makes no difference.
  • External postgres cluster managed by Crunchydata PGO.
  • No scaling or evictions observed around that issue occurence.
  • awx-operator 0.14
  • Kubernetes v1.19.9 (kubespray), calico, ipvs

Any ideas to try for further troubleshooting? This problem is now very old (first reported in March #9594). Seems to be well outlined and reproducible. There are duplicate candidates in #9594, #11027.

@vibinm
Copy link

vibinm commented Jan 21, 2022

From my experience, there are couple of scenarios in which the problem occurs only with awx-web UI.

OPERATOR_VERSION: 0.10.0
AWX_IMAGE: 19.2.0
AWX-EE_IMAGE: 0.3.0
External Database
K8s version: v1.20.2, calico

In any case, all the long running jobs get completed which can be confirmed from the awx-task logs. And, the error is also same for all of them. The weird part is the logs expose the ssh_password as you can see from the awx_task logs.

  1. Job gets completed and all the job output is there in the UI console but the job is stuck in "running" state. This happens to all jobs running more than 30 minutes exactly. I tested for 45/40/35 minutes and that is the case.

2022-01-21 08:09:06,836 DEBUG [929d30f54a6344f995ca21e6e4a0b419] awx.main.tasks job 4670 (running) finished running, producing 1869 events.
2022-01-21 08:09:06,846 ERROR [929d30f54a6344f995ca21e6e4a0b419] awx.main.tasks job 4670 (running) Post run hook errored.
2022-01-21 08:09:06,860 ERROR [929d30f54a6344f995ca21e6e4a0b419] awx.main.dispatch Worker failed to run task awx.main.tasks.RunJob([4670], **{'ssh_password': 'xxxxxxxxxxx'}
2022-01-21 08:09:06,868 DEBUG [929d30f54a6344f995ca21e6e4a0b419] awx.main.tasks Executing error task id 6d766db1-6407-4d99-b032-3fb947931dcb, subtasks: [{'type': 'job', 'id': 4670}]
2022-01-21 08:09:06,872 ERROR [929d30f54a6344f995ca21e6e4a0b419] awx.main.dispatch Worker failed to run task awx.main.tasks.handle_work_error(
['6d766db1-6407-4d99-b032-3fb947931dcb'], **{'subtasks': [{'type': 'job', 'id': 4670}]}
2022-01-21 08:09:06,880 ERROR [929d30f54a6344f995ca21e6e4a0b419] awx.main.dispatch Worker failed to run task awx.main.tasks.handle_work_success([], **{'task_actual': {'type': 'job', 'id': 4670}}
2022-01-21 08:09:11,758 WARNING [9e42ec1aa1184d8d8fce525ba235372d] awx.main.tasks Failed to even try to send notifications for job '2022-01-21 07:23:52.365456+00:00-4670-running' due to job not being in finished state.
2022-01-21 08:22:04,898 DEBUG [fec59d37b236474881b90c185776613f] awx.main.dispatch task 73293f05-f265-4670-a505-6e55e4f2de18 starting awx.main.tasks.awx_periodic_scheduler(
[])

  1. Job gets competed but the job output and the status get stuck. This happens to all jobs running more than a couple of hours (don't know how long exactly). I tested for 4 hours and that is the case.

2022-01-20 11:35:00,500 ERROR [034450d53ea44ad4b00fcefa2ff1b261] awx.main.tasks job 4654 (running) Post run hook errored.
2022-01-20 11:35:00,515 ERROR [034450d53ea44ad4b00fcefa2ff1b261] awx.main.dispatch Worker failed to run task awx.main.tasks.RunJob([4654], **{'ssh_password': 'xxxxxxxxx'}
2022-01-20 11:35:00,525 DEBUG [034450d53ea44ad4b00fcefa2ff1b261] awx.main.tasks Executing error task id 32937d1a-caea-47f9-b59e-27a4e6fcec47, subtasks: [{'type': 'job', 'id': 4654}]
2022-01-20 11:35:00,530 ERROR [034450d53ea44ad4b00fcefa2ff1b261] awx.main.dispatch Worker failed to run task awx.main.tasks.handle_work_error(
['32937d1a-caea-47f9-b59e-27a4e6fcec47'], **{'subtasks': [{'type': 'job', 'id': 4654}]}
2022-01-20 11:35:00,540 ERROR [034450d53ea44ad4b00fcefa2ff1b261] awx.main.dispatch Worker failed to run task awx.main.tasks.handle_work_success([], **{'task_actual': {'type': 'job', 'id': 4654}}
2022-01-20 11:49:21,168 DEBUG [b67d880694654334a3a0fdc50d7f4b81] awx.main.dispatch task 3b105bd2-1b84-4363-a4bb-1ed74439e405 starting awx.main.analytics.analytics_tasks.send_subsystem_metrics(
[])

Appreciate, if any fix has been identified or under progress.

@chicoraf
Copy link

chicoraf commented Feb 7, 2022

K8s (kubespray)
OPERATOR_VERSION: 0.15.0
AWX_IMAGE: 19.5.0
AWX-EE_IMAGE: 0.6.0
External Postgres 14.1

I´m facing a similar problem. We use a playbook on an inventory with 460 hosts. If we limit the execution to fewer hosts, everything works. If we run on the entire inventory, after a few lines the Output on the AWX interface hangs. It recovers after aprox. 10 min, but when the job finishes, the output is incomplete (without the play recap) and there is missing information on the job via api.

I found something intresting in the container log, may help to find the cause: the line "awx.main.commands.run_callback_receiver JobEvent.objects.bulk_create(x)" apears contantly, until something happens and the line stops showing, at the same time that the Output stops in AWX. After aprox. 10 minutes, it starts to appear again, but this time with a much higher number of events (awx.main.commands.run_callback_receiver JobEvent.objects.bulk_create(45)), like it's trying to compensate.

Anyone knows who generate this lines? Looks like the task container is receiving, but who is sending? The logs of the pod automation-job-xxx-yyyyy never stops, but show no errors.

Thanks for the attention, and sorry for the bad english.

Obs: It's not the entire log for the job, it's just to show the lines I talked about

...
...
2022-02-07 19:53:03,146 DEBUG [-] awx.main.commands.run_callback_receiver JobEvent.objects.bulk_create(1)
2022-02-07 19:53:03,146 DEBUG [-] awx.main.commands.run_callback_receiver JobEvent.objects.bulk_create(1)
2022-02-07 19:53:03,147 DEBUG [-] awx.main.commands.run_callback_receiver JobEvent.objects.bulk_create(1)
2022-02-07 19:53:03,147 DEBUG [-] awx.main.commands.run_callback_receiver JobEvent.objects.bulk_create(1)
2022-02-07 19:53:06,150 DEBUG [-] awx.main.commands.run_callback_receiver JobEvent.objects.bulk_create(1)
2022-02-07 19:53:06,150 DEBUG [-] awx.main.commands.run_callback_receiver JobEvent.objects.bulk_create(1)
2022-02-07 19:53:06,153 DEBUG [-] awx.main.commands.run_callback_receiver JobEvent.objects.bulk_create(1)
2022-02-07 19:53:06,153 DEBUG [-] awx.main.commands.run_callback_receiver JobEvent.objects.bulk_create(1)
2022-02-07 19:53:07,905 DEBUG [-] awx.main.commands.run_callback_receiver JobEvent.objects.bulk_create(1)
2022-02-07 19:53:07,905 DEBUG [-] awx.main.commands.run_callback_receiver JobEvent.objects.bulk_create(1)
2022-02-07 19:53:07,906 DEBUG [-] awx.main.commands.run_callback_receiver JobEvent.objects.bulk_create(1)
2022-02-07 19:53:07,906 DEBUG [-] awx.main.commands.run_callback_receiver JobEvent.objects.bulk_create(1)
2022-02-07 19:53:09,910 DEBUG [-] awx.main.commands.run_callback_receiver JobEvent.objects.bulk_create(1)
2022-02-07 19:53:09,910 DEBUG [-] awx.main.commands.run_callback_receiver JobEvent.objects.bulk_create(1)
2022-02-07 19:53:09,915 DEBUG [-] awx.main.commands.run_callback_receiver JobEvent.objects.bulk_create(1)
2022-02-07 19:53:09,915 DEBUG [-] awx.main.commands.run_callback_receiver JobEvent.objects.bulk_create(1)
2022-02-07 19:53:12,754 DEBUG [493dc5ea8ead49d1ac16c14c1b8bb4e5] awx.main.dispatch task 879f6970-31c1-4e9d-8b3b-765ffa02e3f9 starting awx.main.tasks.awx_periodic_scheduler([])
2022-02-07 19:53:12,764 DEBUG [493dc5ea8ead49d1ac16c14c1b8bb4e5] awx.main.tasks Starting periodic scheduler
2022-02-07 19:53:12,766 DEBUG [493dc5ea8ead49d1ac16c14c1b8bb4e5] awx.main.tasks Last scheduler run was: 2022-02-07 19:52:42.718160+00:00
2022-02-07 19:53:22,774 DEBUG [0bf7ca29854649c0afb2f71c0901a83e] awx.main.dispatch task 0c2e94b3-72a0-4297-804f-3a11ea57f3d7 starting awx.main.scheduler.tasks.run_task_manager(
[])
2022-02-07 19:53:22,775 DEBUG [0bf7ca29854649c0afb2f71c0901a83e] awx.main.scheduler Running task manager.
2022-02-07 19:53:22,784 DEBUG [0bf7ca29854649c0afb2f71c0901a83e] awx.main.scheduler Starting Scheduler
2022-02-07 19:53:22,794 DEBUG [0bf7ca29854649c0afb2f71c0901a83e] awx.main.dispatch task fcc3e379-5b3c-401d-8d54-a50ce965104b starting awx.main.analytics.analytics_tasks.send_subsystem_metrics([])
2022-02-07 19:53:22,830 DEBUG [0bf7ca29854649c0afb2f71c0901a83e] awx.main.scheduler Finishing Scheduler
2022-02-07 19:53:29,600 DEBUG [-] awx.main.commands.run_callback_receiver 85 is alive
2022-02-07 19:53:29,600 DEBUG [-] awx.main.commands.run_callback_receiver 85 is alive
2022-02-07 19:53:42,820 DEBUG [cd035ae175424b6e8ef035bb1e185a93] awx.main.dispatch task 84860c82-2fd0-4fba-a1e7-afb95d1686ac starting awx.main.tasks.cluster_node_heartbeat(
[])
2022-02-07 19:53:42,820 DEBUG [cd035ae175424b6e8ef035bb1e185a93] awx.main.tasks Cluster node heartbeat task.
2022-02-07 19:53:42,846 DEBUG [cd035ae175424b6e8ef035bb1e185a93] awx.main.dispatch task 07e958b5-72c8-4493-9b3b-35308ebe7bd7 starting awx.main.tasks.awx_k8s_reaper([])
2022-02-07 19:53:42,845 DEBUG [cd035ae175424b6e8ef035bb1e185a93] awx.main.dispatch task 1faf0a32-a2c4-462a-b83a-5918a8e340d4 starting awx.main.tasks.awx_receptor_workunit_reaper(
[])
Instance Group already registered controlplane
Instance Group already registered default
Instance Group already registered controlplane
Instance Group already registered default
Instance Group already registered controlplane
Instance Group already registered default
Instance Group already registered controlplane
Instance Group already registered default
Instance Group already registered controlplane
Instance Group already registered default
2022-02-07 19:53:42,849 DEBUG [cd035ae175424b6e8ef035bb1e185a93] awx.main.tasks Checking for unreleased receptor work units
2022-02-07 19:53:42,850 WARNING [cd035ae175424b6e8ef035bb1e185a93] awx.main.dispatch scaling up worker pid:1386
2022-02-07 19:53:42,854 DEBUG [cd035ae175424b6e8ef035bb1e185a93] awx.main.dispatch task 1e6ac34c-724f-440a-ab85-fcb8eef354e0 starting awx.main.tasks.awx_periodic_scheduler([])
2022-02-07 19:53:42,855 DEBUG [cd035ae175424b6e8ef035bb1e185a93] awx.main.dispatch task 58c50621-4fb0-4251-af09-f68aaf42a648 starting awx.main.scheduler.tasks.run_task_manager(
[])
2022-02-07 19:53:42,858 WARNING [cd035ae175424b6e8ef035bb1e185a93] awx.main.dispatch scaling up worker pid:1388
2022-02-07 19:53:42,860 DEBUG [cd035ae175424b6e8ef035bb1e185a93] awx.main.scheduler Running task manager.
2022-02-07 19:53:42,866 DEBUG [cd035ae175424b6e8ef035bb1e185a93] awx.main.tasks Checking for orphaned k8s pods for default-2.
2022-02-07 19:53:42,862 DEBUG [cd035ae175424b6e8ef035bb1e185a93] awx.main.dispatch task 8c9d765d-fb16-4ac5-b06f-0a7fd2a5a2fa starting awx.main.analytics.analytics_tasks.send_subsystem_metrics([])
2022-02-07 19:53:42,869 DEBUG [cd035ae175424b6e8ef035bb1e185a93] awx.main.tasks Starting periodic scheduler
2022-02-07 19:53:42,870 DEBUG [cd035ae175424b6e8ef035bb1e185a93] awx.main.scheduler Starting Scheduler
2022-02-07 19:53:42,870 DEBUG [cd035ae175424b6e8ef035bb1e185a93] awx.main.tasks Last scheduler run was: 2022-02-07 19:53:12.765174+00:00
2022-02-07 19:53:42,925 DEBUG [cd035ae175424b6e8ef035bb1e185a93] awx.main.scheduler Finishing Scheduler
RESULT 2
OKREADY
2022-02-07 19:54:02,870 WARNING [33a5d3c364f8472a8695cd5d0a7b78e5] awx.main.dispatch scaling down worker pid:1377
2022-02-07 19:54:02,871 DEBUG [33a5d3c364f8472a8695cd5d0a7b78e5] awx.main.dispatch task 0ce7f4ee-6217-4b36-be85-d232aad71c54 starting awx.main.scheduler.tasks.run_task_manager(
[])
2022-02-07 19:54:02,871 WARNING [33a5d3c364f8472a8695cd5d0a7b78e5] awx.main.dispatch scaling down worker pid:1379
2022-02-07 19:54:02,872 DEBUG [33a5d3c364f8472a8695cd5d0a7b78e5] awx.main.scheduler Running task manager.
2022-02-07 19:54:02,872 WARNING [cd035ae175424b6e8ef035bb1e185a93] awx.main.dispatch worker exiting gracefully pid:1379
2022-02-07 19:54:02,872 WARNING [cd035ae175424b6e8ef035bb1e185a93] awx.main.dispatch worker exiting gracefully pid:1377
2022-02-07 19:54:02,884 DEBUG [33a5d3c364f8472a8695cd5d0a7b78e5] awx.main.scheduler Starting Scheduler
2022-02-07 19:54:02,918 DEBUG [33a5d3c364f8472a8695cd5d0a7b78e5] awx.main.dispatch task b22fb2fc-6627-45c8-8394-877d37ead647 starting awx.main.analytics.analytics_tasks.send_subsystem_metrics([])
2022-02-07 19:54:02,940 DEBUG [33a5d3c364f8472a8695cd5d0a7b78e5] awx.main.scheduler Finishing Scheduler
2022-02-07 19:54:12,903 DEBUG [ee40feef4ee2430abebf9e517adccfa1] awx.main.dispatch task a1bf446a-6ffd-4a7c-8254-32f27624f862 starting awx.main.tasks.awx_periodic_scheduler(
[])
2022-02-07 19:54:12,913 DEBUG [ee40feef4ee2430abebf9e517adccfa1] awx.main.tasks Starting periodic scheduler
2022-02-07 19:54:12,915 DEBUG [ee40feef4ee2430abebf9e517adccfa1] awx.main.tasks Last scheduler run was: 2022-02-07 19:53:42.869541+00:00
2022-02-07 19:54:22,923 DEBUG [1c49c5cf5c424aa2bf9f152ca5494f03] awx.main.dispatch task 380c8c5c-620a-4a9b-847f-b75116f18ca4 starting awx.main.scheduler.tasks.run_task_manager([])
2022-02-07 19:54:22,924 DEBUG [1c49c5cf5c424aa2bf9f152ca5494f03] awx.main.scheduler Running task manager.
2022-02-07 19:54:22,936 DEBUG [1c49c5cf5c424aa2bf9f152ca5494f03] awx.main.scheduler Starting Scheduler
2022-02-07 19:54:22,950 DEBUG [1c49c5cf5c424aa2bf9f152ca5494f03] awx.main.dispatch task 4469acd9-d458-4e81-bdf2-eb1ad147d402 starting awx.main.analytics.analytics_tasks.send_subsystem_metrics(
[])
2022-02-07 19:54:22,983 DEBUG [1c49c5cf5c424aa2bf9f152ca5494f03] awx.main.scheduler Finishing Scheduler
2022-02-07 19:54:29,684 DEBUG [-] awx.main.commands.run_callback_receiver 85 is alive
2022-02-07 19:54:29,684 DEBUG [-] awx.main.commands.run_callback_receiver 85 is alive
2022-02-07 19:54:42,980 DEBUG [d271db3861584a84918b2b322eda033c] awx.main.dispatch task 32b202d6-d5b3-4126-8652-ed42e95e56e9 starting awx.main.tasks.cluster_node_heartbeat([])
2022-02-07 19:54:42,981 DEBUG [d271db3861584a84918b2b322eda033c] awx.main.tasks Cluster node heartbeat task.
2022-02-07 19:54:42,999 DEBUG [d271db3861584a84918b2b322eda033c] awx.main.dispatch task 713d5969-c6bb-4aa2-9fb6-59c76f827007 starting awx.main.tasks.awx_k8s_reaper(
[])
2022-02-07 19:54:42,999 DEBUG [d271db3861584a84918b2b322eda033c] awx.main.dispatch task d80fe429-b46c-47c3-bbda-a62c247d0469 starting awx.main.tasks.awx_receptor_workunit_reaper([])
Instance Group already registered controlplane
Instance Group already registered default
Instance Group already registered controlplane
Instance Group already registered default
Instance Group already registered controlplane
Instance Group already registered default
Instance Group already registered controlplane
Instance Group already registered default
Instance Group already registered controlplane
Instance Group already registered default
2022-02-07 19:54:43,001 DEBUG [d271db3861584a84918b2b322eda033c] awx.main.tasks Checking for unreleased receptor work units
2022-02-07 19:54:43,004 WARNING [d271db3861584a84918b2b322eda033c] awx.main.dispatch scaling up worker pid:1392
2022-02-07 19:54:43,012 DEBUG [d271db3861584a84918b2b322eda033c] awx.main.tasks Checking for orphaned k8s pods for default-2.
2022-02-07 19:54:43,013 WARNING [d271db3861584a84918b2b322eda033c] awx.main.dispatch scaling up worker pid:1394
2022-02-07 19:54:43,016 DEBUG [d271db3861584a84918b2b322eda033c] awx.main.dispatch task b5bc06d7-fb2f-4698-837e-264b0c61f1a0 starting awx.main.tasks.awx_periodic_scheduler(
[])
2022-02-07 19:54:43,017 DEBUG [d271db3861584a84918b2b322eda033c] awx.main.dispatch task 0452e77c-6d59-4f45-aa65-0f1ea0dd0ad7 starting awx.main.scheduler.tasks.run_task_manager([])
2022-02-07 19:54:43,018 DEBUG [d271db3861584a84918b2b322eda033c] awx.main.scheduler Running task manager.
2022-02-07 19:54:43,019 DEBUG [d271db3861584a84918b2b322eda033c] awx.main.dispatch task 74dde63b-7b54-4b75-807e-3ea0a2ce2045 starting awx.main.analytics.analytics_tasks.send_subsystem_metrics(
[])
2022-02-07 19:54:43,030 DEBUG [d271db3861584a84918b2b322eda033c] awx.main.scheduler Starting Scheduler
2022-02-07 19:54:43,030 DEBUG [d271db3861584a84918b2b322eda033c] awx.main.tasks Starting periodic scheduler
2022-02-07 19:54:43,034 DEBUG [d271db3861584a84918b2b322eda033c] awx.main.tasks Last scheduler run was: 2022-02-07 19:54:12.914083+00:00
2022-02-07 19:54:43,079 DEBUG [d271db3861584a84918b2b322eda033c] awx.main.scheduler Finishing Scheduler
RESULT 2
OKREADY
2022-02-07 19:55:03,030 DEBUG [bac9e2f11e8f444a99870600eefe673e] awx.main.dispatch task 8f926cbd-4cad-4b0f-b0d9-2581f2f2f5ef starting awx.main.scheduler.tasks.run_task_manager([])
2022-02-07 19:55:03,029 WARNING [bac9e2f11e8f444a99870600eefe673e] awx.main.dispatch scaling down worker pid:1381
2022-02-07 19:55:03,031 DEBUG [bac9e2f11e8f444a99870600eefe673e] awx.main.scheduler Running task manager.
2022-02-07 19:55:03,032 WARNING [bac9e2f11e8f444a99870600eefe673e] awx.main.dispatch scaling down worker pid:1388
2022-02-07 19:55:03,032 WARNING [d271db3861584a84918b2b322eda033c] awx.main.dispatch worker exiting gracefully pid:1381
2022-02-07 19:55:03,033 WARNING [d271db3861584a84918b2b322eda033c] awx.main.dispatch worker exiting gracefully pid:1388
2022-02-07 19:55:03,043 DEBUG [bac9e2f11e8f444a99870600eefe673e] awx.main.scheduler Starting Scheduler
2022-02-07 19:55:03,070 DEBUG [bac9e2f11e8f444a99870600eefe673e] awx.main.dispatch task f25e6dc3-2353-44f4-87c5-da4477a71b70 starting awx.main.analytics.analytics_tasks.send_subsystem_metrics(
[])
2022-02-07 19:55:03,088 DEBUG [bac9e2f11e8f444a99870600eefe673e] awx.main.scheduler Finishing Scheduler
2022-02-07 19:55:13,049 DEBUG [a68d3618d9d84268b84b2c8faef0a7cf] awx.main.dispatch task 26a26cea-6473-4f63-8761-5929372a9a74 starting awx.main.tasks.awx_periodic_scheduler([])
2022-02-07 19:55:13,059 DEBUG [a68d3618d9d84268b84b2c8faef0a7cf] awx.main.tasks Starting periodic scheduler
2022-02-07 19:55:13,061 DEBUG [a68d3618d9d84268b84b2c8faef0a7cf] awx.main.tasks Last scheduler run was: 2022-02-07 19:54:43.031284+00:00
2022-02-07 19:55:23,066 DEBUG [e2341ffc07c84a7790654b032fa3b2c5] awx.main.dispatch task a0ee3410-534c-4ff2-9e7e-34953a94ebaa starting awx.main.scheduler.tasks.run_task_manager(
[])
2022-02-07 19:55:23,067 DEBUG [e2341ffc07c84a7790654b032fa3b2c5] awx.main.scheduler Running task manager.
2022-02-07 19:55:23,076 DEBUG [e2341ffc07c84a7790654b032fa3b2c5] awx.main.scheduler Starting Scheduler
2022-02-07 19:55:23,086 DEBUG [e2341ffc07c84a7790654b032fa3b2c5] awx.main.dispatch task fdeacb1b-1c50-4754-a1a9-50cb7491d8d4 starting awx.main.analytics.analytics_tasks.send_subsystem_metrics([])
2022-02-07 19:55:23,128 DEBUG [e2341ffc07c84a7790654b032fa3b2c5] awx.main.scheduler Finishing Scheduler
2022-02-07 19:55:29,761 DEBUG [-] awx.main.commands.run_callback_receiver 85 is alive
2022-02-07 19:55:29,761 DEBUG [-] awx.main.commands.run_callback_receiver 85 is alive
2022-02-07 19:55:43,118 DEBUG [2288be3500f34935b66b114c6c2ccd7f] awx.main.dispatch task c4a3b4c6-957b-438e-90dd-b9ce4956e752 starting awx.main.tasks.cluster_node_heartbeat(
[])
2022-02-07 19:55:43,119 DEBUG [2288be3500f34935b66b114c6c2ccd7f] awx.main.tasks Cluster node heartbeat task.
Instance Group already registered controlplane
Instance Group already registered default
Instance Group already registered controlplane
Instance Group already registered default
Instance Group already registered controlplane
Instance Group already registered default
Instance Group already registered controlplane
Instance Group already registered default
Instance Group already registered controlplane
Instance Group already registered default
2022-02-07 19:55:43,136 DEBUG [2288be3500f34935b66b114c6c2ccd7f] awx.main.dispatch task 34549039-b771-4ad1-b3dc-b6427b32c43d starting awx.main.tasks.awx_k8s_reaper([])
2022-02-07 19:55:43,136 DEBUG [2288be3500f34935b66b114c6c2ccd7f] awx.main.dispatch task ce500e10-44ba-49d1-9946-d509bfc78c35 starting awx.main.tasks.awx_receptor_workunit_reaper(
[])
2022-02-07 19:55:43,138 DEBUG [2288be3500f34935b66b114c6c2ccd7f] awx.main.tasks Checking for unreleased receptor work units
2022-02-07 19:55:43,141 WARNING [2288be3500f34935b66b114c6c2ccd7f] awx.main.dispatch scaling up worker pid:1398
2022-02-07 19:55:43,148 DEBUG [2288be3500f34935b66b114c6c2ccd7f] awx.main.dispatch task ba7dcdfc-0b7e-4d9e-9441-00979e54e4ea starting awx.main.tasks.awx_periodic_scheduler([])
2022-02-07 19:55:43,150 DEBUG [2288be3500f34935b66b114c6c2ccd7f] awx.main.tasks Checking for orphaned k8s pods for default-2.
2022-02-07 19:55:43,150 WARNING [2288be3500f34935b66b114c6c2ccd7f] awx.main.dispatch scaling up worker pid:1400
2022-02-07 19:55:43,155 DEBUG [2288be3500f34935b66b114c6c2ccd7f] awx.main.dispatch task ec40f615-9598-4a21-b4f4-9312f554c3f5 starting awx.main.analytics.analytics_tasks.send_subsystem_metrics(
[])
2022-02-07 19:55:43,156 DEBUG [2288be3500f34935b66b114c6c2ccd7f] awx.main.dispatch task a4da5bd8-d809-4cea-8d07-0d5c4bb03889 starting awx.main.scheduler.tasks.run_task_manager([])
2022-02-07 19:55:43,161 DEBUG [2288be3500f34935b66b114c6c2ccd7f] awx.main.tasks Starting periodic scheduler
2022-02-07 19:55:43,161 DEBUG [2288be3500f34935b66b114c6c2ccd7f] awx.main.scheduler Running task manager.
2022-02-07 19:55:43,164 DEBUG [2288be3500f34935b66b114c6c2ccd7f] awx.main.tasks Last scheduler run was: 2022-02-07 19:55:13.060078+00:00
2022-02-07 19:55:43,175 DEBUG [2288be3500f34935b66b114c6c2ccd7f] awx.main.scheduler Starting Scheduler
2022-02-07 19:55:43,228 DEBUG [2288be3500f34935b66b114c6c2ccd7f] awx.main.scheduler Finishing Scheduler
RESULT 2
OKREADY
2022-02-07 19:56:03,176 DEBUG [db7ada6be8f844b88be6055adff8477f] awx.main.dispatch task f4caef53-02cd-4894-9051-f29efd7c833b starting awx.main.scheduler.tasks.run_task_manager(
[])
2022-02-07 19:56:03,175 WARNING [db7ada6be8f844b88be6055adff8477f] awx.main.dispatch scaling down worker pid:1386
2022-02-07 19:56:03,177 DEBUG [db7ada6be8f844b88be6055adff8477f] awx.main.scheduler Running task manager.
2022-02-07 19:56:03,178 WARNING [2288be3500f34935b66b114c6c2ccd7f] awx.main.dispatch worker exiting gracefully pid:1386
2022-02-07 19:56:03,178 WARNING [db7ada6be8f844b88be6055adff8477f] awx.main.dispatch scaling down worker pid:1394
2022-02-07 19:56:03,179 WARNING [2288be3500f34935b66b114c6c2ccd7f] awx.main.dispatch worker exiting gracefully pid:1394
2022-02-07 19:56:03,189 DEBUG [db7ada6be8f844b88be6055adff8477f] awx.main.scheduler Starting Scheduler
2022-02-07 19:56:03,217 DEBUG [db7ada6be8f844b88be6055adff8477f] awx.main.dispatch task 6f8b05fa-22c1-486f-a8ab-bfa9d86ec56a starting awx.main.analytics.analytics_tasks.send_subsystem_metrics([])
2022-02-07 19:56:03,234 DEBUG [db7ada6be8f844b88be6055adff8477f] awx.main.scheduler Finishing Scheduler
2022-02-07 19:56:13,204 DEBUG [d6ea53630fd44206be3648d1d7ddf410] awx.main.dispatch task 16987962-0580-41e3-aeb7-d121b58d7a88 starting awx.main.tasks.awx_periodic_scheduler(
[])
2022-02-07 19:56:13,213 DEBUG [d6ea53630fd44206be3648d1d7ddf410] awx.main.tasks Starting periodic scheduler
2022-02-07 19:56:13,215 DEBUG [d6ea53630fd44206be3648d1d7ddf410] awx.main.tasks Last scheduler run was: 2022-02-07 19:55:43.161899+00:00
2022-02-07 19:56:23,223 DEBUG [e5449c0168034d4293c4071f2e83f8c5] awx.main.dispatch task 59a8c54e-868e-434c-812d-23094068df90 starting awx.main.scheduler.tasks.run_task_manager([])
2022-02-07 19:56:23,224 DEBUG [e5449c0168034d4293c4071f2e83f8c5] awx.main.scheduler Running task manager.
2022-02-07 19:56:23,233 DEBUG [e5449c0168034d4293c4071f2e83f8c5] awx.main.scheduler Starting Scheduler
2022-02-07 19:56:23,244 DEBUG [e5449c0168034d4293c4071f2e83f8c5] awx.main.dispatch task 193d3b61-082e-4608-9b53-4ea30b758816 starting awx.main.analytics.analytics_tasks.send_subsystem_metrics(
[])
2022-02-07 19:56:23,282 DEBUG [e5449c0168034d4293c4071f2e83f8c5] awx.main.scheduler Finishing Scheduler
2022-02-07 19:56:29,819 DEBUG [-] awx.main.commands.run_callback_receiver 85 is alive
2022-02-07 19:56:29,819 DEBUG [-] awx.main.commands.run_callback_receiver 85 is alive
2022-02-07 19:56:43,276 DEBUG [59a35afede8f49dab6e274f31fcccb31] awx.main.dispatch task c7977c24-0dad-4dae-99b0-486385125ae1 starting awx.main.tasks.cluster_node_heartbeat([])
2022-02-07 19:56:43,277 DEBUG [59a35afede8f49dab6e274f31fcccb31] awx.main.tasks Cluster node heartbeat task.
Instance Group already registered controlplane
Instance Group already registered default
Instance Group already registered controlplane
Instance Group already registered default
Instance Group already registered controlplane
Instance Group already registered default
Instance Group already registered controlplane
Instance Group already registered default
Instance Group already registered controlplane
Instance Group already registered default
2022-02-07 19:56:43,298 DEBUG [59a35afede8f49dab6e274f31fcccb31] awx.main.dispatch task 0717eeca-51c0-41b7-9237-752956a6a8fc starting awx.main.tasks.awx_k8s_reaper(
[])
2022-02-07 19:56:43,299 DEBUG [59a35afede8f49dab6e274f31fcccb31] awx.main.dispatch task 29849b05-987f-4a08-8a0f-a563cebe8069 starting awx.main.tasks.awx_receptor_workunit_reaper([])
2022-02-07 19:56:43,302 DEBUG [59a35afede8f49dab6e274f31fcccb31] awx.main.tasks Checking for unreleased receptor work units
2022-02-07 19:56:43,303 WARNING [59a35afede8f49dab6e274f31fcccb31] awx.main.dispatch scaling up worker pid:1404
2022-02-07 19:56:43,309 DEBUG [59a35afede8f49dab6e274f31fcccb31] awx.main.dispatch task 0c15c07a-2ec7-4dc1-ab58-c42c617f316f starting awx.main.scheduler.tasks.run_task_manager(
[])
2022-02-07 19:56:43,311 DEBUG [59a35afede8f49dab6e274f31fcccb31] awx.main.scheduler Running task manager.
2022-02-07 19:56:43,312 WARNING [59a35afede8f49dab6e274f31fcccb31] awx.main.dispatch scaling up worker pid:1406
2022-02-07 19:56:43,315 DEBUG [59a35afede8f49dab6e274f31fcccb31] awx.main.dispatch task 8a19840d-3cf7-4c6a-baca-0c26e6764e6b starting awx.main.tasks.awx_periodic_scheduler([])
2022-02-07 19:56:43,318 DEBUG [59a35afede8f49dab6e274f31fcccb31] awx.main.tasks Checking for orphaned k8s pods for default-2.
2022-02-07 19:56:43,318 DEBUG [59a35afede8f49dab6e274f31fcccb31] awx.main.dispatch task 15027cba-d268-46a2-9b1d-6f73298f9829 starting awx.main.analytics.analytics_tasks.send_subsystem_metrics(
[])
2022-02-07 19:56:43,326 DEBUG [59a35afede8f49dab6e274f31fcccb31] awx.main.scheduler Starting Scheduler
2022-02-07 19:56:43,328 DEBUG [59a35afede8f49dab6e274f31fcccb31] awx.main.tasks Starting periodic scheduler
2022-02-07 19:56:43,331 DEBUG [59a35afede8f49dab6e274f31fcccb31] awx.main.tasks Last scheduler run was: 2022-02-07 19:56:13.213949+00:00
2022-02-07 19:56:43,383 DEBUG [59a35afede8f49dab6e274f31fcccb31] awx.main.scheduler Finishing Scheduler
RESULT 2
OKREADY
2022-02-07 19:57:03,334 WARNING [6770d6b217c243359e0b3332aab76977] awx.main.dispatch scaling down worker pid:1392
2022-02-07 19:57:03,335 DEBUG [6770d6b217c243359e0b3332aab76977] awx.main.dispatch task 8422ec84-d17a-4b94-bda7-47de584cbad6 starting awx.main.scheduler.tasks.run_task_manager([])
2022-02-07 19:57:03,336 DEBUG [6770d6b217c243359e0b3332aab76977] awx.main.scheduler Running task manager.
2022-02-07 19:57:03,336 WARNING [59a35afede8f49dab6e274f31fcccb31] awx.main.dispatch worker exiting gracefully pid:1392
2022-02-07 19:57:03,336 WARNING [6770d6b217c243359e0b3332aab76977] awx.main.dispatch scaling down worker pid:1398
2022-02-07 19:57:03,341 WARNING [59a35afede8f49dab6e274f31fcccb31] awx.main.dispatch worker exiting gracefully pid:1398
2022-02-07 19:57:03,350 DEBUG [6770d6b217c243359e0b3332aab76977] awx.main.scheduler Starting Scheduler
2022-02-07 19:57:03,373 DEBUG [6770d6b217c243359e0b3332aab76977] awx.main.dispatch task 549bb497-c390-4bc4-86b0-251bbf1f3f8a starting awx.main.analytics.analytics_tasks.send_subsystem_metrics(
[])
2022-02-07 19:57:03,394 DEBUG [6770d6b217c243359e0b3332aab76977] awx.main.scheduler Finishing Scheduler
2022-02-07 19:57:13,357 DEBUG [cd978bd53a3e4333931abc3e8f7a0d27] awx.main.dispatch task 1e496c09-db5a-4fbe-a1b7-86ea6472f87c starting awx.main.tasks.awx_periodic_scheduler([])
2022-02-07 19:57:13,368 DEBUG [cd978bd53a3e4333931abc3e8f7a0d27] awx.main.tasks Starting periodic scheduler
2022-02-07 19:57:13,370 DEBUG [cd978bd53a3e4333931abc3e8f7a0d27] awx.main.tasks Last scheduler run was: 2022-02-07 19:56:43.329539+00:00
2022-02-07 19:57:23,377 DEBUG [954208731db745aba718082aca95c826] awx.main.dispatch task 499e059c-f300-44f2-938f-e01ee36231a6 starting awx.main.scheduler.tasks.run_task_manager(
[])
2022-02-07 19:57:23,378 DEBUG [954208731db745aba718082aca95c826] awx.main.scheduler Running task manager.
2022-02-07 19:57:23,388 DEBUG [954208731db745aba718082aca95c826] awx.main.scheduler Starting Scheduler
2022-02-07 19:57:23,398 DEBUG [954208731db745aba718082aca95c826] awx.main.dispatch task 6dadc514-0335-4949-afbb-e763e5f83682 starting awx.main.analytics.analytics_tasks.send_subsystem_metrics([])
2022-02-07 19:57:23,438 DEBUG [954208731db745aba718082aca95c826] awx.main.scheduler Finishing Scheduler
2022-02-07 19:57:29,853 DEBUG [-] awx.main.commands.run_callback_receiver 85 is alive
2022-02-07 19:57:29,853 DEBUG [-] awx.main.commands.run_callback_receiver 85 is alive
2022-02-07 19:57:43,407 DEBUG [05e494902893460dbe39403b7725e63f] awx.main.dispatch task 5c228ecc-75ee-4832-bd34-f0bcf586a72f starting awx.main.tasks.gather_analytics(
[])
Instance Group already registered controlplane
Instance Group already registered default
Instance Group already registered controlplane
Instance Group already registered default
Instance Group already registered controlplane
Instance Group already registered default
Instance Group already registered controlplane
Instance Group already registered default
Instance Group already registered controlplane
Instance Group already registered default
2022-02-07 19:57:43,439 DEBUG [05e494902893460dbe39403b7725e63f] awx.main.dispatch task 3a204065-674e-4e96-b18b-e645eb4eaa0c starting awx.main.tasks.awx_receptor_workunit_reaper([])
2022-02-07 19:57:43,440 DEBUG [05e494902893460dbe39403b7725e63f] awx.main.tasks Checking for unreleased receptor work units
2022-02-07 19:57:43,439 DEBUG [05e494902893460dbe39403b7725e63f] awx.main.dispatch task 792ca145-0693-4a62-95a4-49ba4766539c starting awx.main.tasks.cluster_node_heartbeat(
[])
2022-02-07 19:57:43,440 DEBUG [05e494902893460dbe39403b7725e63f] awx.main.dispatch task ea51d94c-5686-4d70-bfac-548e415d710a starting awx.main.tasks.awx_k8s_reaper([])
2022-02-07 19:57:43,441 DEBUG [05e494902893460dbe39403b7725e63f] awx.main.tasks Cluster node heartbeat task.
2022-02-07 19:57:43,444 WARNING [05e494902893460dbe39403b7725e63f] awx.main.dispatch scaling up worker pid:1410
2022-02-07 19:57:43,453 DEBUG [05e494902893460dbe39403b7725e63f] awx.main.tasks Checking for orphaned k8s pods for default-2.
2022-02-07 19:57:43,453 DEBUG [05e494902893460dbe39403b7725e63f] awx.main.dispatch task 64d92076-4e8b-48c0-9e23-982077fb437c starting awx.main.tasks.awx_periodic_scheduler(
[])
2022-02-07 19:57:43,455 WARNING [05e494902893460dbe39403b7725e63f] awx.main.dispatch scaling up worker pid:1412
2022-02-07 19:57:43,464 WARNING [05e494902893460dbe39403b7725e63f] awx.main.dispatch scaling up worker pid:1414
2022-02-07 19:57:43,470 DEBUG [05e494902893460dbe39403b7725e63f] awx.main.tasks Starting periodic scheduler
2022-02-07 19:57:43,469 DEBUG [05e494902893460dbe39403b7725e63f] awx.main.dispatch task b9a8aaea-b6ca-4b8b-af7c-2cf8147316cb starting awx.main.scheduler.tasks.run_task_manager([])
2022-02-07 19:57:43,471 DEBUG [05e494902893460dbe39403b7725e63f] awx.main.scheduler Running task manager.
2022-02-07 19:57:43,471 DEBUG [05e494902893460dbe39403b7725e63f] awx.main.dispatch task 2f9dbb42-0466-4d41-a48c-6e3d9b37f579 starting awx.main.analytics.analytics_tasks.send_subsystem_metrics(
[])
2022-02-07 19:57:43,474 DEBUG [05e494902893460dbe39403b7725e63f] awx.main.tasks Last scheduler run was: 2022-02-07 19:57:13.368770+00:00
2022-02-07 19:57:43,481 DEBUG [05e494902893460dbe39403b7725e63f] awx.main.scheduler Starting Scheduler
2022-02-07 19:57:43,536 DEBUG [05e494902893460dbe39403b7725e63f] awx.main.scheduler Finishing Scheduler
RESULT 2
OKREADY
2022-02-07 19:58:03,486 DEBUG [02f9d7e695394fbf93af7e891daf1cfe] awx.main.dispatch task 24ab9bf2-feb7-4b83-b24c-d85a56f9db5f starting awx.main.scheduler.tasks.run_task_manager([])
2022-02-07 19:58:03,485 WARNING [02f9d7e695394fbf93af7e891daf1cfe] awx.main.dispatch scaling down worker pid:1404
2022-02-07 19:58:03,487 DEBUG [02f9d7e695394fbf93af7e891daf1cfe] awx.main.scheduler Running task manager.
2022-02-07 19:58:03,487 WARNING [02f9d7e695394fbf93af7e891daf1cfe] awx.main.dispatch scaling down worker pid:1406
2022-02-07 19:58:03,489 WARNING [05e494902893460dbe39403b7725e63f] awx.main.dispatch worker exiting gracefully pid:1406
2022-02-07 19:58:03,490 WARNING [05e494902893460dbe39403b7725e63f] awx.main.dispatch worker exiting gracefully pid:1404
2022-02-07 19:58:03,489 WARNING [02f9d7e695394fbf93af7e891daf1cfe] awx.main.dispatch scaling down worker pid:1410
2022-02-07 19:58:03,492 WARNING [05e494902893460dbe39403b7725e63f] awx.main.dispatch worker exiting gracefully pid:1410
2022-02-07 19:58:03,497 DEBUG [02f9d7e695394fbf93af7e891daf1cfe] awx.main.scheduler Starting Scheduler
2022-02-07 19:58:03,537 DEBUG [02f9d7e695394fbf93af7e891daf1cfe] awx.main.dispatch task f09ee0b5-5013-447e-bae1-ec78f5c8e8bb starting awx.main.analytics.analytics_tasks.send_subsystem_metrics(
[])
2022-02-07 19:58:03,546 DEBUG [02f9d7e695394fbf93af7e891daf1cfe] awx.main.scheduler Finishing Scheduler
2022-02-07 19:58:13,505 DEBUG [ebcc6d3fb47a47e09e917f54438fda5f] awx.main.dispatch task 7a778ac2-6227-4a10-b1e5-5268a9cd42b7 starting awx.main.tasks.awx_periodic_scheduler([])
2022-02-07 19:58:13,515 DEBUG [ebcc6d3fb47a47e09e917f54438fda5f] awx.main.tasks Starting periodic scheduler
2022-02-07 19:58:13,517 DEBUG [ebcc6d3fb47a47e09e917f54438fda5f] awx.main.tasks Last scheduler run was: 2022-02-07 19:57:43.471361+00:00
2022-02-07 19:58:23,520 DEBUG [68c9498dd1294224931f0440b7ca3e5a] awx.main.dispatch task e1f62790-3af7-4245-a67d-795f7da37f35 starting awx.main.scheduler.tasks.run_task_manager(
[])
2022-02-07 19:58:23,522 DEBUG [68c9498dd1294224931f0440b7ca3e5a] awx.main.scheduler Running task manager.
2022-02-07 19:58:23,531 DEBUG [68c9498dd1294224931f0440b7ca3e5a] awx.main.scheduler Starting Scheduler
2022-02-07 19:58:23,540 DEBUG [68c9498dd1294224931f0440b7ca3e5a] awx.main.dispatch task f479cdaa-bb18-4766-ad10-68a27875c04f starting awx.main.analytics.analytics_tasks.send_subsystem_metrics([])
2022-02-07 19:58:23,577 DEBUG [68c9498dd1294224931f0440b7ca3e5a] awx.main.scheduler Finishing Scheduler
2022-02-07 19:58:29,947 DEBUG [-] awx.main.commands.run_callback_receiver 85 is alive
2022-02-07 19:58:29,947 DEBUG [-] awx.main.commands.run_callback_receiver 85 is alive
2022-02-07 19:58:43,572 DEBUG [2c228c1f58e7439e991ccf0cb8988da7] awx.main.dispatch task 20d5bc9b-872b-4280-9ffd-43896ec343da starting awx.main.tasks.cluster_node_heartbeat(
[])
2022-02-07 19:58:43,573 DEBUG [2c228c1f58e7439e991ccf0cb8988da7] awx.main.tasks Cluster node heartbeat task.
2022-02-07 19:58:43,591 DEBUG [2c228c1f58e7439e991ccf0cb8988da7] awx.main.dispatch task cc1a8833-0298-45d4-a0a9-c3cf2410c7d4 starting awx.main.tasks.awx_k8s_reaper([])
2022-02-07 19:58:43,592 DEBUG [2c228c1f58e7439e991ccf0cb8988da7] awx.main.dispatch task e002456d-823c-4553-801a-184b7a879604 starting awx.main.tasks.awx_receptor_workunit_reaper(
[])
Instance Group already registered controlplane
Instance Group already registered default
Instance Group already registered controlplane
Instance Group already registered default
Instance Group already registered controlplane
Instance Group already registered default
Instance Group already registered controlplane
Instance Group already registered default
Instance Group already registered controlplane
Instance Group already registered default
2022-02-07 19:58:43,594 DEBUG [2c228c1f58e7439e991ccf0cb8988da7] awx.main.tasks Checking for unreleased receptor work units
2022-02-07 19:58:43,596 WARNING [2c228c1f58e7439e991ccf0cb8988da7] awx.main.dispatch scaling up worker pid:1419
2022-02-07 19:58:43,601 DEBUG [2c228c1f58e7439e991ccf0cb8988da7] awx.main.dispatch task 457b195a-6330-48dc-9d19-b49839b3e34e starting awx.main.tasks.awx_periodic_scheduler([])
2022-02-07 19:58:43,602 DEBUG [2c228c1f58e7439e991ccf0cb8988da7] awx.main.dispatch task 51972b5e-8615-4083-bb68-466d27e117b0 starting awx.main.scheduler.tasks.run_task_manager(
[])
2022-02-07 19:58:43,604 DEBUG [2c228c1f58e7439e991ccf0cb8988da7] awx.main.scheduler Running task manager.
2022-02-07 19:58:43,605 DEBUG [2c228c1f58e7439e991ccf0cb8988da7] awx.main.tasks Checking for orphaned k8s pods for default-2.
2022-02-07 19:58:43,606 WARNING [2c228c1f58e7439e991ccf0cb8988da7] awx.main.dispatch scaling up worker pid:1421
2022-02-07 19:58:43,610 DEBUG [2c228c1f58e7439e991ccf0cb8988da7] awx.main.tasks Starting periodic scheduler
2022-02-07 19:58:43,612 DEBUG [2c228c1f58e7439e991ccf0cb8988da7] awx.main.tasks Last scheduler run was: 2022-02-07 19:58:13.516122+00:00
2022-02-07 19:58:43,611 DEBUG [2c228c1f58e7439e991ccf0cb8988da7] awx.main.dispatch task b006aeda-bbb7-42cf-9d26-0754127bc4db starting awx.main.analytics.analytics_tasks.send_subsystem_metrics([])
2022-02-07 19:58:43,624 DEBUG [2c228c1f58e7439e991ccf0cb8988da7] awx.main.scheduler Starting Scheduler
2022-02-07 19:58:43,679 DEBUG [2c228c1f58e7439e991ccf0cb8988da7] awx.main.scheduler Finishing Scheduler
RESULT 2
OKREADY
2022-02-07 19:59:03,627 DEBUG [02ab0c5f49d64e86a5ce48a14febe114] awx.main.dispatch task 25612b1d-962b-4783-9dda-d14fdc16c9fa starting awx.main.scheduler.tasks.run_task_manager(
[])
2022-02-07 19:59:03,627 WARNING [02ab0c5f49d64e86a5ce48a14febe114] awx.main.dispatch scaling down worker pid:1400
2022-02-07 19:59:03,629 DEBUG [02ab0c5f49d64e86a5ce48a14febe114] awx.main.scheduler Running task manager.
2022-02-07 19:59:03,629 WARNING [02ab0c5f49d64e86a5ce48a14febe114] awx.main.dispatch scaling down worker pid:1412
2022-02-07 19:59:03,629 WARNING [2c228c1f58e7439e991ccf0cb8988da7] awx.main.dispatch worker exiting gracefully pid:1400
2022-02-07 19:59:03,631 WARNING [2c228c1f58e7439e991ccf0cb8988da7] awx.main.dispatch worker exiting gracefully pid:1412
2022-02-07 19:59:03,643 DEBUG [02ab0c5f49d64e86a5ce48a14febe114] awx.main.scheduler Starting Scheduler
2022-02-07 19:59:03,671 DEBUG [02ab0c5f49d64e86a5ce48a14febe114] awx.main.dispatch task a5b739a5-4898-4bab-a797-86752f63664f starting awx.main.analytics.analytics_tasks.send_subsystem_metrics([])
2022-02-07 19:59:03,694 DEBUG [02ab0c5f49d64e86a5ce48a14febe114] awx.main.scheduler Finishing Scheduler
2022-02-07 19:59:13,647 DEBUG [4fec13383e0f4ec394e0175447c7fe11] awx.main.dispatch task ad0b67f1-19d6-49e9-b093-86ae28cfb379 starting awx.main.tasks.awx_periodic_scheduler(
[])
2022-02-07 19:59:13,657 DEBUG [4fec13383e0f4ec394e0175447c7fe11] awx.main.tasks Starting periodic scheduler
2022-02-07 19:59:13,659 DEBUG [4fec13383e0f4ec394e0175447c7fe11] awx.main.tasks Last scheduler run was: 2022-02-07 19:58:43.610735+00:00
2022-02-07 19:59:23,664 DEBUG [37b98c9625aa4f7ebe6b63caa989847b] awx.main.dispatch task 563e73e6-d3af-4cbc-8b29-2b004df0eb68 starting awx.main.scheduler.tasks.run_task_manager([])
2022-02-07 19:59:23,665 DEBUG [37b98c9625aa4f7ebe6b63caa989847b] awx.main.scheduler Running task manager.
2022-02-07 19:59:23,678 DEBUG [37b98c9625aa4f7ebe6b63caa989847b] awx.main.scheduler Starting Scheduler
2022-02-07 19:59:23,688 DEBUG [37b98c9625aa4f7ebe6b63caa989847b] awx.main.dispatch task e8f11ac5-1afb-4247-8da6-f14989e85210 starting awx.main.analytics.analytics_tasks.send_subsystem_metrics(
[])
2022-02-07 19:59:23,723 DEBUG [37b98c9625aa4f7ebe6b63caa989847b] awx.main.scheduler Finishing Scheduler
2022-02-07 19:59:30,024 DEBUG [-] awx.main.commands.run_callback_receiver 85 is alive
2022-02-07 19:59:30,024 DEBUG [-] awx.main.commands.run_callback_receiver 85 is alive
2022-02-07 19:59:43,708 DEBUG [b23ffddd738e4547a017704be7b4c1a6] awx.main.dispatch task 761f4718-192a-4915-b6d1-a74b60aa49c3 starting awx.main.tasks.cluster_node_heartbeat([])
2022-02-07 19:59:43,708 DEBUG [b23ffddd738e4547a017704be7b4c1a6] awx.main.tasks Cluster node heartbeat task.
2022-02-07 19:59:43,732 DEBUG [b23ffddd738e4547a017704be7b4c1a6] awx.main.dispatch task 56eaab4a-ced6-47d2-9005-b2ee51af383c starting awx.main.tasks.awx_receptor_workunit_reaper(
[])
2022-02-07 19:59:43,733 DEBUG [b23ffddd738e4547a017704be7b4c1a6] awx.main.tasks Checking for unreleased receptor work units
Instance Group already registered controlplane
Instance Group already registered default
Instance Group already registered controlplane
Instance Group already registered default
Instance Group already registered controlplane
Instance Group already registered default
Instance Group already registered controlplane
Instance Group already registered default
Instance Group already registered controlplane
Instance Group already registered default
2022-02-07 19:59:43,733 DEBUG [b23ffddd738e4547a017704be7b4c1a6] awx.main.dispatch task fbfdaf0a-80f1-45ad-90ef-40f9efb73440 starting awx.main.tasks.awx_k8s_reaper([])
2022-02-07 19:59:43,737 WARNING [b23ffddd738e4547a017704be7b4c1a6] awx.main.dispatch scaling up worker pid:1425
2022-02-07 19:59:43,741 DEBUG [b23ffddd738e4547a017704be7b4c1a6] awx.main.dispatch task 6e6e743f-ea7a-46a2-bc86-fd58ee0db13c starting awx.main.scheduler.tasks.run_task_manager(
[])
2022-02-07 19:59:43,742 DEBUG [b23ffddd738e4547a017704be7b4c1a6] awx.main.scheduler Running task manager.
2022-02-07 19:59:43,743 DEBUG [b23ffddd738e4547a017704be7b4c1a6] awx.main.dispatch task b5a81df3-a4b8-43fc-b4f2-4977d1cffa4e starting awx.main.tasks.awx_periodic_scheduler([])
2022-02-07 19:59:43,746 DEBUG [b23ffddd738e4547a017704be7b4c1a6] awx.main.tasks Checking for orphaned k8s pods for default-2.
2022-02-07 19:59:43,745 WARNING [b23ffddd738e4547a017704be7b4c1a6] awx.main.dispatch scaling up worker pid:1427
2022-02-07 19:59:43,751 DEBUG [b23ffddd738e4547a017704be7b4c1a6] awx.main.scheduler Starting Scheduler
2022-02-07 19:59:43,751 DEBUG [b23ffddd738e4547a017704be7b4c1a6] awx.main.dispatch task 17c85587-fcd5-46bf-96c7-7fc605b0815f starting awx.main.analytics.analytics_tasks.send_subsystem_metrics(
[])
2022-02-07 19:59:43,759 DEBUG [b23ffddd738e4547a017704be7b4c1a6] awx.main.tasks Starting periodic scheduler
2022-02-07 19:59:43,763 DEBUG [b23ffddd738e4547a017704be7b4c1a6] awx.main.tasks Last scheduler run was: 2022-02-07 19:59:13.658104+00:00
2022-02-07 19:59:43,797 DEBUG [b23ffddd738e4547a017704be7b4c1a6] awx.main.scheduler Finishing Scheduler
RESULT 2
OKREADY
2022-02-07 20:00:03,759 WARNING [3125bb17cc2b42fca037729433f4d695] awx.main.dispatch scaling down worker pid:1419
2022-02-07 20:00:03,760 DEBUG [3125bb17cc2b42fca037729433f4d695] awx.main.dispatch task 06513daf-e0dd-4bdd-aa71-e2b012e01ad0 starting awx.main.scheduler.tasks.run_task_manager([])
2022-02-07 20:00:03,762 DEBUG [3125bb17cc2b42fca037729433f4d695] awx.main.scheduler Running task manager.
2022-02-07 20:00:03,761 WARNING [3125bb17cc2b42fca037729433f4d695] awx.main.dispatch scaling down worker pid:1421
2022-02-07 20:00:03,761 WARNING [b23ffddd738e4547a017704be7b4c1a6] awx.main.dispatch worker exiting gracefully pid:1419
2022-02-07 20:00:03,765 WARNING [b23ffddd738e4547a017704be7b4c1a6] awx.main.dispatch worker exiting gracefully pid:1421
2022-02-07 20:00:03,772 DEBUG [3125bb17cc2b42fca037729433f4d695] awx.main.scheduler Starting Scheduler
2022-02-07 20:00:03,803 DEBUG [3125bb17cc2b42fca037729433f4d695] awx.main.dispatch task ea62d83f-cae8-4015-9ced-df0a5463ca6d starting awx.main.analytics.analytics_tasks.send_subsystem_metrics(
[])
2022-02-07 20:00:03,818 DEBUG [3125bb17cc2b42fca037729433f4d695] awx.main.scheduler Finishing Scheduler
2022-02-07 20:00:13,785 DEBUG [7fc207952aed4c7e9c97653d4b420c69] awx.main.dispatch task 045f9bd1-814c-48fb-a386-bb1bd5c61a67 starting awx.main.tasks.awx_periodic_scheduler([])
2022-02-07 20:00:13,797 DEBUG [7fc207952aed4c7e9c97653d4b420c69] awx.main.tasks Starting periodic scheduler
2022-02-07 20:00:13,800 DEBUG [7fc207952aed4c7e9c97653d4b420c69] awx.main.tasks Last scheduler run was: 2022-02-07 19:59:43.760424+00:00
2022-02-07 20:00:23,805 DEBUG [9b64619aeefa453ebf74e6ce630c2efe] awx.main.dispatch task 6e37a9c4-428a-4e80-b5b2-137604697b8c starting awx.main.scheduler.tasks.run_task_manager(
[])
2022-02-07 20:00:23,805 DEBUG [9b64619aeefa453ebf74e6ce630c2efe] awx.main.scheduler Running task manager.
2022-02-07 20:00:23,815 DEBUG [9b64619aeefa453ebf74e6ce630c2efe] awx.main.scheduler Starting Scheduler
2022-02-07 20:00:23,826 DEBUG [9b64619aeefa453ebf74e6ce630c2efe] awx.main.dispatch task 6674b2d0-d90b-4556-80b7-e87265f6a67f starting awx.main.analytics.analytics_tasks.send_subsystem_metrics([])
2022-02-07 20:00:23,867 DEBUG [9b64619aeefa453ebf74e6ce630c2efe] awx.main.scheduler Finishing Scheduler
2022-02-07 20:00:30,114 DEBUG [-] awx.main.commands.run_callback_receiver 85 is alive
2022-02-07 20:00:30,114 DEBUG [-] awx.main.commands.run_callback_receiver 85 is alive
2022-02-07 20:00:43,853 DEBUG [a2f39c8af3f042c5919cbf870a96c5b9] awx.main.dispatch task 717863c4-f618-4f70-a80b-b048bbaea683 starting awx.main.tasks.cluster_node_heartbeat(
[])
2022-02-07 20:00:43,854 DEBUG [a2f39c8af3f042c5919cbf870a96c5b9] awx.main.tasks Cluster node heartbeat task.
2022-02-07 20:00:43,875 DEBUG [a2f39c8af3f042c5919cbf870a96c5b9] awx.main.dispatch task 18272f99-b9d3-4ddc-a7b6-619850afc346 starting awx.main.tasks.awx_receptor_workunit_reaper([])
2022-02-07 20:00:43,875 DEBUG [a2f39c8af3f042c5919cbf870a96c5b9] awx.main.dispatch task 6bddeeb0-975f-43dc-8115-f67bbdca9065 starting awx.main.tasks.awx_k8s_reaper(
[])
Instance Group already registered controlplane
Instance Group already registered default
Instance Group already registered controlplane
Instance Group already registered default
Instance Group already registered controlplane
Instance Group already registered default
Instance Group already registered controlplane
Instance Group already registered default
Instance Group already registered controlplane
Instance Group already registered default
2022-02-07 20:00:43,876 DEBUG [a2f39c8af3f042c5919cbf870a96c5b9] awx.main.tasks Checking for unreleased receptor work units
2022-02-07 20:00:43,880 WARNING [a2f39c8af3f042c5919cbf870a96c5b9] awx.main.dispatch scaling up worker pid:1431
2022-02-07 20:00:43,884 DEBUG [a2f39c8af3f042c5919cbf870a96c5b9] awx.main.dispatch task a5e1ebac-78ac-4cc6-9d72-d58e61782eb9 starting awx.main.tasks.awx_periodic_scheduler([])
2022-02-07 20:00:43,886 DEBUG [a2f39c8af3f042c5919cbf870a96c5b9] awx.main.dispatch task a6d69491-7390-4eb0-bb03-ecc371dc4165 starting awx.main.scheduler.tasks.run_task_manager(
[])
2022-02-07 20:00:43,889 WARNING [a2f39c8af3f042c5919cbf870a96c5b9] awx.main.dispatch scaling up worker pid:1433
2022-02-07 20:00:43,893 DEBUG [a2f39c8af3f042c5919cbf870a96c5b9] awx.main.scheduler Running task manager.
2022-02-07 20:00:43,894 DEBUG [a2f39c8af3f042c5919cbf870a96c5b9] awx.main.tasks Starting periodic scheduler
2022-02-07 20:00:43,895 DEBUG [a2f39c8af3f042c5919cbf870a96c5b9] awx.main.tasks Checking for orphaned k8s pods for default-2.
2022-02-07 20:00:43,894 DEBUG [a2f39c8af3f042c5919cbf870a96c5b9] awx.main.dispatch task 36b15a2a-7759-4c50-9777-f93500b2735f starting awx.main.analytics.analytics_tasks.send_subsystem_metrics([])
2022-02-07 20:00:43,896 DEBUG [a2f39c8af3f042c5919cbf870a96c5b9] awx.main.tasks Last scheduler run was: 2022-02-07 20:00:13.797980+00:00
2022-02-07 20:00:43,904 DEBUG [a2f39c8af3f042c5919cbf870a96c5b9] awx.main.scheduler Starting Scheduler
2022-02-07 20:00:43,960 DEBUG [a2f39c8af3f042c5919cbf870a96c5b9] awx.main.scheduler Finishing Scheduler
RESULT 2
OKREADY
2022-02-07 20:01:03,911 WARNING [e31d501910024f31a1a49c063499daaf] awx.main.dispatch scaling down worker pid:1414
2022-02-07 20:01:03,912 DEBUG [e31d501910024f31a1a49c063499daaf] awx.main.dispatch task dd744bcb-43fc-403a-9ad9-0292203aea36 starting awx.main.scheduler.tasks.run_task_manager(
[])
2022-02-07 20:01:03,913 DEBUG [e31d501910024f31a1a49c063499daaf] awx.main.scheduler Running task manager.
2022-02-07 20:01:03,913 WARNING [a2f39c8af3f042c5919cbf870a96c5b9] awx.main.dispatch worker exiting gracefully pid:1414
2022-02-07 20:01:03,912 WARNING [e31d501910024f31a1a49c063499daaf] awx.main.dispatch scaling down worker pid:1427
2022-02-07 20:01:03,915 WARNING [a2f39c8af3f042c5919cbf870a96c5b9] awx.main.dispatch worker exiting gracefully pid:1427
2022-02-07 20:01:03,925 DEBUG [e31d501910024f31a1a49c063499daaf] awx.main.scheduler Starting Scheduler
2022-02-07 20:01:03,954 DEBUG [e31d501910024f31a1a49c063499daaf] awx.main.dispatch task 748dd67a-ad75-4a64-ad2d-537687f417ae starting awx.main.analytics.analytics_tasks.send_subsystem_metrics([])
2022-02-07 20:01:03,977 DEBUG [e31d501910024f31a1a49c063499daaf] awx.main.scheduler Finishing Scheduler
2022-02-07 20:01:13,934 DEBUG [3fe09f64d4c7431291f0d2999b2699c0] awx.main.dispatch task afdd391f-4834-49ab-b0dd-6ea30751d9c6 starting awx.main.tasks.awx_periodic_scheduler(
[])
2022-02-07 20:01:13,946 DEBUG [3fe09f64d4c7431291f0d2999b2699c0] awx.main.tasks Starting periodic scheduler
2022-02-07 20:01:13,948 DEBUG [3fe09f64d4c7431291f0d2999b2699c0] awx.main.tasks Last scheduler run was: 2022-02-07 20:00:43.894589+00:00
2022-02-07 20:01:23,954 DEBUG [956939520c5d458ea8d22f5320da4cc6] awx.main.dispatch task 6124af48-0694-4331-9236-28a24d69aef8 starting awx.main.scheduler.tasks.run_task_manager([])
2022-02-07 20:01:23,955 DEBUG [956939520c5d458ea8d22f5320da4cc6] awx.main.scheduler Running task manager.
2022-02-07 20:01:23,967 DEBUG [956939520c5d458ea8d22f5320da4cc6] awx.main.scheduler Starting Scheduler
2022-02-07 20:01:23,977 DEBUG [956939520c5d458ea8d22f5320da4cc6] awx.main.dispatch task 80570583-a24a-46a8-9fec-6067c5f564e8 starting awx.main.analytics.analytics_tasks.send_subsystem_metrics(
[])
2022-02-07 20:01:24,017 DEBUG [956939520c5d458ea8d22f5320da4cc6] awx.main.scheduler Finishing Scheduler
2022-02-07 20:01:30,210 DEBUG [-] awx.main.commands.run_callback_receiver 85 is alive
2022-02-07 20:01:30,210 DEBUG [-] awx.main.commands.run_callback_receiver 85 is alive
2022-02-07 20:01:43,988 DEBUG [cd88835fd71e453dbdc9b97409e20450] awx.main.dispatch task 4f8a897c-07d8-45d8-ae10-6d0e6978290e starting awx.main.tasks.cluster_node_heartbeat([])
2022-02-07 20:01:43,989 DEBUG [cd88835fd71e453dbdc9b97409e20450] awx.main.tasks Cluster node heartbeat task.
Instance Group already registered controlplane
Instance Group already registered default
Instance Group already registered controlplane
Instance Group already registered default
Instance Group already registered controlplane
Instance Group already registered default
Instance Group already registered controlplane
Instance Group already registered default
Instance Group already registered controlplane
Instance Group already registered default
2022-02-07 20:01:44,006 DEBUG [cd88835fd71e453dbdc9b97409e20450] awx.main.dispatch task 5bcf0e30-a5f2-4b7a-a0ca-d98420e521dc starting awx.main.tasks.awx_k8s_reaper(
[])
2022-02-07 20:01:44,007 DEBUG [cd88835fd71e453dbdc9b97409e20450] awx.main.dispatch task 800dee8d-b63a-4cec-8179-b1546b3e6880 starting awx.main.tasks.awx_receptor_workunit_reaper([])
2022-02-07 20:01:44,010 DEBUG [cd88835fd71e453dbdc9b97409e20450] awx.main.tasks Checking for unreleased receptor work units
2022-02-07 20:01:44,011 WARNING [cd88835fd71e453dbdc9b97409e20450] awx.main.dispatch scaling up worker pid:1437
2022-02-07 20:01:44,019 DEBUG [cd88835fd71e453dbdc9b97409e20450] awx.main.dispatch task 2213d088-1528-438d-8e2b-c59d27ee4f4b starting awx.main.tasks.awx_periodic_scheduler(
[])
2022-02-07 20:01:44,023 WARNING [cd88835fd71e453dbdc9b97409e20450] awx.main.dispatch scaling up worker pid:1439
2022-02-07 20:01:44,026 DEBUG [cd88835fd71e453dbdc9b97409e20450] awx.main.tasks Checking for orphaned k8s pods for default-2.
2022-02-07 20:01:44,026 DEBUG [cd88835fd71e453dbdc9b97409e20450] awx.main.dispatch task bb46ce5b-afc7-49c4-875e-94d5a31eba13 starting awx.main.scheduler.tasks.run_task_manager([])
2022-02-07 20:01:44,028 DEBUG [cd88835fd71e453dbdc9b97409e20450] awx.main.scheduler Running task manager.
2022-02-07 20:01:44,029 DEBUG [cd88835fd71e453dbdc9b97409e20450] awx.main.tasks Starting periodic scheduler
2022-02-07 20:01:44,028 DEBUG [cd88835fd71e453dbdc9b97409e20450] awx.main.dispatch task 464f1129-05ed-4fb6-bf5c-1f67d3bd24db starting awx.main.analytics.analytics_tasks.send_subsystem_metrics(
[])
2022-02-07 20:01:44,031 DEBUG [cd88835fd71e453dbdc9b97409e20450] awx.main.tasks Last scheduler run was: 2022-02-07 20:01:13.946921+00:00
2022-02-07 20:01:44,041 DEBUG [cd88835fd71e453dbdc9b97409e20450] awx.main.scheduler Starting Scheduler
2022-02-07 20:01:44,096 DEBUG [cd88835fd71e453dbdc9b97409e20450] awx.main.scheduler Finishing Scheduler
RESULT 2
OKREADY
2022-02-07 20:02:04,048 WARNING [03c5aea2749d4e67b68f268c0b88e8a4] awx.main.dispatch scaling down worker pid:1425
2022-02-07 20:02:04,049 DEBUG [03c5aea2749d4e67b68f268c0b88e8a4] awx.main.dispatch task a6ff65ae-6ae0-4dbe-bd63-4932111f3b5f starting awx.main.scheduler.tasks.run_task_manager([])
2022-02-07 20:02:04,050 WARNING [03c5aea2749d4e67b68f268c0b88e8a4] awx.main.dispatch scaling down worker pid:1431
2022-02-07 20:02:04,051 DEBUG [03c5aea2749d4e67b68f268c0b88e8a4] awx.main.scheduler Running task manager.
2022-02-07 20:02:04,050 WARNING [cd88835fd71e453dbdc9b97409e20450] awx.main.dispatch worker exiting gracefully pid:1425
2022-02-07 20:02:04,052 WARNING [cd88835fd71e453dbdc9b97409e20450] awx.main.dispatch worker exiting gracefully pid:1431
2022-02-07 20:02:04,060 DEBUG [03c5aea2749d4e67b68f268c0b88e8a4] awx.main.scheduler Starting Scheduler
2022-02-07 20:02:04,086 DEBUG [03c5aea2749d4e67b68f268c0b88e8a4] awx.main.dispatch task ce1e29ee-6b28-4336-af71-6a0752297902 starting awx.main.analytics.analytics_tasks.send_subsystem_metrics(
[])
2022-02-07 20:02:04,107 DEBUG [03c5aea2749d4e67b68f268c0b88e8a4] awx.main.scheduler Finishing Scheduler
2022-02-07 20:02:14,083 DEBUG [cc1eefd81df84fa3882474a24d08b836] awx.main.dispatch task fa47c345-e105-4875-9ea5-c5ce6542faa3 starting awx.main.tasks.awx_periodic_scheduler([])
2022-02-07 20:02:14,095 DEBUG [cc1eefd81df84fa3882474a24d08b836] awx.main.tasks Starting periodic scheduler
2022-02-07 20:02:14,098 DEBUG [cc1eefd81df84fa3882474a24d08b836] awx.main.tasks Last scheduler run was: 2022-02-07 20:01:44.030124+00:00
2022-02-07 20:02:24,102 DEBUG [72d3f4cfdf8f44979f9a3de3bfa27ae5] awx.main.dispatch task 7169a33a-359e-41a5-916a-683b9a516685 starting awx.main.scheduler.tasks.run_task_manager(
[])
2022-02-07 20:02:24,104 DEBUG [72d3f4cfdf8f44979f9a3de3bfa27ae5] awx.main.scheduler Running task manager.
2022-02-07 20:02:24,116 DEBUG [72d3f4cfdf8f44979f9a3de3bfa27ae5] awx.main.scheduler Starting Scheduler
2022-02-07 20:02:24,129 DEBUG [72d3f4cfdf8f44979f9a3de3bfa27ae5] awx.main.dispatch task 1b89b065-d769-4dea-b029-92cd524a9591 starting awx.main.analytics.analytics_tasks.send_subsystem_metrics([])
2022-02-07 20:02:24,162 DEBUG [72d3f4cfdf8f44979f9a3de3bfa27ae5] awx.main.scheduler Finishing Scheduler
2022-02-07 20:02:30,304 DEBUG [-] awx.main.commands.run_callback_receiver 85 is alive
2022-02-07 20:02:30,304 DEBUG [-] awx.main.commands.run_callback_receiver 85 is alive
2022-02-07 20:02:44,140 DEBUG [713b4264230746fba6a5263ffe8e1947] awx.main.dispatch task 6dfdd84f-5445-420d-aefb-56452c5cbfc1 starting awx.main.tasks.gather_analytics(
[])
2022-02-07 20:02:44,174 DEBUG [713b4264230746fba6a5263ffe8e1947] awx.main.dispatch task a39127a3-8825-460c-a3fc-d4048a0caaf7 starting awx.main.tasks.awx_k8s_reaper([])
2022-02-07 20:02:44,175 DEBUG [713b4264230746fba6a5263ffe8e1947] awx.main.dispatch task aca3e481-0eda-4195-b37d-f3b8b761e604 starting awx.main.tasks.cluster_node_heartbeat(
[])
Instance Group already registered controlplane
Instance Group already registered default
Instance Group already registered controlplane
Instance Group already registered default
Instance Group already registered controlplane
Instance Group already registered default
Instance Group already registered controlplane
Instance Group already registered default
Instance Group already registered controlplane
Instance Group already registered default
2022-02-07 20:02:44,176 DEBUG [713b4264230746fba6a5263ffe8e1947] awx.main.tasks Cluster node heartbeat task.
2022-02-07 20:02:44,176 DEBUG [713b4264230746fba6a5263ffe8e1947] awx.main.dispatch task 083ddc0c-a8a8-4824-b03f-f0ba9d055603 starting awx.main.tasks.awx_receptor_workunit_reaper([])
2022-02-07 20:02:44,178 DEBUG [713b4264230746fba6a5263ffe8e1947] awx.main.tasks Checking for unreleased receptor work units
2022-02-07 20:02:44,179 WARNING [713b4264230746fba6a5263ffe8e1947] awx.main.dispatch scaling up worker pid:1443
2022-02-07 20:02:44,186 DEBUG [713b4264230746fba6a5263ffe8e1947] awx.main.dispatch task 95104c6a-0773-4f93-8987-8e21220c52a8 starting awx.main.tasks.awx_periodic_scheduler(
[])
2022-02-07 20:02:44,192 DEBUG [713b4264230746fba6a5263ffe8e1947] awx.main.tasks Checking for orphaned k8s pods for default-2.
2022-02-07 20:02:44,190 WARNING [713b4264230746fba6a5263ffe8e1947] awx.main.dispatch scaling up worker pid:1445
2022-02-07 20:02:44,198 DEBUG [713b4264230746fba6a5263ffe8e1947] awx.main.tasks Starting periodic scheduler
2022-02-07 20:02:44,199 WARNING [713b4264230746fba6a5263ffe8e1947] awx.main.dispatch scaling up worker pid:1447
2022-02-07 20:02:44,202 DEBUG [713b4264230746fba6a5263ffe8e1947] awx.main.dispatch task 863a7261-af22-4bf3-8143-1925d74ddb64 starting awx.main.scheduler.tasks.run_task_manager([])
2022-02-07 20:02:44,203 DEBUG [713b4264230746fba6a5263ffe8e1947] awx.main.tasks Last scheduler run was: 2022-02-07 20:02:14.096001+00:00
2022-02-07 20:02:44,204 DEBUG [713b4264230746fba6a5263ffe8e1947] awx.main.scheduler Running task manager.
2022-02-07 20:02:44,204 DEBUG [713b4264230746fba6a5263ffe8e1947] awx.main.dispatch task 2dd4cbe5-0846-4042-b1a1-7045c11da3d9 starting awx.main.analytics.analytics_tasks.send_subsystem_metrics(
[])
2022-02-07 20:02:44,217 DEBUG [713b4264230746fba6a5263ffe8e1947] awx.main.scheduler Starting Scheduler
2022-02-07 20:02:44,280 DEBUG [713b4264230746fba6a5263ffe8e1947] awx.main.scheduler Finishing Scheduler
2022-02-07 20:02:52,462 DEBUG [-] awx.main.commands.run_callback_receiver JobEvent.objects.bulk_create(1)
2022-02-07 20:02:52,462 DEBUG [-] awx.main.commands.run_callback_receiver JobEvent.objects.bulk_create(1)
2022-02-07 20:02:52,462 DEBUG [-] awx.main.commands.run_callback_receiver JobEvent.objects.bulk_create(1)
2022-02-07 20:02:52,462 DEBUG [-] awx.main.commands.run_callback_receiver JobEvent.objects.bulk_create(1)
2022-02-07 20:02:52,464 DEBUG [-] awx.main.commands.run_callback_receiver JobEvent.objects.bulk_create(1)
2022-02-07 20:02:52,464 DEBUG [-] awx.main.commands.run_callback_receiver JobEvent.objects.bulk_create(1)
2022-02-07 20:02:52,467 DEBUG [-] awx.main.commands.run_callback_receiver JobEvent.objects.bulk_create(1)
2022-02-07 20:02:52,467 DEBUG [-] awx.main.commands.run_callback_receiver JobEvent.objects.bulk_create(1)
2022-02-07 20:02:52,582 DEBUG [-] awx.main.commands.run_callback_receiver JobEvent.objects.bulk_create(45)
2022-02-07 20:02:52,582 DEBUG [-] awx.main.commands.run_callback_receiver JobEvent.objects.bulk_create(45)
2022-02-07 20:02:52,585 DEBUG [-] awx.main.commands.run_callback_receiver JobEvent.objects.bulk_create(38)
2022-02-07 20:02:52,585 DEBUG [-] awx.main.commands.run_callback_receiver JobEvent.objects.bulk_create(38)
2022-02-07 20:02:52,587 DEBUG [-] awx.main.commands.run_callback_receiver JobEvent.objects.bulk_create(42)
2022-02-07 20:02:52,587 DEBUG [-] awx.main.commands.run_callback_receiver JobEvent.objects.bulk_create(42)
2022-02-07 20:02:52,596 DEBUG [-] awx.main.commands.run_callback_receiver JobEvent.objects.bulk_create(35)
2022-02-07 20:02:52,596 DEBUG [-] awx.main.commands.run_callback_receiver JobEvent.objects.bulk_create(35)
2022-02-07 20:02:52,740 DEBUG [-] awx.main.commands.run_callback_receiver JobEvent.objects.bulk_create(91)
2022-02-07 20:02:52,740 DEBUG [-] awx.main.commands.run_callback_receiver JobEvent.objects.bulk_create(91)
2022-02-07 20:02:52,766 DEBUG [-] awx.main.commands.run_callback_receiver JobEvent.objects.bulk_create(91)
2022-02-07 20:02:52,766 DEBUG [-] awx.main.commands.run_callback_receiver JobEvent.objects.bulk_create(91)
2022-02-07 20:02:52,786 DEBUG [-] awx.main.commands.run_callback_receiver JobEvent.objects.bulk_create(85)
2022-02-07 20:02:52,786 DEBUG [-] awx.main.commands.run_callback_receiver JobEvent.objects.bulk_create(85)
2022-02-07 20:02:52,790 DEBUG [-] awx.main.commands.run_callback_receiver JobEvent.objects.bulk_create(77)
2022-02-07 20:02:52,790 DEBUG [-] awx.main.commands.run_callback_receiver JobEvent.objects.bulk_create(77)
2022-02-07 20:02:53,982 DEBUG [-] awx.main.commands.run_callback_receiver JobEvent.objects.bulk_create(49)
2022-02-07 20:02:53,982 DEBUG [-] awx.main.commands.run_callback_receiver JobEvent.objects.bulk_create(48)
2022-02-07 20:02:53,982 DEBUG [-] awx.main.commands.run_callback_receiver JobEvent.objects.bulk_create(104)
2022-02-07 20:02:53,982 DEBUG [-] awx.main.commands.run_callback_receiver JobEvent.objects.bulk_create(43)
2022-02-07 20:02:53,982 DEBUG [-] awx.main.commands.run_callback_receiver JobEvent.objects.bulk_create(43)
2022-02-07 20:02:53,982 DEBUG [-] awx.main.commands.run_callback_receiver JobEvent.objects.bulk_create(48)
2022-02-07 20:02:53,982 DEBUG [-] awx.main.commands.run_callback_receiver JobEvent.objects.bulk_create(104)
2022-02-07 20:02:53,982 DEBUG [-] awx.main.commands.run_callback_receiver JobEvent.objects.bulk_create(49)
2022-02-07 20:02:56,189 DEBUG [-] awx.main.commands.run_callback_receiver JobEvent.objects.bulk_create(1)
2022-02-07 20:02:56,189 DEBUG [-] awx.main.commands.run_callback_receiver JobEvent.objects.bulk_create(1)
2022-02-07 20:02:56,189 DEBUG [-] awx.main.commands.run_callback_receiver JobEvent.objects.bulk_create(1)
2022-02-07 20:02:56,189 DEBUG [-] awx.main.commands.run_callback_receiver JobEvent.objects.bulk_create(1)
2022-02-07 20:02:57,173 DEBUG [-] awx.main.commands.run_callback_receiver JobEvent.objects.bulk_create(1)
2022-02-07 20:02:57,173 DEBUG [-] awx.main.commands.run_callback_receiver JobEvent.objects.bulk_create(1)
2022-02-07 20:02:57,175 DEBUG [-] awx.main.commands.run_callback_receiver JobEvent.objects.bulk_create(1)
2022-02-07 20:02:57,175 DEBUG [-] awx.main.commands.run_callback_receiver JobEvent.objects.bulk_create(1)
2022-02-07 20:02:57,422 DEBUG [-] awx.main.commands.run_callback_receiver JobEvent.objects.bulk_create(1)
2022-02-07 20:02:57,422 DEBUG [-] awx.main.commands.run_callback_receiver JobEvent.objects.bulk_create(1)
2022-02-07 20:02:57,424 DEBUG [-] awx.main.commands.run_callback_receiver JobEvent.objects.bulk_create(1)
2022-02-07 20:02:57,424 DEBUG [-] awx.main.commands.run_callback_receiver JobEvent.objects.bulk_create(1)
2022-02-07 20:02:57,673 DEBUG [-] awx.main.commands.run_callback_receiver JobEvent.objects.bulk_create(1)
2022-02-07 20:02:57,673 DEBUG [-] awx.main.commands.run_callback_receiver JobEvent.objects.bulk_create(1)
2022-02-07 20:02:57,674 DEBUG [-] awx.main.commands.run_callback_receiver JobEvent.objects.bulk_create(1)
2022-02-07 20:02:57,674 DEBUG [-] awx.main.commands.run_callback_receiver JobEvent.objects.bulk_create(1)
2022-02-07 20:02:58,678 DEBUG [-] awx.main.commands.run_callback_receiver JobEvent.objects.bulk_create(1)
2022-02-07 20:02:58,678 DEBUG [-] awx.main.commands.run_callback_receiver JobEvent.objects.bulk_create(1)
2022-02-07 20:02:58,679 DEBUG [-] awx.main.commands.run_callback_receiver JobEvent.objects.bulk_create(1)
2022-02-07 20:02:58,679 DEBUG [-] awx.main.commands.run_callback_receiver JobEvent.objects.bulk_create(1)
2022-02-07 20:02:58,927 DEBUG [-] awx.main.commands.run_callback_receiver JobEvent.objects.bulk_create(1)
2022-02-07 20:02:58,927 DEBUG [-] awx.main.commands.run_callback_receiver JobEvent.objects.bulk_create(1)
2022-02-07 20:02:58,929 DEBUG [-] awx.main.commands.run_callback_receiver JobEvent.objects.bulk_create(1)
2022-02-07 20:02:58,929 DEBUG [-] awx.main.commands.run_callback_receiver JobEvent.objects.bulk_create(1)
2022-02-07 20:03:01,933 DEBUG [-] awx.main.commands.run_callback_receiver JobEvent.objects.bulk_create(1)
2022-02-07 20:03:01,933 DEBUG [-] awx.main.commands.run_callback_receiver JobEvent.objects.bulk_create(1)
RESULT 2
OKREADY
2022-02-07 20:03:01,935 DEBUG [-] awx.main.commands.run_callback_receiver JobEvent.objects.bulk_create(1)
2022-02-07 20:03:01,935 DEBUG [-] awx.main.commands.run_callback_receiver JobEvent.objects.bulk_create(1)
2022-02-07 20:03:04,219 DEBUG [07691192c0f1436398d8244572d8d969] awx.main.dispatch task de141708-b002-48bc-a005-1d7ca794ec86 starting awx.main.scheduler.tasks.run_task_manager([])
2022-02-07 20:03:04,218 WARNING [07691192c0f1436398d8244572d8d969] awx.main.dispatch scaling down worker pid:1433
2022-02-07 20:03:04,220 DEBUG [07691192c0f1436398d8244572d8d969] awx.main.scheduler Running task manager.
2022-02-07 20:03:04,221 WARNING [07691192c0f1436398d8244572d8d969] awx.main.dispatch scaling down worker pid:1437
2022-02-07 20:03:04,222 WARNING [07691192c0f1436398d8244572d8d969] awx.main.dispatch scaling down worker pid:1439
2022-02-07 20:03:04,224 WARNING [713b4264230746fba6a5263ffe8e1947] awx.main.dispatch worker exiting gracefully pid:1437
2022-02-07 20:03:04,224 WARNING [713b4264230746fba6a5263ffe8e1947] awx.main.dispatch worker exiting gracefully pid:1439
2022-02-07 20:03:04,225 WARNING [713b4264230746fba6a5263ffe8e1947] awx.main.dispatch worker exiting gracefully pid:1433
2022-02-07 20:03:04,233 DEBUG [07691192c0f1436398d8244572d8d969] awx.main.scheduler Starting Scheduler
2022-02-07 20:03:04,262 DEBUG [07691192c0f1436398d8244572d8d969] awx.main.dispatch task 7be27447-ac5e-4b69-a165-17a659dc5315 starting awx.main.analytics.analytics_tasks.send_subsystem_metrics(
[])
2022-02-07 20:03:04,284 DEBUG [07691192c0f1436398d8244572d8d969] awx.main.scheduler Finishing Scheduler
2022-02-07 20:03:05,693 DEBUG [-] awx.main.commands.run_callback_receiver JobEvent.objects.bulk_create(1)
2022-02-07 20:03:05,693 DEBUG [-] awx.main.commands.run_callback_receiver JobEvent.objects.bulk_create(1)
2022-02-07 20:03:05,693 DEBUG [-] awx.main.commands.run_callback_receiver JobEvent.objects.bulk_create(1)
2022-02-07 20:03:05,693 DEBUG [-] awx.main.commands.run_callback_receiver JobEvent.objects.bulk_create(1)
2022-02-07 20:03:06,442 DEBUG [-] awx.main.commands.run_callback_receiver JobEvent.objects.bulk_create(1)
2022-02-07 20:03:06,442 DEBUG [-] awx.main.commands.run_callback_receiver JobEvent.objects.bulk_create(1)
2022-02-07 20:03:06,443 DEBUG [-] awx.main.commands.run_callback_receiver JobEvent.objects.bulk_create(1)
2022-02-07 20:03:06,443 DEBUG [-] awx.main.commands.run_callback_receiver JobEvent.objects.bulk_create(1)
2022-02-07 20:03:06,694 DEBUG [-] awx.main.commands.run_callback_receiver JobEvent.objects.bulk_create(1)
2022-02-07 20:03:06,694 DEBUG [-] awx.main.commands.run_callback_receiver JobEvent.objects.bulk_create(1)
2022-02-07 20:03:06,698 DEBUG [-] awx.main.commands.run_callback_receiver JobEvent.objects.bulk_create(1)
2022-02-07 20:03:06,698 DEBUG [-] awx.main.commands.run_callback_receiver JobEvent.objects.bulk_create(1)
2022-02-07 20:03:07,945 DEBUG [-] awx.main.commands.run_callback_receiver JobEvent.objects.bulk_create(1)
2022-02-07 20:03:07,945 DEBUG [-] awx.main.commands.run_callback_receiver JobEvent.objects.bulk_create(1)
2022-02-07 20:03:07,947 DEBUG [-] awx.main.commands.run_callback_receiver JobEvent.objects.bulk_create(1)
2022-02-07 20:03:07,947 DEBUG [-] awx.main.commands.run_callback_receiver JobEvent.objects.bulk_create(1)
2022-02-07 20:03:08,948 DEBUG [-] awx.main.commands.run_callback_receiver JobEvent.objects.bulk_create(1)
2022-02-07 20:03:08,948 DEBUG [-] awx.main.commands.run_callback_receiver JobEvent.objects.bulk_create(1)
2022-02-07 20:03:08,949 DEBUG [-] awx.main.commands.run_callback_receiver JobEvent.objects.bulk_create(1)
2022-02-07 20:03:08,949 DEBUG [-] awx.main.commands.run_callback_receiver JobEvent.objects.bulk_create(1)
2022-02-07 20:03:10,951 DEBUG [-] awx.main.commands.run_callback_receiver JobEvent.objects.bulk_create(1)
2022-02-07 20:03:10,951 DEBUG [-] awx.main.commands.run_callback_receiver JobEvent.objects.bulk_create(1)
2022-02-07 20:03:10,953 DEBUG [-] awx.main.commands.run_callback_receiver JobEvent.objects.bulk_create(1)
2022-02-07 20:03:10,953 DEBUG [-] awx.main.commands.run_callback_receiver JobEvent.objects.bulk_create(1)
2022-02-07 20:03:14,246 DEBUG [faee2e32a1064a58b7383604173f7aa6] awx.main.dispatch task a385903f-fac7-4856-b3a2-c017f9265c3b starting awx.main.tasks.awx_periodic_scheduler(*[])
...
...

@chicoraf
Copy link

chicoraf commented Mar 4, 2022

K8s (kubespray) OPERATOR_VERSION: 0.15.0 AWX_IMAGE: 19.5.0 AWX-EE_IMAGE: 0.6.0 External Postgres 14.1

I´m facing a similar problem. We use a playbook on an inventory with 460 hosts. If we limit the execution to fewer hosts, everything works. If we run on the entire inventory, after a few lines the Output on the AWX interface hangs. It recovers after aprox. 10 min, but when the job finishes, the output is incomplete (without the play recap) and there is missing information on the job via api.

I found something intresting in the container log, may help to find the cause: the line "awx.main.commands.run_callback_receiver JobEvent.objects.bulk_create(x)" apears contantly, until something happens and the line stops showing, at the same time that the Output stops in AWX. After aprox. 10 minutes, it starts to appear again, but this time with a much higher number of events (awx.main.commands.run_callback_receiver JobEvent.objects.bulk_create(45)), like it's trying to compensate.

Anyone knows who generate this lines? Looks like the task container is receiving, but who is sending? The logs of the pod automation-job-xxx-yyyyy never stops, but show no errors.

Thanks for the attention, and sorry for the bad english.

Obs: It's not the entire log for the job, it's just to show the lines I talked about

I made a test in another k8s cluster, this time using docker, and the problem did not occur. Could this problem be related to containerd?

@chris93111
Copy link
Contributor

@chicoraf try container-log-max-size=500Mi in kubelet arg see #11338

@vibinm
Copy link

vibinm commented Mar 18, 2022

Just wanted to report this, I tried the latest version of AWX 20.0.1 and AWX-Operator 0.18.0 and the issue still persists.

OPERATOR_VERSION: 0.18.0
AWX_IMAGE: 20.0.1
AWX-EE_IMAGE: 0.6.0
External Database
K8s version: v1.20.2, calico

All jobs running more than two hours is showing as FAILED (in earlier version it was RUNNING not getting FAILED or SUCCESS), even though the playbook run is successful. The incomplete log output issue is fixed by applying the log container size on the K8s.

Log output of awx-task container: FAILED JOB - 2 hours 15 minutes

2022-03-18 09:53:03,196 DEBUG [3a5ec3009d054c0d9b4ca736c74f464b] awx.main.dispatch task 510b6172-a41b-41fe-b919-00171ed24b7c starting awx.main.tasks.system.handle_success_and_failure_notifications([164])
2022-03-18 09:53:03,901 INFO [f3f0dd3efa1343b9a95eeaee588d06d4] awx.main.commands.run_callback_receiver Event processing is finished for Job 164, sending notifications
2022-03-18 09:53:03,901 INFO [f3f0dd3efa1343b9a95eeaee588d06d4] awx.main.commands.run_callback_receiver Event processing is finished for Job 164, sending notifications
2022-03-18 09:53:04,318 DEBUG [f3f0dd3efa1343b9a95eeaee588d06d4] awx.main.tasks.jobs job 164 (running) finished running, producing 17080 events.
2022-03-18 09:53:04,382 ERROR [f3f0dd3efa1343b9a95eeaee588d06d4] awx.main.tasks.jobs job 164 (running) Post run hook errored.
File "/var/lib/awx/venv/awx/lib64/python3.9/site-packages/django/db/models/fields/related_descriptors.py", line 164, in get
2022-03-18 09:53:04,416 ERROR [f3f0dd3efa1343b9a95eeaee588d06d4] awx.main.dispatch Worker failed to run task awx.main.tasks.jobs.RunJob(
[164], **{'ssh_password': 'xxxxxxxxx'}
2022-03-18 09:53:04,442 DEBUG [f3f0dd3efa1343b9a95eeaee588d06d4] awx.main.tasks.system Executing error task id b105fb89-6eb2-40b2-a800-7c930a78776c, subtasks: [{'type': 'job', 'id': 164}]
2022-03-18 09:53:04,457 ERROR [f3f0dd3efa1343b9a95eeaee588d06d4] awx.main.dispatch Worker failed to run task awx.main.tasks.system.handle_work_error(['b105fb89-6eb2-40b2-a800-7c930a78776c'], **{'subtasks': [{'type': 'job', 'id': 164}]}
2022-03-18 09:53:04,485 ERROR [f3f0dd3efa1343b9a95eeaee588d06d4] awx.main.dispatch Worker failed to run task awx.main.tasks.system.handle_work_success(
[], **{'task_actual': {'type': 'job', 'id': 164}}
2022-03-18 09:53:08,365 WARNING [3a5ec3009d054c0d9b4ca736c74f464b] awx.main.tasks.system Failed to even try to send notifications for job '2022-03-18 07:37:36.403319+00:00-164-running' due to job not being in finished state.
2022-03-18 09:53:15,668 ERROR [53e3d79abf514edbb586227ddb205a04] awx.main.dispatch job 164 (failed) is no longer running; reaping
2022-03-18 09:53:34,873 DEBUG [e8e6e201f9094454ad356132e4b33481] awx.main.tasks.system job 164 (failed) is no longer active, reaping orphaned k8s pod
2022-03-18 09:53:34,988 DEBUG [e8e6e201f9094454ad356132e4b33481] awx.main.tasks.system job 164 (failed) is no longer active, reaping orphaned k8s pod

Log output of awx-task container: Successful Job - 2 hours

2022-03-17 09:19:57,357 DEBUG [9aabd791dc484662ad15bfc38b59c82f] awx.main.dispatch task c3e572e3-521d-4f6c-90c3-b73407e3bd27 starting awx.main.tasks.system.handle_success_and_failure_notifications(*[160])
2022-03-17 09:19:57,890 INFO [9390c05dcf4d413e84b65af6cca48cfd] awx.main.commands.run_callback_receiver Event processing is finished for Job 160, sending notifications
2022-03-17 09:19:57,890 INFO [9390c05dcf4d413e84b65af6cca48cfd] awx.main.commands.run_callback_receiver Event processing is finished for Job 160, sending notifications
2022-03-17 09:19:58,511 DEBUG [9390c05dcf4d413e84b65af6cca48cfd] awx.main.tasks.jobs job 160 (running) finished running, producing 15378 events.
2022-03-17 09:20:02,533 WARNING [9aabd791dc484662ad15bfc38b59c82f] awx.main.tasks.system Failed to even try to send notifications for job '2022-03-17 07:19:08.895432+00:00-160-running' due to job not being in finished state.
2022-03-17 09:35:27,495 ERROR [9390c05dcf4d413e84b65af6cca48cfd] awx.main.tasks.jobs job 160 (running) Post run hook errored.
2022-03-17 09:35:32,855 DEBUG [9390c05dcf4d413e84b65af6cca48cfd] awx.analytics.job_lifecycle job-160 finalize run

@chicoraf
Copy link

@chicoraf try container-log-max-size=500Mi in kubelet arg see #11338

@chris93111 , I found the solution just before your reply, problem solved! Thanks a lot!!

@shanemcd
Copy link
Member

Closing this in favor of #11338

@vibinm
Copy link

vibinm commented Oct 11, 2022 via email

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

10 participants