Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Playbooks running longer than 4 hours are terminated unexpectedly #11594

Closed
3 tasks done
FedorRub opened this issue Jan 24, 2022 · 14 comments
Closed
3 tasks done

Playbooks running longer than 4 hours are terminated unexpectedly #11594

FedorRub opened this issue Jan 24, 2022 · 14 comments

Comments

@FedorRub
Copy link

FedorRub commented Jan 24, 2022

Please confirm the following

  • I agree to follow this project's code of conduct.
  • I have checked the current issues for duplicates.
  • I understand that AWX is open source software provided for free and that I might not receive a timely response.

Summary

Playbooks running longer than 4 hours are terminated unexpectedly. The Jobs finish with error state in GUI.This is relevant for us as we have some long-running playbooks (windows server patching, backups etc)

Please help me understand if there is some timeout inside awx that terminates the container and if this limit can be adjusted.
There is a similar issue reported in awx operator repo ansible/awx-operator#622

AWX version

19.5.0

Installation method

kubernetes

Modifications

yes

Ansible version

core 2.11.7.post0

Operating system

centos (awx-ee)

Web browser

Chrome

Steps to reproduce

The issue can be reproduced by running the following playbook

---
- hosts: frontend
  gather_facts: no

  tasks:
    - name: test
      win_shell: Start-Sleep -s 18000

Expected results

Playbook completes successfully

Actual results

Container running the job is terminated after running for 4 hours

Additional information

automation jobs container exited with the following error

 lastState:
      terminated:
        exitCode: 137
        finishedAt: null
        message: The container could not be located when the pod was deleted.  The
          container used to be Running
        reason: ContainerStatusUnknown

awx-task container logs

"timestamp","source","message"
"2022-01-20T05:56:50.294Z","fluentd-rspdg","2022-01-20 05:56:50,293 DEBUG    [897e5ca4a1c74f7c924756f746410640] awx.analytics.job_lifecycle job-19932 acknowledged"
"2022-01-20T05:56:50.330Z","fluentd-rspdg","2022-01-20 05:56:50,329 DEBUG    [897e5ca4a1c74f7c924756f746410640] awx.analytics.job_lifecycle job-19932 controller node chosen"
"2022-01-20T05:56:50.336Z","fluentd-rspdg","2022-01-20 05:56:50,335 DEBUG    [897e5ca4a1c74f7c924756f746410640] awx.analytics.job_lifecycle job-19932 waiting"
"2022-01-20T05:56:50.922Z","fluentd-rspdg","2022-01-20 05:56:50,921 DEBUG    [897e5ca4a1c74f7c924756f746410640] awx.main.tasks Skipping project sync for job 19932 (running) because commit is locally available"
"2022-01-20T05:56:51.521Z","fluentd-rspdg","2022-01-20 05:56:51,520 DEBUG    [897e5ca4a1c74f7c924756f746410640] awx.analytics.job_lifecycle job-19932 running playbook"
"2022-01-20T05:56:50.648Z","fluentd-rspdg","2022-01-20 05:56:50,647 DEBUG    [897e5ca4a1c74f7c924756f746410640] awx.main.dispatch task 765be65b-0df5-4604-b60a-752680e542a6 starting awx.main.tasks.RunJob(*[19932])"
"2022-01-20T05:56:50.336Z","fluentd-rspdg","2022-01-20 05:56:50,335 DEBUG    [897e5ca4a1c74f7c924756f746410640] awx.main.scheduler job 19932 (waiting) consumed 201 capacity units from default with prior total of 0"
"2022-01-20T05:56:52.086Z","fluentd-rspdg","2022-01-20 05:56:52,085 DEBUG    [897e5ca4a1c74f7c924756f746410640] awx.analytics.job_lifecycle job-19932 work unit id received"
"2022-01-20T05:56:50.911Z","fluentd-rspdg","2022-01-20 05:56:50,910 DEBUG    [897e5ca4a1c74f7c924756f746410640] awx.analytics.job_lifecycle job-19932 pre run"
"2022-01-20T05:56:52.124Z","fluentd-rspdg","2022-01-20 05:56:52,123 DEBUG    [897e5ca4a1c74f7c924756f746410640] awx.analytics.job_lifecycle job-19932 work unit id assigned"
"2022-01-20T05:56:51.236Z","fluentd-rspdg","2022-01-20 05:56:51,235 DEBUG    [897e5ca4a1c74f7c924756f746410640] awx.analytics.job_lifecycle job-19932 preparing playbook"
"2022-01-20T09:57:09.605Z","fluentd-rspdg","2022-01-20 09:57:09,603 WARNING  [897e5ca4a1c74f7c924756f746410640] awx.main.dispatch job 19932 (error) encountered an error (rc=None), please see task stdout for details."
"2022-01-20T09:57:08.908Z","fluentd-rspdg","2022-01-20 09:57:08,907 INFO     [897e5ca4a1c74f7c924756f746410640] awx.main.commands.run_callback_receiver Event processing is finished for Job 19932, sending notifications"
"2022-01-20T09:57:09.495Z","fluentd-rspdg","2022-01-20 09:57:09,494 DEBUG    [897e5ca4a1c74f7c924756f746410640] awx.analytics.job_lifecycle job-19932 finalize run"
"2022-01-20T09:57:09.416Z","fluentd-rspdg","2022-01-20 09:57:09,415 DEBUG    [897e5ca4a1c74f7c924756f746410640] awx.analytics.job_lifecycle job-19932 post run"
"2022-01-20T09:57:09.606Z","fluentd-rspdg","2022-01-20 09:57:09,605 DEBUG    [897e5ca4a1c74f7c924756f746410640] awx.main.tasks Executing error task id 765be65b-0df5-4604-b60a-752680e542a6, subtasks: [{'type': 'job', 'id': 19932}]"
"2022-01-20T09:57:08.908Z","fluentd-rspdg","2022-01-20 09:57:08,907 INFO     [897e5ca4a1c74f7c924756f746410640] awx.main.commands.run_callback_receiver Event processing is finished for Job 19932, sending notifications"
"2022-01-20T09:57:09.409Z","fluentd-rspdg","2022-01-20 09:57:09,408 DEBUG    [897e5ca4a1c74f7c924756f746410640] awx.main.tasks job 19932 (running) finished running, producing 40 events."

job std out

HTTP 200 OK
Allow: GET, HEAD, OPTIONS
Content-Type: text/plain ;utf-8
Vary: Accept
X-API-Node: awx--6bcfb8865f-vfc6z
X-API-Product-Name: AWX
X-API-Product-Version: 19.5.0
X-API-Time: 0.046s

ansible-playbook [core 2.11.7.post0] 
  config file = None
  configured module search path = ['/home/runner/.ansible/plugins/modules', '/usr/share/ansible/plugins/modules']
  ansible python module location = /usr/local/lib/python3.8/site-packages/ansible
  ansible collection location = /runner/requirements_collections:/home/runner/.ansible/collections:/usr/share/ansible/collections
  executable location = /usr/local/bin/ansible-playbook
  python version = 3.8.12 (default, Sep 21 2021, 00:10:52) [GCC 8.5.0 20210514 (Red Hat 8.5.0-3)]
  jinja version = 2.10.3
  libyaml = True
No config file found; using defaults
SSH password: 
setting up inventory plugins
host_list declined parsing /runner/inventory/hosts as it did not pass its verify_file() method
Parsed /runner/inventory/hosts inventory source with script plugin
redirecting (type: modules) ansible.builtin.win_shell to ansible.windows.win_shell
Loading collection ansible.windows from /usr/share/ansible/collections/ansible_collections/ansible/windows
Loading callback plugin awx_display of type stdout, v2.0 from /usr/local/lib/python3.8/site-packages/ansible_runner/callbacks/awx_display.py
Skipping callback 'awx_display', as we already have a stdout callback.
Skipping callback 'default', as we already have a stdout callback.
Skipping callback 'minimal', as we already have a stdout callback.
Skipping callback 'oneline', as we already have a stdout callback.

PLAYBOOK: test.yml *************************************************************
Positional arguments: test.yml
verbosity: 4
ask_pass: True
remote_user: Ansible
connection: smart
timeout: 10
become_method: sudo
tags: ('all',)
inventory: ('/runner/inventory/hosts',)
subset: test*
extra_vars: ('@/runner/env/extravars',)
forks: 200
1 plays in test.yml

PLAY [frontend] ****************************************************************
META: ran handlers

TASK [test] ********************************************************************
task path: /runner/project/test.yml:6
redirecting (type: modules) ansible.builtin.win_shell to ansible.windows.win_shell
redirecting (type: modules) ansible.builtin.win_shell to ansible.windows.win_shell
Using module file /usr/share/ansible/collections/ansible_collections/ansible/windows/plugins/modules/win_shell.ps1
Pipelining is enabled.
<test1> ESTABLISH WINRM CONNECTION FOR USER: Ansible on PORT 5985 TO test1
calling kinit with pexpect for principal AnsibleWinSvc
Using module file /usr/share/ansible/collections/ansible_collections/ansible/windows/plugins/modules/win_shell.ps1
Pipelining is enabled.
<test2> ESTABLISH WINRM CONNECTION FOR USER: Ansible on PORT 5985 TO test2
calling kinit with pexpect for principal AnsibleWinSvc
EXEC (via pipeline wrapper)
EXEC (via pipeline wrapper)

We use awx-ee:0.6.0 with slight additions (galaxy collection, python packages etc)

job details

 REST API — Job Detail
Log out
REST API Version 2 Job List Job Detail

Job Detail
GET /api/v2/jobs/19932/
HTTP 200 OK
Allow: GET, DELETE, HEAD, OPTIONS
Content-Type: application/json
Vary: Accept
X-API-Node: awx-csa-6bcfb8865f-vfc6z
X-API-Product-Name: AWX
X-API-Product-Version: 19.5.0
X-API-Time: 0.060s

{
    "id": 19932,
    "type": "job",
    "url": "/api/v2/jobs/19932/",
    "related": {
        "created_by": "/api/v2/users/19/",
        "labels": "/api/v2/jobs/19932/labels/",
        "inventory": "/api/v2/inventories/3/",
        "project": "/api/v2/projects/39/",
        "organization": "/api/v2/organizations/1/",
        "credentials": "/api/v2/jobs/19932/credentials/",
        "unified_job_template": "/api/v2/job_templates/102/",
        "stdout": "/api/v2/jobs/19932/stdout/",
        "execution_environment": "/api/v2/execution_environments/4/",
        "job_events": "/api/v2/jobs/19932/job_events/",
        "job_host_summaries": "/api/v2/jobs/19932/job_host_summaries/",
        "activity_stream": "/api/v2/jobs/19932/activity_stream/",
        "notifications": "/api/v2/jobs/19932/notifications/",
        "create_schedule": "/api/v2/jobs/19932/create_schedule/",
        "job_template": "/api/v2/job_templates/102/",
        "cancel": "/api/v2/jobs/19932/cancel/",
        "relaunch": "/api/v2/jobs/19932/relaunch/"
    },
    "summary_fields": {
        "organization": {
            "id": 1,
            "description": ""
        },
        "inventory": {
            "id": 3,
            "name": "Advanced Inventory",
            "description": "",
            "has_active_failures": true,
            "total_hosts": 3775,
            "hosts_with_active_failures": 161,
            "total_groups": 81,
            "has_inventory_sources": true,
            "total_inventory_sources": 3,
            "inventory_sources_with_failures": 0,
            "organization_id": 1,
            "kind": ""
        },
        "execution_environment": {
            "id": 4,
            "name": "awx-ee-custom:0.6.0",
            "description": "",
            "image": repo-server.net:8083/awx-ee-custom:0.6.0"
        },
        "project": {
            "id": 39,
            "name": "Dev",
            "description": "",
            "status": "successful",
            "scm_type": "git",
            "allow_override": false
        },
        "job_template": {
            "id": 102,
            "name": "test_frontend",
            "description": ""
        },
        "unified_job_template": {
            "id": 102,
            "name": "test_frontend",
            "description": "",
            "unified_job_type": "job"
        },
        "instance_group": {
            "id": 4,
            "name": "default",
            "is_container_group": true
        },
        "created_by": {
            "id": 19,
        },
        "user_capabilities": {
            "delete": true,
            "start": true
        },
        "labels": {
            "count": 0,
            "results": []
        },
        "credentials": [
            {
                "id": 42,
                "kind": "ssh",
                "cloud": false
            }
        ]
    },
    "created": "2022-01-20T05:56:49.980422Z",
    "modified": "2022-01-20T05:56:50.332277Z",
    "name": "test_frontend",
    "description": "",
    "job_type": "run",
    "inventory": 3,
    "project": 39,
    "playbook": "test.yml",
    "scm_branch": "",
    "forks": 200,
    "limit": "test*",
    "verbosity": 4,
    "extra_vars": "{\"maintenance_comment\": \"testing\"}",
    "job_tags": "",
    "force_handlers": false,
    "skip_tags": "",
    "start_at_task": "",
    "timeout": 14400,
    "use_fact_cache": false,
    "organization": 1,
    "unified_job_template": 102,
    "launch_type": "manual",
    "status": "error",
    "execution_environment": 4,
    "failed": true,
    "started": "2022-01-20T05:56:50.665944Z",
    "finished": "2022-01-20T09:57:09.423498Z",
    "canceled_on": null,
    "elapsed": 14418.758,
    "job_args": "[\"ansible-playbook\", \"-u\", \"Ansible\", \"--ask-pass\", \"--forks=200\", \"-l\", \"wss152*\", \"-vvvv\", \"-i\", \"/runner/inventory/hosts\", \"-e\", \"@/runner/env/extravars\", \"test.yml\"]",
    "job_cwd": "/runner/project",
    "job_env": {
        "AWX_OPERATOR_CONTROLLER_MANAGER_METRICS_SERVICE_PORT": "tcp://10:8443",
        "AWX_CSA_SERVICE_SERVICE_HOST": "10",
        "AWX_CSA_SERVICE_SERVICE_PORT_HTTP": "80",
        "HOSTNAME": "automation-job-19932-2qthp",
        "AWX_CSA_SERVICE_PORT_80_TCP_PORT": "80",
        "KUBERNETES_PORT_443_TCP_PROTO": "tcp",
        "KUBERNETES_PORT_443_TCP_ADDR": "10",
        "container": "oci",
        "AWX_CSA_SERVICE_SERVICE_PORT": "80",
        "KUBERNETES_PORT": "tcp://10:443",
        "PWD": "/runner",
        "HOME": "/var/lib/awx",
        "AWX_OPERATOR_CONTROLLER_MANAGER_METRICS_SERVICE_PORT_8443_TCP_PORT": "8443",
        "KUBERNETES_SERVICE_PORT_HTTPS": "443",
        "KUBERNETES_PORT_443_TCP_PORT": "443",
        "KUBERNETES_PORT_443_TCP": "tcp://10:443",
        "AWX_CSA_SERVICE_PORT_80_TCP": "tcp://10:80",
        "AWX_OPERATOR_CONTROLLER_MANAGER_METRICS_SERVICE_PORT_8443_TCP": "tcp://10:8443",
        "AWX_CSA_SERVICE_PORT_80_TCP_ADDR": "10",
        "AWX_OPERATOR_CONTROLLER_MANAGER_METRICS_SERVICE_SERVICE_PORT_HTTPS": "8443",
        "AWX_OPERATOR_CONTROLLER_MANAGER_METRICS_SERVICE_SERVICE_PORT": "8443",
        "AWX_OPERATOR_CONTROLLER_MANAGER_METRICS_SERVICE_PORT_8443_TCP_PROTO": "tcp",
        "AWX_OPERATOR_CONTROLLER_MANAGER_METRICS_SERVICE_SERVICE_HOST": "10.",
        "AWX_CSA_SERVICE_PORT": "tcp://10.:80",
        "SHLVL": "0",
        "KUBERNETES_SERVICE_PORT": "443",
        "AWX_OPERATOR_CONTROLLER_MANAGER_METRICS_SERVICE_PORT_8443_TCP_ADDR": "10",
        "PATH": "/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin",
        "KUBERNETES_SERVICE_HOST": "10.",
        "AWX_CSA_SERVICE_PORT_80_TCP_PROTO": "tcp",
        "LC_CTYPE": "C.UTF-8",
        "ANSIBLE_FACT_CACHE_TIMEOUT": "0",
        "ANSIBLE_FORCE_COLOR": "True",
        "ANSIBLE_HOST_KEY_CHECKING": "False",
        "ANSIBLE_INVENTORY_UNPARSED_FAILED": "True",
        "ANSIBLE_PARAMIKO_RECORD_HOST_KEYS": "False",
        "AWX_PRIVATE_DATA_DIR": "/tmp/awx_19932_w8trytsb",
        "JOB_ID": "19932",
        "INVENTORY_ID": "3",
        "PROJECT_REVISION": "3d184fe08f57337e4454fe63ceb3073a94ace181",
        "ANSIBLE_RETRY_FILES_ENABLED": "False",
        "MAX_EVENT_RES": "700000",
        "AWX_HOST": "http://awxk8s",
        "ANSIBLE_SSH_CONTROL_PATH_DIR": "/runner/cp",
        "ANSIBLE_COLLECTIONS_PATHS": "/runner/requirements_collections:~/.ansible/collections:/usr/share/ansible/collections",
        "ANSIBLE_ROLES_PATH": "/runner/requirements_roles:~/.ansible/roles:/usr/share/ansible/roles:/etc/ansible/roles",
        "PYTHONPATH": ":/usr/local/lib/python3.8/site-packages/ansible_runner/config/../callbacks",
        "ANSIBLE_CALLBACK_PLUGINS": "/usr/local/lib/python3.8/site-packages/ansible_runner/config/../callbacks",
        "ANSIBLE_STDOUT_CALLBACK": "awx_display",
        "AWX_ISOLATED_DATA_DIR": "/runner/artifacts/19932",
        "RUNNER_OMIT_EVENTS": "False",
        "RUNNER_ONLY_FAILED_EVENTS": "False"
    },
    "job_explanation": "",
    "execution_node": "",
    "controller_node": "awx-csa-6bcfb8865f-vfc6z",
    "result_traceback": "",
    "event_processing_finished": true,
    "launched_by": {
        "id": 19,
        "url": "/api/v2/users/19/"
    },
    "work_unit_id": "icGT5DcT",
    "job_template": 102,
    "passwords_needed_to_start": [],
    "allow_simultaneous": true,
    "artifacts": {},
    "scm_revision": "3d184fe08f57337e4454fe63ceb3073a94ace181",
    "instance_group": 4,
    "diff_mode": false,
    "job_slice_number": 0,
    "job_slice_count": 1,
    "webhook_service": "",
    "webhook_credential": null,
    "webhook_guid": "",
    "host_status_counts": {},
    "playbook_counts": {
        "play_count": 1,
        "task_count": 1
    },
    "custom_virtualenv": null
}
Copyright © 2021 Red Hat, Inc. All Rights Reserved.
@aussielunix
Copy link

@FedorRub Have you seen #11451 ?
and this mail thread https://groups.google.com/g/awx-project/c/qFK-RyLB2Ws ?

#10366 (comment) worked for us as a workaround.

@FedorRub
Copy link
Author

Hi @aussielunix thanks for the comment. As suggested in the threads, I have already increased kubelet log max size, however, it does not appear to have an effect in my case. For the test job the log file is about 165K and is not rotated.

From what I have read my issue is similar to a possible issue with idle_timeout for ansible runner ansible/awx-ee#80

@shanemcd
Copy link
Member

You can now configure the idle timeout via the settings page: #10906

@skbki
Copy link

skbki commented Feb 1, 2022

We have the same issue, disabling log rotation within kubelet or timeout configuration did not help.

@spireob
Copy link

spireob commented Feb 2, 2022

We also have the same issue. Modifying idle timeout did not taken effect and job containers are being killed exactly after 4h.

@dr-duke
Copy link

dr-duke commented Feb 4, 2022

@spireob @skbki Confirm. Pod dying after 4 hour. Log rotation is disabled.
Logs:

2022-02-02 10:39:09,267 DEBUG    [45c6fc1d444b4690a6b18d5449a15084] awx.main.tasks job 1392 (running) finished running, producing 248 events.

Feb 2, 2022 @ 15:39:09.327	awx-b96579f5b-c5fnq	 - 	2022-02-02 10:39:08,432 INFO     [45c6fc1d444b4690a6b18d5449a15084] awx.main.commands.run_callback_receiver Event processing is finished for Job 1392, sending notifications

Feb 2, 2022 @ 15:39:09.327	awx-b96579f5b-c5fnq	 - 	2022-02-02 10:39:09,280 DEBUG    [45c6fc1d444b4690a6b18d5449a15084] awx.analytics.job_lifecycle job-1392 post run

Feb 2, 2022 @ 15:39:10.334	awx-b96579f5b-c5fnq	 - 	2022-02-02 10:39:09,534 DEBUG    [45c6fc1d444b4690a6b18d5449a15084] awx.main.tasks Executing error task id 18b5c2d8-db2e-4869-b00e-4d363d2ec3e2, subtasks: [{'type': 'job', 'id': 1392}]

Feb 2, 2022 @ 15:39:10.334	awx-b96579f5b-c5fnq	 - 	2022-02-02 10:39:09,475 DEBUG    [45c6fc1d444b4690a6b18d5449a15084] awx.analytics.job_lifecycle job-1392 finalize run

Feb 2, 2022 @ 15:39:10.334	awx-b96579f5b-c5fnq	 - 	2022-02-02 10:39:09,531 WARNING  [45c6fc1d444b4690a6b18d5449a15084] awx.main.dispatch job 1392 (error) encountered an error (rc=None), please see task stdout for details.

Please re-open issue.

@kzinas-adv
Copy link

kzinas-adv commented Feb 4, 2022

AWX newest 19.5.1 fresh very basic install, nothing custom with awx "sleep" task: same 4 hour issue
With "containerLogMaxSize: 906Mi" and "Default Job Idle Timeout": 7200 seconds
Task make output every 5 minutes last output " 3:55:27.994".
I hit same problem earlier and this was specifically reproduced on fresh install of kubernetes.
Please re-open issue:

2022-02-03 16:15:20,401 DEBUG    [b79b1e2f560b4bf0ade3c54b6365daad] awx.main.dispatch task 5074d1bd-ed1e-4f5d-a44f-d0c22539be15 starting awx.main.scheduler.tasks.run_task_manager(*[])
2022-02-03 16:15:20,403 DEBUG    [b79b1e2f560b4bf0ade3c54b6365daad] awx.main.scheduler Running task manager.
2022-02-03 16:15:20,431 DEBUG    [b79b1e2f560b4bf0ade3c54b6365daad] awx.main.dispatch task 9395aea8-dee1-4e0c-aa97-842b359c7ebc starting awx.main.analytics.analytics_tasks.send_subsystem_metrics(*[])
2022-02-03 16:15:20,433 DEBUG    [b79b1e2f560b4bf0ade3c54b6365daad] awx.main.scheduler Starting Scheduler
2022-02-03 16:15:20,496 DEBUG    [b79b1e2f560b4bf0ade3c54b6365daad] awx.main.scheduler Finishing Scheduler
2022-02-03 16:15:20,516 INFO     [b680680f766c4c2ca9b8af91e68412a0] awx.main.commands.run_callback_receiver Event processing is finished for Job 2, sending notifications
2022-02-03 16:15:20,516 INFO     [b680680f766c4c2ca9b8af91e68412a0] awx.main.commands.run_callback_receiver Event processing is finished for Job 2, sending notifications
2022-02-03 16:15:20,558 DEBUG    [b680680f766c4c2ca9b8af91e68412a0] awx.main.dispatch task dc9c1945-aa75-4e43-a609-c8db2a173fd1 starting awx.main.tasks.system.handle_success_and_failure_notifications(*[2])
2022-02-03 16:15:20,788 DEBUG    [b680680f766c4c2ca9b8af91e68412a0] awx.main.tasks.jobs project_update 2 (running) finished running, producing 6 events.
2022-02-03 16:15:20,793 DEBUG    [b680680f766c4c2ca9b8af91e68412a0] awx.analytics.job_lifecycle projectupdate-2 post run
2022-02-03 16:15:20,838 DEBUG    [b680680f766c4c2ca9b8af91e68412a0] awx.analytics.job_lifecycle projectupdate-2 finalize run
2022-02-03 16:15:20,842 WARNING  [b680680f766c4c2ca9b8af91e68412a0] awx.main.dispatch project_update 2 (failed) encountered an error (rc=None), please see task stdout for details.
2022-02-03 16:15:20,843 DEBUG    [b680680f766c4c2ca9b8af91e68412a0] awx.main.dispatch task 5dde8cc7-0425-4985-a113-a5f891f933a9 starting awx.main.tasks.system.handle_work_error(*['5dde8cc7-0425-4985-a113-a5f891f933a9'])
2022-02-03 16:15:20,844 DEBUG    [b680680f766c4c2ca9b8af91e68412a0] awx.main.tasks.system Executing error task id 5dde8cc7-0425-4985-a113-a5f891f933a9, subtasks: [{'type': 'project_update', 'id': 2}]
2022-02-03 16:15:20,856 DEBUG    [b680680f766c4c2ca9b8af91e68412a0] awx.main.dispatch task 5dde8cc7-0425-4985-a113-a5f891f933a9 starting awx.main.tasks.system.handle_work_success(*[])
2022-02-03 16:15:20,856 DEBUG    [b680680f766c4c2ca9b8af91e68412a0] awx.main.dispatch task ebd1fa56-08f1-4f4a-87c1-afebf12736ea starting awx.main.scheduler.tasks.run_task_manager(*[])
2022-02-03 16:15:20,857 DEBUG    [b680680f766c4c2ca9b8af91e68412a0] awx.main.scheduler Running task manager.
2022-02-03 16:15:20,867 DEBUG    [b680680f766c4c2ca9b8af91e68412a0] awx.main.scheduler Starting Scheduler
2022-02-03 16:15:20,869 DEBUG    [b680680f766c4c2ca9b8af91e68412a0] awx.main.dispatch task 10e8cd22-c959-4bf8-894f-8b06b25a9c2a starting awx.main.scheduler.tasks.run_task_manager(*[])
2022-02-03 16:15:20,870 DEBUG    [b680680f766c4c2ca9b8af91e68412a0] awx.main.scheduler Running task manager.
2022-02-03 16:15:20,879 DEBUG    [b680680f766c4c2ca9b8af91e68412a0] awx.main.scheduler Not running scheduler, another task holds lock
2022-02-03 16:15:20,896 DEBUG    [b680680f766c4c2ca9b8af91e68412a0] awx.main.scheduler Finishing Scheduler
2022-02-03 16:15:30,436 DEBUG    [0613a5a078c04d6bbe25317ca431f08f] awx.main.dispatch task 4a0cc521-95cb-4414-ba2e-86b41f55ecac starting awx.main.tasks.system.awx_periodic_scheduler(*[])

@spireob
Copy link

spireob commented Feb 9, 2022

We have also deployed new fresh installation and problem still occurs. Please re-open the issue.

@nerijus
Copy link

nerijus commented Feb 9, 2022

We have the same issue. Ping @FedorRub @shanemcd - could you please reopen?

@kzinas-adv
Copy link

kzinas-adv commented Feb 9, 2022

Checked also with "Default Job Idle Timeout 18000 seconds" = 5 hours in GUI settings -> Jobs.
Still fails after 4 hours as expected.
Please reopen.

@FedorRub
Copy link
Author

FedorRub commented Feb 9, 2022

We have the same issue. Ping @FedorRub @shanemcd - could you please reopen?

I do not have an ability to do that. I believe we need one of the mods to reopen the issue.

@spireob
Copy link

spireob commented Feb 15, 2022

Anything moved with this issue? Is there any possibility to reopen this one?

@felipe4334
Copy link

felipe4334 commented Mar 21, 2022

I'm still having the same issue on mine #11484
@shanemcd

@TheRealHaoLiu
Copy link
Member

fixed in ansible/receptor#683

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

10 participants