Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Possible regression - there is no tailed log from a failed unit #531

Open
nobuto-m opened this issue Aug 10, 2022 · 3 comments
Open

Possible regression - there is no tailed log from a failed unit #531

nobuto-m opened this issue Aug 10, 2022 · 3 comments
Labels

Comments

@nobuto-m
Copy link
Contributor

When #518 was merged, I'm pretty sure we were able to see the unit log from a failed unit (zaza.model.run_on_unit and tail).

However, I've realized that there is no tailed log from recent CI jobs. For example, the following one has [ERROR] Juju log for octavia/2 as expected but there is no following lines of the actual log. But we see trace back from zaza instead.

https://openstack-ci-reports.ubuntu.com/artifacts/22c/848297/1/check/focal-xena-ha/22cd3cb/job-output.txt

2022-08-10 01:47:53.195019 | focal-medium | 2022-08-10 01:47:53 [INFO] Deploy of bundle completed.
2022-08-10 01:47:53.707547 | focal-medium | 2022-08-10 01:47:53 [INFO] Waiting for environment to settle
2022-08-10 01:47:53.708863 | focal-medium | 2022-08-10 01:47:53 [INFO] Timeout for deployment to settle set to: 7200
2022-08-10 01:47:53.709183 | focal-medium | 2022-08-10 01:47:53 [INFO] Maximum resolve count for deployment set to: 3
2022-08-10 01:47:53.710657 | focal-medium | 2022-08-10 01:47:53 [INFO] Waiting for application states to reach targeted states.
2022-08-10 01:47:53.789756 | focal-medium | 2022-08-10 01:47:53 [WARNING] unknown facade EnvironUpgrader
2022-08-10 01:47:53.789871 | focal-medium | 2022-08-10 01:47:53 [WARNING] unexpected facade EnvironUpgrader found, unable to decipher version to use
2022-08-10 01:47:54.363949 | focal-medium | 2022-08-10 01:47:54 [INFO] Waiting for an application to be present
2022-08-10 01:47:54.364242 | focal-medium | 2022-08-10 01:47:54 [INFO] Now checking workload status and status messages
2022-08-10 01:51:45.821781 | focal-medium | 2022-08-10 01:51:45 [INFO] Application rabbitmq-server is ready.
2022-08-10 01:52:59.849512 | focal-medium | 2022-08-10 01:52:59 [WARNING] Unit octavia/2 is in error state. Attempt number 1 to resolve
2022-08-10 01:52:59.850338 | focal-medium | 2022-08-10 01:52:59 [INFO] Resolving unit: octavia/2
2022-08-10 01:54:37.322669 | focal-medium | 2022-08-10 01:54:37 [INFO] Application mysql-innodb-cluster is ready.
2022-08-10 01:55:07.064651 | focal-medium | 2022-08-10 01:55:07 [INFO] Application glance-simplestreams-sync is ready.
2022-08-10 01:55:08.769919 | focal-medium | 2022-08-10 01:55:08 [WARNING] Unit octavia/2 is in error state. Attempt number 2 to resolve
2022-08-10 01:55:08.771078 | focal-medium | 2022-08-10 01:55:08 [INFO] Resolving unit: octavia/2
2022-08-10 01:55:34.084292 | focal-medium | 2022-08-10 01:55:34 [WARNING] Unit octavia/2 is in error state. Attempt number 3 to resolve
2022-08-10 01:55:34.085231 | focal-medium | 2022-08-10 01:55:34 [INFO] Resolving unit: octavia/2
2022-08-10 01:55:42.918623 | focal-medium | 2022-08-10 01:55:42 [WARNING] Unit octavia/1 is in error state. Attempt number 1 to resolve
2022-08-10 01:55:42.919040 | focal-medium | 2022-08-10 01:55:42 [INFO] Resolving unit: octavia/1
2022-08-10 01:57:15.165771 | focal-medium | 2022-08-10 01:57:15 [INFO] Application keystone-mysql-router is ready.
2022-08-10 01:57:31.165681 | focal-medium | 2022-08-10 01:57:31 [INFO] Application neutron-mysql-router is ready.
2022-08-10 01:57:59.779868 | focal-medium | 2022-08-10 01:57:59 [WARNING] Unit octavia/0 is in error state. Attempt number 1 to resolve
2022-08-10 01:57:59.781388 | focal-medium | 2022-08-10 01:57:59 [INFO] Resolving unit: octavia/0
2022-08-10 01:58:16.313053 | focal-medium | 2022-08-10 01:58:16 [INFO] Application placement-mysql-router is ready.
2022-08-10 01:58:18.376355 | focal-medium | 2022-08-10 01:58:18 [INFO] Application glance-mysql-router is ready.
2022-08-10 01:58:24.951056 | focal-medium | 2022-08-10 01:58:24 [INFO] Application nova-mysql-router is ready.
2022-08-10 01:58:42.649698 | focal-medium | 2022-08-10 01:58:42 [ERROR] {'default_alias': 'zaza-dd75b148e4fc'}
2022-08-10 01:58:42.649809 | focal-medium | 2022-08-10 01:58:42 [ERROR] Model default_alias (zaza-dd75b148e4fc)
2022-08-10 01:58:42.942621 | focal-medium | 2022-08-10 01:58:42 [ERROR] Applications in error state: octavia
2022-08-10 01:58:42.942687 | focal-medium | 2022-08-10 01:58:42 [ERROR] Units in error state: octavia/2
2022-08-10 01:58:42.942699 | focal-medium | 2022-08-10 01:58:42 [ERROR] Juju log for octavia/2
2022-08-10 01:58:44.532324 | focal-medium | Traceback (most recent call last):
2022-08-10 01:58:44.532422 | focal-medium |   File "/home/ubuntu/src/review.opendev.org/openstack/charm-octavia/src/.tox/func-target/lib/python3.8/site-packages/zaza/charm_lifecycle/func_test_runner.py", line 135, in run_env_deployment
2022-08-10 01:58:44.532462 | focal-medium |     deploy.deploy(
2022-08-10 01:58:44.532483 | focal-medium |   File "/home/ubuntu/src/review.opendev.org/openstack/charm-octavia/src/.tox/func-target/lib/python3.8/site-packages/zaza/charm_lifecycle/deploy.py", line 404, in deploy
2022-08-10 01:58:44.532498 | focal-medium |     zaza.model.wait_for_application_states(
2022-08-10 01:58:44.532571 | focal-medium |   File "/home/ubuntu/src/review.opendev.org/openstack/charm-octavia/src/.tox/func-target/lib/python3.8/site-packages/zaza/__init__.py", line 221, in _wrapper
2022-08-10 01:58:44.532597 | focal-medium |     return future.result(timeout)
2022-08-10 01:58:44.532609 | focal-medium |   File "/usr/lib/python3.8/concurrent/futures/_base.py", line 444, in result
2022-08-10 01:58:44.532621 | focal-medium |     return self.__get_result()
2022-08-10 01:58:44.532632 | focal-medium |   File "/usr/lib/python3.8/concurrent/futures/_base.py", line 389, in __get_result
2022-08-10 01:58:44.532644 | focal-medium |     raise self._exception
2022-08-10 01:58:44.532655 | focal-medium |   File "/home/ubuntu/src/review.opendev.org/openstack/charm-octavia/src/.tox/func-target/lib/python3.8/site-packages/zaza/__init__.py", line 206, in _runner
2022-08-10 01:58:44.532666 | focal-medium |     return await f(*args, **kwargs)
2022-08-10 01:58:44.532699 | focal-medium |   File "/home/ubuntu/src/review.opendev.org/openstack/charm-octavia/src/.tox/func-target/lib/python3.8/site-packages/zaza/model.py", line 1635, in async_wait_for_application_states
2022-08-10 01:58:44.532712 | focal-medium |     ok = check_unit_workload_status(
2022-08-10 01:58:44.532724 | focal-medium |   File "/home/ubuntu/src/review.opendev.org/openstack/charm-octavia/src/.tox/func-target/lib/python3.8/site-packages/zaza/model.py", line 1360, in check_unit_workload_status
2022-08-10 01:58:44.532736 | focal-medium |     check_model_for_hard_errors(model)
2022-08-10 01:58:44.532761 | focal-medium |   File "/home/ubuntu/src/review.opendev.org/openstack/charm-octavia/src/.tox/func-target/lib/python3.8/site-packages/zaza/model.py", line 1338, in check_model_for_hard_errors
2022-08-10 01:58:44.532773 | focal-medium |     raise UnitError(errored_units)
2022-08-10 01:58:44.532784 | focal-medium | zaza.model.UnitError: Units octavia/2 in error state
2022-08-10 01:58:44.532803 | focal-medium |
2022-08-10 01:58:44.532816 | focal-medium | During handling of the above exception, another exception occurred:
2022-08-10 01:58:44.532827 | focal-medium |
2022-08-10 01:58:44.532838 | focal-medium | Traceback (most recent call last):
2022-08-10 01:58:44.532852 | focal-medium |   File "/home/ubuntu/src/review.opendev.org/openstack/charm-octavia/src/.tox/func-target/bin/functest-run-suite", line 8, in <module>
2022-08-10 01:58:44.532876 | focal-medium |     sys.exit(main())
2022-08-10 01:58:44.532905 | focal-medium |   File "/home/ubuntu/src/review.opendev.org/openstack/charm-octavia/src/.tox/func-target/lib/python3.8/site-packages/zaza/charm_lifecycle/func_test_runner.py", line 383, in main
2022-08-10 01:58:44.532928 | focal-medium |     func_test_runner(
2022-08-10 01:58:44.532952 | focal-medium |   File "/home/ubuntu/src/review.opendev.org/openstack/charm-octavia/src/.tox/func-target/lib/python3.8/site-packages/zaza/charm_lifecycle/func_test_runner.py", line 295, in func_test_runner
2022-08-10 01:58:44.532978 | focal-medium |     run_env_deployment(env_deployment, keep_model=preserve_model,
2022-08-10 01:58:44.533002 | focal-medium |   File "/home/ubuntu/src/review.opendev.org/openstack/charm-octavia/src/.tox/func-target/lib/python3.8/site-packages/zaza/charm_lifecycle/func_test_runner.py", line 182, in run_env_deployment
2022-08-10 01:58:44.533090 | focal-medium |     failure_report(model_aliases)
2022-08-10 01:58:44.533116 | focal-medium |   File "/home/ubuntu/src/review.opendev.org/openstack/charm-octavia/src/.tox/func-target/lib/python3.8/site-packages/zaza/charm_lifecycle/func_test_runner.py", line 82, in failure_report
2022-08-10 01:58:44.533127 | focal-medium |     log_output = zaza.model.run_on_unit(
2022-08-10 01:58:44.533139 | focal-medium | KeyError: 'stdout'
2022-08-10 01:58:44.717953 | focal-medium | 2022-08-10 01:58:44 [ERROR] Fatal error on SSL transport
2022-08-10 01:58:44.718024 | focal-medium | protocol: <asyncio.sslproto.SSLProtocol object at 0x7fd3e0aef8b0>
2022-08-10 01:58:44.718035 | focal-medium | transport: <_SelectorSocketTransport closing fd=7>
2022-08-10 01:58:44.718042 | focal-medium | Traceback (most recent call last):
2022-08-10 01:58:44.718049 | focal-medium |   File "/usr/lib/python3.8/asyncio/selector_events.py", line 910, in write
2022-08-10 01:58:44.718057 | focal-medium |     n = self._sock.send(data)
2022-08-10 01:58:44.718064 | focal-medium | OSError: [Errno 9] Bad file descriptor
2022-08-10 01:58:44.718073 | focal-medium |
2022-08-10 01:58:44.718113 | focal-medium | During handling of the above exception, another exception occurred:
2022-08-10 01:58:44.718129 | focal-medium |
2022-08-10 01:58:44.718137 | focal-medium | Traceback (most recent call last):
2022-08-10 01:58:44.718176 | focal-medium |   File "/usr/lib/python3.8/asyncio/sslproto.py", line 685, in _process_write_backlog
2022-08-10 01:58:44.718186 | focal-medium |     self._transport.write(chunk)
2022-08-10 01:58:44.718193 | focal-medium |   File "/usr/lib/python3.8/asyncio/selector_events.py", line 916, in write
2022-08-10 01:58:44.718200 | focal-medium |     self._fatal_error(exc, 'Fatal write error on socket transport')
2022-08-10 01:58:44.718206 | focal-medium |   File "/usr/lib/python3.8/asyncio/selector_events.py", line 711, in _fatal_error
2022-08-10 01:58:44.718213 | focal-medium |     self._force_close(exc)
2022-08-10 01:58:44.718220 | focal-medium |   File "/usr/lib/python3.8/asyncio/selector_events.py", line 723, in _force_close
2022-08-10 01:58:44.718226 | focal-medium |     self._loop.call_soon(self._call_connection_lost, exc)
2022-08-10 01:58:44.718233 | focal-medium |   File "/usr/lib/python3.8/asyncio/base_events.py", line 719, in call_soon
2022-08-10 01:58:44.718240 | focal-medium |     self._check_closed()
2022-08-10 01:58:44.718247 | focal-medium |   File "/usr/lib/python3.8/asyncio/base_events.py", line 508, in _check_closed
2022-08-10 01:58:44.718253 | focal-medium |     raise RuntimeError('Event loop is closed')
2022-08-10 01:58:44.718260 | focal-medium | RuntimeError: Event loop is closed
2022-08-10 01:58:44.718586 | focal-medium | 2022-08-10 01:58:44 [ERROR] Task was destroyed but it is pending!
2022-08-10 01:58:44.718609 | focal-medium | task: <Task pending name='Task-137' coro=<WebSocketCommonProtocol.transfer_data() running at /home/ubuntu/src/review.opendev.org/openstack/charm-octavia/src/.tox/func-target/lib/python3.8/site-packages/websockets/protocol.py:674> wait_for=<Future pending cb=[<TaskWakeupMethWrapper object at 0x7fd3e0adc100>()]> cb=[<TaskWakeupMethWrapper object at 0x7fd3e0afa160>(), _wait.<locals>._on_completion() at /usr/lib/python3.8/asyncio/tasks.py:518]>
2022-08-10 01:58:44.718843 | focal-medium | 2022-08-10 01:58:44 [ERROR] Task was destroyed but it is pending!
2022-08-10 01:58:44.718867 | focal-medium | task: <Task pending name='Task-139' coro=<WebSocketCommonProtocol.close_connection() running at /home/ubuntu/src/review.opendev.org/openstack/charm-octavia/src/.tox/func-target/lib/python3.8/site-packages/websockets/protocol.py:1019> wait_for=<Task pending name='Task-137' coro=<WebSocketCommonProtocol.transfer_data() running at /home/ubuntu/src/review.opendev.org/openstack/charm-octavia/src/.tox/func-target/lib/python3.8/site-packages/websockets/protocol.py:674> wait_for=<Future pending cb=[<TaskWakeupMethWrapper object at 0x7fd3e0adc100>()]> cb=[<TaskWakeupMethWrapper object at 0x7fd3e0afa160>(), _wait.<locals>._on_completion() at /usr/lib/python3.8/asyncio/tasks.py:518]>>
2022-08-10 01:58:44.720619 | focal-medium | Exception ignored in: <generator object WebSocketCommonProtocol.close_connection at 0x7fd3e0af0ba0>
2022-08-10 01:58:44.720646 | focal-medium | Traceback (most recent call last):
2022-08-10 01:58:44.720656 | focal-medium |   File "/home/ubuntu/src/review.opendev.org/openstack/charm-octavia/src/.tox/func-target/lib/python3.8/site-packages/websockets/protocol.py", line 1056, in close_connection
2022-08-10 01:58:44.720664 | focal-medium |     if (yield from self.wait_for_connection_lost()):
2022-08-10 01:58:44.720671 | focal-medium |   File "/home/ubuntu/src/review.opendev.org/openstack/charm-octavia/src/.tox/func-target/lib/python3.8/site-packages/websockets/protocol.py", line 1077, in wait_for_connection_lost
2022-08-10 01:58:44.720678 | focal-medium |     yield from asyncio.wait_for(
2022-08-10 01:58:44.720685 | focal-medium |   File "/usr/lib/python3.8/asyncio/tasks.py", line 472, in wait_for
2022-08-10 01:58:44.720692 | focal-medium |     timeout_handle = loop.call_later(timeout, _release_waiter, waiter)
2022-08-10 01:58:44.720699 | focal-medium |   File "/usr/lib/python3.8/asyncio/base_events.py", line 687, in call_later
2022-08-10 01:58:44.720705 | focal-medium |     timer = self.call_at(self.time() + delay, callback, *args,
2022-08-10 01:58:44.720712 | focal-medium |   File "/usr/lib/python3.8/asyncio/base_events.py", line 698, in call_at
2022-08-10 01:58:44.720719 | focal-medium |     self._check_closed()
2022-08-10 01:58:44.720725 | focal-medium |   File "/usr/lib/python3.8/asyncio/base_events.py", line 508, in _check_closed
2022-08-10 01:58:44.720732 | focal-medium |     raise RuntimeError('Event loop is closed')
2022-08-10 01:58:44.720738 | focal-medium | RuntimeError: Event loop is closed
2022-08-10 01:58:44.720938 | focal-medium | 2022-08-10 01:58:44 [ERROR] Task was destroyed but it is pending!
2022-08-10 01:58:44.720962 | focal-medium | task: <Task pending name='Task-138' coro=<WebSocketCommonProtocol.keepalive_ping() running at /home/ubuntu/src/review.opendev.org/openstack/charm-octavia/src/.tox/func-target/lib/python3.8/site-packages/websockets/protocol.py:977> wait_for=<Future pending cb=[<TaskWakeupMethWrapper object at 0x7fd3e00b8fa0>()]>>
2022-08-10 01:58:44.721201 | focal-medium | 2022-08-10 01:58:44 [ERROR] Task was destroyed but it is pending!
2022-08-10 01:58:44.721225 | focal-medium | task: <Task pending name='Task-145' coro=<Connection._pinger() running at /home/ubuntu/src/review.opendev.org/openstack/charm-octavia/src/.tox/func-target/lib/python3.8/site-packages/juju/client/connection.py:578> wait_for=<Future pending cb=[<TaskWakeupMethWrapper object at 0x7fd3e0afaa90>()]>>
2022-08-10 01:58:44.721381 | focal-medium | 2022-08-10 01:58:44 [ERROR] Task was destroyed but it is pending!
2022-08-10 01:58:44.721400 | focal-medium | task: <Task pending name='Task-140' coro=<Connection._receiver() running at /home/ubuntu/src/review.opendev.org/openstack/charm-octavia/src/.tox/func-target/lib/python3.8/site-packages/juju/client/connection.py:530> wait_for=<Future pending cb=[<TaskWakeupMethWrapper object at 0x7fd3e0af3490>()]>>
2022-08-10 01:58:44.721688 | focal-medium | 2022-08-10 01:58:44 [ERROR] Task was destroyed but it is pending!
2022-08-10 01:58:44.721711 | focal-medium | task: <Task pending name='Task-4330' coro=<WebSocketCommonProtocol.recv() running at /home/ubuntu/src/review.opendev.org/openstack/charm-octavia/src/.tox/func-target/lib/python3.8/site-packages/websockets/protocol.py:416> wait_for=<Future pending cb=[<TaskWakeupMethWrapper object at 0x7fd3e00b32b0>()]> cb=[create_task_with_handler.<locals>._task_result_exp_handler(task_name='tmp', logger=<Logger juju....ection (INFO)>)() at /home/ubuntu/src/review.opendev.org/openstack/charm-octavia/src/.tox/func-target/lib/python3.8/site-packages/juju/jasyncio.py:64]>
2022-08-10 01:58:44.721873 | focal-medium | 2022-08-10 01:58:44 [ERROR] Task was destroyed but it is pending!
2022-08-10 01:58:44.721893 | focal-medium | task: <Task pending name='Task-4331' coro=<Event.wait() running at /usr/lib/python3.8/asyncio/locks.py:309> wait_for=<Future pending cb=[<TaskWakeupMethWrapper object at 0x7fd3e1bfdd60>()]>>
2022-08-10 01:58:44.958119 | focal-medium | ERROR: InvocationError for command /home/ubuntu/src/review.opendev.org/openstack/charm-octavia/src/.tox/func-target/bin/functest-run-suite --keep-model --bundle focal-xena-ha (exited with code 1)
2022-08-10 01:58:44.961271 | focal-medium | ___________________________________ summary ____________________________________
2022-08-10 01:58:44.961319 | focal-medium | ERROR:   func-target: commands failed
2022-08-10 01:58:45.057241 | focal-medium | ERROR: InvocationError for command /home/ubuntu/src/review.opendev.org/openstack/charm-octavia/.tox/func-target/bin/tox -e func-target focal-xena-ha (exited with code 1)
2022-08-10 01:58:45.061239 | focal-medium | ___________________________________ summary ____________________________________
2022-08-10 01:58:45.061328 | focal-medium | ERROR:   func-target: commands failed
2022-08-10 01:58:45.412805 | focal-medium | ERROR
2022-08-10 01:58:45.413857 | focal-medium | {
2022-08-10 01:58:45.414049 | focal-medium |   "delta": "0:13:22.343763",
2022-08-10 01:58:45.414328 | focal-medium |   "end": "2022-08-10 01:58:45.156298",
2022-08-10 01:58:45.414453 | focal-medium |   "msg": "non-zero return code",
2022-08-10 01:58:45.414537 | focal-medium |   "rc": 1,
2022-08-10 01:58:45.414614 | focal-medium |   "start": "2022-08-10 01:45:22.812535"
2022-08-10 01:58:45.414687 | focal-medium | }
@nobuto-m
Copy link
Contributor Author

KeyError: 'stdout' from:

log_output = zaza.model.run_on_unit(
unit,
'tail -{} {}'.format(
error_lines,
os.path.join('/var/log/juju', unit_log)),
model_name=model_name)['stdout']

@ajkavanagh
Copy link
Contributor

KeyError: 'stdout' from:

log_output = zaza.model.run_on_unit(
unit,
'tail -{} {}'.format(
error_lines,
os.path.join('/var/log/juju', unit_log)),
model_name=model_name)['stdout']

That 'stdout' should now be 'Stdout'; I'm guessing that when zaza went fully async this was further harmonised across the whole code-base. Originally, juju used to respond with 'Stdout', and then this changed, but by then zaza was locked into 'Stdout', so it was that used everywhere. There's code that automatically adjusts 'stdout' to 'Stdout' and I guess it was finally enforced here.

@ajkavanagh ajkavanagh added the bug label Aug 10, 2022
@nobuto-m
Copy link
Contributor Author

I guess stdout should work still.

zaza/zaza/model.py

Lines 499 to 510 in bb59dfc

def _normalise_action_results(results):
"""Put action results in a consistent format.
:param results: Results dictionary to process.
:type results: Dict[str, str]
:returns: {
'Code': '',
'Stderr': '',
'Stdout': '',
'stderr': '',
'stdout': ''}
:rtype: Dict[str, str]

This can be from the whole libjuju drama so it can go away as it's pinned by now.
juju/python-libjuju#707

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

2 participants