Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Tests flakiness and some test seems to be locking up in CI #1266

Closed
carlosperate opened this issue Feb 1, 2021 · 37 comments
Closed

Tests flakiness and some test seems to be locking up in CI #1266

carlosperate opened this issue Feb 1, 2021 · 37 comments
Labels

Comments

@carlosperate
Copy link
Member

carlosperate commented Feb 1, 2021

The GitHub Actions jobs have been queuing since yesterday.
The grey icons are test runs I had to cancel as they were stuck there for 4h+:

image

Other CI runs with a similar lock up:

Update

There was originally 2 issues, one to do with dependencies in Python 3.8 (already fixed) and another problem not yet identified where some runs get stuck: #1266 (comment)

@carlosperate
Copy link
Member Author

Apart from that it looks like we are seeing test flakiness when the CI runs the tests twice for the same commit (one for the tests and one for the executable builds) and the tests fail in one of the runs:

image

image

image

However, due to the issues in the first comment we haven't run the full CI tests in the last few commits, so it is possible this could be fixed in the latest master. We'll have to look into it.

@carlosperate
Copy link
Member Author

The latest commit was failing due to formatting, so I've fixed that up and pushed (30ea2f8). Let's see what results we get now.

@carlosperate
Copy link
Member Author

Okay, I can confirm that the test flakiness is still there in test_editor_restore_session_existing_runtime, as that is the same failure from the three examples I linked before.

So far all the examples of this test failing have been in Python 3.8 in macOS, so something to consider.

@tjguk
Copy link
Collaborator

tjguk commented Feb 1, 2021

(Off the top of my head) I think this is reason that I didn't merge the 3.8 "upgrade" PR at the time -- there don't seem to be suitable wheels for that combination of Python, pygame & MacOS.

@tjguk
Copy link
Collaborator

tjguk commented Feb 1, 2021

(... and thanks for trawling through this, by the way, @carlosperate ...)

@carlosperate
Copy link
Member Author

carlosperate commented Feb 2, 2021

While the issue in some of the CI jobs I posted definitely was due to missing macOS wheels on 3.8. I'm still seeing this issue in other runs. For example:

All the rest stuck in tests\test_app.py.

@carlosperate
Copy link
Member Author

Also had a similar issue in Travis for Linux and Python 3.5:

In this case travis stops the job after 10min without stdout, but it also looks stock at tests/test_app.py.

@carlosperate
Copy link
Member Author

Added another entry to this comment: #1266 (comment)

@carlosperate
Copy link
Member Author

Adding -v -v -s flags to pytest captures this issue. What test specifically from app.py is running when it is thrown is not constant, but this could be expected as the tests are randomised:

tests/test_app.py::test_excepthook Exception: BANG
PASSED
tests/test_app.py::test_animated_splash_set_frame PASSED
tests/test_app.py::test_run Traceback (most recent call last):
  File "/Users/microbit-carlos/workspace/mine/mu-upstream/mu/app.py", line 98, in run
    venv.ensure()
  File "/Users/microbit-carlos/workspace/mine/mu-upstream/mu/virtual_environment.py", line 373, in ensure
    self.ensure_interpreter()
  File "/Users/microbit-carlos/workspace/mine/mu-upstream/mu/virtual_environment.py", line 384, in ensure_interpreter
    raise VirtualEnvironmentError(message)
mu.virtual_environment.VirtualEnvironmentError: Interpreter not found where expected at /Users/microbit-carlos/Library/Application Support/mu/mu_venv/bin/python
Fatal Python error: Aborted

Current thread 0x000070000ce54000 (most recent call first):
<no Python frame>

Thread 0x00000001155e65c0 (most recent call first):
  File "/Users/microbit-carlos/workspace/mine/mu-upstream/mu/app.py", line 223 in run
  File "/Users/microbit-carlos/workspace/mine/mu-upstream/tests/test_app.py", line 152 in test_run
  File "/Users/microbit-carlos/workspace/mine/mu-upstream/.venv/lib/python3.8/site-packages/_pytest/python.py", line 183 in pytest_pyfunc_call
  File "/Users/microbit-carlos/workspace/mine/mu-upstream/.venv/lib/python3.8/site-packages/pluggy/callers.py", line 187 in _multicall
  File "/Users/microbit-carlos/workspace/mine/mu-upstream/.venv/lib/python3.8/site-packages/pluggy/manager.py", line 84 in <lambda>
  File "/Users/microbit-carlos/workspace/mine/mu-upstream/.venv/lib/python3.8/site-packages/pluggy/manager.py", line 93 in _hookexec
  File "/Users/microbit-carlos/workspace/mine/mu-upstream/.venv/lib/python3.8/site-packages/pluggy/hooks.py", line 286 in __call__
  File "/Users/microbit-carlos/workspace/mine/mu-upstream/.venv/lib/python3.8/site-packages/_pytest/python.py", line 1641 in runtest
  File "/Users/microbit-carlos/workspace/mine/mu-upstream/.venv/lib/python3.8/site-packages/_pytest/runner.py", line 162 in pytest_runtest_call
  File "/Users/microbit-carlos/workspace/mine/mu-upstream/.venv/lib/python3.8/site-packages/pluggy/callers.py", line 187 in _multicall
  File "/Users/microbit-carlos/workspace/mine/mu-upstream/.venv/lib/python3.8/site-packages/pluggy/manager.py", line 84 in <lambda>
  File "/Users/microbit-carlos/workspace/mine/mu-upstream/.venv/lib/python3.8/site-packages/pluggy/manager.py", line 93 in _hookexec
  File "/Users/microbit-carlos/workspace/mine/mu-upstream/.venv/lib/python3.8/site-packages/pluggy/hooks.py", line 286 in __call__
  File "/Users/microbit-carlos/workspace/mine/mu-upstream/.venv/lib/python3.8/site-packages/_pytest/runner.py", line 255 in <lambda>
  File "/Users/microbit-carlos/workspace/mine/mu-upstream/.venv/lib/python3.8/site-packages/_pytest/runner.py", line 311 in from_call
  File "/Users/microbit-carlos/workspace/mine/mu-upstream/.venv/lib/python3.8/site-packages/_pytest/runner.py", line 254 in call_runtest_hook
  File "/Users/microbit-carlos/workspace/mine/mu-upstream/.venv/lib/python3.8/site-packages/_pytest/runner.py", line 215 in call_and_report
  File "/Users/microbit-carlos/workspace/mine/mu-upstream/.venv/lib/python3.8/site-packages/_pytest/runner.py", line 126 in runtestprotocol
  File "/Users/microbit-carlos/workspace/mine/mu-upstream/.venv/lib/python3.8/site-packages/_pytest/runner.py", line 109 in pytest_runtest_protocol
  File "/Users/microbit-carlos/workspace/mine/mu-upstream/.venv/lib/python3.8/site-packages/pluggy/callers.py", line 187 in _multicall
  File "/Users/microbit-carlos/workspace/mine/mu-upstream/.venv/lib/python3.8/site-packages/pluggy/manager.py", line 84 in <lambda>
  File "/Users/microbit-carlos/workspace/mine/mu-upstream/.venv/lib/python3.8/site-packages/pluggy/manager.py", line 93 in _hookexec
  File "/Users/microbit-carlos/workspace/mine/mu-upstream/.venv/lib/python3.8/site-packages/pluggy/hooks.py", line 286 in __call__
  File "/Users/microbit-carlos/workspace/mine/mu-upstream/.venv/lib/python3.8/site-packages/_pytest/main.py", line 348 in pytest_runtestloop
  File "/Users/microbit-carlos/workspace/mine/mu-upstream/.venv/lib/python3.8/site-packages/pluggy/callers.py", line 187 in _multicall
  File "/Users/microbit-carlos/workspace/mine/mu-upstream/.venv/lib/python3.8/site-packages/pluggy/manager.py", line 84 in <lambda>
  File "/Users/microbit-carlos/workspace/mine/mu-upstream/.venv/lib/python3.8/site-packages/pluggy/manager.py", line 93 in _hookexec
  File "/Users/microbit-carlos/workspace/mine/mu-upstream/.venv/lib/python3.8/site-packages/pluggy/hooks.py", line 286 in __call__
  File "/Users/microbit-carlos/workspace/mine/mu-upstream/.venv/lib/python3.8/site-packages/_pytest/main.py", line 323 in _main
  File "/Users/microbit-carlos/workspace/mine/mu-upstream/.venv/lib/python3.8/site-packages/_pytest/main.py", line 269 in wrap_session
  File "/Users/microbit-carlos/workspace/mine/mu-upstream/.venv/lib/python3.8/site-packages/_pytest/main.py", line 316 in pytest_cmdline_main
  File "/Users/microbit-carlos/workspace/mine/mu-upstream/.venv/lib/python3.8/site-packages/pluggy/callers.py", line 187 in _multicall
  File "/Users/microbit-carlos/workspace/mine/mu-upstream/.venv/lib/python3.8/site-packages/pluggy/manager.py", line 84 in <lambda>
  File "/Users/microbit-carlos/workspace/mine/mu-upstream/.venv/lib/python3.8/site-packages/pluggy/manager.py", line 93 in _hookexec
  File "/Users/microbit-carlos/workspace/mine/mu-upstream/.venv/lib/python3.8/site-packages/pluggy/hooks.py", line 286 in __call__
  File "/Users/microbit-carlos/workspace/mine/mu-upstream/.venv/lib/python3.8/site-packages/_pytest/config/__init__.py", line 162 in main
  File "/Users/microbit-carlos/workspace/mine/mu-upstream/.venv/lib/python3.8/site-packages/_pytest/config/__init__.py", line 185 in console_main
  File "/Users/microbit-carlos/workspace/mine/mu-upstream/.venv/bin/pytest", line 10 in <module>
make: *** [coverage] Abort trap: 6
tests/test_app.py::test_run Traceback (most recent call last):
  File "/Users/microbit-carlos/workspace/mine/mu-upstream/mu/app.py", line 98, in run
    venv.ensure()
  File "/Users/microbit-carlos/workspace/mine/mu-upstream/mu/virtual_environment.py", line 373, in ensure
    self.ensure_interpreter()
  File "/Users/microbit-carlos/workspace/mine/mu-upstream/mu/virtual_environment.py", line 384, in ensure_interpreter
    raise VirtualEnvironmentError(message)
mu.virtual_environment.VirtualEnvironmentError: Interpreter not found where expected at /Users/microbit-carlos/Library/Application Support/mu/mu_venv/bin/python
Fatal Python error: Aborted

Current thread 0x0000700009751000 (most recent call first):
<no Python frame>

Thread 0x000000010c09b5c0 (most recent call first):
  File "/Users/microbit-carlos/workspace/mine/mu-upstream/mu/app.py", line 223 in run
  File "/Users/microbit-carlos/workspace/mine/mu-upstream/tests/test_app.py", line 152 in test_run
  File "/Users/microbit-carlos/workspace/mine/mu-upstream/.venv/lib/python3.8/site-packages/_pytest/python.py", line 183 in pytest_pyfunc_call
  File "/Users/microbit-carlos/workspace/mine/mu-upstream/.venv/lib/python3.8/site-packages/pluggy/callers.py", line 187 in _multicall
  File "/Users/microbit-carlos/workspace/mine/mu-upstream/.venv/lib/python3.8/site-packages/pluggy/manager.py", line 84 in <lambda>
  File "/Users/microbit-carlos/workspace/mine/mu-upstream/.venv/lib/python3.8/site-packages/pluggy/manager.py", line 93 in _hookexec
  File "/Users/microbit-carlos/workspace/mine/mu-upstream/.venv/lib/python3.8/site-packages/pluggy/hooks.py", line 286 in __call__
  File "/Users/microbit-carlos/workspace/mine/mu-upstream/.venv/lib/python3.8/site-packages/_pytest/python.py", line 1641 in runtest
  File "/Users/microbit-carlos/workspace/mine/mu-upstream/.venv/lib/python3.8/site-packages/_pytest/runner.py", line 162 in pytest_runtest_call
  File "/Users/microbit-carlos/workspace/mine/mu-upstream/.venv/lib/python3.8/site-packages/pluggy/callers.py", line 187 in _multicall
  File "/Users/microbit-carlos/workspace/mine/mu-upstream/.venv/lib/python3.8/site-packages/pluggy/manager.py", line 84 in <lambda>
  File "/Users/microbit-carlos/workspace/mine/mu-upstream/.venv/lib/python3.8/site-packages/pluggy/manager.py", line 93 in _hookexec
  File "/Users/microbit-carlos/workspace/mine/mu-upstream/.venv/lib/python3.8/site-packages/pluggy/hooks.py", line 286 in __call__
  File "/Users/microbit-carlos/workspace/mine/mu-upstream/.venv/lib/python3.8/site-packages/_pytest/runner.py", line 255 in <lambda>
  File "/Users/microbit-carlos/workspace/mine/mu-upstream/.venv/lib/python3.8/site-packages/_pytest/runner.py", line 311 in from_call
  File "/Users/microbit-carlos/workspace/mine/mu-upstream/.venv/lib/python3.8/site-packages/_pytest/runner.py", line 254 in call_runtest_hook
  File "/Users/microbit-carlos/workspace/mine/mu-upstream/.venv/lib/python3.8/site-packages/_pytest/runner.py", line 215 in call_and_report
  File "/Users/microbit-carlos/workspace/mine/mu-upstream/.venv/lib/python3.8/site-packages/_pytest/runner.py", line 126 in runtestprotocol
  File "/Users/microbit-carlos/workspace/mine/mu-upstream/.venv/lib/python3.8/site-packages/_pytest/runner.py", line 109 in pytest_runtest_protocol
  File "/Users/microbit-carlos/workspace/mine/mu-upstream/.venv/lib/python3.8/site-packages/pluggy/callers.py", line 187 in _multicall
  File "/Users/microbit-carlos/workspace/mine/mu-upstream/.venv/lib/python3.8/site-packages/pluggy/manager.py", line 84 in <lambda>
  File "/Users/microbit-carlos/workspace/mine/mu-upstream/.venv/lib/python3.8/site-packages/pluggy/manager.py", line 93 in _hookexec
  File "/Users/microbit-carlos/workspace/mine/mu-upstream/.venv/lib/python3.8/site-packages/pluggy/hooks.py", line 286 in __call__
  File "/Users/microbit-carlos/workspace/mine/mu-upstream/.venv/lib/python3.8/site-packages/_pytest/main.py", line 348 in pytest_runtestloop
  File "/Users/microbit-carlos/workspace/mine/mu-upstream/.venv/lib/python3.8/site-packages/pluggy/callers.py", line 187 in _multicall
  File "/Users/microbit-carlos/workspace/mine/mu-upstream/.venv/lib/python3.8/site-packages/pluggy/manager.py", line 84 in <lambda>
  File "/Users/microbit-carlos/workspace/mine/mu-upstream/.venv/lib/python3.8/site-packages/pluggy/manager.py", line 93 in _hookexec
  File "/Users/microbit-carlos/workspace/mine/mu-upstream/.venv/lib/python3.8/site-packages/pluggy/hooks.py", line 286 in __call__
  File "/Users/microbit-carlos/workspace/mine/mu-upstream/.venv/lib/python3.8/site-packages/_pytest/main.py", line 323 in _main
  File "/Users/microbit-carlos/workspace/mine/mu-upstream/.venv/lib/python3.8/site-packages/_pytest/main.py", line 269 in wrap_session
  File "/Users/microbit-carlos/workspace/mine/mu-upstream/.venv/lib/python3.8/site-packages/_pytest/main.py", line 316 in pytest_cmdline_main
  File "/Users/microbit-carlos/workspace/mine/mu-upstream/.venv/lib/python3.8/site-packages/pluggy/callers.py", line 187 in _multicall
  File "/Users/microbit-carlos/workspace/mine/mu-upstream/.venv/lib/python3.8/site-packages/pluggy/manager.py", line 84 in <lambda>
  File "/Users/microbit-carlos/workspace/mine/mu-upstream/.venv/lib/python3.8/site-packages/pluggy/manager.py", line 93 in _hookexec
  File "/Users/microbit-carlos/workspace/mine/mu-upstream/.venv/lib/python3.8/site-packages/pluggy/hooks.py", line 286 in __call__
  File "/Users/microbit-carlos/workspace/mine/mu-upstream/.venv/lib/python3.8/site-packages/_pytest/config/__init__.py", line 162 in main
  File "/Users/microbit-carlos/workspace/mine/mu-upstream/.venv/lib/python3.8/site-packages/_pytest/config/__init__.py", line 185 in console_main
  File "/Users/microbit-carlos/workspace/mine/mu-upstream/.venv/bin/pytest", line 10 in <module>
make: *** [coverage] Abort trap: 6
tests/test_app.py::test_animated_splash_set_frame PASSED
tests/test_app.py::test_worker_run PASSED
tests/test_app.py::test_close_splash_screen Traceback (most recent call last):
  File "/Users/microbit-carlos/workspace/mine/mu-upstream/mu/app.py", line 98, in run
    venv.ensure()
  File "/Users/microbit-carlos/workspace/mine/mu-upstream/mu/virtual_environment.py", line 373, in ensure
    self.ensure_interpreter()
  File "/Users/microbit-carlos/workspace/mine/mu-upstream/mu/virtual_environment.py", line 384, in ensure_interpreter
    raise VirtualEnvironmentError(message)
mu.virtual_environment.VirtualEnvironmentError: Interpreter not found where expected at /Users/microbit-carlos/Library/Application Support/mu/mu_venv/bin/python
...

@tjguk This looks similar to #1291 ?

@ntoll
Copy link
Member

ntoll commented Feb 16, 2021

@carlosperate top digital archaeology there... ;-) I wonder why the Python interpreter isn't copied over. :-/

@carlosperate
Copy link
Member Author

I think I found the issue in my specific case.

That venv was created from an app bundle, so the python executable in the venv was an alias to the interpreter inside that Mu installation. When I deleted that (testing) app bundle it it left an alias pointing to a file that didn't exist any more.

My error traces are from my environment, which might be a different issue from the CI error, even if it looks similar (getting stuck while running the test_app.py tests).

@carlosperate
Copy link
Member Author

carlosperate commented Feb 16, 2021

Got the following from the CI. However this crash doesn't fit the original pattern, where the tests get stuck until cancelled (or GH times out). In this case the error was thrown right away.

Ubuntu 20.04 Py3.7: https://github.com/mu-editor/mu/runs/1911461023?check_suite_focus=true

tests/virtual_environment/test_pip.py::test_pip_uninstall_several_packages_with_flag PASSED
tests/interface/test_main.py::test_Window_add_python3_plotter PASSED
tests/interface/test_main.py::test_Window_get_load_path_no_previous PASSED
tests/interface/test_main.py::test_Window_show_confirmation_default PASSED
tests/interface/test_main.py::test_Window_zoom_in PASSED
tests/interface/test_main.py::test_StatusBar_device_connected_microbit PASSED
tests/interface/test_main.py::test_Window_add_plotter PASSED
tests/interface/test_main.py::test_StatusBar_set_mode PASSED
tests/interface/test_main.py::test_Window_update_debug_inspector_with_exception PASSED
tests/interface/test_main.py::test_Window_toggle_comments PASSED
tests/interface/test_main.py::test_Window_select_mode_cancelled PASSED
tests/interface/test_main.py::test_Window_open_directory_from_os_windows PASSED
tests/interface/test_main.py::test_Window_remove_python_runner PASSED
tests/interface/test_main.py::test_Window_show_find_replace PASSED
tests/interface/test_main.py::test_Window_highlight_text PASSED
tests/interface/test_main.py::test_Window_stop_timer PASSED
tests/interface/test_main.py::test_Window_set_theme PASSED
tests/interface/test_main.py::test_ButtonBar_change_mode PASSED
tests/interface/test_main.py::test_Window_replace_text_not_global_missing PASSED
tests/interface/test_main.py::test_Window_set_zoom PASSED
tests/interface/test_main.py::test_Window_change_mode PASSED
tests/interface/test_main.py::test_Window_highlight_text_backward PASSED
tests/interface/test_main.py::test_Window_show_admin PASSED
tests/interface/test_main.py::test_Window_connect_find_replace PASSED
tests/interface/test_main.py::test_Window_setup PASSED
tests/interface/test_main.py::test_Window_add_filesystem PASSED
tests/interface/test_main.py::test_Window_get_microbit_path PASSED
tests/interface/test_main.py::test_Window_zoom_out PASSED
tests/interface/test_main.py::test_FileTabs_removeTab_ok PASSED
tests/interface/test_main.py::test_Window_set_checker_icon PASSED
tests/interface/test_main.py::test_ButtonBar_reset PASSED
tests/interface/test_main.py::test_Window_connect_find_again PASSED
tests/interface/test_main.py::test_FileTabs_change_tab PASSED
tests/interface/test_main.py::test_StatusBar_init PASSED
tests/interface/test_main.py::test_Window_get_load_path_with_previous PASSED
tests/interface/test_main.py::test_Window_open_directory_from_os_darwin PASSED
tests/interface/test_main.py::test_Window_highlight_text_no_tab PASSED
tests/interface/test_main.py::test_FileTabs_init PASSED
tests/interface/test_main.py::test_StatusBar_connect_logs PASSED
tests/interface/test_main.py::test_Window_update_title PASSED
tests/interface/test_main.py::test_Window_update_debug_inspector PASSED
tests/interface/test_main.py::test_Window_connect_tab_rename PASSED
tests/interface/test_main.py::test_Window_sync_packages PASSED
tests/interface/test_main.py::test_Window_open_file_event PASSED
tests/interface/test_main.py::test_Window_resizeEvent PASSED
tests/interface/test_main.py::test_Window_add_filesystem_open_signal Fatal Python error: Segmentation fault

Thread 0x00007f2d29a41740 (most recent call first):
  File "/opt/hostedtoolcache/Python/3.7.9/x64/lib/python3.7/unittest/mock.py", line 1966 in __init__
  File "/opt/hostedtoolcache/Python/3.7.9/x64/lib/python3.7/unittest/mock.py", line 1925 in _mock_set_magics
  File "/opt/hostedtoolcache/Python/3.7.9/x64/lib/python3.7/unittest/mock.py", line 1901 in __init__
  File "/home/runner/work/mu/mu/tests/interface/test_main.py", line 803 in test_Window_add_filesystem_open_signal
  File "/opt/hostedtoolcache/Python/3.7.9/x64/lib/python3.7/site-packages/_pytest/python.py", line 183 in pytest_pyfunc_call
  File "/opt/hostedtoolcache/Python/3.7.9/x64/lib/python3.7/site-packages/pluggy/callers.py", line 187 in _multicall
  File "/opt/hostedtoolcache/Python/3.7.9/x64/lib/python3.7/site-packages/pluggy/manager.py", line 87 in <lambda>
  File "/opt/hostedtoolcache/Python/3.7.9/x64/lib/python3.7/site-packages/pluggy/manager.py", line 93 in _hookexec
  File "/opt/hostedtoolcache/Python/3.7.9/x64/lib/python3.7/site-packages/pluggy/hooks.py", line 286 in __call__
  File "/opt/hostedtoolcache/Python/3.7.9/x64/lib/python3.7/site-packages/_pytest/python.py", line 1641 in runtest
  File "/opt/hostedtoolcache/Python/3.7.9/x64/lib/python3.7/site-packages/_pytest/runner.py", line 162 in pytest_runtest_call
  File "/opt/hostedtoolcache/Python/3.7.9/x64/lib/python3.7/site-packages/pluggy/callers.py", line 187 in _multicall
  File "/opt/hostedtoolcache/Python/3.7.9/x64/lib/python3.7/site-packages/pluggy/manager.py", line 87 in <lambda>
  File "/opt/hostedtoolcache/Python/3.7.9/x64/lib/python3.7/site-packages/pluggy/manager.py", line 93 in _hookexec
  File "/opt/hostedtoolcache/Python/3.7.9/x64/lib/python3.7/site-packages/pluggy/hooks.py", line 286 in __call__
  File "/opt/hostedtoolcache/Python/3.7.9/x64/lib/python3.7/site-packages/_pytest/runner.py", line 255 in <lambda>
  File "/opt/hostedtoolcache/Python/3.7.9/x64/lib/python3.7/site-packages/_pytest/runner.py", line 311 in from_call
  File "/opt/hostedtoolcache/Python/3.7.9/x64/lib/python3.7/site-packages/_pytest/runner.py", line 255 in call_runtest_hook
  File "/opt/hostedtoolcache/Python/3.7.9/x64/lib/python3.7/site-packages/_pytest/runner.py", line 215 in call_and_report
  File "/opt/hostedtoolcache/Python/3.7.9/x64/lib/python3.7/site-packages/_pytest/runner.py", line 126 in runtestprotocol
  File "/opt/hostedtoolcache/Python/3.7.9/x64/lib/python3.7/site-packages/_pytest/runner.py", line 109 in pytest_runtest_protocol
  File "/opt/hostedtoolcache/Python/3.7.9/x64/lib/python3.7/site-packages/pluggy/callers.py", line 187 in _multicall
  File "/opt/hostedtoolcache/Python/3.7.9/x64/lib/python3.7/site-packages/pluggy/manager.py", line 87 in <lambda>
  File "/opt/hostedtoolcache/Python/3.7.9/x64/lib/python3.7/site-packages/pluggy/manager.py", line 93 in _hookexec
  File "/opt/hostedtoolcache/Python/3.7.9/x64/lib/python3.7/site-packages/pluggy/hooks.py", line 286 in __call__
  File "/opt/hostedtoolcache/Python/3.7.9/x64/lib/python3.7/site-packages/_pytest/main.py", line 348 in pytest_runtestloop
  File "/opt/hostedtoolcache/Python/3.7.9/x64/lib/python3.7/site-packages/pluggy/callers.py", line 187 in _multicall
  File "/opt/hostedtoolcache/Python/3.7.9/x64/lib/python3.7/site-packages/pluggy/manager.py", line 87 in <lambda>
  File "/opt/hostedtoolcache/Python/3.7.9/x64/lib/python3.7/site-packages/pluggy/manager.py", line 93 in _hookexec
  File "/opt/hostedtoolcache/Python/3.7.9/x64/lib/python3.7/site-packages/pluggy/hooks.py", line 286 in __call__
  File "/opt/hostedtoolcache/Python/3.7.9/x64/lib/python3.7/site-packages/_pytest/main.py", line 323 in _main
  File "/opt/hostedtoolcache/Python/3.7.9/x64/lib/python3.7/site-packages/_pytest/main.py", line 269 in wrap_session
  File "/opt/hostedtoolcache/Python/3.7.9/x64/lib/python3.7/site-packages/_pytest/main.py", line 316 in pytest_cmdline_main
  File "/opt/hostedtoolcache/Python/3.7.9/x64/lib/python3.7/site-packages/pluggy/callers.py", line 187 in _multicall
  File "/opt/hostedtoolcache/Python/3.7.9/x64/lib/python3.7/site-packages/pluggy/manager.py", line 87 in <lambda>
  File "/opt/hostedtoolcache/Python/3.7.9/x64/lib/python3.7/site-packages/pluggy/manager.py", line 93 in _hookexec
  File "/opt/hostedtoolcache/Python/3.7.9/x64/lib/python3.7/site-packages/pluggy/hooks.py", line 286 in __call__
  File "/opt/hostedtoolcache/Python/3.7.9/x64/lib/python3.7/site-packages/_pytest/config/__init__.py", line 163 in main
  File "/opt/hostedtoolcache/Python/3.7.9/x64/lib/python3.7/site-packages/_pytest/config/__init__.py", line 185 in console_main
  File "/opt/hostedtoolcache/Python/3.7.9/x64/bin/pytest", line 8 in <module>
make: *** [Makefile:56: coverage] Segmentation fault (core dumped)
Error: Process completed with exit code 2.

@carlosperate
Copy link
Member Author

Repeating the CI tests until a "getting stuck" error was encountered. In this case I cancelled the run after an hour, and unfortunately in that case we don't get any additional output:

Windows 2016 Py3.7: https://github.com/mu-editor/mu/runs/1911754485?check_suite_focus=true

2021-02-16T15:35:56.9575181Z ============================= test session starts =============================
2021-02-16T15:35:56.9577215Z platform win32 -- Python 3.7.9, pytest-6.2.2, py-1.10.0, pluggy-0.13.1
2021-02-16T15:35:56.9578510Z Using --random-order-bucket=module
2021-02-16T15:35:56.9579652Z Using --random-order-seed=720814
2021-02-16T15:35:56.9580456Z 
2021-02-16T15:35:56.9581361Z rootdir: D:\a\mu\mu, configfile: pytest.ini
2021-02-16T15:35:56.9582561Z plugins: cov-2.11.1, random-order-1.0.4
2021-02-16T15:35:56.9583564Z collected 1010 items
2021-02-16T15:35:56.9584268Z 
2021-02-16T15:36:05.5027060Z tests\interface\test_panes.py .......................................... [  4%]
2021-02-16T15:36:05.8526916Z ........................................................................ [ 11%]
2021-02-16T15:36:06.1074081Z .............................................s.......                    [ 16%]
2021-02-16T15:36:06.1395468Z tests\modes\test_pyboard.py .........                                    [ 17%]
2021-02-16T15:36:06.6680166Z tests\virtual_environment\test_process.py ......                         [ 18%]
2021-02-16T15:36:06.9005490Z tests\modes\test_python3.py ...........................                  [ 20%]
2021-02-16T15:36:07.0259743Z tests\interface\test_dialogs.py ...s...s.s.s..s...s...ss...sss...ss      [ 24%]
2021-02-16T15:36:07.1852008Z tests\modes\test_esp.py ........................                         [ 26%]
2021-02-16T15:36:07.1852969Z tests\modes\test_lego.py .                                               [ 26%]
2021-02-16T15:36:07.2611200Z tests\modes\test_web.py ....................                             [ 28%]
2021-02-16T15:36:09.0221527Z tests\modes\test_microbit.py ........................................... [ 32%]
2021-02-16T15:36:09.1112768Z .........                                                                [ 33%]
2021-02-16T15:36:09.1796862Z tests\modes\test_pygamezero.py ..................                        [ 35%]
2021-02-16T15:36:09.3721706Z tests\modes\test_base.py ............................................... [ 40%]
2021-02-16T15:36:09.4274239Z ...........s...                                                          [ 41%]
2021-02-16T15:36:09.4429824Z tests\interface\test_themes.py ....                                      [ 42%]
2021-02-16T15:36:09.6039506Z tests\modes\test_debug.py ........................................       [ 46%]
2021-02-16T15:36:09.9771833Z tests\interface\test_editor.py ......................................... [ 50%]
2021-02-16T15:36:10.0169513Z .......                                                                  [ 50%]
2021-02-16T17:12:18.3439979Z tests\test_app.py ......
2021-02-16T17:12:18.3832207Z ##[error]The operation was canceled.

@carlosperate
Copy link
Member Author

carlosperate commented Feb 16, 2021

This run with the extra pytes flags is still stuck after 4h, let's see if tomorrow morning it has timed-out or stopped somehow. I doubt it'll actually include the stdout, but is worth checking: https://github.com/mu-editor/mu/runs/1913382027


Update: After the GH CI timeout we don't get any additional info than we do when it is manually stopped:

2021-02-16T19:39:09.9144931Z ============================= test session starts =============================
2021-02-16T19:39:09.9145769Z platform win32 -- Python 3.7.9, pytest-6.2.2, py-1.10.0, pluggy-0.13.1
2021-02-16T19:39:09.9146418Z Using --random-order-bucket=module
2021-02-16T19:39:09.9146969Z Using --random-order-seed=229623
2021-02-16T19:39:09.9147314Z 
2021-02-16T19:39:09.9147750Z rootdir: D:\a\mu\mu, configfile: pytest.ini
2021-02-16T19:39:09.9148729Z plugins: cov-2.11.1, random-order-1.0.4
2021-02-16T19:39:09.9149261Z collected 1010 items
2021-02-16T19:39:09.9149548Z 
2021-02-17T01:37:17.9574540Z tests\test_app.py .....
2021-02-17T01:37:17.9936852Z ##[error]The operation was canceled.

@ntoll
Copy link
Member

ntoll commented Feb 17, 2021

Random conjectures, that are probably wrong. Just throwing them out there in case they stick...

  • At some point during the tests Mu attempts to create itself a venv.
  • Since the tests are in a random order, the success of this creation of a venv depends on some sort of (FS/venv based?) state.
  • Could the venv be interfering with the test runner in some way..?
  • The errors look like they could be a problem with the GHA infrastructure..? We're running on a bad node/pod in a k8s cluster for example..?

🤷

@carlosperate
Copy link
Member Author

Do the tests run in parallel? Could we have two processes trying to do the same thing (like creating a venv) and interfering with each other?
We could also have a partially created venv by one process, that could still be installing things, while other test sees the venv and assumes everything needed is there? I guess in those cases we are more likely to see a crash from a missing module, not sure what would be the cause of the infinite lock up.

The Travis output provides a tiny bit more, which could be helpful:
https://travis-ci.org/github/mu-editor/mu/jobs/759194961

tests/debugger/test_runner.py::test_Debugger_do_step PASSED
tests/test_app.py::test_excepthook PASSED
tests/test_app.py::test_animated_splash_set_frame PASSED
tests/test_app.py::test_run Already using interpreter /home/travis/build/mu-editor/mu/venv/bin/python3

Path not in prefix '/home/travis/build/mu-editor/mu/venv/include/python3.5m' '/usr'

No output has been received in the last 10m0s, this potentially indicates a stalled build or something wrong with the build itself.

Check the details on how to adjust your build configuration on: https://docs.travis-ci.com/user/common-build-problems/#build-times-out-because-no-output-was-received

@ntoll
Copy link
Member

ntoll commented Feb 17, 2021

@carlosperate good point.... I guess we could do the following when in CI:

  • Ensure Mu's venv is always created before the tests run.
  • Ensure tests are run consecutively rather than in parallel.

@carlosperate
Copy link
Member Author

carlosperate commented Feb 17, 2021

Sounds like pytest runs test sequentially by default: https://www.tutorialspoint.com/pytest/pytest_run_tests_in_parallel.htm
I had a search in the current pytest docs and looks like this is still the case.

Ensure Mu's venv is always created before the tests run.

Yeah, we should definitely try running python -m mu.wheels before the tests and see if it still hangs.


I've been running the tests in my environment in a loop (and deleting the venv between iterations) and could not replicate the issue. However I've noticed that during these steps the tests do take a few seconds to continue at the same point and the PyQt warnings it prints are interesting:

tests/test_app.py::test_animated_splash_init PASSED
tests/test_app.py::test_debug_no_args PASSED
tests/test_app.py::test_run QObject: Cannot create children for a parent that is in a different thread.
(Parent is Process(0x7fe60ad000d0), parent's thread is QThread(0x7fe607d09350), current thread is QThread(0x7fe60d062e90)
QObject: Cannot create children for a parent that is in a different thread.
(Parent is Process(0x7fe60ad000d0), parent's thread is QThread(0x7fe607d09350), current thread is QThread(0x7fe60d062e90)
QObject: Cannot create children for a parent that is in a different thread.
(Parent is Process(0x7fe60ac11970), parent's thread is QThread(0x7fe607d09350), current thread is QThread(0x7fe60d062e90)
PASSED

Sounds quite plausible that this is related to the problem.

@tjguk
Copy link
Collaborator

tjguk commented Feb 22, 2021

Agree it's worth getting to grips with. I don't have bandwidth right now, but definitely want to get a better handle on the use of Qt explicit/implicit threads in a testing (non-GUI) context.

@carlosperate
Copy link
Member Author

I'm getting similar warnings on the macOS console when running Mu normally. Not sure yet of the cause:

$ python run.py
qt.qpa.drawing: Layer-backing cannot be explicitly controlled on 10.14 when built against the 10.14 SDK
QObject: Cannot create children for a parent that is in a different thread.
(Parent is Process(0x7faaabfdf990), parent's thread is QThread(0x7faaabc0d620), current thread is QThread(0x7faaabcde690)

@carlosperate
Copy link
Member Author

Trying to trigger the Cannot create children for a parent that is in a different thread while running Mu (not in the tests, but during a normal run) and I can only get it to print that when I delete the venv and Mu has to recreate it.

QObject: Cannot create children for a parent that is in a different thread.
(Parent is Process(0x7fad9ac16ae0), parent's thread is QThread(0x7fad9cc165a0), current thread is QThread(0x7fad9fa9b740)

Poking a bit on the code and it looks like it happens at the ipython kernel installation in the venv:

def install_jupyter_kernel(self):
kernel_name = '"Python/Mu ({})"'.format(self.name)
logger.info("Installing Jupyter Kernel %s", kernel_name)
return self.run_python(
"-m",
"ipykernel",
"install",
"--user",
"--name",
self.name,
"--display-name",
kernel_name,
)

return self.process.run_blocking(self.interpreter, args)

def run_blocking(self, command, args, wait_for_s=30.0, **envvars):
self._set_up_run(**envvars)

def _set_up_run(self, **envvars):
"""Run the process with the command and args"""
self.process = QProcess(self)

@tjguk
Copy link
Collaborator

tjguk commented Mar 11, 2021

Thanks, @carlosperate -- my understanding was that a Qt QProcess would not instantiate a new thread, but would use the underlying OS Process creation (in Windows: CreateProcess, elsewhere maybe fork). I'll try to read around the subject to understand a safe way forward.

We could obviously switch back to using a subprocess runner, but the advantage of the QProcess solution was that it's easy to use the signalling mechanism to keep a text box updated. However, it's certainly not impossible to achieve the same thing with a standard subprocess.Popen communication channel, so that might be a way to go.

@carlosperate
Copy link
Member Author

A couple of weeks ago the flakiness got worse, from a random job failing sometimes, to several jobs constantly failing on all test runs.
The latest commit in master 37b4b is the first full green CI run we had an a long time, but it looks like we still have some random lock ups.

Branch https://github.com/mu-editor/mu/tree/test-flakiness was created a while ago, I've rebased and created PR #1364 to collect test runs from the same commit and capture the current failure rate there


Correction: We also had this full green CI 5 days ago, so maybe the one from today could have been a fluke as well, but we'll find out soon.
https://github.com/mu-editor/mu/actions/runs/657627856
Apart from these two exceptions no other GH Action test run passed in the last 3 weeks.

@tjguk
Copy link
Collaborator

tjguk commented Mar 16, 2021

Thanks for tracking, @carlosperate. From where you are, does it seem more likely to be some change in the CI infrastructure, which might have shaken out some latent race condition? Or some change in our codebase, eg around the Splash screen (the most obvious candidate) which started causing the problem?

@carlosperate
Copy link
Member Author

carlosperate commented Mar 16, 2021

We can take an old commit and restart the tests, if they start to fail it might GH platform related.

Took this random commit, around this time all merges to master were fully green:
https://github.com/mu-editor/mu/actions/runs/525254673

Pushed as a new branch, let's see what the result is: https://github.com/mu-editor/mu/actions/runs/658293392

To be fair this might be too long ago (31st Jan), if this passes without issues I'll get a little bit closer to today.


Edit: Run https://github.com/mu-editor/mu/actions/runs/658293392 3 times and they all passed.

@carlosperate
Copy link
Member Author

Ahh, the conversation on Gitter reminded me that this issue was also present in TravisCI:

So it was likely introduced by a code change.

@tjguk
Copy link
Collaborator

tjguk commented Mar 20, 2021

Having just had a bunch of test runs hang -- I've just added a timeout of 30s to pytest. My point is simply to make it fail faster if it's going to hang (assuming that the pytest timeout has enough clout)

@tjguk
Copy link
Collaborator

tjguk commented Mar 20, 2021

The 30s timeout doesn't seem to achieve anything. I've applied pytest skip markers to progressively more tests. However, for the first time, just now I saw the [Ubuntu latest 3.8] run stall on "Install Mu dependencies" -- which suggests that it might not be our tests suite as such which is stalling, but the GH infrastructure.

@tjguk
Copy link
Collaborator

tjguk commented Mar 20, 2021

... ok; looks like that last run was a test stall. When I cancelled it, the logs eventually updated and failed after the tests dialogs.

https://github.com/mu-editor/mu/runs/2156931209?check_suite_focus=true

@tjguk
Copy link
Collaborator

tjguk commented Mar 20, 2021

After a lot of to-ing and fro-ing and skipping tests: it looks like it's something in test_app which is causing the hangs. I've added a 30s timeout to the checkout & test steps; and a 5 minute timeout overall. That's just to try to get a faster fail if we get one!

Trying now to narrow down the individual tests

@tjguk
Copy link
Collaborator

tjguk commented Mar 21, 2021

(Work happening over at #1388)

@tjguk
Copy link
Collaborator

tjguk commented Mar 21, 2021

Ok; merged #1388 into master -- but still more problems inside test_app:test_worker_run

https://github.com/mu-editor/mu/runs/2157592932?check_suite_focus=true

@tjguk
Copy link
Collaborator

tjguk commented Mar 21, 2021

Correction: in both cases (Ubuntu 16 & Ubuntu latest) test_app:test_worker_run is the last test which succeeds, so presumably either the next one's causing a hang, or some dangling thread or zombie process is preventing pytest from continuing.

@tjguk
Copy link
Collaborator

tjguk commented Mar 21, 2021

Work continuing in #1389 -- seeing if we can isolate a specific test or pattern of tests which cause the hang

@carlosperate
Copy link
Member Author

carlosperate commented Mar 22, 2021

... ok; looks like that last run was a test stall. When I cancelled it, the logs eventually updated and failed after the tests dialogs.

https://github.com/mu-editor/mu/runs/2156931209?check_suite_focus=true

Yeah, that happens sometimes, often it's really stalled in the pytest step, but the log show that until it times out or gets manually cancelled.

Sometimes the pip install or apt-install steps look like they get stuck, but in the end they always (as far as I've noticed) finish, it's just than instead taking less than a minute, they might take over 5min.
That doesn't happen often, so it's okay if the current timeout is too short for those cases, it usually works fine the next time, and we can keep an eye if that happens often.

After a lot of to-ing and fro-ing and skipping tests: it looks like it's something in test_app which is causing the hangs.

Yeah, the bast majority of the logs collected here and in #1364 all get stuck in test_app.

Work continuing in #1389 -- seeing if we can isolate a specific test or pattern of tests which cause the hang

So is the current conclusion or theory that the issue is in test_close_splash_screen (or caused somewhere else, but uncovered there)?

@tjguk
Copy link
Collaborator

tjguk commented Mar 22, 2021

So is the current conclusion or theory that the issue is in test_close_splash_screen (or caused somewhere else, but uncovered there)?

Basically, yes. At least, leaving that one test skipped seems to give us a (nearly) flawless set of runs. The only failure I've had since wasn't a timeout but a segfault. (I've got the logs for that one stored locally somewhere).

@carlosperate
Copy link
Member Author

I think the original issue causing the issues listed here has been resolved.
We still have some segmentation faults, but these are captured in #1454, so we can close this ticket.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

3 participants