Skip to content

Conversation

juj
Copy link
Collaborator

@juj juj commented Oct 7, 2025

A major headache when working with the browser test runs in both the single-threaded and parallel harness on the CI are stray browser processes that stay alive across browser test runs.

This happens with Safari, and also with Firefox browser, on each of Windows, Linux and macOS systems.

It is easy to get especially MacOS systems to hang so that one cannot even remote-desktop in to the CI hardware anymore, since all system resources are taken up by the runaway browser processes.

One might argue that these are just "bugs", and after we fix those one by one, we would have a working test harness where browser processes are cleanly shut down so that stray browser processes would not remain.

That might be the case, though we need a system that is robust to these types of bugs.

To help this, add a failsafe mechanism in the form of a new --force-browser-process-termination flag to the test runner, to help tear down browser processes before and after a test suite run. This way even if the harness is misbehaving, there will be a backstop that aggressively attempts to clean up all browsers between test runs.

This way I can have the CI tasks force-shutdown all runners between suite runs, to ensure that even if things are still not working 100%, I will not need to go and physically keep pressing the power button on my Mac Minis to boot them back up.

juj added 3 commits October 7, 2025 03:28
… help tear down browser processes before and after a test suite run.
test/runner.py Outdated
common.terminate_list_of_processes(procs)

atexit.register(terminate_all_browser_processes)
terminate_all_browser_processes()
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I wish this kind of code could live in the setUpClass of the BrowserCore, but I guess that problem with that is we don't want to run this on each parallel worker?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, I checked specifically there first, but the class setup function is run once for each parallel worker, so wouldn't cut it.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Its kind of unfortunate (and maybe a bug) that setUpClass is called more then once per per run.. maybe we should add some kind of setUpClassOncePerRun helper :)

test/runner.py Outdated
'Useful when combined with --failfast')
parser.add_argument('--force64', action='store_true')
parser.add_argument('--crossplatform-only', action='store_true')
parser.add_argument('--force-browser-process-termination', action='store_true', help='If true, a fail-safe method is used to ensure that all browser processes are terminated before and after the test suite run.')
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is there any reason not to just do this on startup?

Seems like doing that way would also make it easier to debug any stalled processes and doing on shutdown doesn't really help any use cases does it?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The CI might run other non-browser suites after browser suites.. so if there were stray browser processes, they would persist over to hogging resources in those next suites.

Debuggability is a non-concern.. when debugging, one will naturally just not pass this flag. This is intended specifically for unattended production runs, where one is not there to debug anything.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I was think it might also be useful for my workflow which is:

  1. Always over ssh, so always headless.
  2. On a host where I never expect to have existing browsers running.
  3. I find myself often running pkill -f google-chrome , "just in case" before I try to run any browser tests.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Would it be useful to split to two flags, --force-browser-terminate-before and -after?

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No worries the CI might run other non-browser suites after browser suites rational makes sense to me.

Do you want to make this an env var too so you can apply it universally to your CI bots?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good idea, that would make setup a bit simpler.

@sbc100 sbc100 requested a review from brendandahl October 7, 2025 00:43
@juj juj merged commit 02c9e71 into emscripten-core:main Oct 7, 2025
3 of 15 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants