Skip to content

Address flakes from using temp directory to run tests #2649

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 7 commits into from
Jul 14, 2025

Conversation

srujzs
Copy link
Contributor

@srujzs srujzs commented Jul 9, 2025

2172ba7 added support to run tests in a temporary directory.

This results in two flaky issues:

  1. On Windows, build_daemon tests may fail to delete the
    temp directory because the process may have not been torn
    down yet, so it may still be accessing the file system.
    There was an initial retry after 1 second, but that appears
    to be not enough looking at a recent test run. See https://github.com/dart-lang/webdev/actions/runs/16157459808/job/45602784802?pr=2649
    for an example. It's printed as a warning.
  2. If a test times out, its tearDown may not be called. frontend_server_common
    relies on modifying the current directory via dart:io. Without a tearDown,
    it never restores the directory the process was in. This
    leads to cascading failures in subsequent tests due to it no
    longer being in a path that contains 'webdev'. See
    https://github.com/dart-lang/webdev/actions/runs/15989286213/job/45099373212?pr=2641
    for an example. See tearDown is not called if a test times out test#897
    as well for tracking work to call tearDown on timeouts.

To address the above issues:

  1. Add exponential backoffs to try again (up to a max number of attempts). Ideally, we'd be able to wait for some event instead, but it doesn't look like the daemon exposes the process nor does that necessarily mean Windows doesn't still have a lock on it.
  2. Migrate frontend_server_common to not change the current working directory in the FileSystem and instead always use full URIs wherever needed.

srujzs added 2 commits July 8, 2025 17:34
dart-lang@2172ba7
added support to run tests in a temporary directory.

This results in two flaky issues:

1. On Windows, build_daemon tests may fail to delete the
temp directory because the process may have not been torn
down yet, so it may still be accessing the file system.
There was an initial retry after 1 second, but that appears
to be not enough looking at a recent test run.
2. If a test times out, its tearDown may not be called. In
this case, the ResidentWebRunner in frontend_server may not
restore the current directory in the LocalFileSystem. This
leads to cascading failures in subsequent tests due to no
longer being in a path that contains 'webdev'. See
https://github.com/dart-lang/webdev/actions/runs/15989286213/job/45099373212?pr=2641
for an example. See dart-lang/test#897
as well for tracking work to call tearDown on timeouts.

To address the above issues:

1. Increase the delay between the two tries and assert this
only occurs on Windows.
2. Cache the current directory so that it can be used in
utilities.dart with the same (correct) value every time.
@srujzs srujzs requested review from bkonyi and nshahan July 14, 2025 16:38
@nshahan
Copy link
Contributor

nshahan commented Jul 14, 2025

Looks like we still see an existing failure on the unit_test; windows; Dart dev; PKG: webdev; dart test -j 1. but I think that failure isn't expected to be fixed by this change. Can you confirm?

@srujzs
Copy link
Contributor Author

srujzs commented Jul 14, 2025

Looks like we still see an existing failure on the unit_test; windows; Dart dev; PKG: webdev; dart test -j 1. but I think that failure isn't expected to be fixed by this change. Can you confirm?

Indeed, that's failing in the CI already and will require a separate PR: https://github.com/dart-lang/webdev/actions/runs/16243343947/job/45862639306

Copy link
Collaborator

@bkonyi bkonyi left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM!

Updating things so we're not changing the CWD is a big improvement. I'm fairly certain this is something that we've been burned by in Flutter Tools tests since there's a warning if you try to change the CWD in a test.

Windows is a PITA when it comes to cleaning up resources after a test. As long as we don't crash or cause the test to fail, it's fine to occasionally leak these resources since they'll only persist as long as the container is alive anyway.

Copy link
Contributor

@nshahan nshahan left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the stability improvements!

@srujzs
Copy link
Contributor Author

srujzs commented Jul 14, 2025

Windows is a PITA when it comes to cleaning up resources after a test.

Indeed, and even with this PR, we're not guaranteed to free up the temp directory (as seen in the logs). But hopefully this reduces the number of instances where this occurs.

Thanks for the reviews!

@srujzs srujzs merged commit 4dbddc8 into dart-lang:main Jul 14, 2025
46 of 47 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants