Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Leaked flutter_tester processes are leaking memory and cause 3xhead flakiness #35549

Closed
mkustermann opened this issue Jan 3, 2019 · 8 comments

Comments

@mkustermann
Copy link
Member

mkustermann commented Jan 3, 2019

We have a head-head-head integration builder, which builds head-flutter, head-engine, head-dart together and runs all tests. We continue to have a lot of flakiness on those bots.

Sometimes those bots run out of memory and need to be rebooted, sometimes compilation fails, sometimes they time out.

After investigating I've found at least one possible reason: Those machines are leaking flutter_tester processes, each of them takes up north of 60MB in RAM, after having leaked around 500 of them, the flakiness issues start happening.

The commands look like this:

/b/build/slave/flutter/build/flutter/bin/cache/artifacts/engine/linux-x64/flutter_tester \
    --run-forever --non-interactive --enable-dart-profiling --packages=.packages \
    --start-paused --flutter-assets-dir=build/flutter_assets build/flutter-tester-app.dill

Please notice the --run-forever flag, which comes from package:flutter_tools/src/tester/flutter_tester.dart:FlutterTesterDevice.startApp():

    final List<String> command = <String>[
      shellPath,
      '--run-forever',
      '--non-interactive',
      '--enable-dart-profiling',
      '--packages=${PackageMap.globalPackagesPath}',
    ];

@devoncarew @scheglov Are you guys the right people to look at this?

dart-bot pushed a commit that referenced this issue Jan 3, 2019
…ead builder work

Issue #35549

Change-Id: I3dd40b878d7c1c08a6f8c9edb520a276754e77b7
Reviewed-on: https://dart-review.googlesource.com/c/88328
Reviewed-by: Martin Kustermann <kustermann@google.com>
Commit-Queue: Martin Kustermann <kustermann@google.com>
Auto-Submit: Martin Kustermann <kustermann@google.com>
@devoncarew
Copy link
Member

I believe we needed to add --run-forever so that the app didn't terminate immediately after the main() method finished. Perhaps the tests driving the flutter-tester test device aren't disposing of it after the test finished?

@scheglov, do you mind investigating? @DanTup, do you have a sense for where we may be leaking these devices?

@DanTup
Copy link
Collaborator

DanTup commented Jan 3, 2019

I believe we needed to add --run-forever so that the app didn't terminate immediately after the main() method finished

That's correct - without this it quits immediately which doesn't allow us to test hot reloads.

@DanTup, do you have a sense for where we may be leaking these devices?

I found at least one place that they seem to leak, but I don't know if that's what's occurring here (since it's when you SIGINT before the signal handlers are set up):

flutter/flutter#20949

I've also seen some hangs which may or may not be resulting in orphaned processes too (normally they'd result in failed tests, but if they were just re-run then it's possible they leaked and weren't noticed):

flutter/flutter#21354

@scheglov
Copy link
Contributor

scheglov commented Jan 3, 2019

@devoncarew Realistically speaking, I will not be efficient investigating this.

@mkustermann
Copy link
Member Author

For now we just kill all leaked flutter_tester processes in our integration builder and also ignore these leaked processes (i.e. not make the bot red, see infra/1394440.

Though I assume this can happen to anyone running those tests locally as well, which is pretty bad.

The mentioned flutter/flutter#20949 and flutter/flutter#21354 have no owners. It would be great if someone familiar with the flutter tester could take those.

@vsmenon
Copy link
Member

vsmenon commented Jan 7, 2019

Is this an SDK issue or should it be filed elsewhere?

@DanTup
Copy link
Collaborator

DanTup commented Jan 8, 2019

I'd guess it's a Flutter tools issue rather than Dart (though it's hard to be sure without tracking down the cause).

I don't know if I can progress either of the issues linked above - flutter/flutter#20949 has questions about whether spawned processes in Dart should die when the parent does (Python seems to, flutter_tester does not - I don't know the rules or mechanisms for this) and flutter/flutter#21354 seems to hang inside flutter_tester (which is native) so I don't know how to debug it.

It's also possible this isn't either of those though. Looks like we're not giving unique folder names to each test - I'll fix this - I wonder if we could then see the cwd's of the leaked processes to see if there is a particular test or it's all of them.

@dgrove
Copy link
Contributor

dgrove commented Jan 10, 2019

@DanTup or @devoncarew can you please open this issue in the Flutter repo and close this one?

@DanTup
Copy link
Collaborator

DanTup commented Jan 10, 2019

I've re-opened this at flutter/flutter#26350.

@DanTup DanTup closed this as completed Jan 10, 2019
DanTup added a commit to flutter/flutter that referenced this issue Jan 17, 2019
This will help track down any that aren't cleaning up and also may help track down leaked flutter_tester processes (dart-lang/sdk#35549).
kangwang1988 pushed a commit to XianyuTech/flutter that referenced this issue Feb 12, 2019
This will help track down any that aren't cleaning up and also may help track down leaked flutter_tester processes (dart-lang/sdk#35549).
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

6 participants