Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Use the systrace recorder if systracing is enabled at startup, and enable systracing in release mode on Android #28903

Merged
merged 5 commits into from
Oct 7, 2021

Conversation

dnfield
Copy link
Contributor

@dnfield dnfield commented Sep 28, 2021

Fix for dart-lang/sdk#45143

This patch makes it such that if systrace is enabled when the Flutter application starts, it uses the systrace recorder for the rest of the run.

This is not quite as nice as if the tracing messages switched back and forth to systrace/timeline depending on whether systrace is enabled, but that approach would likely be confusing anyway - some timeline events would never show up in the observatory and it would not be obvious why, and this avoids overhead of checking ATrace_isEnabled on every tracing call (or every few) and trying to swap out the recording implementation dynamically.

This also enables systracing on Android in release mode. This is particularly useful on low end devices where the service isolate uses a non-trivial amount of resources and messes with tracing.

This still needs a test. My general idea is to write something that would run on an emulator with systrace running, and validate that the trace contains expected Dart/Flutter events. Before writing that test, I want to get some feedback on the approach. I'm doing this in the NDK code rather than the Java code to allow for wider reach on supported API levels - the NDK supports down to 23, whereas the Java call only works on 29+.

/cc @bkonyi fyi
/cc @xster @jiahaog fyi

@flutter-dashboard
Copy link

It looks like this pull request may not have tests. Please make sure to add tests before merging. If you need an exemption to this rule, contact Hixie on the #hackers channel in Chat.

If you are not sure if you need tests, consider this rule of thumb: the purpose of a test is to make sure someone doesn't accidentally revert the fix. Ask yourself, is there anything in your PR that you feel it is important we not accidentally revert back to how it was before your fix?

Reviewers: Read the Tree Hygiene page and make sure this patch meets those guidelines before LGTMing.

Copy link
Member

@zanderso zanderso left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Overall approach sgtm

// This method is only called once at shell creation.
// Do this in C++ because the API is avialable at level 23 here, but only 29+ in
// Java.
static bool ATrace_isEnabled() {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If this is already in an anonymous namespace, does it need to be static?

Also consider adding logging in here if dlopen or dlsym fail.

Should failure to find them be fatal? Should it be fatal in our testing environment?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Dlopen would be fatal but dlsym not because we could be on a lower api level

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nit: Please call this method name anything other than the name in libandroid.so to indicate to the caller that this is not from stock Android. Or to avoid potential issues when if/when the header for this is included in the translation unit.

Or, just follow the style guide for method names here.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nit: Please use fml::NativeLibrary. It's probably exactly the same behavior but I'd rather have all interaction with native libraries go through one place in the engine.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Renamed it and using NativeLibrary now.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

(I can't use ResolveSymbol since that doesn't really let you resolve functions)

Copy link
Member

@chinmaygarde chinmaygarde left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am fine forwarding to systrace in release. But how and when the forwarding happens is already non-trivially hard to reason about. And will become more so when as we add more cases to the list you mentioned.

How about something like tracing to systrace only when the --trace-systrace flag is available or if systrace is the only available trace recorder (release modes)? Folks can trace in profile mode overhead is too high. And, if the user species the trace-systrace flag, it will always trace to systrace no matter what. No need to check what other flags were specified.

// This method is only called once at shell creation.
// Do this in C++ because the API is avialable at level 23 here, but only 29+ in
// Java.
static bool ATrace_isEnabled() {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nit: Please call this method name anything other than the name in libandroid.so to indicate to the caller that this is not from stock Android. Or to avoid potential issues when if/when the header for this is included in the translation unit.

Or, just follow the style guide for method names here.

// This method is only called once at shell creation.
// Do this in C++ because the API is avialable at level 23 here, but only 29+ in
// Java.
static bool ATrace_isEnabled() {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nit: Please use fml::NativeLibrary. It's probably exactly the same behavior but I'd rather have all interaction with native libraries go through one place in the engine.

@@ -75,6 +92,23 @@ void FlutterMain::Init(JNIEnv* env,

auto settings = SettingsFromCommandLine(command_line);

// Turn systracing on if ATrace_isEnabled is true AND
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am worried that as we add to these cases here, reasoning about when tracing to systrace will actually be enabled is going to be get hard to reason about. For instance, how is the user to know that sytraceing is incompatible with startup tracing? Seems like a reasonable thing to attempt.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmm. I can remove the trace_startup logic for sure. I thought we did something weird with the VM arguments in this case, but it's actually just the tool that causes the app to exit after startup, not the engine.

The main problem is that in release mode, none of these flags are passed typically - the Flutter tooling does not set any intents in release mode for this, and even if it did many applications strip out such debugging related intents in release mode.

Are you ok with this if it's just simplified down to a warning if you ask for the endless recorder and this happens, mentioning that the size of the buffer will be controlled by the systrace invocation rather than by us?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've simplified this significantly. I'm still checking whether ATrace is enabled because I don't like the idea to defaulting it to on for release - it seems like it may impact performance unnecessarily.

@@ -37,6 +39,21 @@ extern const intptr_t kPlatformStrongDillSize;

namespace {

// This is only available on API23+, so dynamically look it up.
// This method is only called once at shell creation.
// Do this in C++ because the API is avialable at level 23 here, but only 29+ in
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: avialable -> available

@@ -37,6 +39,21 @@ extern const intptr_t kPlatformStrongDillSize;

namespace {

// This is only available on API23+, so dynamically look it up.
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: API 23+

FML_CHECK(libandroid);
void* atrace_fn = ::dlsym(libandroid->GetHandle(), "ATrace_isEnabled");
if (atrace_fn) {
enabled = reinterpret_cast<bool (*)(void)>(atrace_fn)();
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: return reinterpret_cast<bool (*)(void)>(atrace_fn)();

if (atrace_fn) {
enabled = reinterpret_cast<bool (*)(void)>(atrace_fn)();
}
return enabled;
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: return false;

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done for all of these.

@google-cla
Copy link

google-cla bot commented Oct 6, 2021

All (the pull request submitter and all commit authors) CLAs are signed, but one or more commits were authored or co-authored by someone other than the pull request submitter.

We need to confirm that all authors are ok with their commits being contributed to this project. Please have them confirm that by leaving a comment that contains only @googlebot I consent. in this pull request.

Note to project maintainer: There may be cases where the author cannot leave a comment, or the comment is not properly detected as consent. In those cases, you can manually confirm consent of the commit author(s), and set the cla label to yes (if enabled on your project).

ℹ️ Googlers: Go here for more info.

@google-cla google-cla bot added cla: no and removed cla: yes labels Oct 6, 2021
@dnfield dnfield marked this pull request as ready for review October 6, 2021 23:40
@dnfield
Copy link
Contributor Author

dnfield commented Oct 6, 2021

This now has a test and is ready for review (/cc @goderbauer FYI)

Copy link
Member

@zanderso zanderso left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

lgtm w/ style concern

@@ -31,7 +35,22 @@ class NativeLibrary : public fml::RefCountedThreadSafe<NativeLibrary> {

static fml::RefPtr<NativeLibrary> CreateForCurrentProcess();

const uint8_t* ResolveSymbol(const char* symbol);
template <typename T>
const std::optional<T> ResolveSymbol(const char* symbol) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am not entirely convinced that this overload is in agreement with the advice from the style guide: https://google.github.io/styleguide/cppguide.html#Function_Overloading. I think it will be clear which overload is chosen based on the presence of the template arguments, but the semantics of the return type is different. Could this just have a different name? (If this conflicts with earlier review feedback from others, then feel free to disregard this concern.)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I wasn't entirely sure if I was being too clever here.

I can give this a different name.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I renamed this to ResolveFunction.

The only reason I'm allowed to cheat on the return type is because of the template parameter. But It's kind of a weird case in the language and this should be more obvious.

I considered refactoring the uint8_t* version to return a std::optional, but it doesn't seem like it really helps that much to me in that case.

@dnfield dnfield added the waiting for tree to go green This PR is approved and tested, but waiting for the tree to be green to land. label Oct 7, 2021
@fluttergithubbot
Copy link
Contributor

This pull request is not suitable for automatic merging in its current state.

  • The status or check suite Linux Web Framework tests has failed. Please fix the issues identified (or deflake) before re-applying this label.

@fluttergithubbot fluttergithubbot removed the waiting for tree to go green This PR is approved and tested, but waiting for the tree to be green to land. label Oct 7, 2021
@zanderso
Copy link
Member

zanderso commented Oct 7, 2021

This may need a rebase.

@dnfield dnfield added the waiting for tree to go green This PR is approved and tested, but waiting for the tree to be green to land. label Oct 7, 2021
@dnfield dnfield merged commit 76c8115 into flutter:master Oct 7, 2021
@dnfield dnfield deleted the android_systrace branch October 7, 2021 22:59
engine-flutter-autoroll added a commit to engine-flutter-autoroll/flutter that referenced this pull request Oct 8, 2021
zanderso added a commit that referenced this pull request Oct 8, 2021
…, and enable systracing in release mode on Android (#28903)"

This reverts commit 76c8115.
zanderso added a commit that referenced this pull request Oct 8, 2021
…, and enable systracing in release mode on Android (#28903)" (#29071)

This reverts commit 76c8115.
dnfield added a commit to dnfield/engine that referenced this pull request Oct 8, 2021
…, and enable systracing in release mode on Android (flutter#28903)" (flutter#29071)"

This reverts commit a2773d5.
dnfield added a commit that referenced this pull request Oct 8, 2021
* Reland "Use the systrace recorder if systracing is enabled at startup, and enable systracing in release mode on Android (#28903)" (#29071)"

This reverts commit a2773d5.

* More logcat

* more logs

* Remove wait

* Avoid plugin registrar exception

* DEFAULT instead of LAUNCHER

* use am instead of monkey

* Update android_systrace_test.py
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
cla: yes needs tests platform-android waiting for tree to go green This PR is approved and tested, but waiting for the tree to be green to land.
Projects
None yet
5 participants