Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Increase thread priority before engine init #20922

Merged
merged 3 commits into from
Sep 9, 2020

Conversation

linxuebin1990
Copy link
Member

Description

We get huge ANRs in not pure flutter app.
We found that 0.15% of the engine initialization took more than 10 seconds, every 10 million starts.

Fellow code show, engine init in ui thread. So Increase thread priority before engine init can reduce time cost

// shell/common/shell.cc
fml::TaskRunner::RunNowOrPostTask(
      shell->GetTaskRunners().GetUITaskRunner(),
      fml::MakeCopyable([&engine_promise,                                 //
                         shell = shell.get(),                             //
                         &dispatcher_maker,                               //
                         &platform_data,                                  //
                         isolate_snapshot = std::move(isolate_snapshot),  //
                         vsync_waiter = std::move(vsync_waiter),          //
                         &weak_io_manager_future,                         //
                         &snapshot_delegate_future,                       //
                         &unref_queue_future                              //
  ]() mutable {
        TRACE_EVENT0("flutter", "ShellSetupUISubsystem");
        const auto& task_runners = shell->GetTaskRunners();

        // The animator is owned by the UI thread but it gets its vsync pulses
        // from the platform.
        auto animator = std::make_unique<Animator>(*shell, task_runners,
                                                   std::move(vsync_waiter));

        engine_promise.set_value(std::make_unique<Engine>(
            *shell,                         //
            dispatcher_maker,               //
            *shell->GetDartVM(),            //
            std::move(isolate_snapshot),    //
            task_runners,                   //
            platform_data,                  //
            shell->GetSettings(),           //
            std::move(animator),            //
            weak_io_manager_future.get(),   //
            unref_queue_future.get(),       //
            snapshot_delegate_future.get()  //
            ));
      }));

Tests

Checklist

Before you create this PR confirm that it meets all requirements listed below by checking the relevant checkboxes ([x]). This will ensure a smooth and quick review process.

  • I read the contributor guide and followed the process outlined there for submitting PRs.
  • I signed the CLA.
  • I read and followed the C++, Objective-C, Java style guides for the engine.
  • I read the tree hygiene wiki page, which explains my responsibilities.
  • I updated/added relevant documentation.
  • All existing and new tests are passing.
  • I am willing to follow-up on review comments in a timely manner.

Breaking Change

Did any tests fail when you ran them? Please read handling breaking changes.

We get huge ANRs in not pure flutter app.
We found that 0.15% of the engine initialization took more than 10 seconds, every 10 million starts.
@flutter-dashboard
Copy link

It looks like this pull request may not have tests. Please make sure to add tests before merging. If you need an exemption to this rule, contact Hixie on the #hackers channel in Chat.

Reviewers: Read the Tree Hygiene page and make sure this patch meets those guidelines before LGTMing.

@linxuebin1990
Copy link
Member Author

@liyuqian cc ?

change shell->shell_->GetTaskRunners() to task_runners
@liyuqian liyuqian added perf: speed Performance issues related to (mostly rendering) speed severe: performance Relates to speed or footprint issues. labels Sep 2, 2020
@liyuqian liyuqian requested review from jason-simmons, zanderso, liyuqian and chinmaygarde and removed request for gw280 September 2, 2020 18:06
@liyuqian
Copy link
Contributor

liyuqian commented Sep 2, 2020

The change looks reasonable. One challenging question is how to test and verify it. The expected improvement seems to be making 0.15% of engine initialization faster than 10 seconds, as previously observed by 10 million app startups in add-to-app scenarios on mobile devices. It is probably beyond our devicelab's capability to capture that signal reliably on mobile devices.

I wonder if we can simulate this in host machines with shell_benchmarks. Currently, it has a ShellInitializationAndShutdown benchmark which takes about 20ms for 1 iteration. Could we capture the improvement by running ShellInitializationAndShutdown for 1000 iterations and record the worst time? Could we spin num_cpu - 4 busy threads to artificially create some thread priority competition so we can observe this improvement more easily?

revert is_valid_ value
@linxuebin1990
Copy link
Member Author

The change looks reasonable. One challenging question is how to test and verify it. The expected improvement seems to be making 0.15% of engine initialization faster than 10 seconds, as previously observed by 10 million app startups in add-to-app scenarios on mobile devices. It is probably beyond our devicelab's capability to capture that signal reliably on mobile devices.

I wonder if we can simulate this in host machines with shell_benchmarks. Currently, it has a ShellInitializationAndShutdown benchmark which takes about 20ms for 1 iteration. Could we capture the improvement by running ShellInitializationAndShutdown for 1000 iterations and record the worst time? Could we spin num_cpu - 4 busy threads to artificially create some thread priority competition so we can observe this improvement more easily?

OK, I will do measure it by ShellInitializationAndShutdown .

@linxuebin1990
Copy link
Member Author

The change looks reasonable. One challenging question is how to test and verify it. The expected improvement seems to be making 0.15% of engine initialization faster than 10 seconds, as previously observed by 10 million app startups in add-to-app scenarios on mobile devices. It is probably beyond our devicelab's capability to capture that signal reliably on mobile devices.

I wonder if we can simulate this in host machines with shell_benchmarks. Currently, it has a ShellInitializationAndShutdown benchmark which takes about 20ms for 1 iteration. Could we capture the improvement by running ShellInitializationAndShutdown for 1000 iterations and record the worst time? Could we spin num_cpu - 4 busy threads to artificially create some thread priority competition so we can observe this improvement more easily?

We will verify this modification on Toutiao or Douyin in the near future.

Copy link
Member

@chinmaygarde chinmaygarde left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

IMO this is a fine low-risk change and beyond the capabilities of our current benchmark harnesses to measure.

Copy link
Contributor

@liyuqian liyuqian left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Rerunning the Mac Host Engine test to see if it's just flaky. Agree that this is low-risk so landing without a test is reasonable. Of course, a test will always be appreciated as it can prevent accidental regressions, or capture similar performance issues caused by threads contention.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
cla: yes perf: speed Performance issues related to (mostly rendering) speed platform-android severe: performance Relates to speed or footprint issues.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants