Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Android implementation of timers in eventhandler_android.cc was incorrect #54868

Closed
mraleph opened this issue Feb 9, 2024 · 0 comments
Closed
Assignees
Labels
area-vm Use area-vm for VM related issues, including code coverage, and the AOT and JIT backends.

Comments

@mraleph
Copy link
Member

mraleph commented Feb 9, 2024

Android implementation of timers was using epoll_wait timeout to drive timer wake ups:

  while (!handler_impl->shutdown_) {
    int64_t millis = handler_impl->GetTimeout();
    ASSERT((millis == kInfinityTimeout) || (millis >= 0));
    if (millis > kMaxInt32) {
      millis = kMaxInt32;
    }
    intptr_t result = TEMP_FAILURE_RETRY_NO_SIGNAL_BLOCKER(
        epoll_wait(handler_impl->epoll_fd_, events, kMaxEvents, millis));
    ASSERT(EAGAIN == EWOULDBLOCK);
    if (result == -1) {
      if (errno != EWOULDBLOCK) {
        perror("Poll failed");
      }
    } else {
      handler_impl->HandleTimeout();
      handler_impl->HandleEvents(events, result);
    } 

This code has a subtle bug: if syscall returns EINTR TEMP_FAILURE_RETRY_NO_SIGNAL_BLOCKER will retry syscall AS IS[ without adjusting timeout. This meant that each EINTR would effectively delay expiration of the timeout by millis.

This bug went unnoticed for a while because EINTR does not occur all that often in practice. However it was recently revealed due to interaction with Android's Cached Apps Freezer, which is implemented on top of cgroups freezer. When Android freezes and then subsequently thaws an app a pending epoll_wait completes with EINTR. This means that freezing an app which has a timer with large delay and then thawing it would effectively reset the timer, introducing delay affecting that timer and all other timers waiting in the queue after this timer (assuming that there is no other event handler activity in the application).

This was observed in the wild on an internal Flutter application where a Timer created by a UI interaction sometimes would fire with a huge delay causing UI to become stuck (b/311165013).

The Linux implementation of the event handler was migrated to timerfd 10 years ago (9a8edb6) but the same change was never ported to eventhandler_android.cc.

In reality there is no strong reason for eventhandler_android.cc to exist in the first place. It is almost line by line copy of eventhandler_linux.cc which slightly drifted away from Linux implementation over time. So I am going to fix this bug by removing all Android specific files in runtime/bin and use Linux implementation instead (with a couple of #ifdef-s where OSes differ in a meaningful way).

CL: https://dart-review.googlesource.com/c/sdk/+/350923

@mraleph mraleph added the area-vm Use area-vm for VM related issues, including code coverage, and the AOT and JIT backends. label Feb 9, 2024
@mraleph mraleph self-assigned this Feb 9, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area-vm Use area-vm for VM related issues, including code coverage, and the AOT and JIT backends.
Projects
None yet
Development

No branches or pull requests

1 participant