Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[vm/ffi] structs test on android arm32: gc relocation bug #37511

Closed
dcharkes opened this issue Jul 12, 2019 · 8 comments
Closed

[vm/ffi] structs test on android arm32: gc relocation bug #37511

dcharkes opened this issue Jul 12, 2019 · 8 comments
Assignees
Labels
area-vm Use area-vm for VM related issues, including code coverage, and the AOT and JIT backends. crash Process exits with SIGSEGV, SIGABRT, etc. An unhandled exception is not a crash. gardening library-ffi

Comments

@dcharkes
Copy link
Contributor

dcharkes commented Jul 12, 2019

After landing support for structs in 32 bit (#36334) one of the tests now enabled for 32 bit arm fails. However, it only fails in product mode (log).

/===========================================================\
| ffi/structs_test is new and failed (Crash, expected Pass) |
\===========================================================/

--- Command "vm_compile_to_kernel []" (took 251ms):
DART_CONFIGURATION=ProductAndroidARM /b/swarming/w/ir/pkg/vm/tool/gen_kernel --no-aot --platform=out/ProductAndroidARM/vm_platform_strong.dill -o /b/swarming/w/ir/out/ProductAndroidARM/generated_compilations/dartk/tests_ffi_structs_test/out.dill /b/swarming/w/ir/tests/ffi/structs_test.dart --packages=/b/swarming/w/ir/.packages -Ddart.developer.causal_async_stacks=true

exit code:
0

--- Command "adb_precompilation" (took 02.000750s):
Steps to push Dart VM and Dill file to an attached device. Uses (and requires) adb.

exit code:
-6

stdout:
Executing adb -s 06ad7f0e003b6c8b shell rm -Rf /data/local/tmp/testing/test ; echo AdbShellExitCode:  $?
Stdout:
AdbShellExitCode: 0
ExitCode: 0
Time: 0:00:00.069053

Executing adb -s 06ad7f0e003b6c8b shell mkdir -p /data/local/tmp/testing/test ; echo AdbShellExitCode:  $?
Stdout:
AdbShellExitCode: 0
ExitCode: 0
Time: 0:00:00.065481

Executing Skipped cached push
ExitCode: 0
Time: 0:00:00.000066

Executing adb -s 06ad7f0e003b6c8b push /b/swarming/w/ir/out/ProductAndroidARM/generated_compilations/dartk/tests_ffi_structs_test/out.dill /data/local/tmp/testing/test/out.dill
Stderr:
4235 KB/s (5706568 bytes in 1.315s)
ExitCode: 0
Time: 0:00:01.403831

Executing adb -s 06ad7f0e003b6c8b shell export LD_LIBRARY_PATH=$LD_LIBRARY_PATH:/data/local/tmp/testing/test;/data/local/tmp/testing/dart --deterministic --optimization-counter-threshold=50 --enable-inlining-annotations --ignore-unrecognized-flags --packages=/b/swarming/w/ir/.packages /data/local/tmp/testing/test/out.dill ; echo AdbShellExitCode:  $?
Stdout:
WARNING: linker: /data/local/tmp/testing/dart: unused DT entry: type 0x6ffffef5 arg 0xd37c
WARNING: linker: /data/local/tmp/testing/dart: unused DT entry: type 0x6ffffffe arg 0x12808
WARNING: linker: /data/local/tmp/testing/dart: unused DT entry: type 0x6fffffff arg 0x3
../../runtime/vm/object.cc: 18656: error: unreachable code
Aborted
AdbShellExitCode: 134
ExitCode: -6
Time: 0:00:01.141814

--- Re-run this test:
python tools/test.py -n dartk-android-product-arm ffi/structs_test

Unreachable code that is hit:

int64_t Integer::AsInt64Value() const {
  // Integer is an abstract class.
  UNREACHABLE();
  return 0;
}
@dcharkes dcharkes added area-vm Use area-vm for VM related issues, including code coverage, and the AOT and JIT backends. gardening crash Process exits with SIGSEGV, SIGABRT, etc. An unhandled exception is not a crash. library-ffi labels Jul 12, 2019
@dcharkes dcharkes self-assigned this Jul 12, 2019
@dcharkes
Copy link
Contributor Author

dcharkes commented Jul 12, 2019

I have trouble reproducing this locally.

Build command on bot:

tools/build.py -aarm -mproduct runtime_kernel '--os=android' --no-start-goma -j200

Build command locally:

rm -rf out/ProductAndroidARM
tools/build.py -a arm -m product '--os=android' runtime_kernel

I have the same flags, and I do not see a separate gn.py command on the bot.

Test command on bot:

Click to expand.
DART_CONFIGURATION=ProductAndroidARM /b/swarming/w/ir/pkg/vm/tool/gen_kernel --no-aot --platform=out/ProductAndroidARM/vm_platform_strong.dill -o /b/swarming/w/ir/out/ProductAndroidARM/generated_compilations/dartk/tests_ffi_structs_test/out.dill /b/swarming/w/ir/tests/ffi/structs_test.dart --packages=/b/swarming/w/ir/.packages -Ddart.developer.causal_async_stacks=true

exit code:
0

--- Command "adb_precompilation" (took 02.000470s):
Steps to push Dart VM and Dill file to an attached device. Uses (and requires) adb.

exit code:
-6

stdout:
Executing adb -s 069a6592003ba933 shell rm -Rf /data/local/tmp/testing/test ; echo AdbShellExitCode:  $?
Stdout:
AdbShellExitCode: 0
ExitCode: 0
Time: 0:00:00.071593

Executing adb -s 069a6592003ba933 shell mkdir -p /data/local/tmp/testing/test ; echo AdbShellExitCode:  $?
Stdout:
AdbShellExitCode: 0
ExitCode: 0
Time: 0:00:00.051388

Executing Skipped cached push
ExitCode: 0
Time: 0:00:00.000034

Executing adb -s 069a6592003ba933 push /b/swarming/w/ir/out/ProductAndroidARM/generated_compilations/dartk/tests_ffi_structs_test/out.dill /data/local/tmp/testing/test/out.dill
Stderr:
5708 KB/s (5706568 bytes in 0.976s)
ExitCode: 0
Time: 0:00:01.070715

Executing adb -s 069a6592003ba933 shell export LD_LIBRARY_PATH=$LD_LIBRARY_PATH:/data/local/tmp/testing/test;/data/local/tmp/testing/dart --deterministic --optimization-counter-threshold=50 --enable-inlining-annotations --ignore-unrecognized-flags --packages=/b/swarming/w/ir/.packages /data/local/tmp/testing/test/out.dill ; echo AdbShellExitCode:  $?
Stdout:
../../runtime/vm/object.cc: 18706: error: unreachable code
Aborted 
AdbShellExitCode: 134
ExitCode: -6
Time: 0:00:01.212162

--- Re-run this test:
python tools/test.py -n dartk-android-product-arm ffi/structs_test

Test command locally (with an Android 64bit device):
(I'm unsure how to get the same amount of reporting, but if I add prints to the commands I get the following.)

Click to expand.
$ tools/test.py -n dartk-android-product-arm -j1 --verbose ffi/structs_test
Test configuration:
    dartk-android-product-arm(architecture: arm, compiler: dartk, mode: product, runtime: vm, system: android, preview-dart-2)
Suites tested: ffi
Found 1 Android devices.
Running "vm_compile_to_kernel []" command: DART_CONFIGURATION=ProductAndroidARM /usr/local/google/home/dacoharkes/dart-sdk/sdk/pkg/vm/tool/gen_kernel --no-aot --platform=out/ProductAndroidARM/vm_platform_strong.dill -o /usr/local/google/home/dacoharkes/dart-sdk/sdk/out/ProductAndroidARM/generated_compilations/dartk/tests_ffi_structs_test/out.dill /usr/local/google/home/dacoharkes/dart-sdk/sdk/tests/ffi/structs_test.dart --packages=/usr/local/google/home/dacoharkes/dart-sdk/sdk/.packages -Ddart.developer.causal_async_stacks=true
Running "vm_compile_to_kernel []" command: DART_CONFIGURATION=ProductAndroidARM /usr/local/google/home/dacoharkes/dart-sdk/sdk/pkg/vm/tool/gen_kernel --no-aot --platform=out/ProductAndroidARM/vm_platform_strong.dill -o /usr/local/google/home/dacoharkes/dart-sdk/sdk/out/ProductAndroidARM/generated_compilations/dartk/tests_ffi_function_structs_test/out.dill /usr/local/google/home/dacoharkes/dart-sdk/sdk/tests/ffi/function_structs_test.dart --packages=/usr/local/google/home/dacoharkes/dart-sdk/sdk/.packages -Ddart.developer.causal_async_stacks=true
Running "adb_precompilation" command: Steps to push Dart VM and Dill file to an attached device. Uses (and requires) adb.
Executing adb -s 21d9aecb shell rm -Rf /data/local/tmp/testing/test ; echo AdbShellExitCode:  $?
Executing adb -s 21d9aecb shell mkdir -p /data/local/tmp/testing/test ; echo AdbShellExitCode:  $?
Executing adb -s 21d9aecb push out/ProductAndroidARM/dart /data/local/tmp/testing/dart
Executing adb -s 21d9aecb push /usr/local/google/home/dacoharkes/dart-sdk/sdk/out/ProductAndroidARM/generated_compilations/dartk/tests_ffi_structs_test/out.dill /data/local/tmp/testing/test/out.dill
Executing adb -s 21d9aecb shell export LD_LIBRARY_PATH=$LD_LIBRARY_PATH:/data/local/tmp/testing/test;/data/local/tmp/testing/dart --deterministic --optimization-counter-threshold=50 --enable-inlining-annotations --ignore-unrecognized-flags --packages=/usr/local/google/home/dacoharkes/dart-sdk/sdk/.packages /data/local/tmp/testing/test/out.dill ; echo AdbShellExitCode:  $?
Done dartk-vm product_arm ffi/structs_test: pass
Running "adb_precompilation" command: Steps to push Dart VM and Dill file to an attached device. Uses (and requires) adb.
Executing adb -s 21d9aecb shell rm -Rf /data/local/tmp/testing/test ; echo AdbShellExitCode:  $?
Executing adb -s 21d9aecb shell mkdir -p /data/local/tmp/testing/test ; echo AdbShellExitCode:  $?
Executing Skipped cached push
Executing adb -s 21d9aecb push /usr/local/google/home/dacoharkes/dart-sdk/sdk/out/ProductAndroidARM/generated_compilations/dartk/tests_ffi_function_structs_test/out.dill /data/local/tmp/testing/test/out.dill
Executing adb -s 21d9aecb push out/ProductAndroidARM/libffi_test_functions.so /data/local/tmp/testing/test/libffi_test_functions.so
Executing adb -s 21d9aecb shell export LD_LIBRARY_PATH=$LD_LIBRARY_PATH:/data/local/tmp/testing/test;/data/local/tmp/testing/dart --ignore-unrecognized-flags --packages=/usr/local/google/home/dacoharkes/dart-sdk/sdk/.packages /data/local/tmp/testing/test/out.dill ; echo AdbShellExitCode:  $?
Done dartk-vm product_arm ffi/function_structs_test: pass

This also passes the same flags (--deterministic --optimization-counter-threshold=50 --enable-inlining-annotations) locally as on the bot through adb.

/cc @sjindel-google Is there something I missed w.r.t. the ffi-product bot?

Otherwise it might be due to the specific hardware (my phone), and I can try to debug it on the bot.

Looking in to the code without being able to reproduce it locally

There aren't any obvious calls to AsInt64Value which only happen in defined(PRODUCT). The ffi uses it in various places, but none that differ based on mode.

@dcharkes
Copy link
Contributor Author

This crash got fixed by enabling the constant update.

Blamelist: 9bbd319 Fri Jul 12 12:32 vegorov@google.com [dart] Enable constant-update-2018

ffi/structs_test Crash -> Pass ✔

on configuration dartk-android-product-arm

@sjindel-google
Copy link
Contributor

What was the original problem?

@dcharkes
Copy link
Contributor Author

I'm hoping to be able to find that out with the two different behaviors now, but I'm not able to reproduce it locally. If you have any suggestions on how to reproduce it locally, I'm open for suggestions.

I suspect it's something with constants being treated differently in product mode before the constant update. But it might also be a spurious correlation.

@sjindel-google
Copy link
Contributor

One possibility is that we are incorrectly allocating an instance of Integer -- you could add an assert to check for this and see if it catches on the CQ.

@dcharkes
Copy link
Contributor Author

dcharkes commented Jul 16, 2019

RawObject* Object::Allocate(intptr_t cls_id, intptr_t size, Heap::Space space) {
   // ...
  ASSERT(cls_id != Integer::kClassId);

The above does not get triggered on the bot, so it does not look like we actually allocate an Integer.

Commit 629f38c made product-mode hit unreachable again. So it looks like a spurious correlation. Moreover, the debug-mode now segfaults.

I'll see if the debug-mode crash reproduces locally. (edit: No, it doesn't.) Otherwise, I'll continue on the build bot.

Edit: Copying over my locally-built sdk to the bot does not trigger the crash. Neither does using the bot-built sdk on my machine. Only the combination of the bot-built sdk and the phone connected to the bot trigger this crash.

@dcharkes dcharkes changed the title [vm/ffi] structs test on android arm32 in product mode [vm/ffi] structs test on android arm32 Jul 16, 2019
@sjindel-google
Copy link
Contributor

sjindel-google commented Jul 17, 2019

Another thing you can try is adding an assert to Object::setRaw:

DART_FORCE_INLINE void Object::SetRaw(RawObject* value) {
  NoSafepointScope no_safepoint_scope;
  raw_ = value;
  if ((reinterpret_cast<uword>(value) & kSmiTagMask) == kSmiTag) {
    set_vtable(Smi::handle_vtable_);
    return;
  }
  intptr_t cid = value->GetClassId();
  // Free-list elements cannot be wrapped in a handle.
  ASSERT(cid != kFreeListElement);
  ASSERT(cid != kForwardingCorpse);
  if (cid >= kNumPredefinedCids) {
    cid = kInstanceCid;
  }
  set_vtable(builtin_vtables_[cid]);
#if defined(DEBUG)
  if (FLAG_verify_handles) {
    Isolate* isolate = Isolate::Current();
    Heap* isolate_heap = isolate->heap();
    Heap* vm_isolate_heap = Dart::vm_isolate()->heap();
    uword addr = RawObject::ToAddr(raw_);
    if (!isolate_heap->Contains(addr) && !vm_isolate_heap->Contains(addr)) {
      ASSERT(FLAG_write_protect_code);
      addr = RawObject::ToAddr(HeapPage::ToWritable(raw_));
      ASSERT(isolate_heap->Contains(addr) || vm_isolate_heap->Contains(addr));
    }
  }
#endif
}

@dcharkes dcharkes changed the title [vm/ffi] structs test on android arm32 [vm/ffi] structs test on android arm32: gc relocation bug Jul 19, 2019
@dcharkes
Copy link
Contributor Author

dcharkes commented Jul 19, 2019

The last debug build that crashed reproduced locally.

The test can be boiled down to this:

// VMOptions=--deterministic

import 'dart:ffi';

const highAddress32bit = 0xFFFFFFF0;
const highAddress64bit = 0xFFFFFFFFFFFFFFF0;

final int highAddress =
    sizeOf<IntPtr>() == 4 ? highAddress32bit : highAddress64bit;

final Pointer<Int64> c1 = Pointer.fromAddress(highAddress);

final double ten = 10.0;

void main() {
  for (int i = 0; i < 300000; i++) {
    if (i % 1000 == 0) print(i);

    Pointer<Double> field = c1.cast();
  }
}

The bug happened when first a Pointer is allocated, and subsequently a Mint.

The above regression test relies on many allocations to trigger the garbage collection bug. We should improve this test by triggering a garbage collection on the n-th allocation. I'll land the fix today, and make a proper regression test when I get back.

edit: link to specific commit for regression test testing: 83d2aaa

dart-bot pushed a commit that referenced this issue Jul 19, 2019
Issue: #37511

Change-Id: Ibabe6a49b6fe38032da544a6520bdc398d496ba0
Cq-Include-Trybots: luci.dart.try:vm-ffi-android-debug-arm-try, app-kernel-linux-debug-x64-try, vm-kernel-linux-debug-simdbc64-try,vm-kernel-linux-debug-ia32-try,vm-dartkb-linux-debug-simarm64-try,vm-kernel-win-debug-x64-try,vm-kernel-win-debug-ia32-try,vm-dartkb-linux-debug-x64-try,vm-kernel-precomp-linux-debug-x64-try,vm-ffi-android-product-arm-try,vm-ffi-android-release-arm-try
Reviewed-on: https://dart-review.googlesource.com/c/sdk/+/108805
Auto-Submit: Daco Harkes <dacoharkes@google.com>
Reviewed-by: Samir Jindel <sjindel@google.com>
Commit-Queue: Samir Jindel <sjindel@google.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area-vm Use area-vm for VM related issues, including code coverage, and the AOT and JIT backends. crash Process exits with SIGSEGV, SIGABRT, etc. An unhandled exception is not a crash. gardening library-ffi
Projects
None yet
Development

No branches or pull requests

2 participants