Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Linear scan crash: divergence #36977

Closed
aartbik opened this issue May 15, 2019 · 10 comments
Closed

Linear scan crash: divergence #36977

aartbik opened this issue May 15, 2019 · 10 comments
Assignees
Labels
area-vm Use area-vm for VM related issues, including code coverage, and the AOT and JIT backends. dartfuzz Found with Dart fuzzing (DartFuzz, libFuzzer, etc.)

Comments

@aartbik
Copy link
Contributor

aartbik commented May 15, 2019

Last night Dartfuzz results with two crashes. The first looks like the typical "false" divergence due to allocating too much. We have some plans in our design doc to avoid these cases better.

The second, however, may be something else

Isolate (/b/s/w/itglMuvJ/dart_fuzzVQTSUI) JIT-DebugX64 - JIT-DEPOPTEVERY-ReleaseIA32: !DIVERGENCE! 1.13:348763544 (output=false)

fail1:
Exhausted heap space, trying to allocate 1176490000016 bytes.
Exhausted heap space, trying to allocate 16807000272 bytes.
Unhandled exception:
Out of Memory
#0      _StringBase._concatRangeNative (dart:core-patch/string_patch.dart:943:34)
#1      _StringBase._interpolate (dart:core-patch/string_patch.dart:857:16)
#2      main (file:///b/s/w/itglMuvJ/dart_fuzzVQTSUI/fuzz.dart:252:69)
#3      _startIsolate.<anonymous closure> (dart:isolate-patch/isolate_patch.dart:300:19)
#4      _RawReceivePortImpl._handleMessage (dart:isolate-patch/isolate_patch.dart:171:12)
#5      X1.run (file:///b/s/w/itglMuvJ/dart_fuzzVQTSUI/fuzz.dart:175:11)
#6      X2.run (file:///b/s/w/itglMuvJ/dart_fuzzVQTSUI/fuzz.dart:240:11)
#7      main (file:///b/s/w/itglMuvJ/dart_fuzzVQTSUI/fuzz.dart:248:14)
#8      _startIsolate.<anonymous closure> (dart:isolate-patch/isolate_patch.dart:300:19)
#9      _RawReceivePortImpl._handleMessage (dart:isolate-patch/isolate_patch.dart:171:12)



Isolate (/b/s/w/itglMuvJ/dart_fuzzGDBAGX) JIT-ReleaseSIMDBC64 - JIT-ReleaseSIMDBC: !DIVERGENCE! 1.13:2331337291 (output=false)

fail2:

===== CRASH =====
si_signo=Segmentation fault(11), si_code=1, si_addr=0xfe2fd10b
version=2.3.1-edge.6ac6d3ae7df25cfb5fca9030c6d26e831aad22c4 (Wed May 15 01:57:17 2019 +0000) on "linux_dbc"
thread=5620, isolate=main(0x3954700)
  pc 0x01a46fbf fp 0xf62fce78 dart::FlowGraphAllocator::BlockLocation(dart::TemplateLocation<short, dart::FpuRegister>, int, int)
  pc 0x05aa88e8 fp 0xf62fcfb0 Unknown symbol
-- End of DumpStackTrace

@aartbik aartbik added area-vm Use area-vm for VM related issues, including code coverage, and the AOT and JIT backends. dartfuzz Found with Dart fuzzing (DartFuzz, libFuzzer, etc.) labels May 15, 2019
@aartbik aartbik self-assigned this May 15, 2019
@aartbik
Copy link
Contributor Author

aartbik commented May 15, 2019

The second issue reproduces for SIMDBC, in debug mode with a better assert failure:

../../runtime/vm/compiler/backend/linearscan.cc: 379: error: expected: end < first_use_interval()->start()
version=2.3.1-edge.b0727bd6612ed13aa92e55d127e4b336f17de4b8 (Mon May 13 17:47:11 2019 +0000) on "linux_dbc"
thread=112319, isolate=main(0x3475080)
pc 0x01b7bd7c fp 0xf273d908 dart::Profiler::DumpStackTrace(void*)
-- End of DumpStackTrace

@aartbik aartbik changed the title Memory allocation divergences Linear scan crash: divergence May 20, 2019
@aartbik
Copy link
Contributor Author

aartbik commented May 20, 2019

Happened again in a nightly:

JIT-DebugSIMDBC - JIT-DebugIA32: !DIVERGENCE! 1.13:2028195801 (output=false)

fail1:
../../runtime/vm/compiler/backend/linearscan.cc: 379: error: expected: end < first_use_interval()->start()
version=2.3.1-edge.cc2d5adec943ef02b2c1741857fb3bac5386dee7 (Wed May 15 21:27:19 2019 +0000) on "linux_dbc"
thread=5215, isolate=main(0x2a0e700)
  pc 0x01b4466c fp 0xf62bcc18 dart::Profiler::DumpStackTrace(void*)
-- End of DumpStackTrace

@aartbik
Copy link
Contributor Author

aartbik commented Jun 4, 2019

Keeps coming back. This needs to be fixed soon!

Isolate (/b/s/w/itidxZjX/dart_fuzzGPURYM) JIT-DEPOPTEVERY-ReleaseSIMDBC - KBC-MIX-O3-ReleaseX64: !DIVERGENCE! 1.13:2594362257 (output=false)

fail1:

===== CRASH =====
si_signo=Segmentation fault(11), si_code=1, si_addr=0xfe2fe39b
version=2.3.2-edge.e31af09992382aeb8d5182f15d44104ddbf9c84e (Mon Jun 3 02:20:26 2019 +0000) on "linux_dbc"
thread=2753, isolate=main(0x37d4700)
  pc 0x01a58caf fp 0xf62fe108 dart::FlowGraphAllocator::BlockLocation(dart::TemplateLocation<short, dart::FpuRegister>, int, int)
  pc 0x05bcea60 fp 0xf62fe240 Unknown symbol
-- End of DumpStackTrace

We end up with end = 2147483647 and start() = 1979 in the following assert:

ASSERT(end < first_use_interval()->start());

@aartbik
Copy link
Contributor Author

aartbik commented Jun 5, 2019

What happens is that, for a parameter in ProcessInitialDefinition(), DBC calls BlockLocation() with the maximum position as end.

BlockLocation(Location::RegisterLocation(slot_index), 0, kMaxPosition);

Later, this blows up, either on the following assert in AddUseInterval() with end = 2147483647 and start() = 1979:

ASSERT(end < first_use_interval()->start());

or, without this assert, a bit later while accessing the BitField out of bounds during encode().

@aartbik
Copy link
Contributor Author

aartbik commented Jun 6, 2019

The assert happens for the R1-allocated parameter in a catch block in X0::foo0_0(). All registers see a similar range and, except for R1, an empty use interval. It blows up for R1, which already has a [1979, 1981) interval, which cannot be combined with [0, 2147483647).

B167[target catch try_idx -1 catch_try_idx 0] {
      v2 <- Parameter(0) T{X0}
      v3 <- Parameter(1) T{Map<int, String>?}
      v4 <- Parameter(2) T{int?}
      v5 <- Parameter(3)
      v6 <- Parameter(4)
      v7 <- Parameter(5)
      v8 <- SpecialParameter(kException) T{*}
      v9 <- SpecialParameter(kStackTrace)
      v10 <- Parameter(8)
      v11 <- Parameter(9)
      v12 <- Parameter(10)
      .....
      v32 <- Parameter(30)
      .....
      v40 <- Parameter(38)
      v41 <- Parameter(39)

range [1944, 1946) for v2 has been allocated to R31:
range [1944, 1946) for v3 has been allocated to R30:
range [1944, 1946) for v4 has been allocated to R29:
range [1944, 1945) for v5 has been allocated to R28:
range [1944, 1946) for v6 has been allocated to R27:
range [1944, 1945) for v7 has been allocated to R26:
range [1944, 1945) for v8 has been allocated to ExceptionReg:
range [1944, 1945) for v9 has been allocated to StackTraceReg:
range [1944, 1945) for v10 has been allocated to R23:
range [1944, 1945) for v11 has been allocated to R22:
range [1944, 1945) for v12 has been allocated to R21:
range [1944, 1945) for v13 has been allocated to R20:
.....
range [1944, 1945) for v27 has been allocated to R6:
range [1944, 1945) for v28 has been allocated to R5:
range [1944, 1945) for v29 has been allocated to R4:
range [1944, 1945) for v30 has been allocated to R3:
range [1944, 1945) for v31 has been allocated to R2:
range [1944, 1945) for v32 has been allocated to R1:   <= has a use interval
ASSERT FAIL!

@aartbik
Copy link
Contributor Author

aartbik commented Jun 6, 2019

@mkustermann would you mind having a look if something jumps out?

@aartbik aartbik assigned mraleph and unassigned mkustermann Jun 13, 2019
@aartbik
Copy link
Contributor Author

aartbik commented Jun 13, 2019

@mraleph any insights? Is it just because we run out of registers and that was never seen before (on DBC64, we start at R63), or is is something else?

@mraleph
Copy link
Member

mraleph commented Jun 16, 2019

Based on a brief look - yeah this is just out of registers situation. On DBC32 we have only 32 registers and we need a register for every Parameter() in the catch block entry.

I think code in the ProcessInitialDefinition that handles Parameter inside CatchBlockEntry for DBC there needs to be a check that we are not touching the space for normal parameters.

If we do - then we should just Bailout similar to how we do it when we attempt to spill.

[I actually don't remember why this number is set so low - because instruction encoding can accomodate up to 256 registers. Though if we set number of registers to 256 frame size for functions with try catch would balloon due to somewhat bizarre way of doing try-catch sync].

@mraleph mraleph removed their assignment Jun 16, 2019
@mraleph
Copy link
Member

mraleph commented Jun 16, 2019

@mkustermann Martin could you work with @aartbik and find somebody to fix this bug? Should be rather straightforward change. Let me know if you need to guidance from me.

@aartbik
Copy link
Contributor Author

aartbik commented Jun 17, 2019

I made a CL that fixes this. Please have a look if the solution makes sense.

https://dart-review.googlesource.com/c/sdk/+/106429

dart-bot pushed a commit that referenced this issue Jun 19, 2019
Rationale:
Code had no guard on running out of registers

#36977

Change-Id: Ifec1b7ef626e6a8720c437f0f25d119956dfb63a
Reviewed-on: https://dart-review.googlesource.com/c/sdk/+/106429
Commit-Queue: Aart Bik <ajcbik@google.com>
Reviewed-by: Martin Kustermann <kustermann@google.com>
Reviewed-by: Vyacheslav Egorov <vegorov@google.com>
@aartbik aartbik closed this as completed Jun 19, 2019
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area-vm Use area-vm for VM related issues, including code coverage, and the AOT and JIT backends. dartfuzz Found with Dart fuzzing (DartFuzz, libFuzzer, etc.)
Projects
None yet
Development

No branches or pull requests

3 participants