org.apache.gluten.exception.GlutenException: java.lang.RuntimeException: Exception: VeloxRuntimeError
Error Source: RUNTIME
Error Code: INVALID_STATE
Reason: Operator::isBlocked failed for [operator: HashBuild, plan node ID: 2]: Error during calling Java code from native code: org.apache.gluten.memory.memtarget.ThrowOnOomMemoryTarget$OutOfMemoryException: Not enough spark off-heap execution memory. Acquired: 8.0 MiB, granted: 7.0 MiB. Try tweaking config option spark.memory.offHeap.size to get larger space to run this application (if spark.gluten.memory.dynamic.offHeap.sizing.enabled is not enabled).
Current config settings:
spark.gluten.memory.offHeap.size.in.bytes=40.0 GiB
spark.gluten.memory.task.offHeap.size.in.bytes=10.0 GiB
spark.gluten.memory.conservative.task.offHeap.size.in.bytes=5.0 GiB
spark.memory.offHeap.enabled=true
spark.gluten.memory.dynamic.offHeap.sizing.enabled=false
Memory consumer stats:
Task.105759: Current used bytes: 40.0 GiB, peak bytes: N/A
\- Gluten.Tree.180: Current used bytes: 40.0 GiB, peak bytes: 40.0 GiB
\- root.180: Current used bytes: 40.0 GiB, peak bytes: 40.0 GiB
+- WholeStageIterator.180: Current used bytes: 40.0 GiB, peak bytes: 40.0 GiB
| \- single: Current used bytes: 40.0 GiB, peak bytes: 40.0 GiB
| +- WholeStageIterator_root: Current used bytes: 40.0 GiB, peak bytes: 40.0 GiB
| | +- task.Gluten_Stage_5_TID_105759: Current used bytes: 40.0 GiB, peak bytes: 40.0 GiB
| | | +- node.2: Current used bytes: 40.0 GiB, peak bytes: 40.0 GiB
| | | | +- op.2.1.0.HashBuild: Current used bytes: 40.0 GiB, peak bytes: 40.0 GiB
| | | | \- op.2.0.0.HashProbe: Current used bytes: 481.9 KiB, peak bytes: 14.5 MiB
| | | +- node.1: Current used bytes: 0.0 B, peak bytes: 0.0 B
| | | | \- op.1.1.0.ValueStream: Current used bytes: 0.0 B, peak bytes: 0.0 B
| | | +- node.3: Current used bytes: 0.0 B, peak bytes: 0.0 B
| | | | \- op.3.0.0.FilterProject: Current used bytes: 0.0 B, peak bytes: 0.0 B
| | | +- node.4: Current used bytes: 0.0 B, peak bytes: 1024.0 KiB
| | | | \- op.4.0.0.FilterProject: Current used bytes: 0.0 B, peak bytes: 24.0 KiB
| | | \- node.0: Current used bytes: 0.0 B, peak bytes: 0.0 B
| | | \- op.0.0.0.ValueStream: Current used bytes: 0.0 B, peak bytes: 0.0 B
| | \- WholeStageIterator_default_leaf: Current used bytes: 0.0 B, peak bytes: 0.0 B
| \- gluten::MemoryAllocator: Current used bytes: 0.0 B, peak bytes: 0.0 B
+- ShuffleReader.0: Current used bytes: 9.0 MiB, peak bytes: 24.0 MiB
| \- single: Current used bytes: 9.0 MiB, peak bytes: 24.0 MiB
| +- ShuffleReader_root: Current used bytes: 9.9 KiB, peak bytes: 16.0 MiB
| | \- ShuffleReader_default_leaf: Current used bytes: 9.9 KiB, peak bytes: 15.4 MiB
| \- gluten::MemoryAllocator: Current used bytes: 128.0 B, peak bytes: 6.3 MiB
+- ArrowContextInstance.172: Current used bytes: 8.0 MiB, peak bytes: 8.0 MiB
+- ArrowContextInstance.173: Current used bytes: 0.0 B, peak bytes: 0.0 B
+- ShuffleWriter.180: Current used bytes: 0.0 B, peak bytes: 1808.0 MiB
| \- single: Current used bytes: 0.0 B, peak bytes: 1808.0 MiB
| +- ShuffleWriter_root: Current used bytes: 0.0 B, peak bytes: 72.0 MiB
| | \- ShuffleWriter_default_leaf: Current used bytes: 0.0 B, peak bytes: 64.8 MiB
| \- gluten::MemoryAllocator: Current used bytes: 0.0 B, peak bytes: 1733.7 MiB
+- OverAcquire.DummyTarget.532: Current used bytes: 0.0 B, peak bytes: 7.2 MiB
+- OverAcquire.DummyTarget.533: Current used bytes: 0.0 B, peak bytes: 8.9 GiB
\- OverAcquire.DummyTarget.534: Current used bytes: 0.0 B, peak bytes: 542.4 MiB
at org.apache.gluten.memory.memtarget.ThrowOnOomMemoryTarget.borrow(ThrowOnOomMemoryTarget.java:105)
at org.apache.gluten.memory.nmm.ManagedReservationListener.reserve(ManagedReservationListener.java:43)
at org.apache.gluten.vectorized.ColumnarBatchOutIterator.nativeHasNext(Native Method)
at org.apache.gluten.vectorized.ColumnarBatchOutIterator.hasNextInternal(ColumnarBatchOutIterator.java:65)
at org.apache.gluten.vectorized.GeneralOutIterator.hasNext(GeneralOutIterator.java:37)
at scala.collection.convert.Wrappers$JIteratorWrapper.hasNext(Wrappers.scala:43)
at org.apache.gluten.utils.InvocationFlowProtection.hasNext(Iterators.scala:135)
at org.apache.gluten.utils.IteratorCompleter.hasNext(Iterators.scala:69)
at org.apache.gluten.utils.PayloadCloser.hasNext(Iterators.scala:35)
at org.apache.gluten.utils.PipelineTimeAccumulator.hasNext(Iterators.scala:98)
at scala.collection.Iterator$$anon$10.hasNext(Iterator.scala:458)
at org.apache.spark.shuffle.ColumnarShuffleWriter.internalWrite(ColumnarShuffleWriter.scala:135)
at org.apache.spark.shuffle.ColumnarShuffleWriter.write(ColumnarShuffleWriter.scala:242)
at org.apache.spark.shuffle.ShuffleWriteProcessor.write(ShuffleWriteProcessor.scala:59)
at org.apache.spark.scheduler.ShuffleMapTask.runTask(ShuffleMapTask.scala:99)
at org.apache.spark.scheduler.ShuffleMapTask.runTask(ShuffleMapTask.scala:52)
at org.apache.spark.scheduler.Task.run(Task.scala:131)
at org.apache.spark.executor.Executor$TaskRunner.$anonfun$run$3(Executor.scala:506)
at org.apache.spark.util.Utils$.tryWithSafeFinally(Utils.scala:1471)
at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:509)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
at java.lang.Thread.run(Thread.java:750)
Retriable: False
Function: runInternal
File: /home/binweiyang/gluten/ep/build-velox/build/velox_ep/velox/exec/Driver.cpp
Line: 587
Stack trace:
# 0 _ZN8facebook5velox7process10StackTraceC1Ei
# 1 _ZN8facebook5velox14VeloxExceptionC1EPKcmS3_St17basic_string_viewIcSt11char_traitsIcEES7_S7_S7_bNS1_4TypeES7_
# 2 _ZN8facebook5velox6detail14veloxCheckFailINS0_17VeloxRuntimeErrorERKNSt7__cxx1112basic_stringIcSt11char_traitsIcESaIcEEEEEvRKNS1_18VeloxCheckFailArgsET0_
# 3 _ZN8facebook5velox4exec6Driver11runInternalERSt10shared_ptrIS2_ERS3_INS1_13BlockingStateEERS3_INS0_9RowVectorEE.cold
# 4 _ZN8facebook5velox4exec6Driver4nextERSt10shared_ptrINS1_13BlockingStateEE
# 5 _ZN8facebook5velox4exec4Task4nextEPN5folly10SemiFutureINS3_4UnitEEE
# 6 _ZN6gluten24WholeStageResultIterator4nextEv
# 7 Java_org_apache_gluten_vectorized_ColumnarBatchOutIterator_nativeHasNext
# 8 0x00007f0dcbc89568
Backend
VL (Velox)
Bug description
@zhztheplayer
Spark version
Spark-3.2.x
Spark configurations
No response
System information
No response
Relevant logs
No response