-
Notifications
You must be signed in to change notification settings - Fork 582
Closed
Labels
Description
Backend
VL (Velox)
Bug description
error log:
26/02/01 17:48:45 ERROR Executor: Exception in task 1.0 in stage 13.0 (TID 2567)
org.apache.gluten.exception.GlutenException: org.apache.gluten.exception.GlutenException: Error during calling Java code from native code: org.apache.gluten.exception.GlutenException: org.apache.gluten.exception.GlutenException: Exception: VeloxRuntimeError
Error Source: RUNTIME
Error Code: INVALID_STATE
Reason: Operator::getOutput failed for [operator: HashProbe, plan node ID: 4]: Error during calling Java code from native code: org.apache.gluten.exception.GlutenException: Error during calling Java code from native code: org.apache.gluten.memory.memtarget.ThrowOnOomMemoryTarget$OutOfMemoryException: Not enough spark off-heap execution memory. Acquired: 8.0 MiB, granted: 1024.0 KiB. Try tweaking config option spark.memory.offHeap.size to get larger space to run this application (if spark.gluten.memory.dynamic.offHeap.sizing.enabled is not enabled).
Current config settings:
spark.gluten.memory.offHeap.size.in.bytes=2.0 GiB
spark.gluten.memory.task.offHeap.size.in.bytes=2.0 GiB
spark.gluten.memory.conservative.task.offHeap.size.in.bytes=1024.0 MiB
spark.memory.offHeap.enabled=true
spark.gluten.memory.dynamic.offHeap.sizing.enabled=false
Memory consumer stats:
Task.2567: Current used bytes: 2.1 GiB, peak bytes: N/A
+- Gluten.Tree.3: Current used bytes: 2045.0 MiB, peak bytes: 2.0 GiB
| \- Capacity[8.0 EiB].3: Current used bytes: 2045.0 MiB, peak bytes: 2.0 GiB
| +- UniffleShuffleWriter.3: Current used bytes: 2000.0 MiB, peak bytes: 2008.0 MiB
| | \- single: Current used bytes: 2000.0 MiB, peak bytes: 2008.0 MiB
| | +- gluten::MemoryAllocator: Current used bytes: 1996.5 MiB, peak bytes: 2003.6 MiB
| | | +- VeloxShuffleWriter.partitionBufferPool: Current used bytes: 1996.5 MiB, peak bytes: 2003.6 MiB
| | | +- default: Current used bytes: 0.0 B, peak bytes: 0.0 B
| | | \- PartitionWriter.cached_payload: Current used bytes: 0.0 B, peak bytes: 5.9 MiB
| | \- root: Current used bytes: 0.0 B, peak bytes: 1024.0 KiB
| | \- default_leaf: Current used bytes: 0.0 B, peak bytes: 896.0 B
| +- NativePlanEvaluator-3.0: Current used bytes: 30.0 MiB, peak bytes: 44.0 MiB
| | \- single: Current used bytes: 22.0 MiB, peak bytes: 44.0 MiB
| | +- root: Current used bytes: 20.2 MiB, peak bytes: 37.0 MiB
| | | +- task.Gluten_Stage_13_TID_2567_VTID_3: Current used bytes: 20.2 MiB, peak bytes: 37.0 MiB
| | | | +- node.4: Current used bytes: 20.1 MiB, peak bytes: 37.0 MiB
| | | | | +- op.4.0.0.HashProbe: Current used bytes: 20.0 MiB, peak bytes: 23.5 MiB
| | | | | \- op.4.1.0.HashBuild: Current used bytes: 148.0 KiB, peak bytes: 10.5 MiB
| | | | +- node.6: Current used bytes: 16.0 KiB, peak bytes: 1024.0 KiB
| | | | | \- op.6.0.0.FilterProject: Current used bytes: 16.0 KiB, peak bytes: 24.0 KiB
| | | | +- node.2: Current used bytes: 0.0 B, peak bytes: 0.0 B
| | | | | \- op.2.1.0.ValueStream: Current used bytes: 0.0 B, peak bytes: 0.0 B
| | | | +- node.5: Current used bytes: 0.0 B, peak bytes: 0.0 B
| | | | | \- op.5.0.0.FilterProject: Current used bytes: 0.0 B, peak bytes: 0.0 B
| | | | +- node.1: Current used bytes: 0.0 B, peak bytes: 1024.0 KiB
| | | | | \- op.1.0.0.FilterProject: Current used bytes: 0.0 B, peak bytes: 24.3 KiB
| | | | +- node.0: Current used bytes: 0.0 B, peak bytes: 0.0 B
| | | | | \- op.0.0.0.ValueStream: Current used bytes: 0.0 B, peak bytes: 0.0 B
| | | | \- node.3: Current used bytes: 0.0 B, peak bytes: 1024.0 KiB
| | | | \- op.3.1.0.FilterProject: Current used bytes: 0.0 B, peak bytes: 12.0 KiB
| | | \- default_leaf: Current used bytes: 0.0 B, peak bytes: 0.0 B
| | \- gluten::MemoryAllocator: Current used bytes: 0.0 B, peak bytes: 0.0 B
| | \- default: Current used bytes: 0.0 B, peak bytes: 0.0 B
| +- ArrowContextInstance.3: Current used bytes: 8.0 MiB, peak bytes: 8.0 MiB
| +- VeloxBatchResizer.3: Current used bytes: 7.0 MiB, peak bytes: 24.0 MiB
| | \- single: Current used bytes: 7.0 MiB, peak bytes: 24.0 MiB
| | +- root: Current used bytes: 6.9 MiB, peak bytes: 24.0 MiB
| | | \- default_leaf: Current used bytes: 6.9 MiB, peak bytes: 20.6 MiB
| | \- gluten::MemoryAllocator: Current used bytes: 0.0 B, peak bytes: 0.0 B
| | \- default: Current used bytes: 0.0 B, peak bytes: 0.0 B
| +- UniffleShuffleWriter.3.OverAcquire.0: Current used bytes: 0.0 B, peak bytes: 477.6 MiB
| +- IteratorMetrics.3.OverAcquire.0: Current used bytes: 0.0 B, peak bytes: 0.0 B
| +- IteratorMetrics.3: Current used bytes: 0.0 B, peak bytes: 0.0 B
| | \- single: Current used bytes: 0.0 B, peak bytes: 0.0 B
| | +- gluten::MemoryAllocator: Current used bytes: 0.0 B, peak bytes: 0.0 B
| | | \- default: Current used bytes: 0.0 B, peak bytes: 0.0 B
| | \- root: Current used bytes: 0.0 B, peak bytes: 0.0 B
| | \- default_leaf: Current used bytes: 0.0 B, peak bytes: 0.0 B
| +- ShuffleReader.3.OverAcquire.0: Current used bytes: 0.0 B, peak bytes: 4.8 MiB
| +- VeloxBatchResizer.3.OverAcquire.0: Current used bytes: 0.0 B, peak bytes: 7.2 MiB
| +- NativePlanEvaluator-3.0.OverAcquire.0: Current used bytes: 0.0 B, peak bytes: 13.2 MiB
| \- ShuffleReader.3: Current used bytes: 0.0 B, peak bytes: 16.0 MiB
| \- single: Current used bytes: 0.0 B, peak bytes: 16.0 MiB
| +- root: Current used bytes: 0.0 B, peak bytes: 1024.0 KiB
| | \- default_leaf: Current used bytes: 0.0 B, peak bytes: 36.0 KiB
| \- gluten::MemoryAllocator: Current used bytes: 0.0 B, peak bytes: 755.2 KiB
| \- default: Current used bytes: 0.0 B, peak bytes: 755.2 KiB
\- org.apache.spark.shuffle.writer.WriteBufferManager@211717c5: Current used bytes: 63.7 MiB, peak bytes: N/A
at org.apache.gluten.memory.memtarget.ThrowOnOomMemoryTarget.borrow(ThrowOnOomMemoryTarget.java:104)
at org.apache.gluten.memory.listener.ManagedReservationListener.reserve(ManagedReservationListener.java:49)
at org.apache.gluten.vectorized.ShuffleWriterJniWrapper.reclaim(Native Method)
at org.apache.spark.shuffle.writer.VeloxUniffleColumnarShuffleWriter$1.spill(VeloxUniffleColumnarShuffleWriter.java:201)
at org.apache.gluten.memory.memtarget.Spillers$AppendableSpillerList.spill(Spillers.java:80)
at org.apache.gluten.memory.memtarget.Spillers$WithMinSpillSize.spill(Spillers.java:60)
at org.apache.gluten.memory.memtarget.TreeMemoryTargets.spillTree(TreeMemoryTargets.java:75)
at org.apache.gluten.memory.memtarget.TreeMemoryTargets.spillTree(TreeMemoryTargets.java:68)
at org.apache.gluten.memory.memtarget.TreeMemoryTargets.spillTree(TreeMemoryTargets.java:68)
at org.apache.gluten.memory.memtarget.TreeMemoryTargets.spillTree(TreeMemoryTargets.java:50)
at org.apache.gluten.memory.memtarget.spark.TreeMemoryConsumer.spill(TreeMemoryConsumer.java:116)
at org.apache.spark.memory.TaskMemoryManager.trySpillAndAcquire(TaskMemoryManager.java:227)
at org.apache.spark.memory.TaskMemoryManager.acquireExecutionMemory(TaskMemoryManager.java:190)
at org.apache.spark.memory.MemoryConsumer.acquireMemory(MemoryConsumer.java:137)
at org.apache.gluten.memory.memtarget.spark.TreeMemoryConsumer.borrow(TreeMemoryConsumer.java:66)
at org.apache.gluten.memory.memtarget.spark.TreeMemoryConsumer$Node.borrow0(TreeMemoryConsumer.java:196)
at org.apache.gluten.memory.memtarget.spark.TreeMemoryConsumer$Node.borrow(TreeMemoryConsumer.java:188)
at org.apache.gluten.memory.memtarget.spark.TreeMemoryConsumer$Node.borrow0(TreeMemoryConsumer.java:196)
at org.apache.gluten.memory.memtarget.spark.TreeMemoryConsumer$Node.borrow(TreeMemoryConsumer.java:188)
at org.apache.gluten.memory.memtarget.OverAcquire.borrow(OverAcquire.java:63)
at org.apache.gluten.memory.memtarget.ThrowOnOomMemoryTarget.borrow(ThrowOnOomMemoryTarget.java:35)
at org.apache.gluten.memory.listener.ManagedReservationListener.reserve(ManagedReservationListener.java:49)
at org.apache.gluten.vectorized.ColumnarBatchOutIterator.nativeHasNext(Native Method)
at org.apache.gluten.vectorized.ColumnarBatchOutIterator.hasNext0(ColumnarBatchOutIterator.java:57)
at org.apache.gluten.iterator.ClosableIterator.hasNext(ClosableIterator.java:39)
at scala.collection.convert.Wrappers$JIteratorWrapper.hasNext(Wrappers.scala:45)
at org.apache.gluten.iterator.IteratorsV1$InvocationFlowProtection.hasNext(IteratorsV1.scala:154)
at org.apache.gluten.iterator.IteratorsV1$IteratorCompleter.hasNext(IteratorsV1.scala:66)
at org.apache.gluten.iterator.IteratorsV1$PayloadCloser.hasNext(IteratorsV1.scala:38)
at org.apache.gluten.iterator.IteratorsV1$LifeTimeAccumulator.hasNext(IteratorsV1.scala:95)
at org.apache.gluten.iterator.IteratorsV1$ReadTimeAccumulator.hasNext(IteratorsV1.scala:122)
at scala.collection.Iterator$$anon$10.hasNext(Iterator.scala:460)
at scala.collection.convert.Wrappers$IteratorWrapper.hasNext(Wrappers.scala:32)
at org.apache.gluten.vectorized.ColumnarBatchInIterator.hasNext(ColumnarBatchInIterator.java:36)
at org.apache.gluten.vectorized.ColumnarBatchOutIterator.nativeHasNext(Native Method)
at org.apache.gluten.vectorized.ColumnarBatchOutIterator.hasNext0(ColumnarBatchOutIterator.java:57)
at org.apache.gluten.iterator.ClosableIterator.hasNext(ClosableIterator.java:39)
at scala.collection.convert.Wrappers$JIteratorWrapper.hasNext(Wrappers.scala:45)
at org.apache.gluten.iterator.IteratorsV1$InvocationFlowProtection.hasNext(IteratorsV1.scala:154)
at org.apache.gluten.iterator.IteratorsV1$ReadTimeAccumulator.hasNext(IteratorsV1.scala:122)
at org.apache.gluten.iterator.IteratorsV1$PayloadCloser.hasNext(IteratorsV1.scala:38)
at org.apache.gluten.iterator.IteratorsV1$IteratorCompleter.hasNext(IteratorsV1.scala:66)
at scala.collection.Iterator$$anon$10.hasNext(Iterator.scala:460)
at scala.collection.Iterator$$anon$10.hasNext(Iterator.scala:460)
at org.apache.spark.shuffle.writer.VeloxUniffleColumnarShuffleWriter.writeImpl(VeloxUniffleColumnarShuffleWriter.java:151)
at org.apache.spark.shuffle.writer.RssShuffleWriter.write(RssShuffleWriter.java:344)
at org.apache.spark.shuffle.ShuffleWriteProcessor.write(ShuffleWriteProcessor.scala:59)
at org.apache.spark.scheduler.ShuffleMapTask.runTask(ShuffleMapTask.scala:104)
at org.apache.spark.scheduler.ShuffleMapTask.runTask(ShuffleMapTask.scala:54)
at org.apache.spark.TaskContext.runTaskWithListeners(TaskContext.scala:161)
at org.apache.spark.scheduler.Task.run(Task.scala:141)
at org.apache.spark.executor.Executor$TaskRunner.$anonfun$run$4(Executor.scala:620)
at org.apache.spark.util.SparkErrorUtils.tryWithSafeFinally(SparkErrorUtils.scala:64)
at org.apache.spark.util.SparkErrorUtils.tryWithSafeFinally$(SparkErrorUtils.scala:61)
at org.apache.spark.util.Utils$.tryWithSafeFinally(Utils.scala:94)
at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:623)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
at java.lang.Thread.run(Thread.java:750)
at org.apache.gluten.vectorized.ShuffleWriterJniWrapper.reclaim(Native Method)
at org.apache.spark.shuffle.writer.VeloxUniffleColumnarShuffleWriter$1.spill(VeloxUniffleColumnarShuffleWriter.java:201)
at org.apache.gluten.memory.memtarget.Spillers$AppendableSpillerList.spill(Spillers.java:80)
at org.apache.gluten.memory.memtarget.Spillers$WithMinSpillSize.spill(Spillers.java:60)
at org.apache.gluten.memory.memtarget.TreeMemoryTargets.spillTree(TreeMemoryTargets.java:75)
at org.apache.gluten.memory.memtarget.TreeMemoryTargets.spillTree(TreeMemoryTargets.java:68)
at org.apache.gluten.memory.memtarget.TreeMemoryTargets.spillTree(TreeMemoryTargets.java:68)
at org.apache.gluten.memory.memtarget.TreeMemoryTargets.spillTree(TreeMemoryTargets.java:50)
at org.apache.gluten.memory.memtarget.spark.TreeMemoryConsumer.spill(TreeMemoryConsumer.java:116)
at org.apache.spark.memory.TaskMemoryManager.trySpillAndAcquire(TaskMemoryManager.java:227)
at org.apache.spark.memory.TaskMemoryManager.acquireExecutionMemory(TaskMemoryManager.java:190)
at org.apache.spark.memory.MemoryConsumer.acquireMemory(MemoryConsumer.java:137)
at org.apache.gluten.memory.memtarget.spark.TreeMemoryConsumer.borrow(TreeMemoryConsumer.java:66)
at org.apache.gluten.memory.memtarget.spark.TreeMemoryConsumer$Node.borrow0(TreeMemoryConsumer.java:196)
at org.apache.gluten.memory.memtarget.spark.TreeMemoryConsumer$Node.borrow(TreeMemoryConsumer.java:188)
at org.apache.gluten.memory.memtarget.spark.TreeMemoryConsumer$Node.borrow0(TreeMemoryConsumer.java:196)
at org.apache.gluten.memory.memtarget.spark.TreeMemoryConsumer$Node.borrow(TreeMemoryConsumer.java:188)
at org.apache.gluten.memory.memtarget.OverAcquire.borrow(OverAcquire.java:63)
at org.apache.gluten.memory.memtarget.ThrowOnOomMemoryTarget.borrow(ThrowOnOomMemoryTarget.java:35)
at org.apache.gluten.memory.listener.ManagedReservationListener.reserve(ManagedReservationListener.java:49)
at org.apache.gluten.vectorized.ColumnarBatchOutIterator.nativeHasNext(Native Method)
at org.apache.gluten.vectorized.ColumnarBatchOutIterator.hasNext0(ColumnarBatchOutIterator.java:57)
at org.apache.gluten.iterator.ClosableIterator.hasNext(ClosableIterator.java:39)
at scala.collection.convert.Wrappers$JIteratorWrapper.hasNext(Wrappers.scala:45)
at org.apache.gluten.iterator.IteratorsV1$InvocationFlowProtection.hasNext(IteratorsV1.scala:154)
at org.apache.gluten.iterator.IteratorsV1$IteratorCompleter.hasNext(IteratorsV1.scala:66)
at org.apache.gluten.iterator.IteratorsV1$PayloadCloser.hasNext(IteratorsV1.scala:38)
at org.apache.gluten.iterator.IteratorsV1$LifeTimeAccumulator.hasNext(IteratorsV1.scala:95)
at org.apache.gluten.iterator.IteratorsV1$ReadTimeAccumulator.hasNext(IteratorsV1.scala:122)
at scala.collection.Iterator$$anon$10.hasNext(Iterator.scala:460)
at scala.collection.convert.Wrappers$IteratorWrapper.hasNext(Wrappers.scala:32)
at org.apache.gluten.vectorized.ColumnarBatchInIterator.hasNext(ColumnarBatchInIterator.java:36)
at org.apache.gluten.vectorized.ColumnarBatchOutIterator.nativeHasNext(Native Method)
at org.apache.gluten.vectorized.ColumnarBatchOutIterator.hasNext0(ColumnarBatchOutIterator.java:57)
at org.apache.gluten.iterator.ClosableIterator.hasNext(ClosableIterator.java:39)
at scala.collection.convert.Wrappers$JIteratorWrapper.hasNext(Wrappers.scala:45)
at org.apache.gluten.iterator.IteratorsV1$InvocationFlowProtection.hasNext(IteratorsV1.scala:154)
at org.apache.gluten.iterator.IteratorsV1$ReadTimeAccumulator.hasNext(IteratorsV1.scala:122)
at org.apache.gluten.iterator.IteratorsV1$PayloadCloser.hasNext(IteratorsV1.scala:38)
at org.apache.gluten.iterator.IteratorsV1$IteratorCompleter.hasNext(IteratorsV1.scala:66)
at scala.collection.Iterator$$anon$10.hasNext(Iterator.scala:460)
at scala.collection.Iterator$$anon$10.hasNext(Iterator.scala:460)
at org.apache.spark.shuffle.writer.VeloxUniffleColumnarShuffleWriter.writeImpl(VeloxUniffleColumnarShuffleWriter.java:151)
at org.apache.spark.shuffle.writer.RssShuffleWriter.write(RssShuffleWriter.java:344)
at org.apache.spark.shuffle.ShuffleWriteProcessor.write(ShuffleWriteProcessor.scala:59)
at org.apache.spark.scheduler.ShuffleMapTask.runTask(ShuffleMapTask.scala:104)
at org.apache.spark.scheduler.ShuffleMapTask.runTask(ShuffleMapTask.scala:54)
at org.apache.spark.TaskContext.runTaskWithListeners(TaskContext.scala:161)
at org.apache.spark.scheduler.Task.run(Task.scala:141)
at org.apache.spark.executor.Executor$TaskRunner.$anonfun$run$4(Executor.scala:620)
at org.apache.spark.util.SparkErrorUtils.tryWithSafeFinally(SparkErrorUtils.scala:64)
at org.apache.spark.util.SparkErrorUtils.tryWithSafeFinally$(SparkErrorUtils.scala:61)
at org.apache.spark.util.Utils$.tryWithSafeFinally(Utils.scala:94)
at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:623)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
at java.lang.Thread.run(Thread.java:750)
Gluten version
Gluten-1.5
Spark version
Spark-3.5.x
Spark configurations
No response
System information
No response
Relevant logs
Reactions are currently unavailable