Skip to content

[SUPPORT] upgrade hudi 0.10.1 to hudi 0.11.0, Can not read value at 0 in block -1 in file .... #5575

@Guanpx

Description

@Guanpx

Describe the problem you faced

upgrade hudi 0.10.1 to hudi 0.11.0, with flink and cow table

To Reproduce

Steps to reproduce the behavior:

  1. the exception occurs when table schema have decimal column

Environment Description

  • Hudi version : 0.11.0

  • Flink version : 1.13.2

  • Hive version : 2.1.1

  • Hadoop version : 3.0.0

  • Storage (HDFS/S3/GCS..) : HDFS

  • Running on Docker? (yes/no) : no

Additional context

Add any other context about the problem here.

Stacktrace

Caused by: java.util.concurrent.ExecutionException: org.apache.hudi.exception.HoodieException: operation has failed
	at java.util.concurrent.FutureTask.report(FutureTask.java:122) ~[?:1.8.0_202]
	at java.util.concurrent.FutureTask.get(FutureTask.java:192) ~[?:1.8.0_202]
	at org.apache.hudi.common.util.queue.BoundedInMemoryExecutor.execute(BoundedInMemoryExecutor.java:154) ~[bigdata-hudi-dataware-CK.jar:?]
	at org.apache.hudi.table.action.commit.FlinkMergeHelper.runMerge(FlinkMergeHelper.java:101) ~[bigdata-hudi-dataware-CK.jar:?]
	at org.apache.hudi.table.action.commit.BaseFlinkCommitActionExecutor.handleUpdateInternal(BaseFlinkCommitActionExecutor.java:228) ~[bigdata-hudi-dataware-CK.jar:?]
	at org.apache.hudi.table.action.commit.BaseFlinkCommitActionExecutor.handleUpdate(BaseFlinkCommitActionExecutor.java:219) ~[bigdata-hudi-dataware-CK.jar:?]
	at org.apache.hudi.table.action.commit.BaseFlinkCommitActionExecutor.handleUpsertPartition(BaseFlinkCommitActionExecutor.java:190) ~[bigdata-hudi-dataware-CK.jar:?]
	at org.apache.hudi.table.action.commit.BaseFlinkCommitActionExecutor.execute(BaseFlinkCommitActionExecutor.java:108) ~[bigdata-hudi-dataware-CK.jar:?]
	at org.apache.hudi.table.action.commit.BaseFlinkCommitActionExecutor.execute(BaseFlinkCommitActionExecutor.java:70) ~[bigdata-hudi-dataware-CK.jar:?]
	at org.apache.hudi.table.action.commit.FlinkWriteHelper.write(FlinkWriteHelper.java:73) ~[bigdata-hudi-dataware-CK.jar:?]
	at org.apache.hudi.table.action.commit.FlinkInsertCommitActionExecutor.execute(FlinkInsertCommitActionExecutor.java:49) ~[bigdata-hudi-dataware-CK.jar:?]
	at org.apache.hudi.table.HoodieFlinkCopyOnWriteTable.insert(HoodieFlinkCopyOnWriteTable.java:136) ~[bigdata-hudi-dataware-CK.jar:?]
	at org.apache.hudi.client.HoodieFlinkWriteClient.insert(HoodieFlinkWriteClient.java:174) ~[bigdata-hudi-dataware-CK.jar:?]
	at org.apache.hudi.sink.StreamWriteFunction.lambda$initWriteFunction$0(StreamWriteFunction.java:181) ~[bigdata-hudi-dataware-CK.jar:?]
	at org.apache.hudi.sink.StreamWriteFunction.lambda$flushRemaining$7(StreamWriteFunction.java:461) ~[bigdata-hudi-dataware-CK.jar:?]
	at java.util.LinkedHashMap$LinkedValues.forEach(LinkedHashMap.java:608) ~[?:1.8.0_202]
	at org.apache.hudi.sink.StreamWriteFunction.flushRemaining(StreamWriteFunction.java:454) ~[bigdata-hudi-dataware-CK.jar:?]
	at org.apache.hudi.sink.StreamWriteFunction.snapshotState(StreamWriteFunction.java:131) ~[bigdata-hudi-dataware-CK.jar:?]
	at org.apache.hudi.sink.common.AbstractStreamWriteFunction.snapshotState(AbstractStreamWriteFunction.java:157) ~[bigdata-hudi-dataware-CK.jar:?]
	at org.apache.flink.streaming.util.functions.StreamingFunctionUtils.trySnapshotFunctionState(StreamingFunctionUtils.java:118) ~[flink-dist_2.12-1.13.2.jar:1.13.2]
	at org.apache.flink.streaming.util.functions.StreamingFunctionUtils.snapshotFunctionState(StreamingFunctionUtils.java:99) ~[flink-dist_2.12-1.13.2.jar:1.13.2]
	at org.apache.flink.streaming.api.operators.AbstractUdfStreamOperator.snapshotState(AbstractUdfStreamOperator.java:89) ~[flink-dist_2.12-1.13.2.jar:1.13.2]
	at org.apache.flink.streaming.api.operators.StreamOperatorStateHandler.snapshotState(StreamOperatorStateHandler.java:218) ~[flink-dist_2.12-1.13.2.jar:1.13.2]
	at org.apache.flink.streaming.api.operators.StreamOperatorStateHandler.snapshotState(StreamOperatorStateHandler.java:169) ~[flink-dist_2.12-1.13.2.jar:1.13.2]
	at org.apache.flink.streaming.api.operators.AbstractStreamOperator.snapshotState(AbstractStreamOperator.java:371) ~[flink-dist_2.12-1.13.2.jar:1.13.2]
	at org.apache.flink.streaming.runtime.tasks.SubtaskCheckpointCoordinatorImpl.checkpointStreamOperator(SubtaskCheckpointCoordinatorImpl.java:706) ~[flink-dist_2.12-1.13.2.jar:1.13.2]
	at org.apache.flink.streaming.runtime.tasks.SubtaskCheckpointCoordinatorImpl.buildOperatorSnapshotFutures(SubtaskCheckpointCoordinatorImpl.java:627) ~[flink-dist_2.12-1.13.2.jar:1.13.2]
	at org.apache.flink.streaming.runtime.tasks.SubtaskCheckpointCoordinatorImpl.takeSnapshotSync(SubtaskCheckpointCoordinatorImpl.java:590) ~[flink-dist_2.12-1.13.2.jar:1.13.2]
	at org.apache.flink.streaming.runtime.tasks.SubtaskCheckpointCoordinatorImpl.checkpointState(SubtaskCheckpointCoordinatorImpl.java:312) ~[flink-dist_2.12-1.13.2.jar:1.13.2]
	at org.apache.flink.streaming.runtime.tasks.StreamTask.lambda$performCheckpoint$8(StreamTask.java:1089) ~[flink-dist_2.12-1.13.2.jar:1.13.2]
	at org.apache.flink.streaming.runtime.tasks.StreamTaskActionExecutor$1.runThrowing(StreamTaskActionExecutor.java:50) ~[flink-dist_2.12-1.13.2.jar:1.13.2]
	at org.apache.flink.streaming.runtime.tasks.StreamTask.performCheckpoint(StreamTask.java:1073) ~[flink-dist_2.12-1.13.2.jar:1.13.2]
	at org.apache.flink.streaming.runtime.tasks.StreamTask.triggerCheckpointOnBarrier(StreamTask.java:1029) ~[flink-dist_2.12-1.13.2.jar:1.13.2]
	... 19 more
Caused by: org.apache.hudi.exception.HoodieException: operation has failed
	at org.apache.hudi.common.util.queue.BoundedInMemoryQueue.throwExceptionIfFailed(BoundedInMemoryQueue.java:248) ~[bigdata-hudi-dataware-CK.jar:?]
	at org.apache.hudi.common.util.queue.BoundedInMemoryQueue.readNextRecord(BoundedInMemoryQueue.java:226) ~[bigdata-hudi-dataware-CK.jar:?]
	at org.apache.hudi.common.util.queue.BoundedInMemoryQueue.access$100(BoundedInMemoryQueue.java:52) ~[bigdata-hudi-dataware-CK.jar:?]
	at org.apache.hudi.common.util.queue.BoundedInMemoryQueue$QueueIterator.hasNext(BoundedInMemoryQueue.java:278) ~[bigdata-hudi-dataware-CK.jar:?]
	at org.apache.hudi.common.util.queue.BoundedInMemoryQueueConsumer.consume(BoundedInMemoryQueueConsumer.java:36) ~[bigdata-hudi-dataware-CK.jar:?]
	at org.apache.hudi.common.util.queue.BoundedInMemoryExecutor.lambda$null$2(BoundedInMemoryExecutor.java:134) ~[bigdata-hudi-dataware-CK.jar:?]
	at java.util.concurrent.FutureTask.run(FutureTask.java:266) ~[?:1.8.0_202]
	at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) ~[?:1.8.0_202]
	at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) ~[?:1.8.0_202]
	... 1 more
Caused by: org.apache.hudi.exception.HoodieException: unable to read next record from parquet file 
	at org.apache.hudi.common.util.ParquetReaderIterator.hasNext(ParquetReaderIterator.java:53) ~[bigdata-hudi-dataware-CK.jar:?]
	at org.apache.hudi.common.util.queue.IteratorBasedQueueProducer.produce(IteratorBasedQueueProducer.java:45) ~[bigdata-hudi-dataware-CK.jar:?]
	at org.apache.hudi.common.util.queue.BoundedInMemoryExecutor.lambda$null$0(BoundedInMemoryExecutor.java:105) ~[bigdata-hudi-dataware-CK.jar:?]
	at java.util.concurrent.FutureTask.run(FutureTask.java:266) ~[?:1.8.0_202]
	at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511) ~[?:1.8.0_202]
	at java.util.concurrent.FutureTask.run(FutureTask.java:266) ~[?:1.8.0_202]
	at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) ~[?:1.8.0_202]
	at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) ~[?:1.8.0_202]
	... 1 more
Caused by: org.apache.parquet.io.ParquetDecodingException: Can not read value at 0 in block -1 in file hdfs://ha/hudi/dw/rds.db/hudi_info/0623acec-7614-41fa-97cd-8ba44614ffff_0-2-16_20220512163722805.parquet
	at org.apache.parquet.hadoop.InternalParquetRecordReader.nextKeyValue(InternalParquetRecordReader.java:255) ~[bigdata-hudi-dataware-CK.jar:?]
	at org.apache.parquet.hadoop.ParquetReader.read(ParquetReader.java:132) ~[bigdata-hudi-dataware-CK.jar:?]
	at org.apache.parquet.hadoop.ParquetReader.read(ParquetReader.java:136) ~[bigdata-hudi-dataware-CK.jar:?]
	at org.apache.hudi.common.util.ParquetReaderIterator.hasNext(ParquetReaderIterator.java:48) ~[bigdata-hudi-dataware-CK.jar:?]
	at org.apache.hudi.common.util.queue.IteratorBasedQueueProducer.produce(IteratorBasedQueueProducer.java:45) ~[bigdata-hudi-dataware-CK.jar:?]
	at org.apache.hudi.common.util.queue.BoundedInMemoryExecutor.lambda$null$0(BoundedInMemoryExecutor.java:105) ~[bigdata-hudi-dataware-CK.jar:?]
	at java.util.concurrent.FutureTask.run(FutureTask.java:266) ~[?:1.8.0_202]
	at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511) ~[?:1.8.0_202]
	at java.util.concurrent.FutureTask.run(FutureTask.java:266) ~[?:1.8.0_202]
	at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) ~[?:1.8.0_202]
	at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) ~[?:1.8.0_202]
	... 1 more
Caused by: java.lang.ArrayIndexOutOfBoundsException
	at java.lang.System.arraycopy(Native Method) ~[?:1.8.0_202]
	at org.apache.hudi.org.apache.avro.generic.GenericData.createFixed(GenericData.java:1315) ~[bigdata-hudi-dataware-CK.jar:?]
	at org.apache.parquet.avro.AvroConverters$FieldFixedConverter.convert(AvroConverters.java:319) ~[bigdata-hudi-dataware-CK.jar:?]
	at org.apache.parquet.avro.AvroConverters$BinaryConverter.setDictionary(AvroConverters.java:75) ~[bigdata-hudi-dataware-CK.jar:?]
	at org.apache.parquet.column.impl.ColumnReaderBase.<init>(ColumnReaderBase.java:385) ~[bigdata-hudi-dataware-CK.jar:?]
	at org.apache.parquet.column.impl.ColumnReaderImpl.<init>(ColumnReaderImpl.java:46) ~[bigdata-hudi-dataware-CK.jar:?]
	at org.apache.parquet.column.impl.ColumnReadStoreImpl.getColumnReader(ColumnReadStoreImpl.java:84) ~[bigdata-hudi-dataware-CK.jar:?]
	at org.apache.parquet.io.RecordReaderImplementation.<init>(RecordReaderImplementation.java:271) ~[bigdata-hudi-dataware-CK.jar:?]
	at org.apache.parquet.io.MessageColumnIO$1.visit(MessageColumnIO.java:147) ~[bigdata-hudi-dataware-CK.jar:?]
	at org.apache.parquet.io.MessageColumnIO$1.visit(MessageColumnIO.java:109) ~[bigdata-hudi-dataware-CK.jar:?]
	at org.apache.parquet.filter2.compat.FilterCompat$NoOpFilter.accept(FilterCompat.java:165) ~[bigdata-hudi-dataware-CK.jar:?]
	at org.apache.parquet.io.MessageColumnIO.getRecordReader(MessageColumnIO.java:109) ~[bigdata-hudi-dataware-CK.jar:?]
	at org.apache.parquet.hadoop.InternalParquetRecordReader.checkRead(InternalParquetRecordReader.java:137) ~[bigdata-hudi-dataware-CK.jar:?]
	at org.apache.parquet.hadoop.InternalParquetRecordReader.nextKeyValue(InternalParquetRecordReader.java:226) ~[bigdata-hudi-dataware-CK.jar:?]
	at org.apache.parquet.hadoop.ParquetReader.read(ParquetReader.java:132) ~[bigdata-hudi-dataware-CK.jar:?]
	at org.apache.parquet.hadoop.ParquetReader.read(ParquetReader.java:136) ~[bigdata-hudi-dataware-CK.jar:?]
	at org.apache.hudi.common.util.ParquetReaderIterator.hasNext(ParquetReaderIterator.java:48) ~[bigdata-hudi-dataware-CK.jar:?]
	at org.apache.hudi.common.util.queue.IteratorBasedQueueProducer.produce(IteratorBasedQueueProducer.java:45) ~[bigdata-hudi-dataware-CK.jar:?]
	at org.apache.hudi.common.util.queue.BoundedInMemoryExecutor.lambda$null$0(BoundedInMemoryExecutor.java:105) ~[bigdata-hudi-dataware-CK.jar:?]
	at java.util.concurrent.FutureTask.run(FutureTask.java:266) ~[?:1.8.0_202]
	at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511) ~[?:1.8.0_202]
	at java.util.concurrent.FutureTask.run(FutureTask.java:266) ~[?:1.8.0_202]
	at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) ~[?:1.8.0_202]
	at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) ~[?:1.8.0_202]
	... 1 more

Metadata

Metadata

Assignees

No one assigned

    Labels

    engine:flinkFlink integrationpriority:highSignificant impact; potential bugs

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions