Skip to content

Parquet: Fix ParquetWriteAdapter NPE when accessing length/splitOffsets after close#15879

Open
jackylee-ch wants to merge 1 commit intoapache:mainfrom
jackylee-ch:fix/parquet-write-adapter-npe-14310
Open

Parquet: Fix ParquetWriteAdapter NPE when accessing length/splitOffsets after close#15879
jackylee-ch wants to merge 1 commit intoapache:mainfrom
jackylee-ch:fix/parquet-write-adapter-npe-14310

Conversation

@jackylee-ch
Copy link
Copy Markdown
Contributor

Summary

  • Fix NPE in ParquetWriteAdapter.length() and splitOffsets() when called after close() by caching data size and reusing cached footer before nullifying the internal writer
  • Add regression tests verifying the FileAppender post-close contract for the deprecated adapter
  • Clean up the pre-close length() workaround in ParquetWritingTestUtils that was needed because of this bug

Fixes #14310

Test Plan

  • TestParquetWriteAdapter.postCloseAccessors — verifies length(), metrics(), splitOffsets() work after close
  • TestParquetWriteAdapter.dataWriterCloseWithDeprecatedAdapter — end-to-end DataWriter.close() path
  • Full :iceberg-parquet:test suite — 642 tests pass, 0 failures

…ts after close

ParquetWriteAdapter.close() nullifies the internal Hadoop ParquetWriter
but length() and splitOffsets() still dereference it, causing NPE when
called after close (e.g. from DataWriter.close()).

Fix by caching data size before close and reusing the already-cached
footer for splitOffsets(). This restores FileAppender post-close
contract compliance.

Also removes the pre-close length() workaround in ParquetWritingTestUtils
that was needed because of this bug.

Fixes apache#14310
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Projects

None yet

Development

Successfully merging this pull request may close these issues.

iceberg-core:io:DataWriter.close() NPE error

1 participant