Skip to content

Fix null-safety and contract violations in HadoopPinotFS#18457

Merged
yashmayya merged 2 commits into
apache:masterfrom
Akanksha-kedia:fix-hadoop-pinotfs-null-safety
May 11, 2026
Merged

Fix null-safety and contract violations in HadoopPinotFS#18457
yashmayya merged 2 commits into
apache:masterfrom
Akanksha-kedia:fix-hadoop-pinotfs-null-safety

Conversation

@Akanksha-kedia
Copy link
Copy Markdown
Contributor

@Akanksha-kedia Akanksha-kedia commented May 11, 2026

Summary

Four independent issues in HadoopPinotFS:

  1. visitFileStatus() — null return from listStatus()
    FileSystem.listStatus() can return null on some Hadoop versions when the path is empty or inaccessible. Iterating over a null array causes NPE. Added an early-return null guard.

  2. isDirectory() / lastModified()RuntimeException wrapping
    Both methods catch IOException and rethrow wrapped in RuntimeException. The PinotFS interface declares both as throws IOException, so the checked exception should propagate directly. Removed the wrapping and added throws IOException to the overrides.

  3. touch()FSDataOutputStream resource leak
    FSDataOutputStream fos = _hadoopFS.create(path); fos.close(); — if fos.close() throws, the stream leaks. Replaced with try-with-resources.

  4. close() — NPE when init() was never called
    If init() never ran or threw before assigning _hadoopFS, close() throws NPE. Added a null guard before calling _hadoopFS.close().

Test plan

  • Existing unit tests for HadoopPinotFS pass
  • listStatus returning null no longer causes NPE in listFiles / listFilesWithMetadata
  • isDirectory and lastModified propagate IOException to callers instead of RuntimeException

- visitFileStatus(): guard against null return value from FileSystem.listStatus() before iterating, avoiding NPE on empty or inaccessible directories
- isDirectory() / lastModified(): remove RuntimeException wrapping of IOException — both methods are declared in the PinotFS interface to throw IOException, so the checked exception should propagate directly to callers rather than being hidden in a RuntimeException
- touch(): use try-with-resources for FSDataOutputStream so the stream is closed even if fos.close() throws, preventing a resource leak
- close(): null-check _hadoopFS before calling close() to guard against the case where init() was never called or threw
@Akanksha-kedia
Copy link
Copy Markdown
Contributor Author

@xiangfu0 please review

@codecov-commenter
Copy link
Copy Markdown

codecov-commenter commented May 11, 2026

Codecov Report

✅ All modified and coverable lines are covered by tests.
✅ Project coverage is 63.66%. Comparing base (1881884) to head (c62dcb4).
⚠️ Report is 76 commits behind head on master.

Additional details and impacted files
@@             Coverage Diff              @@
##             master   #18457      +/-   ##
============================================
+ Coverage     63.40%   63.66%   +0.25%     
- Complexity     1679     1684       +5     
============================================
  Files          3253     3256       +3     
  Lines        198767   199548     +781     
  Branches      30791    30986     +195     
============================================
+ Hits         126034   127047    +1013     
+ Misses        62659    62370     -289     
- Partials      10074    10131      +57     
Flag Coverage Δ
custom-integration1 100.00% <ø> (ø)
integration 100.00% <ø> (ø)
integration1 100.00% <ø> (ø)
integration2 0.00% <ø> (?)
java-21 63.66% <ø> (+0.25%) ⬆️
temurin 63.66% <ø> (+0.25%) ⬆️
unittests 63.66% <ø> (+0.25%) ⬆️
unittests1 55.73% <ø> (+0.37%) ⬆️
unittests2 34.98% <ø> (+0.05%) ⬆️

Flags with carried forward coverage won't be shown. Click here to find out more.

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • 📦 JS Bundle Analysis: Save yourself from yourself by tracking and limiting bundle sizes in JS merges.

Copy link
Copy Markdown
Contributor

@xiangfu0 xiangfu0 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Found one high-signal correctness issue; see inline comment.

…g on null listStatus

Silently returning when listStatus() returns null causes delete(uri, false) to see an
empty directory listing, skip the non-empty guard, and fall through to
_hadoopFS.delete(path, true) — recursively deleting a non-empty directory that should
have been preserved. Throw IOException instead so callers fail fast.
@Akanksha-kedia
Copy link
Copy Markdown
Contributor Author

@xiangfu0 recheck please

@yashmayya yashmayya merged commit 0a5d4fe into apache:master May 11, 2026
16 of 18 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants