Skip to content

Conversation

@nisargthakkar
Copy link
Contributor

Problem Statement

There is no official utility to detect if a file is a SequenceFile. We rely on SequenceFile.Reader to throw an IOException and then we catch this IOException and track all IOException errors as not being a SequenceFile. However, this approach masks true IOException and eventually, VPJ ends up with a NullPointerException through some sequence of events.

Solution

This PR adds an explicit function to check for a SequenceFile using the file headers.

Code changes

  • Added new code behind a config. If so list the config names and their default values in the PR description.
  • Introduced new log lines.
    • Confirmed if logs need to be rate limited to avoid excessive logging.

Concurrency-Specific Checks

Both reviewer and PR author to verify

  • Code has no race conditions or thread safety issues.
  • Proper synchronization mechanisms (e.g., synchronized, RWLock) are used where needed.
  • No blocking calls inside critical sections that could lead to deadlocks or performance degradation.
  • Verified thread-safe collections are used (e.g., ConcurrentHashMap, CopyOnWriteArrayList).
  • Validated proper exception handling in multi-threaded code to avoid silent thread termination.

How was this PR tested?

  • New unit tests added.
  • New integration tests added.
  • Modified or extended existing tests.
  • Verified backward compatibility (if applicable).

Does this PR introduce any user-facing or breaking changes?

  • No. You can skip the rest of this section.
  • Yes. Clearly explain the behavior change and its impact.

@nisargthakkar nisargthakkar enabled auto-merge (squash) April 30, 2025 01:05
KaiSernLim
KaiSernLim previously approved these changes Apr 30, 2025
@nisargthakkar nisargthakkar merged commit 365c406 into linkedin:main May 1, 2025
58 checks passed
@nisargthakkar nisargthakkar deleted the fixVsonFileInputIOException branch May 1, 2025 19:09
WhitneyDeng pushed a commit to WhitneyDeng/venice that referenced this pull request May 16, 2025
…din#1745)

There is no official utility to detect if a file is a "SequenceFile". We rely on "SequenceFile.Reader" to throw an "IOException" and then we catch this "IOException" and track all "IOException" errors as not being a "SequenceFile". However, this approach masks true "IOException" and eventually, VPJ ends up with a "NullPointerException" through some sequence of events.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants