Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[FLINK-30189] HsSubpartitionFileReader may load data that has been consumed from memory #21415

Closed
wants to merge 1 commit into from
Closed

[FLINK-30189] HsSubpartitionFileReader may load data that has been consumed from memory #21415

wants to merge 1 commit into from

Conversation

reswqa
Copy link
Member

@reswqa reswqa commented Nov 29, 2022

What is the purpose of the change

In order to solve the problem that data cannot be read from the disk correctly after failover, we changed the calculation logical of the buffer's readable state in FLINK-29238. Buffers that are greater than consumingOffset and have been released can be pre-load from file. However, the update of consumingOffset is asynchronous, If it lags behind the actual consumption progress, the buffer will have a chance to be load from the disk again.

Brief change log

  • HsSubpartitionFileReaderImpl will skip peek buffer that less than expected.

Verifying this change

This change is already covered by existing tests

Does this pull request potentially affect one of the following parts:

  • Dependencies (does it add or upgrade a dependency): no
  • The public API, i.e., is any changed class annotated with @Public(Evolving): no
  • The serializers: no
  • The runtime per-record code paths (performance sensitive): no
  • Anything that affects deployment or recovery: JobManager (and its components), Checkpointing, Kubernetes/Yarn, ZooKeeper: no
  • The S3 file system connector: no

Documentation

  • Does this pull request introduce a new feature? no

@flinkbot
Copy link
Collaborator

flinkbot commented Nov 29, 2022

CI report:

Bot commands The @flinkbot bot supports the following commands:
  • @flinkbot run azure re-run the last Azure build

Copy link
Contributor

@xintongsong xintongsong left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @reswqa. LGTM. I have one minor comment, which I'll address myself during merging.

}

return Optional.of(peek);
return peek == null ? Optional.empty() : Optional.of(peek);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
return peek == null ? Optional.empty() : Optional.of(peek);
return Optional.ofNullable(peek);

xintongsong pushed a commit that referenced this pull request Dec 7, 2022
chucheng92 pushed a commit to chucheng92/flink that referenced this pull request Feb 3, 2023
sergeitsar pushed a commit to fentik/flink that referenced this pull request Feb 8, 2023
sergeitsar pushed a commit to fentik/flink that referenced this pull request Feb 8, 2023
akkinenivijay pushed a commit to krisnaru/flink that referenced this pull request Feb 11, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
3 participants