Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[FLINK-24666][table-runtime] Add job level table.exec.state-stale.error-handling option and apply to related stateful stream operators #20051

Open
wants to merge 3 commits into
base: master
Choose a base branch
from

Conversation

lincoln-lil
Copy link
Contributor

@lincoln-lil lincoln-lil commented Jun 22, 2022

What is the purpose of the change

In stream processing, records will be deleted when exceed state ttl (if configured), and when the corresponding record's update arrives again, the operator may not be able to handle it properly, we need a unified error handling mechanism to
handle this situation, instead of each stateful operator currently handling its own.

Brief change log

  • add 'table.exec.state-stale.error-handling' to ExecutionConfigOptions
  • add utility class ErrorHandlingUtil for unified state stale error handling
  • apply the state stale error handling logic to related stateful streaming operators
  • update existing tests to ensure error handling is covered

Verifying this change

updated existing tests

Does this pull request potentially affect one of the following parts:

  • Dependencies (does it add or upgrade a dependency): (no)
  • The public API, i.e., is any changed class annotated with @public(Evolving): (no)
  • The serializers: (no )
  • The runtime per-record code paths (performance sensitive): (no)
  • Anything that affects deployment or recovery: JobManager (and its components), Checkpointing, Kubernetes/Yarn, ZooKeeper: (no)
  • The S3 file system connector: (no)

Documentation

  • Does this pull request introduce a new feature? (yes)
  • If yes, how is the feature documented? (docs)

@flinkbot
Copy link
Collaborator

flinkbot commented Jun 22, 2022

CI report:

Bot commands The @flinkbot bot supports the following commands:
  • @flinkbot run azure re-run the last Azure build

@lincoln-lil
Copy link
Contributor Author

There found a bad case , I've created another issue to followup: FLINK-28242, before it is done, the strict check in this pr may cause user's task failure. So pending this for a while..

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
3 participants