Skip to content

Conversation

@featzhang
Copy link
Member

@featzhang featzhang commented Feb 10, 2026

What is the purpose of the change

This PR FLINK-38857 adds sequence ID auto-increment strategy support for Triton Inference Server integration in Flink, addressing the need for sequence isolation across job failovers and restarts.

Brief change log

  • Add sequence-id-auto-increment configuration option in TritonOptions
  • Implement auto-increment logic in TritonInferenceModelFunction
    • Generate unique sequence IDs: {sequence-id}-{subtask-index}-{counter}
    • Initialize AtomicLong counter in open() method
    • Increment counter for each inference request
  • Add validation to ensure sequence-id is configured when auto-increment is enabled
  • Add detailed logging for debugging sequence ID generation

Use Case

This feature is particularly useful for non-reentrant models where:

  • Duplicate inference requests must be avoided after failover
  • Sequence batching requires strict isolation between parallel instances
  • Models maintain stateful context that cannot handle repeated sequence IDs

Example Configuration

CREATE MODEL my_triton_model WITH (
  'provider' = 'triton',
  'endpoint' = 'https://triton-server:8000/v2/models',
  'model-name' = 'my_stateful_model',
  'sequence-id' = 'flink-job-123',
  'sequence-id-auto-increment' = 'true',  -- Enable auto-increment
  'sequence-start' = 'true',
  'sequence-end' = 'true'
);

Sequence ID Format

When auto-increment is enabled, sequence IDs follow this pattern:

{base-sequence-id}-{subtask-index}-{counter}
Example: flink-job-123-0-0, flink-job-123-0-1, flink-job-123-1-0

This ensures:

  • Unique sequences across parallel subtasks (via subtask-index)
  • Monotonically increasing sequences per subtask (via counter)
  • Sequence isolation across job restarts (new counter starts from 0)

Verifying this change

This change is a trivial rework / code cleanup without any test coverage.

  • Manual testing with Triton Inference Server
  • Code formatting verified with Spotless

Does this pull request potentially affect one of the following parts

  • Dependencies: no
  • The public API: yes (adds new configuration option)
  • The serializers: no
  • The runtime per-record code paths: no
  • Anything that affects deployment or recovery: JobManager, Checkpoint Coordinator, State backends, etc.: no
  • The S3 file system connector: no

Documentation

  • Does this pull request introduce a new feature? yes
  • If yes, how is the feature documented? JavaDocs

Related Issues

Closes #FLINK-38857

featzhang added 2 commits February 10, 2026 11:09
…n inference

- Add 'sequence-id-auto-increment' configuration option
- Enable automatic sequence ID generation: {sequence-id}-{subtask-index}-{counter}
- Support for non-reentrant models requiring unique sequence IDs
- Ensure sequence isolation across Flink task restarts and failovers
- Add validation: auto-increment requires sequence-id to be configured first

This feature is useful for stateful models that cannot handle duplicate
sequence IDs after job recovery or rescaling.
@flinkbot
Copy link
Collaborator

flinkbot commented Feb 10, 2026

CI report:

Bot commands The @flinkbot bot supports the following commands:
  • @flinkbot run azure re-run the last Azure build

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants