Skip to content

Add test to verify sequence name of Kafka task#15397

Merged
AmatyaAvadhanula merged 3 commits intoapache:masterfrom
kfaraz:add_supervisor_sequence_test
Nov 21, 2023
Merged

Add test to verify sequence name of Kafka task#15397
AmatyaAvadhanula merged 3 commits intoapache:masterfrom
kfaraz:add_supervisor_sequence_test

Conversation

@kfaraz
Copy link
Contributor

@kfaraz kfaraz commented Nov 20, 2023

The sequence name of a streaming task is determined by hashing the following:

  • min message time
  • max message time
  • data schema
  • tuning config
  • start partition offsets

Thus even if a task fails and another task is created to ingest that data, it gets assigned the same offset and would thus use the same sequence_name for segment allocation.

This PR only adds a simple test to verify that the sequence name does not depend on the task ID.
More tests can be later added around the sequence name to verify the other fields that affect the sequence name.

@AmatyaAvadhanula AmatyaAvadhanula merged commit 4ba3cf5 into apache:master Nov 21, 2023
@kfaraz kfaraz deleted the add_supervisor_sequence_test branch November 21, 2023 05:05
yashdeep97 pushed a commit to yashdeep97/druid that referenced this pull request Dec 1, 2023
* Add test to verify sequence name of Kafka and Kinesis tasks
@LakshSingla LakshSingla added this to the 29.0.0 milestone Jan 29, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants