-
Notifications
You must be signed in to change notification settings - Fork 92
fix GEARPUMP-122, refactor kafka connector API #25
Conversation
override def open(context: TaskContext, startTime: TimeStamp): Unit = { | ||
import context.{parallelism, taskId} | ||
|
||
LOG.info("KafkaSource opened at start time {}", startTime) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think adding the topic list that are consumed to the log is better.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We already log assigned partitions below. Will that be sufficient ?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
all right, I missed that part.
Documentation update? |
may I update doc as follow-up ? |
props.put(KafkaConfig.CHECKPOINT_STORE_NAME_PREFIX_CONFIG, appName) | ||
val source = new KafkaSource(sourceTopic, props) | ||
val checkpointStoreFactory = new KafkaStoreFactory(props) | ||
source.checkpoint(checkpointStoreFactory) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What if user doesn't call checkpoint
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
then kafka offsets will not be checkpointed
Current coverage is 63.75%
@@ master #25 diff @@
==========================================
Files 180 177 -3
Lines 5956 5898 -58
Methods 5678 5626 -52
Messages 0 0
Branches 278 272 -6
==========================================
- Hits 3818 3760 -58
Misses 2138 2138
Partials 0 0
|
/** | ||
* for tests only | ||
*/ | ||
void addPartitionAndStore(TopicAndPartition tp, CheckpointStore store) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
protected
?
please do not merge. I need to verify whether integration test passes |
aac082b
to
ae6671a
Compare
I've verified related integration tests pass |
a48d95a
to
9c5052a
Compare
UT failed |
+1 |
This PR mainly refactor the Kafka connector API with the following changes:
OffsetStorage
andKafkaStorage
, and usesCheckpointStore
andKafkaStore
for both offset and state checkpoint stores.checkpoint(CheckpointStoreFactory)
method toTimeReplayableSource
which makes offset store configurable by user. Ifcheckpoint
is not called, then no offset checkpoint will be performed and applications run in at-most-once mode.KafkaSource
,KafkaSink
andKafkaStore
to Java and keeps their implementations in Scala, which prevents Scala package private methods being exposed in Java programs.KafkaSource
,KafkaSink
andKafkaStore
are all configurable through Java properties which is the standard Kafka way.