Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Multithreaded KafkaItemReader on lines of JdbcPagingItemReader [BATCH-2855] #760

spring-issuemaster opened this issue Nov 6, 2019 · 1 comment


Copy link

@spring-issuemaster spring-issuemaster commented Nov 6, 2019

Abhinav Nigam opened BATCH-2855 and commented

Hello Mike


Checking if there is possibility to have a multithreaded item reader to read from a Kafka topic/partition similar to JdbcPagingItemReader




No further details from BATCH-2855

Copy link

@benas benas commented Feb 28, 2020

The KafkaItemReader is based on a KafkaConsumer which is not thread safe. Here is an excerpt from its Javadoc:

The Kafka consumer is NOT thread-safe.
All network I/O happens in the thread of the application making the call.
It is the responsibility of the user to ensure that multi-threaded access
is properly synchronized. 

Hence, the KafkaItemReader is in turn not thread-safe. If you want to use it in a multi-threaded scenario, you can decorate it with a SynchronizedItemStreamReader. In a partitioned scenario (for example with a partitioner that creates a Spring batch partition for each partition in a given kafka topic), using a step-scoped reader should make it thread-safe as well.

That said, to answer the question about the "possibility to have a multithreaded item reader to read from a Kafka topic/partition ", I will base my answer on the following section from the aforementioned Javadoc of the Kafka consumer:

We have intentionally avoided implementing a particular threading model for processing.
This leaves several options for implementing multi-threaded processing of records.

1. One Consumer Per Thread
A simple option is to give each thread its own consumer instance
2. Decouple Consumption and Processing
Another alternative is to have one or more consumer threads that do all data consumption
and hands off ConsumerRecords instances to a blocking queue consumed by
a pool of processor threads that actually handle the record processing.
  • To implement option 1: You can create multiple KafkaItemReaders (each one with its own kafka consumer), then make each thread use a different reader
  • To implement option 2: Use a single KafkaItemReader and couple it with an AsyncItemProcessor/AsyncItemWriter (aka reading is single threaded and processing/writing is multi-threaded)

I'm closing this issue for now as I explained the reason why the KafkaItemReader cannot be made thread-safe and gave some alternative options.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
None yet
Linked pull requests

Successfully merging a pull request may close this issue.

None yet
2 participants
You can’t perform that action at this time.