-
Notifications
You must be signed in to change notification settings - Fork 579
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Scaling In After a Scale Out With Kafka Binder Causes Uneven Work Distribution #4896
Comments
I guess if I had to re-phrase it in a better way, it would be that consumers are being assigned to partitions that will not be assigned messages. It's possible that consumers are assigned multiple partitions that will never receive messages. This happens because when you scale out, partitions are added, but when you scale in they are not removed. The producer assigns partitions based on deployer count, but the consumer isn't assigned partitions based on deployer count. |
@fiidim Kafka binder provisions ten partitions when you scale out your app to 10 instances. You cannot decrease the number of partitions once they are already added. Thus, when scaling down to 5 instances, as you correctly noted, you are still dealing with ten partitions. Your five sink instances will evenly distribute the available partitions from the Kafka topic. On the processor side, whatever value that
I don't see how partitions >=5 are ignored in this case unless the This is the standard Spring Cloud Stream behavior with partitions and is only applicable at the Spring Cloud Stream level. We will explore if SCDF has some particular logic to send to only those five partitions on the processor side. |
@sobychacko When It behaves like the algorithm is being applied as This would explain why explicitly telling (or perhaps overriding) the value the producer uses works correctly: I'm not intimate with the SCDF code base in order to peek what happens when you set |
@fiidim Thanks for the feedback. We will look into the SCDF code to see if the sink count has any influence on the partition decision. cc @markpollack |
Somewhere around here Line 187 in bef804a
is where we should investigate. |
As described above, one workaround is to set the Generally speaking with Apache Kafka, it is better to leave partitioning to the middleware itself rather than using the partitioning support provided by the binder. See the important section under this part of the reference doc for more details. |
Description:
Discussion on the topic here. High level issue may be documentation related, or
deployer.count
property may need to do additional work.Consider a stream that consists of
source | process | sink
, for Scaling out, the documentation recommendsstream.properties
resembling:This will result in Kafka provisioning partitions. Everything is ok. If you scale in from this point as:
This results in Kafka keeping the original amount of partitions (since partitions aren't typically removed), and Spring Cloud Data Flow only assigning to <count/2> partitions via the default strategy. Sink instances are assigned to all 10 real partitions. Instances may be assigned solely to unused partitions.
Example in going to 10 instances to 5, can result in partition assignment of (note that instance 3 and 5 would be completely idle, and 4 partially idle):
The only workaround that has produced correct results is to tell the
processor
how many physical partitions Kafka has (i.e. the old number), which allows for assigned to ALL partitions, while still having a lower deployed instance count. Example:Feels kind of clunky. I'm open to suggestions. If there was a way to tell the
sink
app only to listen on the first partitions, this would also work. Otherwise, there should be an settings for theprocessor
application to detect the number of physical partitions, or a bit more documentation on the Scaling In part.Release versions:
All versions
The text was updated successfully, but these errors were encountered: