spring.cloud.stream.instanceCount and Autoscaling #1342

steve-roy · 2018-04-03T19:54:47Z

Hi,

Are these properties required for proper function?
spring.cloud.stream.instanceCount
spring.cloud.stream.instanceIndex

If so, how are these expected to be set in an autoscaling environment?

steve

olegz · 2018-04-03T19:59:37Z

@steve-roy can you please be a bit more specific with regard to what you're trying to achieve?

steve-roy · 2018-04-03T20:57:55Z

For example, we will be running our Kafka Consumer / Producer in an Amazon Autoscaling environment. Let's call our Spring Boot / Cloud Stream application CP. When load is light we might be running one instance of CP.

In this case, if we understand the use of the settings they would be:
spring.cloud.stream.instanceCount=1
spring.cloud.stream.instanceIndex=1

During heavy activity, we might ask Amazon to increase the application instance count of CP based on rules we define. Let's say Amazon increases the instances to 3.

In this case, the settings would be:

CP1
spring.cloud.stream.instanceCount=3
spring.cloud.stream.instanceIndex=1

CP2
spring.cloud.stream.instanceCount=3
spring.cloud.stream.instanceIndex=2

CP3
spring.cloud.stream.instanceCount=3
spring.cloud.stream.instanceIndex=3

Because, the code is a pre-built jar and deployed we can't update the properties in the jar.

Questions:

Are these settings required?
What is their purpose?
How does the Spring Cloud Stream team envision these properties being updated dynamically in an autoscaling environment if they are needed?

Does this help?

steve

steve-roy · 2018-04-04T14:44:50Z

Also, looks like this issue was previously opened and then closed but the resolution is not clear.

#1151

garyrussell · 2018-04-04T15:05:52Z

They are used for binder-based partitioning (when using a binder which, unlike Kafka, doesn't support partitioning natively).

With Kafka, the instanceCount is also used (along with concurrency) to check if the topic has enough partitions (count * concurrency), and adjust them up, if so configured (autoAddPartitions).

Similarly, the instance index is used for partitioned consumers; again, not necessary with Kafka as long as Kafka group management is used, so that Kafka allocates partitions to instances.

Some environments (PCF, k8s - with the SCDF deployer) allocate an instance index automatically.

steve-roy · 2018-04-04T18:20:10Z

ok - to confirm - if we are using Kafka with Consumer Groups these two properties are not required:

spring.cloud.stream.instanceCount=
spring.cloud.stream.instanceIndex=

garyrussell · 2018-04-04T18:25:21Z

Correct. But you need to make sure you have sufficient partitions for your topics to support the scale-out you desire, because the binder won't attempt to increase them.

steve-roy · 2018-04-04T18:27:27Z

understood - thank you.

Are you able to comment on the Kinesis Spring Cloud Stream binder? Are the properties required with the Kinesis Spring Cloud Stream binder?

https://github.com/spring-cloud/spring-cloud-stream-binder-aws-kinesis

artembilan · 2018-04-04T18:35:36Z

@steve-roy ,

No, those properties are not required on Kinesis as well.
We have there a logic like:

if (properties.getInstanceCount() > 1) {
	shardOffsets = new HashSet<>();
	KinesisConsumerDestination kinesisConsumerDestination = (KinesisConsumerDestination) destination;
	List<Shard> shards = kinesisConsumerDestination.getShards();
	for (int i = 0; i < shards.size(); i++) {
		// divide shards across instances
		if ((i % properties.getInstanceCount()) == properties.getInstanceIndex()) {
			KinesisShardOffset shardOffset = new KinesisShardOffset(kinesisShardOffset);
			shardOffset.setStream(destination.getName());
			shardOffset.setShard(shards.get(i).getShardId());
			shardOffsets.add(shardOffset);
		}
	}
}

So, if you specify them, each instanceIndex will get its own set of shards from the stream.
Otherwise all the instances will share all the shards, but only one instance will process one record from the shard thanks to consumerGroup management via checkpointer based on the DynamoDbMetaDataStore.

Does it make sense to you?

steve-roy · 2018-04-04T20:08:04Z

ok - thank you both for confirming.

Venkat2811 · 2018-10-25T12:37:49Z

Hey @artembilan,

I was exactly looking for this. So, from your explanation, with consumerGroup management via checkpointer these properties are not needed to be set for kinesis binder:

spring.cloud.stream.instanceCount=
spring.cloud.stream.instanceIndex=

However, I'm still not clear on this:
I have a consumerGroup let's say with 2 instances. It listens on a single kinesis stream with let's say 4 shards:
If I have this configuration on both my instances:

cloud:
    stream:
      bindings:
        input:
          destination: my_stream_with_4_shards
          group: my_consumer_group
          consumer:
            concurrency: 2
            partitioned: true

will instance1 and instance2 process 2 shards each ?

Also, what happens when I change concurrency: 4

Thanks,
Venkat

artembilan · 2018-10-25T14:28:28Z

@Venkat2811 ,

Unfortunately an even distribution still doesn't work for Kinesis Binder: spring-projects/spring-integration-aws#99.
We don't have there a rebalance implementation, so there is really a chance that one instance of your app will pick up all the shards for consuming.

However if you use instanceCount and instanceIndex, you'll end up with the static distribution:

		if (properties.getInstanceCount() > 1) {
			shardOffsets = new HashSet<>();
			KinesisConsumerDestination kinesisConsumerDestination = (KinesisConsumerDestination) destination;
			List<Shard> shards = kinesisConsumerDestination.getShards();
			for (int i = 0; i < shards.size(); i++) {
				// divide shards across instances
				if ((i % properties.getInstanceCount()) == properties.getInstanceIndex()) {
					KinesisShardOffset shardOffset = new KinesisShardOffset(kinesisShardOffset);
					shardOffset.setStream(destination.getName());
					shardOffset.setShard(shards.get(i).getShardId());
					shardOffsets.add(shardOffset);
				}
			}
		}

Please, consider in the future to ask questions on StackOverflow and don't comment on closed issues. There is a chance it is going to be lost.

mteodori · 2020-10-21T14:48:35Z

are these properties required for the rabbitmq binder?

garyrussell · 2020-10-21T15:38:43Z

Yes; the rabbit binder does not support auto-scaling with partitioned queues.

mteodori · 2020-10-21T16:23:14Z

thanks, so I guess in k8s I should move from Deployment to StatefulSet and compute index and count in an init container before I can try and setup HPA

steve-roy closed this as completed Apr 4, 2018

kdavisk6 mentioned this issue Nov 19, 2018

Support for Rebalancing Consumers with Kinesis spring-projects/spring-integration-aws#99

Closed

Siddharth-Somani mentioned this issue Dec 4, 2020

Same message is consumed multiple times by consumers running on multiple instances spring-cloud/spring-cloud-stream-binder-aws-kinesis#148

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

spring.cloud.stream.instanceCount and Autoscaling #1342

spring.cloud.stream.instanceCount and Autoscaling #1342

steve-roy commented Apr 3, 2018

olegz commented Apr 3, 2018

steve-roy commented Apr 3, 2018

steve-roy commented Apr 4, 2018

garyrussell commented Apr 4, 2018 •

edited

steve-roy commented Apr 4, 2018

garyrussell commented Apr 4, 2018

steve-roy commented Apr 4, 2018

artembilan commented Apr 4, 2018

steve-roy commented Apr 4, 2018

Venkat2811 commented Oct 25, 2018 •

edited

artembilan commented Oct 25, 2018

mteodori commented Oct 21, 2020

garyrussell commented Oct 21, 2020

mteodori commented Oct 21, 2020

spring.cloud.stream.instanceCount and Autoscaling #1342

spring.cloud.stream.instanceCount and Autoscaling #1342

Comments

steve-roy commented Apr 3, 2018

olegz commented Apr 3, 2018

steve-roy commented Apr 3, 2018

steve-roy commented Apr 4, 2018

garyrussell commented Apr 4, 2018 • edited

steve-roy commented Apr 4, 2018

garyrussell commented Apr 4, 2018

steve-roy commented Apr 4, 2018

artembilan commented Apr 4, 2018

steve-roy commented Apr 4, 2018

Venkat2811 commented Oct 25, 2018 • edited

artembilan commented Oct 25, 2018

mteodori commented Oct 21, 2020

garyrussell commented Oct 21, 2020

mteodori commented Oct 21, 2020

garyrussell commented Apr 4, 2018 •

edited

Venkat2811 commented Oct 25, 2018 •

edited