Skip to content

TaskCount == parittions if you dont specify in your spec#13339

Closed
churromorales wants to merge 3 commits intoapache:masterfrom
churromorales:taskCounts
Closed

TaskCount == parittions if you dont specify in your spec#13339
churromorales wants to merge 3 commits intoapache:masterfrom
churromorales:taskCounts

Conversation

@churromorales
Copy link
Contributor

If you don't specify taskCount in your supervisor spec, it now defaults to the # of partitions you have in kinesis / kafka. This is useful for customers that are in a growth phase. Where they start with x number of partitions but then have to grow by some amount. Before you would have to update the supervisor spec and add middle managers (if needed). With the mm-less patch in druid, if a customer increased the number of partitions, all you have to do is restart the supervisor and it will pick up the new partitions. Even if they don't grow, it is best to have one task per partition as the default (unless others oppose this).

Future work: this patch forces a manual restart of the supervisor, but no spec changes are needed, or knowledge of how many partitions your message bus has. In the future we could have a thread that detects a change and automatically restarts the supervisor, but that might be overkill.

@churromorales
Copy link
Contributor Author

@kfaraz does this look good to you? Thank you

@kfaraz
Copy link
Contributor

kfaraz commented Nov 16, 2022

@churromorales , it makes sense for the MM-less setup. But for a cluster running on middle managers, if we set the number of tasks equal to the number of partitions, that means each task would be mapped to a single partition. If we don't have enough workers to launch that many tasks, some partitions will never be read from.

So maybe we can use the new default logic only for the MM-less setup for now?

@churromorales
Copy link
Contributor Author

This is just if you leave the taskCount parameter out of your spec. You can always specify taskCount=xx in the spec and it it would launch only that many tasks.

@churromorales
Copy link
Contributor Author

churromorales commented Nov 17, 2022

One other thing, I have run into issues like this when having tasks handle multiple partitions. #8139

I personally think the default should do the most sensible thing, instead of defaulting to 1, just to satisfy a small set of users. 1 is the safest thing to do, but if you have more partitions than tasks, you are not necessarily safe either. Now I haven’t checked if this happens in the latest Druid version, we stopped having a task handle more than one partition, so I never looked into this issue again after it bit us enough times.

Also this is part of the supervisor, so I believe the change set is agnostic to the task runner implementation. I don’t see an easy way to make this work just for mm-less without a hack.

Anyways let me know your thoughts. I do agree that in a mm world having tasks == partitions may not be an ideal default. But in most deployments, 1 as a default doesn’t make much sense either. I guess there is no “right answer” here :(.

Thanks

@kfaraz
Copy link
Contributor

kfaraz commented Nov 18, 2022

Yeah, #8139 needs some investigation. We should be able to support multiple partitions per task.

As you said, 1 is the safest option, but I see your point.
We could set it to the partition count, given we have enough worker slots.
So, default could be something like Math.min(1, totalWorkerCapacity/2).
See OverlordResource.getTotalWorkerCapacity() for reference.

Future work: this patch forces a manual restart of the supervisor, but no spec changes are needed, or knowledge of how many partitions your message bus has. In the future we could have a thread that detects a change and automatically restarts the supervisor, but that might be overkill.

For this, we already have a LagBasedAutoscaler, we could probably add a LagAndPartitionBasedAutoscaler? I have not looked into it but I think it should be doable.

@churromorales
Copy link
Contributor Author

Fair enough, can we do a compromise here.

How about this?

if you specify taskCount that is respected
If you do not specify taskCount then it defaults to 1

if you specify a taskCount of -1 then you align the number of tasks to # of partitions

I think that way for all existing users, nothing changes. But for users that want to default to the number of partitions for their queue and not really worry about things, they pass -1 in for taskCount.

Are you okay with this?

@kfaraz
Copy link
Contributor

kfaraz commented Nov 22, 2022

if you specify a taskCount of -1 then you align the number of tasks to # of partitions

Thanks for your prompt response, @churromorales . I don't really like the idea of having a special meaning attached to a specific value of a config. Especially, passing a negative value to a config that is clearly supposed to be positive is not so great if we wanted to validate the parameters.

That said, we could probably have this behaviour on passing task count as 0 (instead of -1). There are some other configs which do that, such as maxSegmentsInNodeLoadingQueue, where 0 means an unlimited load queue. So, here too 0 would mean maximum possible, which is nothing but the number of partitions.

The idea of defaulting task count to number of partitions is a good one. We just need to implement it in the right way. Does the worker capacity based solution not seem viable to you? Even with passing the 0 value above, it would be safer if we just checked the worker capacity before launching a large number of tasks.

@abhishekagarwal87
Copy link
Contributor

I agree with Kashif about overloading this configuration. I would rather stick with the original proposal made in this PR. Any user should be setting up a reasonable value for the number of tasks in a production environment. And those specs don't get affected by this change.

I have marked it as "Design Review" since its a change in config behavior.

@churromorales
Copy link
Contributor Author

any update on this? I personally think in a production environment it would not be prudent to run with a default of 1 task. I think having taskCount == partitions would be more sensible as a default. But if you guys have customers that rely on having a default of 1 I can also understand that.

@kfaraz
Copy link
Contributor

kfaraz commented Jan 14, 2023

@churromorales , as @abhishekagarwal87 mentions, in a production environment, we should always be prudent while using the default value of any config, especially one that dictates the usage of resources such as task slots. So I agree that in prod, no one should be using a task count of 1. I have not seen anyone do it in my experience either.

The only concern I have with using a default value of taskCount = numPartitions is that we just might not have enough task slots. But I agree that Druid should use a better and more dynamic default out of the box.

How about we do this:

  • If taskCount is specified, we use that
  • If not specified, we use taskCount = Math.min(numPartitions, totalWorkerCapacity/2).

This way, we ensure that ingestion always runs successfully and has a better default value than 1. I agree that this might lead to multiple partitions being mapped to a single task, but only in the case where your setup doesn't have enough task slots. In a prod environment, we would ideally have enough task slots.
(To make this better, you may even choose a taskCount such that numPartitions is a multiple of it, so that each task gets the same number of partitions.)

Let me know what you think.

@github-actions
Copy link

This pull request has been marked as stale due to 60 days of inactivity.
It will be closed in 4 weeks if no further activity occurs. If you think
that's incorrect or this pull request should instead be reviewed, please simply
write any comment. Even if closed, you can still revive the PR at any time or
discuss it on the dev@druid.apache.org list.
Thank you for your contributions.

@github-actions github-actions bot added the stale label Jan 12, 2024
@github-actions
Copy link

This pull request/issue has been closed due to lack of activity. If you think that
is incorrect, or the pull request requires review, you can revive the PR at any time.

@github-actions github-actions bot closed this Feb 10, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants