Skip to content

Flink: Allow setting slot sharing group for fine-grained resource management in DynamicSink #16065

Merged
pvary merged 7 commits into
apache:mainfrom
sqd:oss_slot_sharing_group
May 19, 2026
Merged

Flink: Allow setting slot sharing group for fine-grained resource management in DynamicSink #16065
pvary merged 7 commits into
apache:mainfrom
sqd:oss_slot_sharing_group

Conversation

@sqd
Copy link
Copy Markdown
Contributor

@sqd sqd commented Apr 20, 2026

Currently all operators created by the dynamic sink are part of the default slot sharing group, and thus getting an equal share of the resources on taskmanagers. However, it is usually the case that the sink and the generator operators are far more resource-heavy than the rest of the operators, making the default resource allocation inefficient.

Flink already supports fine-grained resource management mechanism to support use cases exactly like this. This change adds support to wire the dynamic sink into that system, by allowing the users to set slot sharing groups for 1. the shuffle writer 2. the generator+the forward writer -- they need to share the same slot sharing group to enable operator chaining.

Currently all operators created by the dynamic sink are part of the
default slot sharing group, and thus getting an equal share of the
resources on taskmanagers. However, it is usually the case that the sink
and the generator operators are far more resource-heavy than the rest of
the operators, making the default resource allocation inefficient.

Flink already supports fine-grained resource management mechanism to
support use cases exactly like this. This change adds support to wire
the dynamic sink into that system, by allowing the users to set slot
sharing groups for 1. the shuffle writer 2. the generator+the forward
writer -- they need to share the same slot sharing group to enable
operator chaining.
@sqd
Copy link
Copy Markdown
Contributor Author

sqd commented Apr 20, 2026

@mxm @pvary Would appreciate if you could take a look

@sqd
Copy link
Copy Markdown
Contributor Author

sqd commented Apr 20, 2026

We have been running this internally for a while now. This allows Flink pipelines using the Iceberg dynamic sink to very flexibly slice the taskmanager resources like this:
image

@pvary
Copy link
Copy Markdown
Contributor

pvary commented Apr 21, 2026

CC: @mxm, @Guosmilesmile

Copy link
Copy Markdown
Contributor

@mxm mxm left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How important is it for the slot sharing groups to be set explicitly? Could we add an option like disableSlotSharing() to put the components into different slots?

@sqd
Copy link
Copy Markdown
Contributor Author

sqd commented Apr 21, 2026

How important is it for the slot sharing groups to be set explicitly? Could we add an option like disableSlotSharing() to put the components into different slots?

The goal is to allow tailoring the resources allocated to each operator using Flink fine-grained resource management, so the user needs to pass in an SSG like this

sinkBuilder
  .generatorSlotSharingGroup(
    SlotSharingGroup.newBuilder('generator-ssg')
      .setCpuCores(1)
      .setTaskHeapMemoryMB(512)
      .build())
  .otherSinkBuilderOptions(...)
...

@sqd
Copy link
Copy Markdown
Contributor Author

sqd commented Apr 21, 2026

@mxm I also think it's a bit confusing that Flink uses "slot sharing group" which seems to imply some sort of resource isolation mechanism to manage resources, but here we are. :-)

@mxm
Copy link
Copy Markdown
Contributor

mxm commented Apr 23, 2026

I was curious because the common use case is to just disable slot sharing for certain operators. In that case, we don't need to pass an explicit SlotSharingGroup, but we can generate one behind the scenes.

@sqd
Copy link
Copy Markdown
Contributor Author

sqd commented Apr 23, 2026

@mxm How would the user specify resource specs for those operators in that case?

@mxm
Copy link
Copy Markdown
Contributor

mxm commented Apr 23, 2026

You wouldn't be able to do that. I'm assuming uniform resources across the TaskManagers.

@sqd
Copy link
Copy Markdown
Contributor Author

sqd commented Apr 23, 2026

@mxm I see your point. The current topology allows neither the generator/forward-writer nor the shuffle-writer to slot share with other tasks anyway. But I can definitely see the value in explicitly specifying that the tasks be split up. I'll add that disableSlotSharing() API to the builder

@mxm
Copy link
Copy Markdown
Contributor

mxm commented Apr 24, 2026

Thanks! Now the question is, whether the option to disable slot sharing would be sufficient. Do you need explicit control over the slot sharing groups?

@sqd
Copy link
Copy Markdown
Contributor Author

sqd commented Apr 24, 2026

@mxm Yes, I do need explicit control over the slot sharing groups. My use case is to assign different resource spec for 1. generator 2. sink 3. other operators in my pipeline.

@sqd
Copy link
Copy Markdown
Contributor Author

sqd commented May 7, 2026

Hi @mxm ! Just wondering if you got the chance to take another pass at this.

Copy link
Copy Markdown
Contributor

@mxm mxm left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM. Thanks @sqd!

Comment on lines +464 to +467
if (generatorSlotSharingGroup != null) {
forwardWriteResults.slotSharingGroup(generatorSlotSharingGroup);
}

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Instead of this, could we default to StreamGraphGenerator.DEFAULT_SLOT_SHARING_GROUP?
The code would be a bit nicer.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Unfortunately no, that is not equivalent. If unset, Flink tries to inherit the slot sharing group from the input operators; if set to DEFAULT_SLOT_SHARING_GROUP, that behavior is bypassed.

https://github.com/apache/flink/blob/release-2.1/flink-runtime/src/main/java/org/apache/flink/streaming/api/graph/StreamGraphGenerator.java#L651-L666

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the info!
This means that we should behave similarly with TableMaintenance.slotSharingGroup.
This is for another PR though.

CC: @Guosmilesmile

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for pinging me on this. I'll open a PR to update the behavior in TableMaintenance so that it aligns with the standard Flink behavior.

Comment thread docs/docs/flink-writes.md Outdated
Comment on lines +553 to +554
| `shuffeSinkSlotSharingGroup(SlotSharingGroup ssg)` | Set the [slot sharing group](https://nightlies.apache.org/flink/flink-docs-stable/docs/deployment/finegrained_resource/) for the shuffle sink. |
| `generatorSlotSharingGroup(SlotSharingGroup ssg)` | Set the [slot sharing group](https://nightlies.apache.org/flink/flink-docs-stable/docs/deployment/finegrained_resource/) for the generator (and forward sink chained to it). |
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Would it make sense to add these configs to FlinkDynamicSinkConf and FlinkDynamicSinkOptions too?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it's probably going to be too messy, because we'll have to make a config translation layer to set cpu/heap/off heap/managed, and even external resource if we want to support everything Flink offers.

https://nightlies.apache.org/flink/flink-docs-stable/docs/deployment/finegrained_resource/#:~:text=you%20can%20set%20the%20following%20resource%20components%20for%20the%20slot%20sharing%20group

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should we set only the name and rely on env.registerSlotSharingGroup(ssgWithResource) for registering the resource?

Copy link
Copy Markdown
Contributor Author

@sqd sqd May 12, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That is cleaner but I am still a bit hesitant. If the ssg names aren't registered, these configs silently take no effect. When we do correctly register the SSGs, this will split the config into two places (sounds like a hidden footgun). I am slightly negative, but if you still think it's valuable I'll make the change.

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would like to hear what @mxm and @Guosmilesmile think about this.
The TableMaintenance has a precedent to use the String as a config which makes it easier to use in SQL too (if the SSG is already defined in the env), but I fully understand your point too.

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Here's my immature take - if we go with the SlotSharingGroup class approach, we'd need to split it into a bunch of extra configs for SQL/config integration, while a simple string would be much easier to plug into SQL. For users who need custom slot group resources, they can just define them upfront in the env, which keeps the config simple. If more users ask for it later, we can always add slot group resource configs then. WDYT?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hi @Guosmilesmile thanks for looking. Just to confirm I understand your proposal, you are agreeing with what pvary suggests; and then in the future, we can add the full-fledged SSG resource config (like CPU/memory) if there's user interest. Am I understanding you correctly?

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Configuring SlotSharingGroups is sort of an special case. Most users, especially SQL users, will never do this. I would support the case of going with Java only config. If in the future, if is requested for SQL, we can still come up with a way to serialize the SlotSharingGroup into a string.

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't really like special cases. In the long run it is usually better to handle things consistently.

I would still prefer to have a string setting for ssc and also we should add it to FlinkDynamicSinkConf.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Changed to using a string setting instead.

@pvary pvary merged commit 6d8ebbb into apache:main May 19, 2026
20 checks passed
@pvary pvary changed the title Flink: Allow setting slot sharing group for fine-grained resource management Flink: Allow setting slot sharing group for fine-grained resource management in DynamicSink May 19, 2026
@pvary
Copy link
Copy Markdown
Contributor

pvary commented May 19, 2026

Merged to main.
Thanks @sqd for the PR and @mxm and @Guosmilesmile for the reviews!

sqd pushed a commit to sqd/iceberg that referenced this pull request May 19, 2026
sqd pushed a commit to sqd/iceberg that referenced this pull request May 19, 2026
kevinjqliu pushed a commit that referenced this pull request May 20, 2026
Co-authored-by: Han You <han.you@imc.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants