Skip to content

Conversation

@rkhachatryan
Copy link
Contributor

@rkhachatryan rkhachatryan commented Aug 9, 2022

Because different changelog implementations might have different
options, the configuration is passed as a (serialized) string map.

To extract the relevant KV-pairs (and not pass the whole configuration),
Changelog factory is used on JM.

On TM, its configuration is combined with the one from job.

The path in common cases is as follows:
env -> streamGraphGenerator -> streamGraph -> jobGraph -> TM

@rkhachatryan rkhachatryan changed the title [WIP][FLINK-26372][runtime][state] Allow to configure Changelog Storage pe… [WIP][FLINK-26372][runtime][state] Allow to configure Changelog Storage per program Aug 9, 2022
@flinkbot
Copy link
Collaborator

flinkbot commented Aug 9, 2022

CI report:

Bot commands The @flinkbot bot supports the following commands:
  • @flinkbot run azure re-run the last Azure build

Comment on lines +646 to +649
if (changelogStateBackendEnabled.getOrDefault(false)) {
StateChangelogOptionsInternal.putConfiguration(
jobConfiguration, changelogConfiguration);
}
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Here, changelog configuration is passed from JM to TM inside the jobGraph.jobConfiguration. It is then merged on TM with its "local" TM configuration.

It's not strongly typed (something like ChangelogConfig) because different implementations might have different parameters. And it's not a serialized factory because passing a string map seems safer and easier.

It's added to jobConfiguration as serialized value under a single key (rather than merged) because semantically they are different. jobConfiguration is some internal Flink config, while changelog configuration contains some user-facing keys; so they might clash.

@zentol WDYT about this approach?

…r program

Because different changelog implementations might have different
options, the configuration is passed as a (serialized) string map.

To extract the relevant KV-pairs (and not pass the whole configuration),
Changelog factory is used on JM.

On TM, its configuration is combined with the one from job.

The path in common cases is as follows:
env -> streamGraphGenerator -> streamGraph -> jobGraph -> TM
Copy link
Contributor

@masteryhx masteryhx left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the draft!
The passing path makes sense to me.
I have a comment about the overide logic, PTAL.
BTW, we could validate the overide logic by a UT ?

StateChangelogStorageView<?> createStorageView(Configuration configuration) throws IOException;

/** Extract the relevant to this factory portion of the configuration. */
default Configuration extractConfiguration(ReadableConfig src) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

IIUC, Even if users haven't setted these parameter in job level, we will still overide them in current logic?
Should we just overide them when users have setted them ?

@rkhachatryan
Copy link
Contributor Author

rkhachatryan commented Jan 25, 2023

Superseded by #21637.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants