Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[improve][fn] Allow unknown fields in connectors config #20116

Merged
merged 6 commits into from
Apr 26, 2023

Conversation

nicoloboschi
Copy link
Contributor

Motivation

Most of the java connectors (sinks and sources) map the configuration to a java class. Pulsar exposes a method to convert the configuration (IOConfigUtils#loadWithSecrets) using a jackson mapper. However is up to the connector to decide how to convert the configuration.
Most of the builtin java connectors do not allow unknown fields in their configuration. This is usually a good thing since it alerts the user that something is malconfigured. But this doesn't leave space for connectors' rollbacks.

The scenario which is not possible to cover is:

  • Upgrade a builtin connector which has a new config property
  • Upgrade existing connectors and use this new property
  • For some reasons, the connectors must be rollbacked. In order to let the connector work, it's required to rollback also the existing connectors' configuration removing the new field. This can be very hard to achieve in large clusters.

Modifications

  • Added a new flag in the functions worker ignoreUnknownConfigFields (default false) that strips out fields not supported in the current sink/source config. The sink/source config class is retrieved from the service definition (sinkConfigClass and sourceConfigClass)
  • In case a field is stripped out, a WARN log is printed out:
11:31:37,446 WARN [public/default/elastic-sink-0] [instance: 0] JavaInstanceRunnable - Field 'bro' not defined in the SINK configuration org.apache.pulsar.io.elasticsearch.ElasticSearchConfig, the field will be ignored
  • Added the configClass reference in the service definitions of connectors that were missing it

Verifying this change

  • [z] Make sure that the change passes the CI checks.

Documentation

  • doc
  • doc-required
  • doc-not-needed
  • doc-complete

Matching PR in forked repository

@github-actions github-actions bot added the doc-not-needed Your PR changes do not impact docs label Apr 17, 2023
Copy link
Contributor

@dlg99 dlg99 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

lgtm

if (configClass != null) {
final Object configInstance = Reflections.createInstance(configClass,
Thread.currentThread().getContextClassLoader());
final List<String> allFields =
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am not sure that this is correct.

ObjectMapper should follow Java Beans conventions and use getter/setters together with public fields.
We should use some ObjectMapper utilities here in order to ensure that we are doing the right thing

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've found a way to use the same deserializer used by jackson mapper, so this should cover all the cases when jackson mapper is used by the sink/source


for (String s : config.keySet()) {
if (!allFields.contains(s)) {
log.warn("Field '{}' not defined in the {} configuration {}, the field will be ignored",
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

maybe this should be logged as "ERROR", WARNINGs tend to be ignored by alert systems

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done

+ "Ignoring unknown fields makes possible to keep the new configuration and "
+ "only rollback the connector."
)
private boolean ignoreUnknownConfigFields = false;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This configuration applies to the instances of the functions/connectors, and not to the function worker (that already ignores unknown fields)

what about 'functionsIgnoreUnknownConfigFields' ? (maybe we can do better, but connectors are actually functions)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I thought about that and I agree. But there are other properties that are meant to be used by the java runner like
maxPendingAsyncRequests, exposeAdminClientEnabled, additionalJavaRuntimeArguments so I'd prefer to stick with this convention

Copy link
Contributor

@eolivelli eolivelli left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

+1

LGTM

@nicoloboschi nicoloboschi merged commit f7c0b3c into apache:master Apr 26, 2023
41 checks passed
nicoloboschi added a commit to datastax/pulsar that referenced this pull request Apr 26, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area/function doc-not-needed Your PR changes do not impact docs ready-to-test
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

4 participants