Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

KAFKA-7509: Reduce unnecessary and misleading “configuration supplied but not known” warning messages in Connect #5802

Closed
wants to merge 3 commits into from

Conversation

rhauch
Copy link
Contributor

@rhauch rhauch commented Oct 15, 2018

When running Connect, the logs contain quite a few warnings about:

The configuration '{}' was supplied but isn't a known config.

This occurs when Connect creates producers, consumers, and admin clients, because the AbstractConfig is logging unused configuration properties upon construction. It's complicated by the fact that the Producer, Consumer, and AdminClient all create in their constructors private instances of ProducerConfig, ConsumerConfig, and AdminClientConfig, respectively, and the unused properties are logged as warnings there. Because the AbstractConfig instances are created by the client components, Connect is not able to call the ignore(String key) method to suppress those log warnings.

Unfortunately, there are no arguments in the Producer, Consumer, or AdminClient public constructors to control whether the configs log these warning in their constructors. While that might be a possible change we want to make in the future, a simpler workaround is for Connect to be more hygienic and pass to the Producer, Consumer, and AdminClient constructors only those configuration properties that the ProducerConfig, ConsumerConfig, and AdminClientConfig configdefs know about.

This PR makes this change for Connect.

If this approach is approved, we should consider how far back we want to port this change.

Committer Checklist (excluded from commit message)

  • Verify design and implementation
  • Verify test coverage and CI build status
  • Verify documentation (including upgrade notes)

@rhauch rhauch changed the title Reduce unnecessary and misleading “unused configuration” warning messages in Connect KAFKA-7509: Reduce unnecessary and misleading “unused configuration” warning messages in Connect Oct 15, 2018
…warning messages in Connect

Change how configurations are passed to Kafka producer/consumer/adminClient to eliminate the `The configuration '{}' was supplied but isn't a known config` warning log messages for producer, consumer, and admin client.
@rhauch rhauch changed the title KAFKA-7509: Reduce unnecessary and misleading “unused configuration” warning messages in Connect KAFKA-7509: Reduce unnecessary and misleading “configuration supplied but not known” warning messages in Connect Oct 15, 2018
@cotedm
Copy link
Contributor

cotedm commented Oct 16, 2018

Not sure how much of a concern it should be that we are using these *Config.configNames() methods to filter the configs. Can we be sure that all relevant configurations will make it through using these methods? I assume it requires a KIP to add/remove/modify a config name and that the review process for such a thing would require a review of the configNames() method but if something were to slip through it might lead to an unfortunate regression for Connect. Probably be good to get some feedback on that angle to double check. FWIW I would love to see this get in since debugging config issues with Connect is made difficult by the lack of filtering.

@rhauch
Copy link
Contributor Author

rhauch commented Oct 19, 2018

@hachikuji, here is the PR that attempts to pass only the producer, consumer, and admin client configs to the producer, consumer, and admin client (respectively). As @cotedm mentioned above, this may be risky if any of these components might use configurations that are not in the ConfigDef.

Do you have any thoughts about this approach?

Copy link
Contributor

@mageshn mageshn left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@rhauch Overall, I like the approach. AFAIK, all new configs get added to one of these ConfigDef's. With that assumption, I just had one comment about logging the configs that were not retained at debug level. I think it might be useful to have that in some rare cases. Thoughts?

@rhauch
Copy link
Contributor Author

rhauch commented Oct 22, 2018

@mageshn addressed your suggestion of a debug log message.

Copy link
Contributor

@mageshn mageshn left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks, @rhauch. LGTM.

@rhauch
Copy link
Contributor Author

rhauch commented Oct 22, 2018

@ewencp or @hachikuji, any thoughts on this approach?

@soenkeliebau
Copy link
Contributor

Should we maybe consider adding this method directly to the AbstractConfig class? This could potentially be useful from other modules as well would logically fit there.

Copy link
Contributor

@ewencp ewencp left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@rhauch It seems like this shouldn't be necessary if ConfigDef for nested configs is used properly. It's likely somewhere we either aren't using it properly (passing along RecordingMaps from the AbstraactConfig) or we are doing logUnused() before we do nested configuration.

@rhauch
Copy link
Contributor Author

rhauch commented Oct 25, 2018

@ewencp that's a good point, and I probably have been overzealous in this fix. The worker does indeed only use select top-level properties when configuring the producer and then combines those with the prefixed / nested producer configs (see here).

I guess the biggest culprit here are the three log-based stores, which pull in everything from the worker config and the properties with the producer. or consumer. prefix and then use these for the producer, consumer, and admin clients:

So, these should probably all do something similar to what the worker does for its producer. This would work well for the producer and consumer, but it won't be enough for the admin client, which needs a subset of the producer and consumer configurations.

But perhaps I don't understand what you mean by "doing logUnused() before we do nested configuration". The AdminClient.create(Properties p) method constructs the AdminClientConfig instance with the true parameter so that the AdminClientConfig constructor will always call logUnused().

We could add a AdminClient.create(AdminClientConfig) method that allows us to construct the AdminClientConfig instance ourselves, which would allow us to create it without having to log unused properties. Normally we log the unused properties for safety reasons, but perhaps we don't need that in this particular usage of the AdminClient.

EDIT: actually, the KafkaAdminClient constructor is directly calling config.logUnused().

Let me rework this PR to more correctly handle how the stores are collecting their producer and consumer properties, add the AdminClient method as mentioned above, and remove all of the other changes. (I may create a new PR since it's pretty much a completely different approach.)

@mageshn
Copy link
Contributor

mageshn commented Oct 25, 2018

@rhauch For the internal topics, I don't think we can do something similar to what the worker does for its producer. AFAIK, the producer. and consumer. are reserved for the topics used by the Connectors. For the internal topics, you could specify the producer/consumer configs without any prefix. Making any changes to this will be breaking compatibility.

@rhauch
Copy link
Contributor Author

rhauch commented Oct 25, 2018

@mageshn, ok, good point. The worker's producer properties are passed to the source tasks, so that will not work. But I think @ewencp was right in that this isn't necessary for the producer.* and consumer.* properties used for the source and sink connectors and tasks.

And, @soenkeliebau was correct in pointing out that the current approach is losing all extensions for metric reporters and interceptors, sot that will require some changes.

@rhauch
Copy link
Contributor Author

rhauch commented Oct 25, 2018

Okay, I am now to the point where I think this "hygiene" approach is completely flawed, and that instead the problem is that we're logging as warnings all "unused" properties in the producer, consumer, and the KafkaAdminClient.

Let me back up. Based upon the discussion above, I modified my approach to attempt to retain all of the configuration properties that are known to the ProducerConfig, ConsumerConfig, and AdminClientConfig, where "known" properties include

  • all of those whose name is in the set returned by each of the config's configNames(), or
  • any property that can be passed to an interceptor, key or value (de)serializer, metrics reporter, or partitioner that is instantiated by the client.

That last bit is the problem: the properties to the clients' interceptors, serdes, metrics reporter, and partitions are all unprefixed, so it is impossible to know which properties are needed by any of the specified implementations.

IOW, the properties passed to a producer, consumer, or admin client must be able to include any property that is needed by any of these custom components. And, because the getConfiguredComponent method used by the clients passes the Map<String, Object> to the component's configure method, the AbstractConfig doesn't know whether those properties are even used by the component. So, if the AbstractConfig doesn't really even know whether it a property is really used or unused, why are the Producer, Consumer, and KafkaAdminClient even bothering to log "unused" properties?

Bottom line: I now posit that the only way to accurately eliminate these warnings is to remove the config.logUnused() call from the Producer, Consumer, and KafkaAdminClient, or to change AbstractConfig.logUnused() to log these at the INFO (or DEBUG) level.

@ewencp, @hachikuji, @cmccabe: thoughts?

@rhauch
Copy link
Contributor Author

rhauch commented Nov 1, 2018

Closing due to conclusion above. Will open a new PR with the proposed changes.

@rhauch rhauch closed this Nov 1, 2018
@rhauch
Copy link
Contributor Author

rhauch commented Nov 1, 2018

Superseded by #5867

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
5 participants