Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

KAFKA-13328, KAFKA-13329 (2): Add custom preflight validation support for connector header, key, and value converters #14309

Open
wants to merge 15 commits into
base: trunk
Choose a base branch
from

Conversation

C0urante
Copy link
Contributor

Jira 1, Jira 2

Depends on #14304

Although the Converter and HeaderConverter interfaces both include methods to provide a ConfigDef object, we don't make use of them during preflight configuration validation. This makes Kafka Connect harder to use with, among other things, programmatic UIs that perform automatic validation as soon as users enter in new key/value pairs. Additionally, it causes errors in configuration to surface at runtime (after rebalances have taken place and while tasks are being instantiated), which is less convenient for all users.

This PR uses these ConfigDef objects for preflight validation if the user specifies a custom key, value, and/or header converter class in their connector config. If a key, value, or header converter returns a null ConfigDef, we simply log a warning. Although this is technically disallowed by the Converter and HeaderConverter API, because we have not been enforcing that requirement up til now, we permit that case in order to not break existing setups.

Committer Checklist (excluded from commit message)

  • Verify design and implementation
  • Verify test coverage and CI build status
  • Verify documentation (including upgrade notes)

@C0urante C0urante changed the title KAFKA-13328, KAFKA-13329: Add custom preflight validation support for connector header, key, and value converters KAFKA-13328, KAFKA-13329 (2): Add custom preflight validation support for connector header, key, and value converters Aug 29, 2023
@C0urante C0urante force-pushed the kafka-13328-13329-configdef-validation branch from 7eea31d to 452d38b Compare August 31, 2023 21:29
Copy link
Contributor

@gharris1727 gharris1727 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

When looking through the validateConverterConfig method, I realized that I've seen the same implementation beats before: in the EnrichablePlugin class. It uses a different style of abstraction (subclasses + abstract methods instead of lambdas) but I think it solves a similar problem.

This makes the key/value/header converters follow more of the producer/consumer/admin config validation style, rather than the transformation/predicate style. There is an extra layer of indirection (the use of aliases and prefixes) but there is also some overlap.

I wonder if we can unify these approaches, and maybe even use the "enrich" patten for producer/consumer/admin instead of the "merge" style.

return null;
}

try (Utils.UncheckedCloseable close = () -> Utils.maybeCloseQuietly(converterInstance, converterName + " " + converterClass);) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🤨

is there some reason why this can't be put in a finally block? The unchecked closable should be for exception-throwing operations that should have their exceptions suppressed, but maybeCloseQuietly should never throw an exception.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It felt slightly more readable to use a try-with-resources block than a finally block (especially since we don't have any catch blocks). You're correct that Utils::maybeCloseQuietly doesn't throw any checked exceptions, but I had to provide a left-hand type (that extended the AutoCloseable interface) to prove that to the compiler, which ruled out AutoCloseable and Closeable since both of those interface's close methods throw checked exceptions.

Also, regarding this comment:

The unchecked closable should be for exception-throwing operations that should have their exceptions suppressed, but maybeCloseQuietly should never throw an exception.

Is this correct? The interface was introduced in #8618, where it was used on the left-hand side of a try-with-resources assignment that captured a method reference which did not throw a checked exception.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It felt slightly more readable to use a try-with-resources block than a finally block (especially since we don't have any catch blocks).

I suppose there's some subjectivity involved here, since I found the UncheckedClosable and explicit lambda to be a bit hard to read initially, but understood after some inspection. Using try-finally without any catch clauses is a pretty normal arrangement, and I think more developers would be used to it as compared to using a lambda along with our special UncheckedClosable.

AFAIU the try-with-resources construct was added to help with cleaning up AutoClosable resources which can throw exceptions both during opening and closing, where it becomes tedious to set up the finally to perform the cleanup correctly. In this specific situation, the newInstance (open) errors are handled by a separate try, and the close errors are handled by closeQuietly, so none of the value-add of the try-with-resources is apparent.

I See what you mean though, as we do have exceptions from open and close, and we have somewhat tedious error handling surrounding them. But since the objects we're instantiating are "sometimes AutoClosable", the try-with-resources type checking is going to get in the way.

Using try-with-resources to handle open and close together, you could have a wrapper class MaybeClosable<T> implements UncheckedClosable, Supplier<T> along with a method static <T> MaybeClosable<T> quiet(T, String) that you would call like this:

try (MaybeClosable<Converter> wrapper = MaybeClosable.quiet(Utils.newInstance(...), "converter (for validation)") {
    Converter converter = wrapper.get();
    // do stuff
} catch (RuntimeException e) {
    // exceptions from newInstance and do stuff
}
// exceptions from close are logged instead of propagated/suppressed

I think that would type-check but I haven't tried it out myself. Everything is just a rough suggestion, so please iterate on the names or ergonomics if you like the idea.

Without the closeQuietly semantics, it would look like this:

try (MaybeClosable<Converter> wrapper = MaybeClosable.propagate(Utils.newInstance(...)) {
    Converter converter = wrapper.get();
    // do stuff
} catch (RuntimeException e) {
    // exceptions from newInstance and do stuff
}
// exceptions from close are not checked, but propagated or suppressed.

Is this correct? The interface was introduced in #8618, where it was used on the left-hand side of a try-with-resources assignment that captured a method reference which did not throw a checked exception.

When I said "exception throwing operation" I didn't mean "method that throws a checked exception", because I was thinking about how methods can throw RuntimeExceptions whether or not they have checked exceptions in the signature. I probably should have said "method that throws unchecked exceptions" to be unambiguous. Yes this PR and the linked PR both did not have checked exceptions, but they differ because one throws RuntimeExceptions and one does not.

In that PR I used try-with-resources because I wanted the propagate-or-suppress logic built into the try-with-resources for free, instead of the shadowing behavior that a normal finally clause has. If I just wanted to log exceptions in the finally operation, I think I would have used a Utils.closeQuietly call in the finally instead. Recently in #14277 i did a bare log in the finally instead of try-with-resources, as I didn't need the propagate-or-suppress logic. Maybe I should have used closeQuietly, but I didn't think of that at the time for whatever reason.

Most of my friction with the as-implemented version is that you're using the UncheckedClosable and all of the readability costs, without getting the benefit of the propagate-or-suppress logic.

Copy link
Contributor Author

@C0urante C0urante Oct 31, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for an exhaustive analysis, I now see the error of my ways 🙏

I've pushed a new commit migrating the Utils::closeQuietly call to a finally block. The MaybeCloseable idea is fascinating but I don't think the additional cognitive burden is worth the small bump in ergonomics for this change. I do think it might be worth it to apply it across the whole code base (sort of like how #13185 forces the whole code base to be aware of plugin-thrown exceptions, at least for Connector instances).

@C0urante
Copy link
Contributor Author

C0urante commented Sep 5, 2023

I wonder if we can unify these approaches, and maybe even use the "enrich" patten for producer/consumer/admin instead of the "merge" style.

I honestly find the EnrichablePlugin class pretty hard to read, and prefer the merge style when it can be used. The extra logic involved in supporting aliases and overriding the base ConfigDef is already fairly complex, and we'd also have to expand the class to allow plugins to optionally return null ConfigDef objects.

If the stylistic suggestions are not blockers for review, but blockers for merging, do you think we could establish the ideal user-facing behavior here and then use a separate PR for a refactoring? I can target this branch with that PR (which would allow review to take place on it without having to merge this one to trunk), or target trunk (if we feel comfortable merging this without blocking on a refactor).

And of course, if the stylistic suggestions are blockers for review, let me know and I can take a stab at that without doing anything fancy 😄

@C0urante C0urante force-pushed the kafka-13328-13329-configdef-validation branch from 452d38b to 6781085 Compare September 5, 2023 20:27
@gharris1727
Copy link
Contributor

I honestly find the EnrichablePlugin class pretty hard to read, and prefer the merge style when it can be used.

Sure, I'll buy that. I'm fine with migrating away from EnrichablePlugin to something else as long as it is a common abstraction. My concern here was just that we were adding a distinct third style of validating configurations when there appeared to be a lot of common functionality that could be shared.

If the stylistic suggestions are not blockers for review, but blockers for merging, do you think we could establish the ideal user-facing behavior here and then use a separate PR for a refactoring? I can target this branch with that PR (which would allow review to take place on it without having to merge this one to trunk), or target trunk (if we feel comfortable merging this without blocking on a refactor).

I'm fine with reviewing this as-is and merging to trunk, and then refactoring the other two strategies in a follow-up. I think using lambdas is more appropriate than anonymous classes which are constructed for just one method call and then discarded.

Function<T, ConfigDef> configDefAccessor,
String converterName,
String converterProperty,
ConverterType converterType
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I believe this is the only thing which makes this a converter-specific function. All of the other logic appears generic across all plugin types. It could be replaced with a Consumer<Map<String, String>> function which mutates the config before it is passed to the plugin, or a Function<Map<String, String>, Map<String, String>> if you wanted to be pure.

Then all of the variable names can just become plugin instead of converter.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I like leaving room for a potential future refactor that uses this method for more than just converters, but there is one more part here that's converter-specific: we permit converters to return null ConfigDef objects from their config methods. We don't allow, e.g., SMTs or predicates to do the same.

To save time on potential future refactoring, I've renamed every parameter and variable in this method to use "plugin" instead of "converter, but I've kept the method name itself (validateConverterConfig) the same, and have retained some converter-specific language in log messages. I've also used a Map<String, String> defaultProperties instead of the functional programming approach in order to be a little more concise with existing caller code; hope this isn't too controversial. Let me know if this strikes the right balance!

I've also added a Javadoc to this method with examples for most parameters, since the parameter names only helped so much and even I was having trouble recalling how this method worked after a few months without reading it.

@C0urante C0urante force-pushed the kafka-13328-13329-configdef-validation branch from 6781085 to 756729e Compare October 31, 2023 19:16
@C0urante C0urante force-pushed the kafka-13328-13329-configdef-validation branch from 756729e to 5b33be2 Compare October 31, 2023 19:16
@C0urante
Copy link
Contributor Author

@gharris1727 similar to #14304 - apologies for the delay, I've addressed all outstanding concerns, ready for another round when you have time 👍

Copy link

This PR is being marked as stale since it has not had any activity in 90 days. If you would like to keep this PR alive, please ask a committer for review. If the PR has merge conflicts, please update it with the latest from trunk (or appropriate release branch)

If this PR is no longer valid or desired, please feel free to close it. If no activity occurs in the next 30 days, it will be automatically closed.

@github-actions github-actions bot added the stale Stale PRs label Jan 30, 2024
@gharris1727
Copy link
Contributor

Hi @C0urante could you resolve the merge conflicts now that #14304 is merged?

@github-actions github-actions bot removed the stale Stale PRs label Feb 6, 2024
@C0urante
Copy link
Contributor Author

Hi @gharris1727 sorry for the delay. I've resolved the merge conflicts; let me know what you think if you have a moment.

@C0urante
Copy link
Contributor Author

@gharris1727 I've resolved the merge conflicts again; can you please take a look when you get a chance?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
2 participants