New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
KAFKA-3209: KIP-66: more single message transforms #2374
Conversation
Refer to this link for build results (access rights to CI server needed): |
Refer to this link for build results (access rights to CI server needed): |
Refer to this link for build results (access rights to CI server needed): |
Refer to this link for build results (access rights to CI server needed): |
Refer to this link for build results (access rights to CI server needed): |
@@ -37,7 +37,7 @@ | |||
|
|||
public abstract class InsertField<R extends ConnectRecord<R>> implements Transformation<R> { | |||
|
|||
public interface Keys { | |||
public interface ConfigName { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Seems this can be private instead? Ditto for other classes. I thought it was possibly for tests but it doesn't seem to be used anywhere anyway.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
will do
/** | ||
* Set the schema name, version or both on the record's key schema. | ||
*/ | ||
public static class Key<R extends ConnectRecord<R>> extends SetSchemaMetadata<R> { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Lack of multiple inheritance is really annoying in cases like this....
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Curious, how would multiple inheritance help?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
All these classes use the same pattern are defining the same overrides for operatingSchema
and updatedRecord
. If you put that into a single abstract class, then the definition of the Key
and Value
types would just be one line -- the class declaration that inherits from both SetSchemaMetadata
and this new abstract class that provides the overrides.
It's obviously not a big deal here since the implementations are trivial, just kind of annoying code bloat.
public R apply(R record) { | ||
final Schema valueSchema = record.valueSchema(); | ||
if (valueSchema == null) { | ||
if (!(record.value() instanceof Map)) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We end up doing this same set of checks in a bunch of places. Any way we could reduce the duplicated code?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'll factor out a couple of helper functions to call into requireMap
and requireStructSchema
|
||
Schema keySchema = valueToKeySchemaCache.get(valueSchema); | ||
if (keySchema == null) { | ||
final SchemaBuilder keySchemaBuilder = SchemaBuilder.struct(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We can probably skip it for now, but a cache for this conversion might be a nice improvement eventually. I'd guess that with schemas you're probably mostly seeing the same conversion.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There is a cache here though, valueToKeySchemaCache
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Clearly I was reading this without my 👓
Refer to this link for build results (access rights to CI server needed): |
Refer to this link for build results (access rights to CI server needed): |
Refer to this link for build results (access rights to CI server needed): |
@ewencp should be ready for review.
|
Refer to this link for build results (access rights to CI server needed): |
Refer to this link for build results (access rights to CI server needed): |
Refer to this link for build results (access rights to CI server needed): |
Refer to this link for build results (access rights to CI server needed): |
Refer to this link for build results (access rights to CI server needed): |
Refer to this link for build results (access rights to CI server needed): |
Refer to this link for build results (access rights to CI server needed): |
Refer to this link for build results (access rights to CI server needed): |
final String topic = matcher.replaceFirst(replacement); | ||
return record.newRecord(topic, record.kafkaPartition(), record.keySchema(), record.key(), record.valueSchema(), record.value(), record.timestamp()); | ||
} | ||
return record; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Wondering if it may be useful here to have an option to drop records (return null
here) that don't match the regex against the record.topic()
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hmm, not sure. I suspect I'd actually prefer an error in that case.
@ewencp addressed comments |
Refer to this link for build results (access rights to CI server needed): |
Refer to this link for build results (access rights to CI server needed): |
Refer to this link for build results (access rights to CI server needed): |
Refer to this link for build results (access rights to CI server needed): |
Refer to this link for build results (access rights to CI server needed): |
Refer to this link for build results (access rights to CI server needed): |
@ewencp sneaked in |
Refer to this link for build results (access rights to CI server needed): |
Refer to this link for build results (access rights to CI server needed): |
Refer to this link for build results (access rights to CI server needed): |
Refer to this link for build results (access rights to CI server needed): |
Refer to this link for build results (access rights to CI server needed): |
Refer to this link for build results (access rights to CI server needed): |
Refer to this link for build results (access rights to CI server needed): |
Refer to this link for build results (access rights to CI server needed): |
Refer to this link for build results (access rights to CI server needed): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@shikhar This LGTM. I have one comment about whether we're inadvertently advertising internal APIs as external, but this isn't really a blocker. I'm going to commit as is and we can always follow up if needed.
|
||
import java.util.Map; | ||
|
||
public class SchemaUtil { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Are transforms ending up in the javadocs? Just curious if we're making stuff like this appear as public APIs when we really just want this to be an internal utility.
Renames `HoistToStruct` SMT to `HoistField`. Adds the following SMTs: `ExtractField` `MaskField` `RegexRouter` `ReplaceField` `SetSchemaMetadata` `ValueToKey` Adds HTML doc generation and updates to `connect.html`. Author: Shikhar Bhushan <shikhar@confluent.io> Reviewers: Ewen Cheslack-Postava <ewen@confluent.io> Closes #2374 from shikhar/more-smt
Renames `HoistToStruct` SMT to `HoistField`. Adds the following SMTs: `ExtractField` `MaskField` `RegexRouter` `ReplaceField` `SetSchemaMetadata` `ValueToKey` Adds HTML doc generation and updates to `connect.html`. Author: Shikhar Bhushan <shikhar@confluent.io> Reviewers: Ewen Cheslack-Postava <ewen@confluent.io> Closes apache#2374 from shikhar/more-smt
Renames
HoistToStruct
SMT toHoistField
.Adds the following SMTs:
ExtractField
MaskField
RegexRouter
ReplaceField
SetSchemaMetadata
ValueToKey
Adds HTML doc generation and updates to
connect.html
.