Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

KAFKA-3515: migrate json serde from connect to common #1191

Closed
wants to merge 1 commit into from

Conversation

guozhangwang
Copy link
Contributor

No description provided.

@guozhangwang
Copy link
Contributor Author

Ping @ewencp for reviews.

@@ -562,6 +562,7 @@ project(':clients') {
compile libs.lz4
compile libs.snappy
compile libs.slf4jApi
compile libs.jacksonDatabind
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I forgot this dependency change would happen. You almost definitely should ping the mailing list with this -- dependency changes on the clients lib are pretty significant, especially a widely used one like Jackson that can easily conflict (although I think they are very good about compatibility).

Another option would be to use a separate jar -- still move out of connect, but put it in something like a kafka-serde-json jar. Streams code could easily depend on that without depending on all of connect, but it wouldn't require everyone using Kafka clients to pull in Jackson. (In general I think Kafka should be broken into finer-grained jars).

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I agree that we should eventually break the kafka-clients jar, but I'm just not sure whether we should create a separate kafka-serde-json jar, or if we should just have a kafka-common.jar or finner grained kafka-common-serialization.jar. Since this discussion requires further discussions and go way beyond this JIRA, I would prefer to go with the first option here. cc @ewencp @ijuma

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry, I wasn't necessarily suggesting a specific grouping yet, just identifying the problem and a solution. I think we'd need to propose a way to break them down to the mailing list. Anything we add to kafka-clients is basically stuck there permanently (or moved to a new dependency of kafka-clients).

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Cool, so for this JIRA shall I create a KIP for dragging the dependency into kafka-clients?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Wait, no, I feel like we're not on the same page. I don't think we want to add the dependency to clients. That would mean we'd eventually have to move it to a separate jar, which would make it source incompatible between Kafka releases. I was suggesting we should just figure out a reasonable organization of jars or use some fine-grained jars because at least those you can collapse into a single jar and maintain compatibility for awhile using empty jars that just depend on the collapsed one. I don't know how you make compatibility and migration work if you dump everything into kafka-clients.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, it seems to me like introducing a kafka-serialization-json.jar is the simplest option at this stage. And yes, it seems like a simple KIP may be the way forward.

@ewencp
Copy link
Contributor

ewencp commented Apr 6, 2016

@guozhangwang Besides comment left inline, I think you can remove the dependency on connect:json in build.gradle:712 since that was only there for serializers as far as I can tell.

Rest of the code looks good to me. You could also avoid the code duplication between clients and connect by just moving all the code and leaving the connect version inheriting from the clients class (but still marked deprecated), but this code is small and simple enough that I am not particularly concerned about that.

* structured data without having associated Java classes. This deserializer also supports Connect schemas.
*/
public class JsonDeserializer implements Deserializer<JsonNode> {
private ObjectMapper objectMapper = new ObjectMapper();
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This can be final.

Copy link
Contributor

@stepio stepio Apr 18, 2016

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also it can be static, as it's thread-safe.
Or an alternative option. In terms of flexibility, it's wise to move initialization to configure() method. This way you'll be able to retrieve some jackson-specific options (if necessary) from the "props" Map.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ack.

@guozhangwang
Copy link
Contributor Author

Discussed with @ewencp offline, and we decided to postpone this movement until we break kafka-clients into separate jars.

And I'm assuming @ijuma will drive the split 💯

@ijuma
Copy link
Contributor

ijuma commented Apr 6, 2016

@guozhangwang :)

JsonNode data;
try {
data = objectMapper.readTree(bytes);
} catch (Exception e) {
Copy link
Contributor

@stepio stepio Apr 18, 2016

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I recommend to add special handling for JsonParseException - just log it instead of rethrowing.

If such an exception is not handled properly, consuming may be blocked with any non-json message - just text, for example. I got this while playing with Kafka locally: just one simple "dummy" message from console client brought tons of exceptions to my log.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ack.

@ijuma
Copy link
Contributor

ijuma commented Apr 26, 2016

@guozhangwang, shall we close this for now?

@guozhangwang guozhangwang deleted the K3515 branch May 11, 2016 23:58
@guozhangwang guozhangwang restored the K3515 branch May 11, 2016 23:58
@guozhangwang guozhangwang deleted the K3515 branch October 7, 2016 21:44
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
4 participants