Skip to content

[BEAM-351] Add DisplayData to KafkaIO#2111

Closed
aviemzur wants to merge 2 commits into
apache:masterfrom
aviemzur:kafkaio-display-data
Closed

[BEAM-351] Add DisplayData to KafkaIO#2111
aviemzur wants to merge 2 commits into
apache:masterfrom
aviemzur:kafkaio-display-data

Conversation

@aviemzur
Copy link
Copy Markdown
Member

@aviemzur aviemzur commented Feb 26, 2017

Be sure to do all of the following to help us incorporate your contribution
quickly and easily:

  • Make sure the PR title is formatted like:
    [BEAM-<Jira issue #>] Description of pull request
  • Make sure tests pass via mvn clean verify. (Even better, enable
    Travis-CI on your fork and ensure the whole test matrix passes).
  • Replace <Jira issue #> in the title with the actual Jira issue
    number, if there is one.
  • If this contribution is large, please file an Apache
    Individual Contributor License Agreement.

@aviemzur
Copy link
Copy Markdown
Member Author

R: @dhalperi / @bjchambers

@coveralls
Copy link
Copy Markdown

Coverage Status

Coverage decreased (-0.007%) to 69.316% when pulling 4962424 on aviemzur:kafkaio-display-data into 4bba380 on apache:master.

@asfbot
Copy link
Copy Markdown

asfbot commented Feb 26, 2017

Refer to this link for build results (access rights to CI server needed):
https://builds.apache.org/job/beam_PreCommit_Java_MavenInstall/7867/
--none--

@dhalperi
Copy link
Copy Markdown
Contributor

R: @rangadi @amitsela can you take a look?

Amit, would you mind handling merging of this one? I'm swamped and you use Kafka more than I do ;)

@Override
public void populateDisplayData(DisplayData.Builder builder) {
super.populateDisplayData(builder);
builder.add(DisplayData.item("topics", Joiner.on(",").join(getTopics())));
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please make it read 'topic' in common case of single topic.

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also need to handle the case if the partitions are set manually (Either 'topics' or 'partitions' is set).

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah I missed the partitions or topics part, will do.
Are we sure we want to change the name of the key depending on the number of topics? Shouldn't display data keys be consistent?

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

topic/s ?

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I haven't the read the contract on the key. Don't know if it is supposed to stay static or not. Will let you guys decide.

@Override
public void populateDisplayData(DisplayData.Builder builder) {
super.populateDisplayData(builder);
builder.add(DisplayData.item("topics", Joiner.on(",").join(getTopics())));
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also need to handle the case if the partitions are set manually (Either 'topics' or 'partitions' is set).

public void populateDisplayData(DisplayData.Builder builder) {
super.populateDisplayData(builder);
builder.add(DisplayData.item("topics", Joiner.on(",").join(getTopics())));
for (Map.Entry<String, Object> conf : getConsumerConfig().entrySet()) {
Copy link
Copy Markdown
Contributor

@rangadi rangadi Feb 27, 2017

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Writing entire config might not be that useful. You could use ConsumerConfig.originals(). Among those, the most important one could be at the top:
BOOTSTRAP_SERVERS_CONFIG "bootstrap.servers".

RECEIVE_BUFFER_CONFIG and ENABLE_AUTO_COMMIT_CONFIG could be skipped unless they differ from what we set them to above. (optional)

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I started by picking and choosing configs but figured it could be useful for the user to see which configurations they made. Is this not useful?

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Agree with @rangadi, this is not LOG.DEBUG, I'd go through Kafka's documentation and add here the "important configurations".
Maybe pick the once marked as "high importance" in Kafka documentation.

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So, the consumer properties map is what the user passed to the Read transform builder.
It is then passed to Kafka consumer which joins these properties with Kafka defaults, but this is in a different scope.
So this map won't contain the defaults for each configuration Kafka has but rather just the ones the user specified in KafkaIO.Read#updateConsumerProperties

Copy link
Copy Markdown
Contributor

@rangadi rangadi Feb 28, 2017

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@aviemzur, I misread the code. getConsumerConfig() is our own map. Then we just need to exclude the internal configs (not a must, but better I think). I thought this was the map from 'ConsumerConfig()'.

public void populateDisplayData(DisplayData.Builder builder) {
super.populateDisplayData(builder);
builder.addIfNotNull(DisplayData.item("topic", getTopic()));
for (Map.Entry<String, Object> conf : getProducerConfig().entrySet()) {
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same comment as consumer.

@Override
public void populateDisplayData(DisplayData.Builder builder) {
super.populateDisplayData(builder);
read.populateDisplayData(builder);
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We could indicate that metadata is trimmed. May be..

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah I thought of that too. Is that useful?

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think it matters.
IMHO with/without metadata is more for the pipeline author and what he does with the read PCollection. DisplayData for the KafkaIO should be about T which can be KV<K, V> or KafkaRecord<KV<K, V>>

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Agreed.

assertThat(displayData, hasDisplayItem("bootstrap.servers", "myServer1:9092,myServer2:9092"));
assertThat(displayData, hasDisplayItem("value.deserializer",
org.apache.kafka.common.serialization.ByteArrayDeserializer.class.getName()));
assertThat(displayData, hasDisplayItem("key.deserializer",
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

oh, yeah, we should drop these two serializers too. They are are implementation details.

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Funny, they are actually marked as "high importance" in Kafka docs..

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

They are, but we don't allow users to set them. Users only provide coders.

@amitsela
Copy link
Copy Markdown
Member

@dhalperi no problem!

@aviemzur
Copy link
Copy Markdown
Member Author

aviemzur commented Mar 1, 2017

@rangadi @amitsela addressed your comments in my latest commit.

@coveralls
Copy link
Copy Markdown

Coverage Status

Coverage decreased (-0.1%) to 69.189% when pulling 8829d01 on aviemzur:kafkaio-display-data into 4bba380 on apache:master.

@asfbot
Copy link
Copy Markdown

asfbot commented Mar 1, 2017

Refer to this link for build results (access rights to CI server needed):
https://builds.apache.org/job/beam_PreCommit_Java_MavenInstall/8007/
--none--

@amitsela
Copy link
Copy Markdown
Member

amitsela commented Mar 1, 2017

LGTM, pending Travis.

@amitsela
Copy link
Copy Markdown
Member

amitsela commented Mar 1, 2017

@aviemzur can you please rebase and push so Travis would do us the courtesy of running ? thanks!

@aviemzur aviemzur force-pushed the kafkaio-display-data branch from 8829d01 to 7fd48d4 Compare March 1, 2017 11:27
@aviemzur
Copy link
Copy Markdown
Member Author

aviemzur commented Mar 1, 2017

Rebased on master.

@coveralls
Copy link
Copy Markdown

Coverage Status

Coverage increased (+0.003%) to 69.193% when pulling 7fd48d4 on aviemzur:kafkaio-display-data into d66029c on apache:master.

@asfbot
Copy link
Copy Markdown

asfbot commented Mar 1, 2017

Refer to this link for build results (access rights to CI server needed):
https://builds.apache.org/job/beam_PreCommit_Java_MavenInstall/8010/
--none--

@amitsela
Copy link
Copy Markdown
Member

amitsela commented Mar 1, 2017

Merging.

@asfgit asfgit closed this in 3b3d6b8 Mar 1, 2017
@aviemzur aviemzur deleted the kafkaio-display-data branch March 11, 2017 09:52
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

6 participants