Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

STORM-2658: Extract storm-kafka-client examples to storm-kafka-client-examples #2243

Merged
merged 1 commit into from
Aug 3, 2017

Conversation

srdo
Copy link
Contributor

@srdo srdo commented Jul 25, 2017

See https://issues.apache.org/jira/browse/STORM-2658.

The changes to the Trident examples mainly have to do with the command line arguments not being consistently passed to the topologies, e.g. the broker url was passed to producing topologies but not the consuming topology. The extra parameters have been removed, I think the example code is fine with hard coded topic names.

Copy link
Contributor

@HeartSaVioR HeartSaVioR left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Overall it looks good. Left comments on guiding how users provide the dependencies manually.

```
will submit the topologies set up by KafkaSpoutTopologyMainNamedTopics to Storm.

Note that this example produces a jar containing all dependencies for ease of use. In a production environment you may want to reduce the jar size by extracting some dependencies (e.g. org.apache.kafka:kafka-clients) from the jar. You can do this by setting the dependencies you don't want to include in the jars to `provided` scope, and then manually copying the dependencies to your Storm extlib directory.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Instead of copying dependencies to the extlib, you can achieve the same thing (or more) via using --artifacts to add dependencies for specific topology while submitting. I think this is simpler and topology-wide, so would love to guide both, or only --artifacts. (We already replaced the guide for how to add dependencies from Storm SQL.)

Please refer https://github.com/apache/storm/blob/master/docs/Command-line-client.md#jar for details.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks, I didn't know about this flag. It's much better, will replace references to extlib.

@@ -42,7 +42,9 @@
<groupId>org.apache.storm</groupId>
<artifactId>storm-kafka</artifactId>
<version>${project.version}</version>
<!-- You can reduce jar size by uncommenting this and putting dependencies in $STORM-HOME/extlib instead of including them in the jar
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same above. Maybe we can let users choose how they provide dependencies, change the sentence to ...uncommenting this and providing dependencies manually. or so. My intention is that we don't recommend putting dependencies to extlib directory, unless they know what they're doing (affecting whole topologies' dependencies)

@@ -73,19 +75,27 @@
<groupId>org.apache.storm</groupId>
<artifactId>storm-kafka-client</artifactId>
<version>${project.version}</version>
<!-- You can reduce jar size by uncommenting this and putting dependencies in $STORM-HOME/extlib instead of including them in the jar
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same here.

</dependency>
<dependency>
<groupId>org.apache.kafka</groupId>
<artifactId>${storm.kafka.artifact.id}</artifactId>
<version>${storm.kafka.client.version}</version>
<scope>compile</scope>
<!-- You can reduce jar size by uncommenting this and putting dependencies in $STORM-HOME/extlib instead of including them in the jar
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same here.

</dependency>
<dependency>
<groupId>org.apache.kafka</groupId>
<artifactId>kafka-clients</artifactId>
<version>${storm.kafka.client.version}</version>
<scope>compile</scope>
<!-- You can reduce jar size by uncommenting this and putting dependencies in $STORM-HOME/extlib instead of including them in the jar
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same here.

@HeartSaVioR
Copy link
Contributor

+1

Copy link
Contributor

@hmcl hmcl left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am in favor of the changes proposed in this JIRA. However, my understanding is that this patch is removing the ability to specify the topic names from the CLI, as well as running the topology in LocalCluster. I think these are valid options that we should keep.

Since we are already refactoring, I am also suggesting a few small name changes.

Once we have all the +1s and are in agreement, let's squash all the commits into one.

## Usage
This module contains example topologies demonstrating storm-kafka-client spout and Trident usage.

The module is built by `mvn clean package`. This will generate the `target/storm-kafka-client-examples-VERSION.jar` file. The jar contains all dependencies and can be submitted to Storm via the Storm CLI. For example:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

... built running ... the Storm CLI, e.g.:

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Will fix

@@ -42,7 +42,9 @@
<groupId>org.apache.storm</groupId>
<artifactId>storm-kafka</artifactId>
<version>${project.version}</version>
<!-- You can reduce jar size by uncommenting this and providing the dependencies manually. See the README for details.
<scope>${provided.scope}</scope>
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Isn't the goal of ${provided.scope} to handle the proper scope according to the profile, e.g. just like it is done with the Intellij profile. I am not quite following why the user has to comment/uncomment the scope configuration.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, I somehow missed the dollar. I'll try reverting this and update the readme to set the scope to compile

@@ -73,19 +75,27 @@
<groupId>org.apache.storm</groupId>
<artifactId>storm-kafka-client</artifactId>
<version>${project.version}</version>
<!-- You can reduce jar size by uncommenting this and providing the dependencies manually. See the README for details.
<scope>${provided.scope}</scope>
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Isn't the goal of ${provided.scope} to handle the proper scope according to the profile, e.g. just like it is done with the Intellij profile. I am not quite following why the user has to comment/uncomment the scope configuration.

Thread.sleep(2000);
DrpcResultsPrinter.remoteClient().printResults(60, 1, TimeUnit.SECONDS);
}
protected void run(String[] args) throws AlreadyAliveException, InvalidTopologyException,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why remove the ability to specify the topic name from the command line ?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The topic name wasn't being passed to the consumer before, only the producers as far as I could tell, so if you used the parameters the example didn't work. Fixing it caused a conflict with the wildcard example, because I'd have to change the newKafkaSpoutConfig signature to take a list of topics. That won't work with the Pattern required by the wildcard example. It seemed easier to just remove the option.

@@ -21,15 +21,10 @@
import static org.apache.storm.kafka.spout.KafkaSpoutConfig.FirstPollOffsetStrategy.EARLIEST;

import org.apache.kafka.clients.consumer.ConsumerConfig;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would call package where this class lives config.builder instead of builders, which is a bit misleading since this is really a configuration class.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I also would call the two getXyz methods in this class createXyz, as they are static factory methods. I know that the name was already like that, but since we are changing it, we should just make it more conventional.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sure, will rename

new KafkaSpoutTopologyMainNamedTopics().runMain(args);
}

protected void runMain(String[] args) throws Exception {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Isn't this change removing the ability to run this code in LocalCluster mode? I think it is very useful. For example, I use it all the time to run these simple test examples from IntelliJ.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes. I'll restore that bit

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I remembered why I removed it. LocalCluster is in the storm-server jar, which isn't included by the example projects. I think including it would cause conflict when the jar is deployed to a real cluster. How about I move the ability to run this from a local cluster to a test class? That should still leave people able to run on a local cluster from an IDE, but doesn't interfere with the generated jar.

</dependency>
<dependency>
<groupId>org.apache.kafka</groupId>
<artifactId>kafka-clients</artifactId>
<version>${storm.kafka.client.version}</version>
<scope>compile</scope>
<!-- You can reduce jar size by uncommenting this and providing the dependencies manually. See the README for details.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Isn't the goal of ${provided.scope} to handle the proper scope according to the profile, e.g. just like it is done with the Intellij profile. I am not quite following why the user has to comment/uncomment the scope configuration.

</dependency>
<dependency>
<groupId>org.apache.kafka</groupId>
<artifactId>${storm.kafka.artifact.id}</artifactId>
<version>${storm.kafka.client.version}</version>
<scope>compile</scope>
<!-- You can reduce jar size by uncommenting this and providing the dependencies manually. See the README for details.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Isn't the goal of ${provided.scope} to handle the proper scope according to the profile, e.g. just like it is done with the Intellij profile. I am not quite following why the user has to comment/uncomment the scope configuration.

@srdo
Copy link
Contributor Author

srdo commented Jul 31, 2017

@hmcl I think I addressed everything. Please look again. Thanks.

@hmcl
Copy link
Contributor

hmcl commented Jul 31, 2017

+1. Thanks @srdo

@srdo srdo changed the title STORM-2658: Extract storm-kafka-client examples to storm-kafka-client… STORM-2658: Extract storm-kafka-client examples to storm-kafka-client-examples Aug 1, 2017
@HeartSaVioR
Copy link
Contributor

@srdo
I have no idea which branches I need to apply this patch, so please go on merging yourself. Thanks for the patch. :)

@srdo
Copy link
Contributor Author

srdo commented Aug 2, 2017

Will do, thanks @HeartSaVioR :)

@asfgit asfgit merged commit 9b84248 into apache:master Aug 3, 2017
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants