-
Notifications
You must be signed in to change notification settings - Fork 13.9k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
KAFKA-13748: Do not include file stream connectors in Connect's CLASSPATH and plugin.path by default #11908
KAFKA-13748: Do not include file stream connectors in Connect's CLASSPATH and plugin.path by default #11908
Conversation
da7568e
to
380c91f
Compare
Here's a run of all the Connect system tests with the proposed changes in the PR. @rhauch the 9 tests that have failed have done so after loading the classes successfully. The failures are not relevant to the changes here and the issues with the |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks @kkonstantine. I've not taken a look at the Python changes yet, but I left a couple of questions.
docs/connect.html
Outdated
@@ -74,6 +74,7 @@ <h4><a id="connect_running" href="#connect_running">Running Kafka Connect</a></h | |||
<li><code>config.storage.topic</code> (default <code>connect-configs</code>) - topic to use for storing connector and task configurations; note that this should be a single partition, highly replicated, compacted topic. You may need to manually create the topic to ensure the correct configuration as auto created topics may have multiple partitions or be automatically configured for deletion rather than compaction</li> | |||
<li><code>offset.storage.topic</code> (default <code>connect-offsets</code>) - topic to use for storing offsets; this topic should have many partitions, be replicated, and be configured for compaction</li> | |||
<li><code>status.storage.topic</code> (default <code>connect-status</code>) - topic to use for storing statuses; this topic can have multiple partitions, and should be replicated and configured for compaction</li> | |||
<li><code>plugin.path</code> (default <code>empty</code>) - a list of paths that contain plugins (connectors, converters, transformations). For the purpose of quick starts users will have to add the path that contains the FileStreamSourceConnector and FileStreamSinkConnector packaged in <code>connect-file-"version".jar</code>, because these connectors are not included by default to the <code>CLASSPATH</code> or the <code>plugin.path</code> of the Connect worker</li> |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should we add the same to the Standalone section above?
Also we should update quickstart.html
too.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for the comment @mimaison
I've moved this bullet further up where we include properties for both standalone and distributed.
Regarding the quickstart, the reason I didn't change this file is that it's not shown anymore. Which might be something to fix, but possibly not in this PR.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks @kkonstantine.
Regarding the quickstart, it's currently not shown on the website, but this commit added it back. So once we sync the docs for 3.2, it should be back.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah, I agree that it'd be good for this PR to add this information to the Connect section of the quickstart, since that's been added back for 3.2.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for the link @mimaison. In that case, it's worth updating the quick start here too. Let me do that
9fbefd3
to
aa443b4
Compare
I rebased to get the changes from #11933 and get a green run of system tests |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks, @kkonstantine. Looks pretty good, but I do have a few comments/suggestions below.
Also, it might be good for this PR to update the Connect section of the quickstart, since that's been added back for 3.2 (see thread above).
docs/connect.html
Outdated
<li><code>plugin.path</code> (default <code>empty</code>) - a list of paths that contain | ||
plugins (connectors, converters, transformations). For the purpose of quick starts users will have to add the path that contains the FileStreamSourceConnector and FileStreamSinkConnector packaged in <code>connect-file-"version".jar</code>, because these connectors are not included by default to the <code>CLASSPATH</code> or the <code>plugin.path</code> of the Connect worker.</li> |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Previous list item paragraphs are on a single line, and it's probably useful to keep that formatting style. Also, a few nits:
<li><code>plugin.path</code> (default <code>empty</code>) - a list of paths that contain | |
plugins (connectors, converters, transformations). For the purpose of quick starts users will have to add the path that contains the FileStreamSourceConnector and FileStreamSinkConnector packaged in <code>connect-file-"version".jar</code>, because these connectors are not included by default to the <code>CLASSPATH</code> or the <code>plugin.path</code> of the Connect worker.</li> | |
<li><code>plugin.path</code> (default <code>empty</code>) - a list of paths that contain Connect plugins (connectors, converters, transformations). Before running quick starts, users must add the relative or absolute path that contains the example FileStreamSourceConnector and FileStreamSinkConnector packaged in <code>connect-file-"version".jar</code>, because these connectors are not included by default to the <code>CLASSPATH</code> or the <code>plugin.path</code> of the Connect worker.</li> |
I also have a few questions.
- Should the path include the JAR file or the directory that contains the JAR file?
- Would it be useful if we supplied an example
plugin.path
line to add?
I've built the distribution with this PR branch, and verified that the quickstart running the standalone worker fails if the file connectors are not in the plugin.path
, but adding plugin.path=libs/connect-file-<version>.jar
does work as long as the quickstart is run from the installation directory.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for the comment @rhauch
One note that I have is that I wouldn't recommend adding the relative path even if it works, because it possibly doesn't lead to robust deployments. plugin.path
works when following symlinks so that can help.
plugin.path
works with uber-jars but I wouldn't necessarily consider connect-file-<version>.jar
an uber jar. I'd recommend the mainstream way which is to add the parent directory of any directories containing connector jars. Let me try to see how an example looks like here, because that's already just a bullet point.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks, @kkonstantine. A long comment/suggestion below about how to reduce the barrier of running the quickstart.
docs/quickstart.html
Outdated
First, make sure to add <code class="language-bash">connect-file-{{fullDotVersion}}.jar</code> to the <code>plugin.path</code> property in the Connect worker's configuration (see <a href="#connectconfigs_plugin.path">plugin.path</a> for examples). | ||
</p> |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Will users that are trying this for the first time understand what this means, or will this be enough of a barrier to cause them to abandon the quickstart? I'm not sure, but it might be worth to be more explicit here.
First, since this is a quickstart I think it's okay to duplicate some information here just so that users don't have to go elsewhere to learn what they need to do to run the quickstart.
Second, the referenced plugin.path configuration section says (emphasis mine):
The list should consist of top level directories that include any combination of ...
And, to run the quickstarts we want them to add the path to that JAR file to the plugin.path
configuration property, not add the path to the lib/
directory that contains the connect-file-<version>.JAR
file, as the referenced documentation suggests to do.
So, I suggest we be much more explicit here, and say something like:
First, make sure to add <code class="language-bash">connect-file-{{fullDotVersion}}.jar</code> to the <code>plugin.path</code> property in the Connect worker's configuration (see <a href="#connectconfigs_plugin.path">plugin.path</a> for examples). | |
</p> | |
First, instruct Connect to use the example file system connectors by editing the configuration for the Connect standalone worker to changing the following line from: | |
</p> | |
<pre class="brush: bash;"> | |
#plugin.path=</pre> | |
to: | |
</p> | |
<pre class="brush: bash;"> | |
plugin.path=lib/connect-file-{{fullDotVersion}}.jar</pre> |
We could make this a little easier if we changed the connect-standalone.properties
and connect-distributed.properties
files to add a few lines to the existing commented out section, so that all a user has to do is uncomment one line. For example:
...
# Set to a list of filesystem paths separated by commas (,) to enable class loading isolation for plugins
# (connectors, converters, transformations). The list should consist of top level directories that include
# any combination of:
# a) directories immediately containing jars with plugins and their dependencies
# b) uber-jars with plugins and their dependencies
# c) directories immediately containing the package directory structure of classes of plugins and their dependencies
# Note: symlinks will be followed to discover dependencies or plugins.
# Examples:
# plugin.path=/usr/local/share/java,/usr/local/share/kafka/plugins,/opt/connectors,
#
# To run the quickstart that just uses the example file system connectors, uncomment this line:
# plugin.path=lib/connect-file-<version>.jar,
#
#plugin.path=
These are comments, and I think we can improve the comments without a KIP, especially since with this PR we're trying to limit the impact on the quickstarts.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I should add that I don't think adding the comments to the config properties is really all that necessary -- it might simplify things a bit, but at the same time we don't want users to do that in production.
Not including it in the configs just means the quickstart needs a bit more content to explain how the user should modify the config. Plus, maybe a note that says not to use these in production.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I agree. Better to add this example to the quick start itself rather than the config.
Again, my dilemma was regarding whether we should suggest a relative path + uber jar value which is not typical for production even on the quick start or direct users to the documentation of the property. I'm also a fan of quick starts that demonstrate a working example step-by-step. So, after your comment I've changed the quick start doc to contain an example. Let me know how it looks after that change when you get a chance.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM. Thanks, @kkonstantine!
…PATH and plugin.path by default
fdbe364
to
d1f8a10
Compare
Thanks @rhauch. Just rebased to get the latest changes and squash the last commit with a line edit. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks, @kkonstantine. Still LGMT.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM, thanks!
…PATH and plugin.path by default (#11908) With this change we stop including the non-production grade connectors that are meant to be used for demos and quick starts by default in the CLASSPATH and plugin.path of Connect deployments. The package of these connector will still be shipped with the Apache Kafka distribution and will be available for explicit inclusion. The changes have been tested through the system tests and the existing unit and integration tests. Reviewers: Mickael Maison <mickael.maison@gmail.com>, Randall Hauch <rhauch@gmail.com>
…PATH and plugin.path by default (#11908) With this change we stop including the non-production grade connectors that are meant to be used for demos and quick starts by default in the CLASSPATH and plugin.path of Connect deployments. The package of these connector will still be shipped with the Apache Kafka distribution and will be available for explicit inclusion. The changes have been tested through the system tests and the existing unit and integration tests. Reviewers: Mickael Maison <mickael.maison@gmail.com>, Randall Hauch <rhauch@gmail.com>
…PATH and plugin.path by default (#11908) With this change we stop including the non-production grade connectors that are meant to be used for demos and quick starts by default in the CLASSPATH and plugin.path of Connect deployments. The package of these connector will still be shipped with the Apache Kafka distribution and will be available for explicit inclusion. The changes have been tested through the system tests and the existing unit and integration tests. Reviewers: Mickael Maison <mickael.maison@gmail.com>, Randall Hauch <rhauch@gmail.com>
Thanks both. Merged to |
…PATH and plugin.path by default (apache#11908) With this change we stop including the non-production grade connectors that are meant to be used for demos and quick starts by default in the CLASSPATH and plugin.path of Connect deployments. The package of these connector will still be shipped with the Apache Kafka distribution and will be available for explicit inclusion. The changes have been tested through the system tests and the existing unit and integration tests. Reviewers: Mickael Maison <mickael.maison@gmail.com>, Randall Hauch <rhauch@gmail.com>
…cs-11-may-2022 * apache-kafka/3.1: (51 commits) MINOR: reload4j build dependency fixes (apache#12144) KAFKA-13255: Use config.properties.exclude when mirroring topics (apache#11401) KAFKA-13794: Fix comparator of inflightBatchesBySequence in TransactionsManager (round 3) (apache#12096) KAFKA-13794: Follow up to fix producer batch comparator (apache#12006) fix: make sliding window works without grace period (#kafka-13739) (apache#11980) 3.1.1 release notes (apache#12001) KAFKA-13794; Fix comparator of `inflightBatchesBySequence` in `TransactionManager` (apache#11991) KAFKA-13782; Ensure correct partition added to txn after abort on full batch (apache#11995) KAFKA-13748: Do not include file stream connectors in Connect's CLASSPATH and plugin.path by default (apache#11908) KAFKA-13775: CVE-2020-36518 - Upgrade jackson-databind to 2.12.6.1 (apache#11962) ...
…cs-12-may-2022 * apache-kafka/3.0: (14 commits) fix: make sliding window works without grace period (#kafka-13739) (apache#11980) KAFKA-13794: Follow up to fix producer batch comparator (apache#12006) KAFKA-13794; Fix comparator of `inflightBatchesBySequence` in `TransactionManager` (apache#11991) KAFKA-13748: Do not include file stream connectors in Connect's CLASSPATH and plugin.path by default (apache#11908) KAFKA-13418: Support key updates with TLS 1.3 (apache#11966) KAFKA-13770: Restore compatibility with KafkaBasedLog using older Kafka brokers (apache#11946) KAFKA-13761: KafkaLog4jAppender deadlocks when idempotence is enabled (apache#11939) KAFKA-13759: Disable idempotence by default in producers instantiated by Connect (apache#11933) MINOR: Fix `ConsumerConfig.ISOLATION_LEVEL_DOC` (apache#11915) KAFKA-13750; Client Compatability KafkaTest uses invalid idempotency configs (apache#11909) ...
The purpose of this patch is to stop including the non-production grade connectors that are meant to be used for demos and quick starts by default in the CLASSPATH and plugin.path of Connect deployments. The package of these connector still be shipped with the Apache Kafka distribution and will be available for explicit inclusion.
The changes have been tested through the system tests and the existing unit and integration tests
Committer Checklist (excluded from commit message)