-
Notifications
You must be signed in to change notification settings - Fork 28.1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[SPARK-13843][Streaming]Move streaming-flume, streaming-mqtt, streaming-zeromq, streaming-akka, streaming-twitter to Spark packages #11672
Conversation
Test build #52978 has finished for PR 11672 at commit
|
Test build #52989 has finished for PR 11672 at commit
|
Test build #52990 has finished for PR 11672 at commit
|
FWIW I think that's a grand idea. These are fairly ancillary packages and probably the overhead of maintaining them, running tests, patching failures doesn't justify keeping them in the core project. There might even be more packages for which this is true. |
…kka, streaming-twitter to Spark packages
Test build #53010 has finished for PR 11672 at commit
|
I would also like to see the Kafka modules removed in a similar way. We have had trouble balancing Spark's compatibility requirements and Kafka's every breaking client APIs. By having this clean separation, it makes it easier for user applications to just declare which Kafka client version they want to use. |
retest this please |
Test build #53017 has finished for PR 11672 at commit
|
retest this please |
cc @rxin @tdas @JoshRosen |
@ksakellis Most of Streaming users use Kafka. It may affect a lot of people. I think it's better to discuss how to support Kafka 0.9 (SPARK-13252) before deciding whether moving Kafka out. |
Test build #53023 has finished for PR 11672 at commit
|
retest this please |
Test build #53043 has finished for PR 11672 at commit
|
I agree with @ksakellis on this one. It would be great if we can pull Kafka out as well. I understand that there are a lot of users who might find it difficult, but if you think about it, most people use the plugins via mvn anyway (since we don't actually package them in our assembly). I am not sure what the policy is if we pull it into a different repo and if we can keep the same groupId and artifactId, but that could be an alternative and most likely will not break too many users. |
Test build #53047 has finished for PR 11672 at commit
|
Test build #53078 has finished for PR 11672 at commit
|
Merge conflict is probably my fault (I modified a file that this PR deleted). |
Test build #53095 has finished for PR 11672 at commit
|
@JoshRosen could you take a look at this PR? |
Hi @zsxwing, I think the discussion on supporting Kafka 0.9 should happen if we decide to keep Kafka in Spark itself. I really think we should discuss moving Kafka out and come to an agreement on that as well. |
Can we have a JIRA ticket to discuss that? I am not sure this github pr is the place to discuss moving kafka out. |
Thanks @rxin. Opened SPARK-13877 to discuss this |
OK apparently the recent mima failures on the master branch is caused by the mqtt dependency not being in maven. We can either fix that with some build hack, or just merge this pr. Since these are the least contentious choices, I'm going to merge this pull request. We should discuss kafka/kinesis separately. |
Merging in master. |
FYI this has been merged but GitHub mirror is lagging; see https://git-wip-us.apache.org/repos/asf?p=spark.git. Until the GitHub merge goes through, I think that PR builds are going to continue to fail since the merge commits that we test are automatically generated by GitHub. |
This didn't cut the relevant doc/streaming-*.md files BTW |
Thanks for reminding. I will submit a PR to remove them after I copy all of them to the new projects. |
…aming-zeromq, streaming-akka, streaming-twitter to Spark packages ## What changes were proposed in this pull request? Currently there are a few sub-projects, each for integrating with different external sources for Streaming. Now that we have better ability to include external libraries (spark packages) and with Spark 2.0 coming up, we can move the following projects out of Spark to https://github.com/spark-packages - streaming-flume - streaming-akka - streaming-mqtt - streaming-zeromq - streaming-twitter They are just some ancillary packages and considering the overhead of maintenance, running tests and PR failures, it's better to maintain them out of Spark. In addition, these projects can have their different release cycles and we can release them faster. I have already copied these projects to https://github.com/spark-packages ## How was this patch tested? Jenkins tests Author: Shixiong Zhu <shixiong@databricks.com> Closes apache#11672 from zsxwing/remove-external-pkg.
What changes were proposed in this pull request?
Currently there are a few sub-projects, each for integrating with different external sources for Streaming. Now that we have better ability to include external libraries (spark packages) and with Spark 2.0 coming up, we can move the following projects out of Spark to https://github.com/spark-packages
They are just some ancillary packages and considering the overhead of maintenance, running tests and PR failures, it's better to maintain them out of Spark. In addition, these projects can have their different release cycles and we can release them faster.
I have already copied these projects to https://github.com/spark-packages
How was this patch tested?
Jenkins tests