Skip to content

Commit

Permalink
Remove Kafka 0.8 integration
Browse files Browse the repository at this point in the history
  • Loading branch information
srowen committed Oct 11, 2018
1 parent a001814 commit 4f0bab8
Show file tree
Hide file tree
Showing 46 changed files with 34 additions and 5,905 deletions.
6 changes: 3 additions & 3 deletions core/src/main/scala/org/apache/spark/deploy/SparkSubmit.scala
Original file line number Diff line number Diff line change
Expand Up @@ -991,9 +991,9 @@ private[spark] object SparkSubmitUtils {

// Exposed for testing.
// These components are used to make the default exclusion rules for Spark dependencies.
// We need to specify each component explicitly, otherwise we miss spark-streaming-kafka-0-8 and
// other spark-streaming utility components. Underscore is there to differentiate between
// spark-streaming_2.1x and spark-streaming-kafka-0-8-assembly_2.1x
// We need to specify each component explicitly, otherwise we miss
// spark-streaming utility components. Underscore is there to differentiate between
// spark-streaming_2.1x and spark-streaming-kafka-0-10-assembly_2.1x
val IVY_DEFAULT_EXCLUDES = Seq("catalyst_", "core_", "graphx_", "kvstore_", "launcher_", "mllib_",
"mllib-local_", "network-common_", "network-shuffle_", "repl_", "sketch_", "sql_", "streaming_",
"tags_", "unsafe_")
Expand Down
4 changes: 3 additions & 1 deletion dev/create-release/release-build.sh
Original file line number Diff line number Diff line change
Expand Up @@ -115,7 +115,9 @@ SCALA_2_10_PROFILES="-Pscala-2.10"
SCALA_2_11_PROFILES=
if [[ $SPARK_VERSION > "2.3" ]]; then
BASE_PROFILES="$BASE_PROFILES -Pkubernetes"
SCALA_2_11_PROFILES="-Pkafka-0-8"
if [[ $SPARK_VERSION < "3.0." ]]; then
SCALA_2_11_PROFILES="-Pkafka-0-8"
fi
else
PUBLISH_SCALA_2_10=1
fi
Expand Down
2 changes: 1 addition & 1 deletion dev/mima
Original file line number Diff line number Diff line change
Expand Up @@ -24,7 +24,7 @@ set -e
FWDIR="$(cd "`dirname "$0"`"/..; pwd)"
cd "$FWDIR"

SPARK_PROFILES="-Pmesos -Pkafka-0-8 -Pkubernetes -Pyarn -Pspark-ganglia-lgpl -Pkinesis-asl -Phive-thriftserver -Phive"
SPARK_PROFILES="-Pmesos -Pkubernetes -Pyarn -Pspark-ganglia-lgpl -Pkinesis-asl -Phive-thriftserver -Phive"
TOOLS_CLASSPATH="$(build/sbt -DcopyDependencies=false "export tools/fullClasspath" | tail -n1)"
OLD_DEPS_CLASSPATH="$(build/sbt -DcopyDependencies=false $SPARK_PROFILES "export oldDeps/fullClasspath" | tail -n1)"

Expand Down
1 change: 0 additions & 1 deletion dev/run-tests.py
Original file line number Diff line number Diff line change
Expand Up @@ -332,7 +332,6 @@ def build_spark_sbt(hadoop_version):
# Enable all of the profiles for the build:
build_profiles = get_hadoop_profiles(hadoop_version) + modules.root.build_profile_flags
sbt_goals = ["test:package", # Build test jars as some tests depend on them
"streaming-kafka-0-8-assembly/assembly",
"streaming-kinesis-asl-assembly/assembly"]
profiles_and_goals = build_profiles + sbt_goals

Expand Down
1 change: 0 additions & 1 deletion dev/sbt-checkstyle
Original file line number Diff line number Diff line change
Expand Up @@ -23,7 +23,6 @@ ERRORS=$(echo -e "q\n" \
| build/sbt \
-Pkinesis-asl \
-Pmesos \
-Pkafka-0-8 \
-Pkubernetes \
-Pyarn \
-Phive \
Expand Down
1 change: 0 additions & 1 deletion dev/scalastyle
Original file line number Diff line number Diff line change
Expand Up @@ -23,7 +23,6 @@ ERRORS=$(echo -e "q\n" \
| build/sbt \
-Pkinesis-asl \
-Pmesos \
-Pkafka-0-8 \
-Pkubernetes \
-Pyarn \
-Phive \
Expand Down
21 changes: 0 additions & 21 deletions dev/sparktestsupport/modules.py
Original file line number Diff line number Diff line change
Expand Up @@ -252,24 +252,6 @@ def __hash__(self):
)


streaming_kafka = Module(
name="streaming-kafka-0-8",
dependencies=[streaming],
source_file_regexes=[
"external/kafka-0-8",
"external/kafka-0-8-assembly",
],
build_profile_flags=[
"-Pkafka-0-8",
],
environ={
"ENABLE_KAFKA_0_8_TESTS": "1"
},
sbt_test_goals=[
"streaming-kafka-0-8/test",
]
)

streaming_kafka_0_10 = Module(
name="streaming-kafka-0-10",
dependencies=[streaming],
Expand Down Expand Up @@ -380,9 +362,6 @@ def __hash__(self):
source_file_regexes=[
"python/pyspark/streaming"
],
environ={
"ENABLE_KAFKA_0_8_TESTS": "1"
},
python_test_goals=[
"pyspark.streaming.util",
"pyspark.streaming.tests",
Expand Down
2 changes: 1 addition & 1 deletion dev/test-dependencies.sh
Original file line number Diff line number Diff line change
Expand Up @@ -29,7 +29,7 @@ export LC_ALL=C
# TODO: This would be much nicer to do in SBT, once SBT supports Maven-style resolution.

# NOTE: These should match those in the release publishing script
HADOOP2_MODULE_PROFILES="-Phive-thriftserver -Pmesos -Pkafka-0-8 -Pkubernetes -Pyarn -Phive"
HADOOP2_MODULE_PROFILES="-Phive-thriftserver -Pmesos -Pkubernetes -Pyarn -Phive"
MVN="build/mvn"
HADOOP_PROFILES=(
hadoop-2.7
Expand Down
9 changes: 0 additions & 9 deletions docs/building-spark.md
Original file line number Diff line number Diff line change
Expand Up @@ -89,15 +89,6 @@ like ZooKeeper and Hadoop itself.
## Building with Kubernetes support

./build/mvn -Pkubernetes -DskipTests clean package

## Building with Kafka 0.8 support

Kafka 0.8 support must be explicitly enabled with the `kafka-0-8` profile.
Note: Kafka 0.8 support is deprecated as of Spark 2.3.0.

./build/mvn -Pkafka-0-8 -DskipTests clean package

Kafka 0.10 support is still automatically built.

## Building submodules individually

Expand Down
2 changes: 1 addition & 1 deletion docs/configuration.md
Original file line number Diff line number Diff line change
Expand Up @@ -2017,7 +2017,7 @@ showDF(properties, numRows = 200, truncate = FALSE)
<td>
Maximum rate (number of records per second) at which data will be read from each Kafka
partition when using the new Kafka direct stream API. See the
<a href="streaming-kafka-integration.html">Kafka Integration guide</a>
<a href="streaming-kafka-0-10-integration.html">Kafka Integration guide</a>
for more details.
</td>
</tr>
Expand Down
6 changes: 5 additions & 1 deletion docs/streaming-kafka-0-10-integration.md
Original file line number Diff line number Diff line change
Expand Up @@ -3,7 +3,11 @@ layout: global
title: Spark Streaming + Kafka Integration Guide (Kafka broker version 0.10.0 or higher)
---

The Spark Streaming integration for Kafka 0.10 is similar in design to the 0.8 [Direct Stream approach](streaming-kafka-0-8-integration.html#approach-2-direct-approach-no-receivers). It provides simple parallelism, 1:1 correspondence between Kafka partitions and Spark partitions, and access to offsets and metadata. However, because the newer integration uses the [new Kafka consumer API](http://kafka.apache.org/documentation.html#newconsumerapi) instead of the simple API, there are notable differences in usage. This version of the integration is marked as experimental, so the API is potentially subject to change.
The Spark Streaming integration for Kafka 0.10 provides simple parallelism, 1:1 correspondence between Kafka
partitions and Spark partitions, and access to offsets and metadata. However, because the newer integration uses
the [new Kafka consumer API](https://kafka.apache.org/documentation.html#newconsumerapi) instead of the simple API,
there are notable differences in usage. This version of the integration is marked as experimental, so the API is
potentially subject to change.

### Linking
For Scala/Java applications using SBT/Maven project definitions, link your streaming application with the following artifact (see [Linking section](streaming-programming-guide.html#linking) in the main programming guide for further information).
Expand Down

0 comments on commit 4f0bab8

Please sign in to comment.