New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
feat(spark): upgrade to spark 3.0.0 #376
Conversation
735b878
to
c01a54d
Compare
build.sbt
Outdated
|
||
libraryDependencies ++= { | ||
CrossVersion.partialVersion(scalaVersion.value) match { | ||
case Some((2, scalaMajor)) if scalaMajor == 11 => Seq("com.databricks" %% "spark-redshift" % "3.0.0-preview1" excludeAll excludeAvro) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
remove, move to latest spark-redshift-community
build.sbt
Outdated
"org.apache.commons" % "commons-text" % "1.8", | ||
"org.influxdb" % "influxdb-java" % "2.19", | ||
// Wait for https://github.com/spark-redshift-community/spark-redshift/pull/72 | ||
// "io.github.spark-redshift-community" %% "spark-redshift" % "4.0.1", |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
fix
build.sbt
Outdated
"org.apache.hive" % "hive-jdbc" % "2.3.3" % "provided" excludeAll(excludeNetty, excludeNettyAll, excludeLog4j, excludeParquet), | ||
"org.apache.hadoop" % "hadoop-aws" % "2.7.3" % "provided", | ||
"com.amazon.deequ" % "deequ" % "1.0.4" excludeAll(excludeSpark) | ||
"com.amazon.deequ" % "deequ" % "1.0.4" excludeAll(excludeSpark, excludeScalanlp), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
upgrade
docker/spark/k8s/Dockerfile
Outdated
RUN wget -q https://repo1.maven.org/maven2/org/apache/spark/spark-avro_${SCALA_MAJOR_VERSION}/${SPARK_VERSION}/spark-avro_${SCALA_MAJOR_VERSION}-${SPARK_VERSION}.jar -P $SPARK_HOME/jars/ | ||
RUN wget -q https://repo1.maven.org/maven2/org/apache/commons/commons-pool2/2.6.2/commons-pool2-2.6.2.jar -P $SPARK_HOME/jars/ | ||
|
||
RUN rm -f $SPARK_HOME/jars/httpclient-*.jar && wget -q https://repo1.maven.org/maven2/org/apache/httpcomponents/httpclient/${HTTPCLIENT_VERSION}/httpclient-${HTTPCLIENT_VERSION}.jar -P $SPARK_HOME/jars |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
remove
@@ -13,5 +11,4 @@ spark.hadoop.mapreduce.fileoutputcommitter.algorithm.version=2 | |||
spark.port.maxRetries=0 | |||
spark.rdd.compress=true | |||
spark.serializer=org.apache.spark.serializer.KryoSerializer | |||
spark.sql.hive.convertMetastoreParquet=false | |||
spark.sql.legacy.allowCreatingManagedTableUsingNonemptyLocation=true |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
revert
spark.sql.hive.convertMetastoreParquet=false |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
try and remove
eba3d5a
to
0c4c970
Compare
No description provided.