Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Upgrade to spark 2.4 #40

Merged
merged 6 commits into from Nov 15, 2018
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Jump to
Jump to file
Failed to load files.
Diff view
Diff view
9 changes: 7 additions & 2 deletions README.md
Expand Up @@ -54,9 +54,14 @@ output.show(truncate = false)
+----------------------------------------------+------------------------------------------------------+--------------------------------------------------+---------+
~~~

### Databricks

If you are a Databricks user, please follow the instructions in this
[example notebook](https://databricks-prod-cloudfront.cloud.databricks.com/public/4027ec902e239c93eaaa8714f173bcfc/1233855/1962483213436895/588180/latest.html).

### Dependencies

Because CoreNLP depends on `protobuf-java` 3.x but Spark 2.3 depends on `protobuf-java` 2.x,
Because CoreNLP depends on `protobuf-java` 3.x but Spark 2.4 depends on `protobuf-java` 2.x,
we release `spark-corenlp` as an assembly jar that includes CoreNLP as well as its transitive dependencies,
except `protobuf-java` being shaded.
This might cause issues if you have CoreNLP or its dependencies on the classpath.
Expand All @@ -67,7 +72,7 @@ To use `spark-corenlp`, you need one of the CoreNLP language models:
# Download one of the language models.
wget http://repo1.maven.org/maven2/edu/stanford/nlp/stanford-corenlp/3.9.1/stanford-corenlp-3.9.1-models.jar
# Run spark-shell
spark-shell --packages databricks/spark-corenlp:0.3.1-s_2.11 --jars stanford-corenlp-3.9.1-models.jar
spark-shell --packages databricks/spark-corenlp:0.4.0-spark_2.4-scala_2.11 --jars stanford-corenlp-3.9.1-models.jar
~~~

### Acknowledgements
Expand Down
9 changes: 7 additions & 2 deletions build.sbt
@@ -1,13 +1,18 @@
import ReleaseTransformations._

def majorVersion(version: String) = version.split('.').slice(0, 2).mkString(".")

lazy val commonSettings = Seq(
organization := "databricks",
name := "spark-corenlp",
spName := "databricks/spark-corenlp",
licenses := Seq("GPL-3.0" -> url("http://opensource.org/licenses/GPL-3.0")),
// dependency settings //
scalaVersion := "2.11.8",
sparkVersion := "2.3.1",
sparkVersion := "2.4.0",
version := (version in ThisBuild).value +
s"-spark_${majorVersion(sparkVersion.value)}" +
s"-scala_${majorVersion(scalaVersion.value)}",
initialize := {
val _ = initialize.value
// require Java 8+
Expand All @@ -21,7 +26,7 @@ lazy val commonSettings = Seq(
fork in Test := true,
javaOptions in Test ++= Seq("-Xmx6g"),
// release settings //
spAppendScalaVersion := true,
spAppendScalaVersion := false,
// We only use sbt-release to update version numbers for now.
releaseProcess := Seq[ReleaseStep](
inquireVersions,
Expand Down
2 changes: 1 addition & 1 deletion version.sbt
@@ -1 +1 @@
version in ThisBuild := "0.3.2-SNAPSHOT"
version in ThisBuild := "0.4.0-SNAPSHOT"