From 134584cf60dd884fb25d08e04be713e842b93cea Mon Sep 17 00:00:00 2001 From: Jia Yu Date: Tue, 7 May 2024 22:32:58 -0700 Subject: [PATCH 1/2] Update release notes --- docs/setup/databricks.md | 34 +++++++++++++++++++++++++++++++++ docs/setup/emr.md | 38 +++++++++++++++++++++++++++++++++++++ docs/setup/fabric.md | 5 +++++ docs/setup/release-notes.md | 4 +++- docs/setup/wherobots.md | 6 +++--- 5 files changed, 83 insertions(+), 4 deletions(-) diff --git a/docs/setup/databricks.md b/docs/setup/databricks.md index 011c0392e5..6116f6331b 100644 --- a/docs/setup/databricks.md +++ b/docs/setup/databricks.md @@ -1,3 +1,37 @@ + +## JDK 11+ requirement + +Sedona 1.6.0+ requires JDK 11+ to run. Databricks Runtime by default uses JDK 8. You can set up JDK 17 by following the instructions in the [Databricks documentation](https://docs.databricks.com/en/dev-tools/sdk-java.html#create-a-cluster-that-uses-jdk-17). + +### on Databricks Runtime versions 13.1 and above + +When you create a cluster, specify that the cluster uses JDK 17 for both the driver and executor by adding the following environment variable to `Advanced Options > Spark > Environment Variables`: + +``` +JNAME=zulu17-ca-amd64 +``` + +If you are using ARM-based clusters (for example, AWS Graviton instances), use the following environment variable instead. + +``` +JNAME=zulu17-ca-arm64 +``` + +### on Databricks Runtime versions 11.2 - 13.0 + + +When you create a cluster, you can specify that the cluster uses JDK 11 (for both the driver and executor). To do this, add the following environment variable to `Advanced Options > Spark > Environment Variables`: + +``` +JNAME=zulu11-ca-amd64 +``` + +If you are using ARM-based clusters (for example, AWS Graviton instances), use the following environment variable instead. + +``` +JNAME=zulu11-ca-arm64 +``` + ## Community edition (free-tier) You just need to install the Sedona jars and Sedona Python on Databricks using Databricks default web UI. Then everything will work. diff --git a/docs/setup/emr.md b/docs/setup/emr.md index 6d687f35e8..10d04c4df5 100644 --- a/docs/setup/emr.md +++ b/docs/setup/emr.md @@ -5,6 +5,44 @@ This tutorial is tested on EMR on EC2 with EMR Studio (notebooks). EMR on EC2 us !!!note If you are using Spark 3.4+ and Scala 2.12, please use `sedona-spark-shaded-3.4_2.12`. Please pay attention to the Spark version postfix and Scala version postfix. + +## JDK 11+ requirement + +Sedona 1.6.0+ requires JDK 11+ to run. For Amazon EMR 7.x, the default JVM is Java 17. For Amazon EMR 5.x and 6.x, the default JVM is Java 8 but you can configure the cluster to use Java 11 or Java 17. For more information, see [EMR JVM versions](https://docs.aws.amazon.com/emr/latest/ReleaseGuide/configuring-java8.html#configuring-java8-override-spark). + +When you use Spark with Amazon EMR releases 6.12 and higher, if you write a driver for submission in cluster mode, the driver uses Java 8, but you can set the environment so that the executors use Java 11 or 17. To override the JVM for Spark, AWS EMR recommends that you set both the Hadoop and Spark classifications. + +However, it is unclear that if the following will work on EMR below 6.12. + +``` +{ +"Classification": "hadoop-env", + "Configurations": [ + { +"Classification": "export", + "Configurations": [], + "Properties": { +"JAVA_HOME": "/usr/lib/jvm/java-1.11.0" + } + } + ], + "Properties": {} + }, + { +"Classification": "spark-env", + "Configurations": [ + { +"Classification": "export", + "Configurations": [], + "Properties": { +"JAVA_HOME": "/usr/lib/jvm/java-1.11.0" + } + } + ], + "Properties": {} + } +``` + ## Prepare initialization script In your S3 bucket, add a script that has the following content: diff --git a/docs/setup/fabric.md b/docs/setup/fabric.md index aa5ca6ee68..85e298da9e 100644 --- a/docs/setup/fabric.md +++ b/docs/setup/fabric.md @@ -1,5 +1,10 @@ This tutorial will guide you through the process of installing Sedona on Microsoft Fabric Synapse Data Engineering's Spark environment. +## JDK 11+ requirement + +Sedona 1.6.0+ requires JDK 11+ to run. Microsoft Fabric Synapse Data Engineering 1.2+ uses JDK 11 by default so we recommend using Microsoft Fabric Synapse Data Engineering 1.2+. For more information, see [Apache Spark Runtimes in Fabric](https://learn.microsoft.com/en-us/fabric/data-engineering/runtime). + + ## Step 1: Open Microsoft Fabric Synapse Data Engineering Go to the [Microsoft Fabric portal](https://app.fabric.microsoft.com/) and choose the `Data Engineering` option. diff --git a/docs/setup/release-notes.md b/docs/setup/release-notes.md index 771335d11b..c8a68cc75e 100644 --- a/docs/setup/release-notes.md +++ b/docs/setup/release-notes.md @@ -4,7 +4,7 @@ If you use Sedona < 1.6.0, please use GeoPandas <= `0.11.1` since GeoPandas > 0.11.1 will automatically install Shapely 2.0. If you use Shapely, please use <= `1.8.5`. !!! warning - Sedona 1.6.0+ requires Java 11+ to compile and run. If you are using Java 8, please use Sedona <= 1.5.2. + Sedona 1.6.0+ requires Java 11+ to compile and run. If you are using Java 8, please use Sedona < 1.6.0. To learn how to set up Java 11+ on different platforms, please refer to the Java 11+ requirement in the corresponding platform setup guide. ## Sedona 1.6.0 @@ -48,6 +48,8 @@ df_raster.withColumn("mean", expr("mean_udf(rast)")).show()
  • [SEDONA-543] - RS_Union_aggr gives referenceRaster is null error when run on cluster
  • +
  • [SEDONA-555] - Snowflake Native App should not always create a new role +
  • ### New Feature diff --git a/docs/setup/wherobots.md b/docs/setup/wherobots.md index 1bff8d3224..e19555f44b 100644 --- a/docs/setup/wherobots.md +++ b/docs/setup/wherobots.md @@ -1,7 +1,7 @@ ## WherobotsDB -Wherobots Cloud offers fully-managed and fully provisioned cloud services for WherobotsDB, a comprehensive spatial analytics database system. You can play with it using Wherobots Jupyter Scala and Python kernel. No installation is needed. +Wherobots Cloud offers fully-managed and fully provisioned cloud services for WherobotsDB, a comprehensive spatial analytics database system. You can play with it using in a cloud-hosted Jupyter notebook with Python, Java or Scala kernels; no installation is needed. -WherobotsDB is 100% compatible with Apache Sedona in terms of public APIs but provides more functionalities and better performance. +WherobotsDB is 100% compatible with Apache Sedona in terms of public APIs but provides more functionality and better performance. -It is easy to migrate your existing Sedona workflow to Wherobots Cloud. Please sign up at [Wherobots Cloud](https://www.wherobots.services/). +It is easy to migrate your existing Sedona workflow to [Wherobots Cloud](https://www.wherobots.com). Please sign up [here](https://cloud.wherobots.com/) to create your account. From 622c23285ca081f2152eea924cff8992bb333b82 Mon Sep 17 00:00:00 2001 From: Jia Yu Date: Tue, 7 May 2024 22:36:00 -0700 Subject: [PATCH 2/2] Fix linter --- docs/setup/databricks.md | 1 - docs/setup/emr.md | 1 - docs/setup/fabric.md | 1 - 3 files changed, 3 deletions(-) diff --git a/docs/setup/databricks.md b/docs/setup/databricks.md index 6116f6331b..1c43ab6438 100644 --- a/docs/setup/databricks.md +++ b/docs/setup/databricks.md @@ -19,7 +19,6 @@ JNAME=zulu17-ca-arm64 ### on Databricks Runtime versions 11.2 - 13.0 - When you create a cluster, you can specify that the cluster uses JDK 11 (for both the driver and executor). To do this, add the following environment variable to `Advanced Options > Spark > Environment Variables`: ``` diff --git a/docs/setup/emr.md b/docs/setup/emr.md index 10d04c4df5..9f73b62bab 100644 --- a/docs/setup/emr.md +++ b/docs/setup/emr.md @@ -5,7 +5,6 @@ This tutorial is tested on EMR on EC2 with EMR Studio (notebooks). EMR on EC2 us !!!note If you are using Spark 3.4+ and Scala 2.12, please use `sedona-spark-shaded-3.4_2.12`. Please pay attention to the Spark version postfix and Scala version postfix. - ## JDK 11+ requirement Sedona 1.6.0+ requires JDK 11+ to run. For Amazon EMR 7.x, the default JVM is Java 17. For Amazon EMR 5.x and 6.x, the default JVM is Java 8 but you can configure the cluster to use Java 11 or Java 17. For more information, see [EMR JVM versions](https://docs.aws.amazon.com/emr/latest/ReleaseGuide/configuring-java8.html#configuring-java8-override-spark). diff --git a/docs/setup/fabric.md b/docs/setup/fabric.md index 85e298da9e..ff3a10e365 100644 --- a/docs/setup/fabric.md +++ b/docs/setup/fabric.md @@ -4,7 +4,6 @@ This tutorial will guide you through the process of installing Sedona on Microso Sedona 1.6.0+ requires JDK 11+ to run. Microsoft Fabric Synapse Data Engineering 1.2+ uses JDK 11 by default so we recommend using Microsoft Fabric Synapse Data Engineering 1.2+. For more information, see [Apache Spark Runtimes in Fabric](https://learn.microsoft.com/en-us/fabric/data-engineering/runtime). - ## Step 1: Open Microsoft Fabric Synapse Data Engineering Go to the [Microsoft Fabric portal](https://app.fabric.microsoft.com/) and choose the `Data Engineering` option.