# Mosaic & Sedona

> You can combine the usage of [Mosaic](https://databrickslabs.github.io/mosaic/index.html) with other geospatial libraries. In this example we combine it with [Sedona](https://sedona.apache.org).

## Setup

This notebook will run if you have both Mosaic and Sedona installed on your cluster as described below.

### Install Sedona

To install Sedona, follow the [official Sedona instructions](https://sedona.apache.org/1.5.0/setup/databricks/).

E.g. Add the following maven coordinates to a non-photon cluster [[1](https://docs.databricks.com/en/libraries/package-repositories.html)]. This is showing DBR 12.2 LTS.  

```
org.apache.sedona:sedona-spark-shaded-3.0_2.12:1.5.0
org.datasyslab:geotools-wrapper:1.5.0-28.2
```

### Install Mosaic

Download Mosaic JAR to your local machine (e.g. from [here](https://github.com/databrickslabs/mosaic/releases/download/v_0.3.12/mosaic-0.3.12-jar-with-dependencies.jar) for 0.3.12) and then UPLOAD to your cluster [[1](https://docs.databricks.com/en/libraries/cluster-libraries.html#install-a-library-on-a-cluster)]. 

### Notes

* This is for [SPARK SQL](https://www.databricks.com/glossary/what-is-spark-sql#:~:text=Spark%20SQL%20is%20a%20Spark,on%20existing%20deployments%20and%20data.) which is different from [DBSQL](https://www.databricks.com/product/databricks-sql); __The best way to combine is to not register mosaic SQL functions since Sedona is primarily SQL.__
* See instructions for `SedonaContext.create(spark)` [[1](https://sedona.apache.org/1.5.0/tutorial/sql/?h=sedonacontext#initiate-sedonacontext)]. 
* And, Sedona identifies that it might have issues if executed on a [Photon](https://www.databricks.com/product/photon) cluster; again this example is showing DBR 12.2 LTS on the Mosaic 0.3 series.

--- 
 __Last Update__ 01 DEC 2023 [Mosaic 0.3.12]

## Prior to Setup

> Notice that even in DBR 12.2 LTS, Databricks initially has gated functions, meaning they will not execute on the runtime but are there. However, we will see that after registering functions, e.g. from Sedona, those then become available (in DBR).

In [0]:
%sql 
-- before we do anything
-- have gated product functions
show system functions like 'st_*'

_The following exception will be thrown if you attempt to execute the gated functions:_

```
AnalysisException: [DATATYPE_MISMATCH.UNEXPECTED_INPUT_TYPE] Cannot resolve "st_area(POLYGON ((30 10, 40 40, 20 40, 10 20, 30 10)))" due to data type mismatch: parameter 1 requires ("GEOMETRY" or "GEOGRAPHY") type, however, "POLYGON ((30 10, 40 40, 20 40, 10 20, 30 10))" is of "STRING" type.; line 1 pos 7;
'Project [unresolvedalias(st_area(POLYGON ((30 10, 40 40, 20 40, 10 20, 30 10))), None)]
```

In [0]:
%sql 
-- assumes you are in DBR 12.2 LTS
-- so this will not execute
-- uncomment to verify
-- select st_area('POLYGON ((30 10, 40 40, 20 40, 10 20, 30 10))')

In [0]:
%sql 
-- notice, e.g. these are initially gated product functions
describe function extended st_area

## Setup

> We are installing Mosaic without SQL functions registered (via Scala) and are installing Sedona SQL as normal.

In [0]:
%scala

// -- spark functions
import org.apache.spark.sql.functions._

// -- mosaic functions
import com.databricks.labs.mosaic.functions.MosaicContext
import com.databricks.labs.mosaic.H3
import com.databricks.labs.mosaic.JTS

val mosaicContext = MosaicContext.build(H3, JTS)
import mosaicContext.functions._

// ! don't register SQL functions !
// - this allows sedona to be the main spatial SQL provider
//mosaicContext.register()

// -- sedona functions
import org.apache.sedona.spark.SedonaContext
val sedona = SedonaContext.create(spark)

_Now when we list user functions, we see all the Sedona provided ones._

In [0]:
%sql 
show user functions like 'st_*'

_Notice that the prior system registered functions have been replaced, e.g. `ST_Area`._

In [0]:
%sql 
-- notice, e.g. the provided function now are available
describe function extended st_area

## Queries

> Showing how Sedona (registered Spark SQL) and Mosaic (Scala) can co-exist on the same cluster. Not shown here, but the could also be Mosaic Python bindings.

In [0]:
%sql
CREATE OR REPLACE TEMPORARY VIEW sample AS (
  SELECT 'POLYGON ((30 10, 40 40, 20 40, 10 20, 30 10))' AS wkt
);

SELECT * FROM sample

_Here is a Spark SQL call to use the Sedona functions._

In [0]:
%sql
SELECT ST_Area(ST_GeomFromText(wkt)) AS sedona_area FROM sample

_Here is Scala call to the same Mosaic-provided `ST_Area` function._

In [0]:
%scala
// verify scala functions registered
display(
  spark
  .table("sample")
    .select(st_area($"wkt").as("mosaic_area"))
)

_Mosaic + Sedona_

> Showing blending Mosaic calls (in Scala) with Sedona (Spark SQL) calls.

In [0]:
%scala
display(
  spark.table("sample")
    .select(
      st_area($"wkt").as("mosaic_area"),                    // <- mosaic (scala)
      expr("ST_Area(ST_GeomFromText(wkt)) AS sedona_area"), // <- sedona (spark sql)
      $"wkt"
    )
)