# Spatial Query in sedona


In this tutorial, we will play with spatial join query. To better understand the query, we use this [website](https://www.keene.edu/campus/maps/tool/) to get coordinates.

For example, the below polygone represent Île-de-France:
1.8814087,49.2265665
1.8099976,48.5884175
2.9347229,48.5820584
3.0528259,49.2068317
1.8814087,49.2265665

polygone represents casd:
2.3065817,48.8204849
2.3063672,48.8177934
2.3113775,48.8177369
2.3114955,48.8205838
2.3065817,48.8204849

polygone represents insee:
2.3066783,48.8179488
2.3065925,48.8159283
2.3108518,48.8159566
2.3109269,48.8179559
2.3066783,48.8179488


      
The coordinates of eiffel-tour:
2.2949409,48.8579388 

- **ST_Contains**(polygondf.polygonshape,pointdf.pointshape): Return true if A fully contains B. (E.g. check if a polygone contains a point)
- **ST_Crosses**(polygondf.polygonshape,polygondf.polygonshape): Return true if A crosses B. (E.g. check if a polygon cross inside a polygone)
- **ST_Disjoint**(polygondf.polygonshape,polygondf.polygonshape): Return true if A and B are disjoint. (E.g. check if a polygon disjoint another polygone)
- **ST_DWithin**(leftGeometry: Geometry, rightGeometry: Geometry, distance: Double, useSpheroid: Optional(Boolean) = false): Returns true if 'leftGeometry' and 'rightGeometry' are within a specified 'distance'. If useSpheroid is passed true, ST_DWithin uses Sedona's ST_DistanceSpheroid to check the spheroid distance between the centroids of two geometries. The **unit of the distance in this case is meter**. If useSpheroid is passed false, ST_DWithin uses Euclidean distance and the unit of the distance is the same as the CRS of the geometries. To obtain the correct result, please consider using ST_Transform to put data in an appropriate CRS.
- **ST_Equals**(A: Geometry, B: Geometry): Return true if A equals to B. (E.g. checks if two line string LINESTRING(0 0,10 10), LINESTRING(0 0,5 5,10 10) equals.)
- ST_Intersects(polygondf.polygonshape,pointdf.pointshape): Return true if A intersects B. 

In [12]:
from pyspark.sql import DataFrame
from sedona.spark import SedonaContext
import geopandas as gpd
from pyspark.sql.functions import trim, col
from pathlib import Path

In [2]:
# get the project root dir
project_root_dir = Path.cwd().parent.parent

In [3]:
# build a sedona session (sedona = 1.6.1)
jar_folder = Path(f"{project_root_dir}/jars/sedona-35-213-161")
jar_list = [str(jar) for jar in jar_folder.iterdir() if jar.is_file()]
jar_path = ",".join(jar_list)

# build a sedona session (sedona = 1.6.1) offline
config = SedonaContext.builder() \
    .master("local[*]") \
    .config('spark.jars', jar_path). \
    getOrCreate()

In [4]:
# create a sedona context
sedona = SedonaContext.create(config)
sc = sedona.sparkContext

In [5]:
# this sets the encoding of shape files
sc.setSystemProperty("sedona.global.charset", "utf8")

In [13]:
def evalSpaceJoinQuery(TargetQuery:str)->DataFrame:
    inQuery = f"{TargetQuery} as result"
    return sedona.sql(inQuery)

## 1. ST_Contains

We check if a polygon contains a point or not:
- eiffel_tour in Île-de-France or not
- bordeaux city hall in Île-de-France or not


A point:
1.5655, 47.9733

In [31]:
ile_france = "POLYGON((1.8814087 49.2265665,1.8099976 48.5884175,2.9347229 48.5820584,3.0528259 49.2068317,1.8814087 49.2265665))"

casd = "POLYGON((2.3065817  48.8204849,2.3063672  48.8177934,2.3113775  48.8177369,2.3114955  48.8205838,2.3065817  48.8204849))"

insee = "POLYGON((2.3066783  48.8179488,2.3065925  48.8159283,2.3108518  48.8159566,2.3109269  48.8179559,2.3066783  48.8179488))"
eiffel_tour = "POINT(2.2949409 48.8579388)"
bordeaux = "POINT(-0.574851 44.8453837)"

In [23]:
query1 = f"SELECT ST_Contains(ST_GeomFromWKT('{ile_france}'), ST_GeomFromWKT('{eiffel_tour}'))"

resu1 = evalSpaceJoinQuery(query1)


In [24]:
resu1.show()

+------+
|result|
+------+
|  true|
+------+



In [26]:
query2 = f"SELECT ST_Contains(ST_GeomFromWKT('{ile_france}'), ST_GeomFromWKT('{bordeaux}'))"

resu2 = evalSpaceJoinQuery(query2)

In [27]:
resu2.show()

+------+
|result|
+------+
| false|
+------+



In [32]:
query3 = f"SELECT ST_Contains(ST_GeomFromWKT('{ile_france}'), ST_GeomFromWKT('{casd}'))"
resu3 = evalSpaceJoinQuery(query3)
resu3.show()

+------+
|result|
+------+
|  true|
+------+

