![](https://wherobots.com/wp-content/uploads/2023/12/Inline-Blue_Black_onWhite.png)
# <span style="color: #7b73e2;">WherobotsAI Raster Inference - Object Detection</span>


#### This example demonstrates an object detection model with **Raster Inference** to identify <span style="color: #7b73e2;">**marine infrastructure**</span> (offshore wind farms and platforms) in satellite imagery.

We will use a **machine-learning model** from <span style="color: #7b73e2;">**Satlas**</span>, which was trained using imagery from the **European Space Agency’s Sentinel-2 satellites**.

---
<div style="display: flex; justify-content: center; align-items: center; gap: 20px;">
    <img src="./assets/img/offshore_oil.png" alt="Offshore Oil Platform" width="300">
    <img src="./assets/img/wind_farm.png" alt="Wind Farm Offshore" width="300">
</div>

## <span style="color: #7b73e2;">Set Up The WherobotsDB Context</span>

#### Here we configure WherobotsDB to enable access to the necessary cloud object storage buckets with sample data.


In [None]:
import warnings
warnings.filterwarnings('ignore')

from pyspark.sql.functions import expr, size, col
from sedona.spark import *
import json

config = SedonaContext.builder().appName('object-detection-batch-inference')\
    .getOrCreate()

sedona = SedonaContext.create(config)

## <span style="color: #7b73e2;">Load Satellite Imagery Efficiently</span>

In this step, we load the satellite imagery to run <span style="color: #7b73e2;">**inference**</span> over. 
These GeoTIFF images are ingested as <span style="color: #7b73e2;">**out-of-database or "out-db" rasters**</span> in **WherobotsDB** and stored in the Spatial Catalog for easy access. Building out DB ensuring efficient storage and retrieval.

In [None]:
df_raster_input = sedona.table(f"wherobots_pro_data.satlas.offshore_satlas")
df_raster_input.printSchema()

In [None]:
df_raster_input.count()

## <span style="color: #7b73e2;">Focus on a Coastal Region</span>


With **176,000 images** covering most of Earth's coastlines, let's choose an area to focus on.

You can use the interactive map below to define a Region of Interest (ROI) as a polygon using the "Draw a Polygon" tool in the left-hand sidebar. Or, use the default ROI (in red) that is preloaded in the map below. 

In [None]:
from leafmap import Map

my_map = Map(zoom=7, center = (36.5, 122))
default_roi = {'type': 'Feature',
                       'properties': {},
                       'geometry': {'type': 'Polygon',
                        'coordinates': [[[120.959473, 35.918528],
                                        [120.959473, 36.820829],
                                        [123.046875, 36.820829],
                                        [123.046875, 35.918528],
                                        [120.959473, 35.918528]]]}}
my_map.add_geojson(default_roi, layer_name="Default AOI", style={"color": "#ba2762","fillOpacity": 0.5,"fillColor": "#ba2762","weight": 3,})
my_map

In [None]:
if my_map.user_roi is None:
    my_map.user_roi = default_roi

In [None]:
feature_json = json.dumps(my_map.user_roi) # formats the python dictionary as a string so we can pass it to SQL
df_raster_sub = df_raster_input.where(
    expr(f"""ST_INTERSECTS(footprint, ST_GeomFromGeoJSON('{feature_json}'))""")
)

df_raster_sub.cache()
print(f"IMAGE COUNT: {df_raster_sub.count()}")
df_raster_sub.show(3, truncate=True)
df_raster_sub.createOrReplaceTempView("df_raster_input")
df_raster_sub.count()

## <span style="color: #7b73e2;">Viewing Results</span>

With our ROI defined we can see the footprints of the images in the ROI with the `SedonaKepler.create_map()` integration. Using `SedonaUtils.display_image()` we can view the images as well.

<span style="color: #7b73e2;;"> **Tip:** </span>  Save the map to a html file using `kepler_map.save_to_html()`


In [None]:
from sedona.spark import *

kepler_map = SedonaKepler.create_map()

SedonaKepler.add_df(kepler_map, df=df_raster_sub, name="Image Footprints")

kepler_map

In [None]:
htmlDf = sedona.sql(f"""SELECT RS_AsImage(outdb_raster, 250), name as FROM df_raster_input limit 5""")
SedonaUtils.display_image(htmlDf)

## <span style="color: #7b73e2;">Run Predictions and Visualize Results</span>

To run predictions, specify the model to use by the `model id`. Three models are pre-loaded and made available in **Wherobots Cloud**. You can also load your own models, learn more about that process [here](https://docs.wherobots.com/latest/tutorials/wherobotsai/wherobots-inference/raster-inference-overview/?h=bring#bring-your-own-model-guide).


Inference can be run using **Wherobots' Spatial SQL functions**, in this case: `RS_DETECT_BBOXES()`.
            
Here we generate predictions for all images in the ROI. The predictions output has two labels, `1` for offshore wind turbines and `2` for offshore platfroms. 

Then filter and print some of the results to see how non-detect and positive detect results look.   


In [None]:
model_id = 'marine-satlas-sentinel2'

predictions_df = sedona.sql(f"""
SELECT
  outdb_raster,
  name as image_name,
  detect_result.*
FROM (
  SELECT
    outdb_raster,
    name,
    RS_DETECT_BBOXES('{model_id}', outdb_raster) AS detect_result
  FROM
    df_raster_input
) AS detect_fields
""")

predictions_df.cache().count()
predictions_df.filter(size(col("labels")) == 0).show(3)
predictions_df.filter(size(col("labels")) == 1).show(3)
predictions_df.createOrReplaceTempView("predictions")

## <span style="color: #7b73e2;">Run Predictions And Visualize Results</span>

Since we ran inference across a lot of coastline, many scenes don't contain wind farms and don't have positive detections. Now that we've generated predictions using our model over our satellite imagery, we can filter the geometries by confidence score with `RS_FILTER_BOX_CONFIDENCE` and by the integer label representing offshore wind farms, `1`, to locate predicted offshore wind farms (label `2` is the integer label representing offshore platforms).

In [None]:
filtered_predictions = sedona.sql(f"""
  SELECT
    outdb_raster,
    image_name,
    filtered.*
  FROM (
    SELECT
      outdb_raster,
      image_name,
      RS_FILTER_BOX_CONFIDENCE(bboxes_wkt, confidence_scores, labels, 0.65) AS filtered
    FROM
      predictions
  ) AS temp
    WHERE size(filtered.max_confidence_bboxes) > 0
    AND 
        array_contains(filtered.max_confidence_labels, '1')
""")
filtered_predictions.createOrReplaceTempView("filtered_predictions")

## <span style="color: #7b73e2;">Prepare Results</span>

Before plotting our predictions we need to transfrom our results. 

We need our table in a structure where each row represents _all_ of a raster scene's bounding box predictions to a format where each row represents a _single_ predicted bounding box. 

To do this, combine the list columns containing our prediction results (`max_confidence_bboxes`, `max_confidence_scores`, and `max_confidence_labels`) with `arrays_zip`.  Then use `explode` to convert lists to rows. 

To map the results with `SedonaKepler`, convert the `max_confidence_bboxes` column to a `GeometryType` column with `ST_GeomFromWKT`

In [None]:
exploded_df = sedona.sql("""
SELECT
    outdb_raster,
    image_name,
    exploded.*
FROM (
    SELECT
        outdb_raster,
        image_name,
        explode(arrays_zip(max_confidence_bboxes, max_confidence_scores, max_confidence_labels)) AS exploded
    FROM
        filtered_predictions
) temp
""")
df_exploded = exploded_df.withColumn("geometry", expr("ST_GeomFromWkt(max_confidence_bboxes)")).drop("max_confidence_bboxes")
print(df_exploded.cache().count())
df_exploded.show()

In [None]:
from sedona.spark import *

kepler_map = SedonaKepler.create_map()

SedonaKepler.add_df(kepler_map, df=df_exploded.drop("outdb_raster"), name="Wind Farm Detections")
kepler_map

## <span style="color: #7b73e2;">Select a Footprint and Review the Image</span>


Select one of the detected **footprints** from the map above. Copy the name of a detected bounding box and paste it into the query below to retrieve the corresponding image.


<span style="color: #7b73e2;">**Remember:**</span>  If you changed the ROI at the begining the `image_name` below might not be found.

In [None]:
image_name = '2015411785-5-6.tiff'
htmlDf = sedona.sql(f"""SELECT RS_AsImage(outdb_raster, 500), name FROM df_raster_input WHERE name = '{image_name}' """)
SedonaUtils.display_image(htmlDf)

## <span style="color: #7b73e2;">Python API for wherobots.inference </span>

If you prefer python, wherobots.inference offers a module for registering the SQL inference functions as python functions. Below we run the same inference as before with `RS_DETECT_BBOXES`. 

In [None]:
from wherobots.inference.engine.register import create_object_detection_udfs
from pyspark.sql.functions import col
rs_detect, rs_threshold_geoms, rs_text_to_bboxes =  create_object_detection_udfs(batch_size = 10, sedona=sedona)
df = df_raster_input.withColumn("detect_result", rs_detect(model_id, col("outdb_raster"))).select(
                               "outdb_raster",
                               col("detect_result.bboxes_wkt").alias("bboxes_wkt"),
                               col("detect_result.confidence_scores").alias("confidence_scores"),
                               col("detect_result.labels").alias("labels")
                           )
df.show()