![](https://wherobots.com/wp-content/uploads/2023/12/Inline-Blue_Black_onWhite@3x.png)

## WherobotsAI Raster Inference - Segmentation 

This example demonstrates using a segmentation model with Raster Inference to identify solar farms in satellite imagery. We will use a machine learning model from [Satlas](https://satlas.allen.ai/ai) <sup>1</sup> which was trained using imagery from the European Space Agency’s Sentinel-2 satellites.

**Note: This notebook requires the Wherobots Inference functionality to be enabled and a GPU runtime selected in Wherobots Cloud. Please [contact us](https://wherobots.com/contact/) to enable these features.**


### Step 1: Set Up The WherobotsDB Context 

In [None]:
import warnings
warnings.filterwarnings('ignore')

from sedona.spark import SedonaContext

config = SedonaContext.builder().appName('segmentation-batch-inference')\
    .config("spark.wherobots.inference.entrance", "/opt/conda/envs/wherobots/lib/python3.11/site-packages/wherobots/inference/main.py")\
    .getOrCreate()

sedona = SedonaContext.create(config)

### 2: Load Satellite Imagery

Next, we load the satellite imagery that we will be running inference over. These GeoTiff images are loaded as *out-db* rasters in WherobotsDB, where each row represents a different scene.

In [None]:
tif_folder_path = 's3://wherobots-benchmark-prod/data/ml/satlas'
df_raster_input = sedona.read.format("raster").option("retile", False).load(f"{tif_folder_path}/*.tiff").limit(400)
df_raster_input.show(truncate=False)
df_raster_input.createOrReplaceTempView("df_raster_input")

### 3: Run Predictions And Visualize Results

To run predictions we will specify the model we wish to use. Some models are pre-loaded and made available in Wherobots Cloud. We can also load our own models. Predictions can be run with the Raster Inference SQL function [`RS_Segment`](https://docs.wherobots.com/latest/api/wherobots-inference/pythondoc/inference/sql_functions/) or the Python API.

Here we generate raster predictions using `RS_Segment`.

In [None]:
model_id = 'solar-satlas-sentinel2'

predictions_df = sedona.sql(f"""
SELECT
  rast,
  segment_result.*
FROM (
  SELECT
    rast,
    RS_SEGMENT('{model_id}', rast) AS segment_result
  FROM
    df_raster_input
) AS segment_fields
""")

predictions_df.cache().count()
predictions_df.show()
predictions_df.createOrReplaceTempView("predictions")

Now that we've generated predictions using our model over our satellite imagery, we can use the `RS_Segment_To_Geoms` function to extract the geometries indicating the model has identified as possible solar farms. we'll specify the following:

* a raster column to use for georeferencing our results
* the prediction result from the previous step
* our category label "1" returned by the model representing Solar Farms and the class map to use for assigning labels to the prediction
* a confidence threshold between 0 and 1.

In [None]:
df_multipolys = sedona.sql("""
    WITH t AS (
        SELECT RS_SEGMENT_TO_GEOMS(rast, confidence_array, array(1), class_map, 0.65) result
        FROM predictions
    )
    SELECT result.* FROM t
""")

df_multipolys.cache().count()
df_multipolys.show()
df_multipolys.createOrReplaceTempView("multipolygon_predictions")

Since we ran inference across the state of Arizona, many scenes don't contain solar farms and don't have positive detections. Let's filter out scenes without segmentation detections so that we can plot the results.

In [None]:
df_merged_predictions = sedona.sql("""
    SELECT
        element_at(class_name, 1) AS class_name,
        cast(element_at(average_pixel_confidence_score, 1) AS double) AS average_pixel_confidence_score,
        ST_Collect(geometry) AS merged_geom
    FROM
        multipolygon_predictions
""")

This leaves us with a few predicted solar farm polygons for our 400 satellite image samples.

In [None]:
df_filtered_predictions = df_merged_predictions.filter("ST_IsEmpty(merged_geom) = False")
df_filtered_predictions.cache().count()

In [None]:
df_filtered_predictions.show()

We'll plot these with SedonaKepler. Compare the satellite basemap with the predictions and see if there's a match!

In [None]:
from sedona.maps.SedonaKepler import SedonaKepler

map = SedonaKepler.create_map()

SedonaKepler.add_df(map, df=df_filtered_predictions, name="Solar Farm Detections")
map

### wherobots.inference Python API

If you prefer python, wherobots.inference offers a module for registering the SQL inference functions as python functions. Below we run the same inference as before with RS_SEGMENT.

In [None]:
from wherobots.inference.engine.register import create_semantic_segmentation_udfs
from pyspark.sql.functions import col
rs_segment =  create_semantic_segmentation_udfs(batch_size = 9, sedona=sedona)
df = df_raster_input.withColumn("segment_result", rs_segment(model_id, col("rast"))).select(
                               "rast",
                               col("segment_result.confidence_array").alias("confidence_array"),
                               col("segment_result.class_map").alias("class_map")
                           )
df.show(3)

### References

1. Bastani, Favyen, Wolters, Piper, Gupta, Ritwik, Ferdinando, Joe, and Kembhavi, Aniruddha. "SatlasPretrain: A Large-Scale Dataset for Remote Sensing Image Understanding." *arXiv preprint arXiv:2211.15660* (2023). [https://doi.org/10.48550/arXiv.2211.15660](https://doi.org/10.48550/arXiv.2211.15660)