# Segmenting aerial imagery using geospatial notebook

---

This notebook's CI test result for us-west-2 is as follows. CI test results in other regions can be found at the end of the notebook. 

![This us-west-2 badge failed to load. Check your device's internet connectivity, otherwise the service is currently unavailable](https://h75twx4l60.execute-api.us-west-2.amazonaws.com/sagemaker-nb/us-west-2/sagemaker-geospatial|segment-aerial-naip|segment_naip_geospatial_notebook-cpu_only.ipynb)

---

## Introduction

This notebook demonstrates how you can use the geospatial kernel in combination with open-source geospatial libraries to perform prompt-based segmentation on satellite or aerial imagery. This example uses data from National Agriculture Imagery Program (NAIP). The National Agriculture Imagery Program acquires aerial imagery during the agricultural growing seasons in the continental United States.

## Prerequisites

This notebook runs with the Geospatial 1.0 kernel with a `ml.geospatial.interactive` instance. Note that the following policies need to be attached to the execution role that you used to run this notebook:
- AmazonSageMakerFullAccess
- AmazonSageMakerGeospatialFullAccess

You can see the policies attached to the role in the IAM console under the permissions tab. If required, add the roles using the 'Add Permissions' button.

In addition to these policies, ensure that the execution role's trust policy allows the SageMaker-GeoSpatial service to assume the role. This can be done by adding the following trust policy using the 'Trust relationships' tab:

```
{
    "Version": "2012-10-17",
    "Statement": [
        {
            "Effect": "Allow",
            "Principal": {
                "Service": [
                    "sagemaker.amazonaws.com",
                    "sagemaker-geospatial.amazonaws.com"
                ]
            },
            "Action": "sts:AssumeRole"
        }
    ]
}
```

## GPU Support

When you're running this notebook on a `ml.g5.*` instance, you have access to NVIDIA A10G GPUs. The Geospatial 1.0 kernel comes preinstalled with PyTorch and Tensorflow and if you use a `ml.g5.*` the CUDA dependencies are installed and the frameworks can access the GPUs without further setup.

This notebook is an *adapted version* which runs on the `ml.geospatial.interactive` instance, which *runs only on CPU*. For the GPU enabled version, please see [segment_naip_geospatial_notebook.ipynb](sagemaker-geospatial/segment-aerial-naip/segment_naip_geospatial_notebook.ipynb)

## Setup SageMaker geospatial capabitilies and install additional dependencies

In this example, you'll use the [segment-geospatial](https://pypi.org/project/segment-geospatial/) library to perform the segmentation. This library is combining Segment Anything with Grounding DINO to detect and segment based on text inputs. The following steps are installing the additional dependencies and setting up the necessary imports.

In [None]:
# install additional dependencies
%pip install segment-geospatial groundingdino-py

In [None]:
import boto3
import sagemaker
import os
import torch
import IPython.display
from samgeo import SamGeo
from samgeo.text_sam import LangSAM

session = boto3.Session()
execution_role = sagemaker.get_execution_role()
geospatial_client = session.client(service_name="sagemaker-geospatial")

In [None]:
!mkdir -p data

## Get NAIP example data

To query NAIP aerial imagery, you can use the `search_raster_data_collection` API, which allows to run spatial-temporal queries against various raster data collections.

The following code example uses the ARN associated with the NAIP data collection, `arn:aws:sagemaker-geospatial:us-west-2:378778860802:raster-data-collection/public/37ndema229vwa987`.

In [None]:
search_raster_config = {
    "Arn": "arn:aws:sagemaker-geospatial:us-west-2:378778860802:raster-data-collection/public/37ndema229vwa987",  # NAIP, National Agriculture Imagery Program
    "RasterDataCollectionQuery": {
        "AreaOfInterest": {
            "AreaOfInterestGeometry": {
                "PolygonGeometry": {
                    "Coordinates": [
                        [
                            [-116.43212657788257, 43.492823120694055],
                            [-116.43212657788257, 43.459682618058224],
                            [-116.37655884552012, 43.459682618058224],
                            [-116.37655884552012, 43.492823120694055],
                            [-116.43212657788257, 43.492823120694055],
                        ]
                    ]
                }
            }
        },
        "TimeRangeFilter": {
            "StartTime": "2021-01-01T00:00:00Z",
            "EndTime": "2021-12-31T23:59:59Z",
        },
    },
}

In [None]:
result = geospatial_client.search_raster_data_collection(**search_raster_config)

After performing the query, we obtained two scenes which matched the search criteria. We can inspect the results in the following cell.

In [None]:
IPython.display.JSON(result["Items"])

## Download and visualize example scene

We will download the image asset of the first scene to use it as input for the segmentation model. To prepare the input for the inference, we'll clip the scene to a particular area of interest and visualize the prepared input.

In [None]:
def download_from_s3(s3_obj_url, local_dir):
    os.makedirs(local_dir, exist_ok=True)
    local_file_path = os.path.join(local_dir, s3_obj_url.split("/")[-1])
    target_bucket_name = s3_obj_url.split("/")[2]
    target_bucket_ob_key = "/".join(s3_obj_url.split("/")[3:])

    s3_bucket = session.resource("s3").Bucket(target_bucket_name)
    s3_bucket.download_file(
        target_bucket_ob_key, local_file_path, ExtraArgs={"RequestPayer": "requester"}
    )
    return local_file_path

In [None]:
local_file_path = download_from_s3(result["Items"][0]["Assets"]["image"]["Href"], "data")

In [None]:
import rioxarray

visual = rioxarray.open_rasterio(local_file_path)

In [None]:
# clip to particular AOI
clipped = visual.rio.clip_box(
    minx=552000,
    miny=4813000,
    maxx=553000,
    maxy=4814000,
)

In [None]:
import matplotlib.pyplot as plt

# display clipped AOI
plt.figure(figsize=(12, 12))
ax = plt.axes()
clipped.plot.imshow(ax=ax)
plt.show()

In [None]:
# store as input for segmentation model
clipped.rio.to_raster("data/example_input.tif")

## Perform segmentation on spatial data
### Initialize Segmentation model

In [None]:
sam = LangSAM()

### Run inference & visualize segmentation mask

After the segmentation model has been initialized, we'll perform the inference by running the `predict` method.

The model prediction step includes setting appropriate thresholds for object detection and associating text with the identified objects. These threshold values, which range from 0 to 1, are specified when invoking the `predict` method of the LangSAM class.

box_threshold: This parameter is used for object detection in the image. A higher box threshold causes the model to be more selective, identifying only object instances with the highest confidence. This selectivity may lead to a reduction in the total number of detections. On the other hand, a lower box threshold renders the model more permissive, resulting in a greater number of detections, including those of potentially lower confidence.

text_threshold: A higher text threshold demands a stronger correlation between the object and the text prompt, which can lead to more precise but fewer associations. Conversely, a lower text threshold facilitates looser associations, possibly increasing the number of associations but at the risk of including less accurate matches.

In [None]:
text_prompt = "tree"
sam.predict("data/example_input.tif", text_prompt, box_threshold=0.24, text_threshold=0.24)

In [None]:
sam.show_anns(
    cmap="Greens",
    box_color="red",
    title="Example Segmentation Result (Trees)",
    blend=True,
)

We can also show the annotations as a binary mask.

In [None]:
sam.show_anns(
    cmap="Greys_r",
    add_boxes=False,
    alpha=1,
    title="Example Segmentation Result (Trees)",
    blend=False,
    output="data/trees.tif",
)

As a final step, we'll convert the binary mask to a vector format, such as shapefile. You can also use any other vector format supported by geopandas, such as GeoJSON and GeoPackage. In the following cells, we extract the mask as shapefile and visualize it on an interactive map.

In [None]:
sam.raster_to_vector("data/trees.tif", "data/trees.shp")

In [None]:
import leafmap.foliumap as leafmap

Map = leafmap.Map()
Map.add_basemap("USGS NAIP Imagery")
style = {
    "color": "#3388ff",
    "weight": 2,
    "fillColor": "#7c4185",
    "fillOpacity": 0.5,
}
Map.add_vector("data/trees.shp", layer_name="Vector", style=style)
Map

## Notebook CI Test Results

This notebook was tested in multiple regions. The test results are as follows, except for us-west-2 which is shown at the top of the notebook.

![This us-east-1 badge failed to load. Check your device's internet connectivity, otherwise the service is currently unavailable](https://h75twx4l60.execute-api.us-west-2.amazonaws.com/sagemaker-nb/us-east-1/sagemaker-geospatial|segment-aerial-naip|segment_naip_geospatial_notebook-cpu_only.ipynb)

![This us-east-2 badge failed to load. Check your device's internet connectivity, otherwise the service is currently unavailable](https://h75twx4l60.execute-api.us-west-2.amazonaws.com/sagemaker-nb/us-east-2/sagemaker-geospatial|segment-aerial-naip|segment_naip_geospatial_notebook-cpu_only.ipynb)

![This us-west-1 badge failed to load. Check your device's internet connectivity, otherwise the service is currently unavailable](https://h75twx4l60.execute-api.us-west-2.amazonaws.com/sagemaker-nb/us-west-1/sagemaker-geospatial|segment-aerial-naip|segment_naip_geospatial_notebook-cpu_only.ipynb)

![This ca-central-1 badge failed to load. Check your device's internet connectivity, otherwise the service is currently unavailable](https://h75twx4l60.execute-api.us-west-2.amazonaws.com/sagemaker-nb/ca-central-1/sagemaker-geospatial|segment-aerial-naip|segment_naip_geospatial_notebook-cpu_only.ipynb)

![This sa-east-1 badge failed to load. Check your device's internet connectivity, otherwise the service is currently unavailable](https://h75twx4l60.execute-api.us-west-2.amazonaws.com/sagemaker-nb/sa-east-1/sagemaker-geospatial|segment-aerial-naip|segment_naip_geospatial_notebook-cpu_only.ipynb)

![This eu-west-1 badge failed to load. Check your device's internet connectivity, otherwise the service is currently unavailable](https://h75twx4l60.execute-api.us-west-2.amazonaws.com/sagemaker-nb/eu-west-1/sagemaker-geospatial|segment-aerial-naip|segment_naip_geospatial_notebook-cpu_only.ipynb)

![This eu-west-2 badge failed to load. Check your device's internet connectivity, otherwise the service is currently unavailable](https://h75twx4l60.execute-api.us-west-2.amazonaws.com/sagemaker-nb/eu-west-2/sagemaker-geospatial|segment-aerial-naip|segment_naip_geospatial_notebook-cpu_only.ipynb)

![This eu-west-3 badge failed to load. Check your device's internet connectivity, otherwise the service is currently unavailable](https://h75twx4l60.execute-api.us-west-2.amazonaws.com/sagemaker-nb/eu-west-3/sagemaker-geospatial|segment-aerial-naip|segment_naip_geospatial_notebook-cpu_only.ipynb)

![This eu-central-1 badge failed to load. Check your device's internet connectivity, otherwise the service is currently unavailable](https://h75twx4l60.execute-api.us-west-2.amazonaws.com/sagemaker-nb/eu-central-1/sagemaker-geospatial|segment-aerial-naip|segment_naip_geospatial_notebook-cpu_only.ipynb)

![This eu-north-1 badge failed to load. Check your device's internet connectivity, otherwise the service is currently unavailable](https://h75twx4l60.execute-api.us-west-2.amazonaws.com/sagemaker-nb/eu-north-1/sagemaker-geospatial|segment-aerial-naip|segment_naip_geospatial_notebook-cpu_only.ipynb)

![This ap-southeast-1 badge failed to load. Check your device's internet connectivity, otherwise the service is currently unavailable](https://h75twx4l60.execute-api.us-west-2.amazonaws.com/sagemaker-nb/ap-southeast-1/sagemaker-geospatial|segment-aerial-naip|segment_naip_geospatial_notebook-cpu_only.ipynb)

![This ap-southeast-2 badge failed to load. Check your device's internet connectivity, otherwise the service is currently unavailable](https://h75twx4l60.execute-api.us-west-2.amazonaws.com/sagemaker-nb/ap-southeast-2/sagemaker-geospatial|segment-aerial-naip|segment_naip_geospatial_notebook-cpu_only.ipynb)

![This ap-northeast-1 badge failed to load. Check your device's internet connectivity, otherwise the service is currently unavailable](https://h75twx4l60.execute-api.us-west-2.amazonaws.com/sagemaker-nb/ap-northeast-1/sagemaker-geospatial|segment-aerial-naip|segment_naip_geospatial_notebook-cpu_only.ipynb)

![This ap-northeast-2 badge failed to load. Check your device's internet connectivity, otherwise the service is currently unavailable](https://h75twx4l60.execute-api.us-west-2.amazonaws.com/sagemaker-nb/ap-northeast-2/sagemaker-geospatial|segment-aerial-naip|segment_naip_geospatial_notebook-cpu_only.ipynb)

![This ap-south-1 badge failed to load. Check your device's internet connectivity, otherwise the service is currently unavailable](https://h75twx4l60.execute-api.us-west-2.amazonaws.com/sagemaker-nb/ap-south-1/sagemaker-geospatial|segment-aerial-naip|segment_naip_geospatial_notebook-cpu_only.ipynb)