# SCANEO


As a final step in creating our training dataset, we need to prepare the data for AI. Numerous tasks can be performed here, such as:

- **Data cleaning**: removing corrupted images, removing images with excessive cloudiness, etc.
- **Feature engineering**: calculating vegetation indices, calculating statistics, etc.
- **Data analysis**: plotting time series, plotting histograms, etc.
- **Labeling**: creating labels for images, etc.

For each case, you can use your favorite tools. Here we will demonstrate labeling with [SCANEO](https://github.com/earthpulse/scaneo).

SCANEO is a labeling web application that allows you to label satellite images (to identify, for example, objects present, terrain types, etc.) quickly and easily. The service offered by SCANEO is vital, as it requires preparing satellite data so it can be processed by neural networks, enabling active learning.

Before running the web interface, we must ensure that we have the `scaneo` package installed on our machine, and if not, install it.


In [None]:
# !uv add scaneo

You can run `scaneo` with the following options.


In [None]:
!uv run scaneo --help

You can run `scan` by opening a terminal and running:


In [None]:
!uv run scaneo

You can then access the web interface at `http://localhost:8000`.

> You can change the host and port with `scan --host 0.0.0.0 --port 8000`.

Let's see what it has to offer!


![scaneo](images/scaneo.png)


To demonstrate SCANEO, we're going to use a different dataset than the one we created, so we can showcase one of SCANEO's best features: automatic labeling!

To do this, we first need to prepare the training and test set using the EOTDL CLI. This may take a little time the first time, so be patient!

The [EOTDL (Earth Observation Training Data Lab)](https://www.eotdl.com/) is an open, collaborative platform with a growing cloud-based repository of curated datasets and pre-trained models — here we are using it to quickly download a ready-to-use road segmentation dataset and model. Let's take a quick look on the platform first [here](https://www.eotdl.com/)!


In [None]:
!uv run eotdl models get SCANEO -p .. -a -f   # Como path le damos el directorio del proyecto,
                                              # -a para que se descarguen los archivos STAC
                                              # -f para que se fuerce la descarga

Now we have our dataset. Let's label it!


## AI-assisted labeling


Once we've played with the labeling tools, we can try SCANEO's best feature: adding and integrating your own models to automatically label the training set. As you can imagine, this significantly speeds up labeling times.

First, we need to deploy the SCANEO inference API. To do this, we'll do the following:


In [None]:
!git clone https://github.com/earthpulse/scaneo ../scaneo-api

Now we copy the model from `SCANEO/models/` to `scaneo-api/inference/`


In [None]:
cp -rf ../SCANEO/models ../scaneo-api/inference

And we launch the API


In [None]:
!cd ../scaneo-api && make inference

Now we can go to the inference API at http://0.0.0.0:8001/docs. The road segmentation model with Sentinel-2 imagery is at http://0.0.0.0:8001/s2-roads

Now, to activate our model, we must follow these steps:

1. Go to http://localhost:8000/models and create a new model.
2. Give it a name and description, which can be something related to `BiDS 2025`, and set the URL to `http://0.0.0.0:8001/s2-roads`. The task is segmentation. No further action is required.
3. Now, go to our campaign, to the settings section, and add our model. It is important to note that for it to work correctly, we must set the `roads` class to 1 and leave the others empty, with no value, not even 0. And save.
4. We return to the campaign, select an image, the "roads" class, and in the top left tab, click "Run inference model." And voilà! Labeling!

With this, the advantages are clear. You can label a small dataset, train a model, and use the same model to continue labeling the training set, iteratively, making it much faster.


## Opportunities for discussion and contribution


Feel free to ask questions now (live or via Discord) and suggest future improvements.

- Do you already use a tagging tool?
- What would be your ideal tagging tool?
