Blog post explaining this project in more detail can be found HERE.
Set of ML algorithms for cloud detection in satellite images trained using Sentinel-2 imagery.
In this project we download and preprocess image files from the data collection SENTINEL2_L2A using sentinelhub.
Make sure you have the following software installed:
- Python 3.11
To create a virtual environment of this project, open a terminal, go to the root path of this project and type this:
> pip install pipenv
> pipenv install --dev
In the root folder, create a .env file with the following information:
# sentinelhub credentials
CLIENT_ID = <your client_id>
CLIENT_SECRET = <your client secret>
# dagshub ML_FLOW credentials
MLFLOW_TRACKING_URI=https://dagshub.com/<your dagshub username>/cloud-I.mlflow
MLFLOW_TRACKING_USERNAME=<your dagshub username>
MLFLOW_TRACKING_PASSWORD=<your mlflow tracking password>
To get the sentinelhub credentials:
- Create an account in the Sentinel Hubs dashboard.
- Get an OAuth client. Details here.
To get the dagshub ML_FLOW credentials, follow this documentation.
You can use the script src/import_image.py
.
This script downloads the images from remote repository using the sentinelhub python package. Be sure you already requested your CLIENT_ID and CLIENT_SECRET.
To preprocess the Sentinel-2 images, you can use the script src/preprocess.py
.
This script improves the brightness and contrast of the images using two transformations. It also creates cloud masks using the cloud probability files downloaded.
Data augmentation is done by rotating the pictures 90, 180 and 270 degrees, as well as vertical and horizontal flips.
You can use the script src/model.py
.
Different models are implemeted: Random Forest, ANN, FCNN, UNET.
To train a specific model, write in your terminal (inside your environment):
python src/model.py --model_name=<model name>
, where
--model_name=rf
trains a Random Forest.--model_name=ann
trains an ANN.--model_name=unet
trains a U-NET model.--model_name=segnet
trains a SegNet model.--model_name=yolo
trains YOLO.
If you want to make a prediction with an already trained model that is saved, use:
python src/model.py --model_name=<model name> --train_model=False
You need to create a folder called Dataset/
in the root path of the project. The structure
of this folder is as follows:
├── Dataset
│ ├── images
│ ├── train
│ ├── <train image name 1>.png
│ ├── ...
│ ├── val
│ ├── <validation image name 1>.png
│ ├── ...
│ ├── masks
│ ├── train
│ ├── <train mask name 1>.png
│ ├── ...
│ ├── val
│ ├── <validation mask name 1>.png
│ ├── ...
│ ├── labels
│ ├── train
│ ├── <train image name 1>.txt
│ ├── ...
│ ├── val
│ ├── <validation image name 1>.txt
│ ├── ...
│ ├── test
│ ├── <test image name>.png
│ ├── <test mask name>.png
IMPORTANT: The code is designed to test the models on a single image.