In [None]:
!nvidia-smi

Sun Jun 11 20:40:39 2023       
+-----------------------------------------------------------------------------+
| NVIDIA-SMI 525.85.12    Driver Version: 525.85.12    CUDA Version: 12.0     |
|-------------------------------+----------------------+----------------------+
| GPU  Name        Persistence-M| Bus-Id        Disp.A | Volatile Uncorr. ECC |
| Fan  Temp  Perf  Pwr:Usage/Cap|         Memory-Usage | GPU-Util  Compute M. |
|                               |                      |               MIG M. |
|   0  NVIDIA A100-SXM...  Off  | 00000000:00:04.0 Off |                    0 |
| N/A   30C    P0    45W / 400W |      0MiB / 40960MiB |      0%      Default |
|                               |                      |             Disabled |
+-------------------------------+----------------------+----------------------+
                                                                               
+-----------------------------------------------------------------------------+
| Proces

# Import required libraries

In [None]:
from google.colab import drive
drive.mount('/content/drive')

Mounted at /content/drive


# Get the dataset

In [None]:
%cd /content
!cp "/content/drive/Shareddrives/Nuvens/datasets/clouds1500_with_tree.zip" .
!mkdir clouds1500_with_tree
!unzip "clouds1500_with_tree.zip" -d clouds1500_with_tree

# Installing specific version of PaddlePaddle-GPU Lib that works with the PaddleSeg Lib

In [None]:
# Change the working directory to '/content/'
%cd '/content/'

# Copy the HrNetBruno.yaml configuration file from Google Drive to the current working directory
!cp '/content/drive/Shareddrives/Nuvens/resultados_paddleseg/clouds1500_no_tree/segformer/train.yaml' .

# Install the specified version of paddlepaddle-gpu from the provided URL
!pip install paddlepaddle-gpu==2.4.2.post117 -f https://www.paddlepaddle.org.cn/whl/linux/mkl/avx/stable.html

# Run a check to ensure the paddle installation is successful and working
paddle.utils.run_check()

# Print the paddle version
print(paddle.__version__)

import paddle
import os

os.environ['CUDA_VISIBLE_DEVICES'] ='0'

# Installing PaddleSeg Lib

In [None]:
# Clone the PaddleSeg repository
!git clone -b release/2.7 https://github.com/PaddlePaddle/PaddleSeg.git

# Change the working directory to 'PaddleSeg'
%cd PaddleSeg

# Install the required dependencies for PaddleSeg
!pip install -r requirements.txt

# Run a shell script to check the installation of PaddleSeg
!sh tests/run_check_install.sh

# Install the PaddleSeg package
!python setup.py install


# Training

In [None]:
# Define experiment name and save directory
%cd /content
exp_name = 'hrnet_w18_ssld_with_resize'
sav_dir = f'path_to_save/{exp_name}'

!python /content/PaddleSeg/tools/train.py \
    --config path_to_save/train_with_tree_colab.yaml \
    --do_eval \
    --use_vdl \
    --resume_model $sav_dir'/iter_36500' \
    --precision fp32 \
    --save_interval 500 \
    --save_dir $sav_dir
    # --keep_checkpoint_max 100 \

# Validation

The code snippet performs model validation using the PaddleSeg library:

1. `model_params = f'{sav_dir}/best_model/model.pdparams'`: This line creates a string with the file path of the best model's parameters from the training process. The path is constructed using the `sav_dir` variable, which contains the directory where the model and training results were saved.

2. The `!python /content/PaddleSeg/val.py \` command runs the PaddleSeg validation script (`val.py`). The following arguments are passed to the script:

   - `--config /content/HrNetBruno.yaml`: This specifies the configuration file (`HrNetBruno.yaml`) to use during the validation process. The configuration file contains information about the model architecture, dataset, and other settings.

   - `--model_path $model_params`: This provides the path to the best model's parameters, which were saved during the training process. The script will use these parameters to evaluate the model's performance on the validation dataset.

In summary, this code snippet evaluates the trained model on the validation dataset using the specified configuration file and model parameters. The PaddleSeg validation script calculates various performance metrics, such as accuracy and IoU (Intersection over Union), to assess the model's performance.

In [None]:
model_params = f'{sav_dir}/best_model/model.pdparams'

!python /content/PaddleSeg/val.py \
       --config /content/clouds1500_with_tree/train_with_tree_colab.yaml \
       --model_path $model_params

# Test

The code snippet performs model validation using the PaddleSeg library with an updated configuration file for validation:

1. `model_params = f'{sav_dir}/best_model/model.pdparams'`: This line creates a string with the file path of the best model's parameters from the training process. The path is constructed using the `sav_dir` variable, which contains the directory where the model and training results were saved.

2. The `!python /content/PaddleSeg/val.py \` command runs the PaddleSeg validation script (`val.py`). The following arguments are passed to the script:

   - `--config '/content/drive/Shareddrives/Nuvens/datasets/Albedo(merged classes)_001 - 997 images/Experimentos/HrNetBruno_for_validation.yaml'`: This specifies an alternative configuration file (`HrNetBruno_for_validation.yaml`) located in a different path to use during the validation process. The configuration file contains information about the model architecture, dataset, and other settings specifically for validation.

   - `--model_path $model_params`: This provides the path to the best model's parameters, which were saved during the training process. The script will use these parameters to evaluate the model's performance on the validation dataset.

In summary, this code snippet evaluates the trained model on the validation dataset using the specified alternative configuration file and model parameters. The PaddleSeg validation script calculates various performance metrics, such as accuracy and IoU (Intersection over Union), to assess the model's performance.

In [None]:
model_params = f'{sav_dir}/best_model/model.pdparams'

!python /content/PaddleSeg/val.py \
       --config /content/clouds1500_with_tree/val_with_tree_colab.yaml \
       --model_path $model_params

# Inference

The code snippet performs model prediction on a folder of images using the trained model and the PaddleSeg library:

1. `model_params = f'{sav_dir}/best_model/model.pdparams'`: This line creates a string with the file path of the best model's parameters from the training process. The path is constructed using the `sav_dir` variable, which contains the directory where the model and training results were saved.

2. The image folder, destination folder, and the custom color palette are defined:
   - `image_folder`: Folder containing the input images to perform prediction on.
   - `dest_folder`: Folder where the prediction results will be saved.
   - `color_pallet`: A custom color palette for visualizing the predicted segmentation masks, defined as a sequence of RGB values separated by spaces.

3. The `!python /content/PaddleSeg/predict.py \` command runs the PaddleSeg prediction script (`predict.py`). The following arguments are passed to the script:

   - `--config /content/HrNetBruno.yaml`: This specifies the configuration file (`HrNetBruno.yaml`) to use during the prediction process. The configuration file contains information about the model architecture, dataset, and other settings.

   - `--model_path $model_params`: This provides the path to the best model's parameters, which were saved during the training process. The script will use these parameters to perform prediction on the input images.

   - `--image_path $image_folder`: This specifies the folder containing the input images for prediction.

   - `--save_dir  $dest_folder`: This defines the folder where the prediction results (segmentation masks) will be saved.

   - `--custom_color $color_pallet`: This provides the custom color palette for visualizing the predicted segmentation masks.

In summary, this code snippet performs semantic segmentation prediction on a folder of images using the trained model, specified configuration file, and model parameters. The prediction results (segmentation masks) are saved in the specified destination folder with the provided custom color palette for visualization.

In [None]:
# Predict Folder with trained model
# Files created in the training Experiment
model_params = f'{sav_dir}/best_model/model.pdparams'

# Folder to predict
image_folder = "/content/clouds1500_with_tree/test/"

# Folder to save the predictions
dest_folder = f'{sav_dir}/predictions/train'

# Custom color pallet, the format is a sequential RGB value for each class, and all values are separated by a space.
# In the example bellow, 0 0 0 is the value for the class zero, 7 25 163 is the value for the class one and so and on.
color_pallet = '31 119 180 255 127 15 43 160 43 214 39 39 148 103 189 140 85 76 '

!python /content/PaddleSeg/predict.py \
       --config /content/clouds1500_with_tree/train_with_tree_colab.yaml \
       --model_path $model_params \
       --image_path $image_folder \
       --save_dir  $dest_folder \
       --custom_color $color_pallet


# Yaml Explanation (no need to execute this cell)
Consider the following YAML file.

The YAML (Yet Another Markup Language) file contains the configuration settings for training and validating a semantic segmentation model using the PaddleSeg library. It defines the dataset, model architecture, loss function, learning rate scheduler, optimizer, and other settings. The file is used in the previous code snippets to configure the training, validation, and prediction processes.

Here's an explanation of each section in the YAML file:

1. `batch_size`: The number of samples to process in each batch during training.
2. `iters`: The total number of iterations for the training process.

3. `train_dataset`: Configuration settings for the training dataset.
   - `type`: Specifies the type of dataset (in this case, `Dataset`).
   - `separator`: The character used to separate fields in the dataset files.
   - `dataset_root`: The root directory of the dataset.
   - `train_path`: The path to the file containing the training image and label paths.
   - `num_classes`: The number of classes in the dataset.
   - `transforms`: A list of data augmentation techniques to apply during training (in this case, `Resize`).
   - `mode`: The mode of the dataset (in this case, `train`).

4. `val_dataset`: Configuration settings for the validation dataset, similar to the `train_dataset` settings.

5. `model`: Configuration settings for the model architecture.
   - `type`: Specifies the type of model (in this case, `OCRNet`).
   - `backbone`: Contains settings for the model's backbone architecture (in this case, `HRNet_W18`).
   - `num_classes`: The number of classes in the dataset.
   - `backbone_indices`: Indices of the backbone output layers to be used.

6. `loss`: Configuration settings for the loss function.
   - `types`: A list of loss functions to be used (in this case, two instances of `CrossEntropyLoss`).
   - `coef`: A list of coefficients to apply to each loss function.

7. `lr_scheduler`: Configuration settings for the learning rate scheduler.
   - `type`: Specifies the type of learning rate scheduler (in this case, `PolynomialDecay`).
   - `learning_rate`: The initial learning rate.
   - `power`: The exponent used in the polynomial decay.

8. `optimizer`: Configuration settings for the optimizer.
   - `type`: Specifies the type of optimizer (in this case, `AdamW`).
   - `beta1` and `beta2`: Parameters for the AdamW optimizer.
   - `weight_decay`: The weight decay applied to the optimizer.

In the previous code snippets, the YAML file is used to configure the training, validation, and prediction processes by providing the settings for the dataset, model architecture, loss function, learning rate scheduler, and optimizer. The `--config` argument in the `train.py`, `val.py`, and `predict.py` scripts specifies the path to the YAML file, which is then used by the PaddleSeg library to set up the required configurations.

In [None]:
# batch_size: 2
# iters: 80000
# train_dataset:
#   type: Dataset
#   separator: ;
#   dataset_root: /content/drive/Shareddrives/Nuvens/datasets/Albedo(merged classes)_001 - 997 images
#   train_path: /content/drive/Shareddrives/Nuvens/datasets/Albedo(merged classes)_001 - 997 images/train-paddle.txt
#   num_classes: 6
#   transforms:
#     - type: Resize
#       target_size: [1280, 1280]
#   mode: train

# val_dataset:
#   type: Dataset
#   separator: ;
#   dataset_root: /content/drive/Shareddrives/Nuvens/datasets/Albedo(merged classes)_001 - 997 images
#   val_path: /content/drive/Shareddrives/Nuvens/datasets/Albedo(merged classes)_001 - 997 images/val-paddle.txt
#   num_classes: 6
#   transforms:
#     - type: Resize
#       target_size: [1280, 1280]
#   mode: val

# model:
#   type: OCRNet
#   backbone:
#     type: HRNet_W18
#     pretrained: https://bj.bcebos.com/paddleseg/dygraph/hrnet_w18_ssld.tar.gz
#   num_classes: 6
#   backbone_indices: [0]

# loss:
#   types:
#     - type: CrossEntropyLoss
#     - type: CrossEntropyLoss
#   coef: [1, 0.4]

# lr_scheduler:
#   type: PolynomialDecay
#   learning_rate: 0.0001
#   power: 0.9

# optimizer:
#   type: AdamW
#   beta1: 0.9
#   beta2: 0.999
#   weight_decay: 0.01