# 02: Model prediction and tiling for validation
Authors: Tobias G. Mueller, Mark A. Buckner
Last modified: 4 Dec 2024
Contact: __________

**Summary**: Here, we predict on an orthomosaic using our pretrained model. 
We then split the predictions and image into smaller tiles and take a random 40% for ground truthing and model validation.


This script outputs 
- predicted nest detections in yolov5 format 
- a random 20% of tiles into a testset folder 

The data used in this script was generated in:
    `AIggregation/notebooks/01_preprocessing.ipynb`

This script is followed by `03_validation_and_optimization.ipynb`. However the test tiles created in this script will need to be annotated before proceeding.

In [None]:
#imports 
import os
import fiftyone as fo
from sahi import AutoDetectionModel
from sahi.predict import get_sliced_prediction
from PIL import Image 

device = torch.device("cuda:0" if torch.cuda.is_available() else "cpu")

In [None]:
# first check the wd is not notebooks but the main folder
print("cwd is", os.getcwd())

if os.path.basename(os.getcwd()) == "notebooks":
    os.chdir("..")
    print("cwd changed to", os.getcwd())


# Detect nests on orthomosaic using sahi

Using our trained nest detection model we predict on our stitched orthomosaic image using SAHI (slicing aided hyper-inference)   -- `https://github.com/obss/sahi`



In [None]:
# set paths
# -------------------------------------------------------------------------------------------------------------------- #
image_directory = "datasets/drone_ortho/ortho_clip_23april.png"     # path to image to be predicted on
model_path = "AIggregation_yolov5m/weights/best.pt"                      # path to image detection model
predictions_directory = "datasets/export_predictions/temp"                    # directory to export model predictions to
# --------------------------------------------------------------------- ----------------------------------------------- #



# Import ortho image into a fiftyone dataset 
dataset_full = fo.Dataset.from_images(
    [image_directory]
)

# specify AI detection model to use for predictions
detection_model = AutoDetectionModel.from_pretrained(
    model_type='yolov5',
    model_path=model_path, #specify path to trained model
    confidence_threshold=0.25,
    device=device, # "cpu" or "cuda:0" for GPU (if available)
)

# define function for sliced predictions from sahi
def predict_with_slicing(sample, label_field, **kwargs):
    result = get_sliced_prediction(
        sample.filepath, detection_model, verbose=0, **kwargs
    )
    sample[label_field] = fo.Detections(detections=result.to_fiftyone_detections())

# predict on image, slicing at training image size
for sample in dataset_full.iter_samples(progress=True, autosave=True):
    predict_with_slicing(sample,
                         label_field="prediction",
                         slice_height=608, 
                         slice_width=608,
                         overlap_height_ratio = .4, 
                         overlap_width_ratio=.4
    )


#launch fiftyone session to see predictions
session = fo.launch_app(dataset_full)


#export predictions
dataset_full.export(
        export_dir=predictions_directory,
        dataset_type=fo.types.YOLOv5Dataset,
        label_field="prediction",
        include_confidence=True
    )

# Importing image, predicting, and export took
# 7m 53s using CPU (AMD Ryzen 5 5500 3.6 GHz 6-Core Processor)
# 2m 38s using GPU (EVGA SC GAMING GeForce GTX 1060 3GB 3 GB Video Card)
# 64 gb DDR4-3200 ram

# for half resolution dataset it took 38.1 seconds using GPU


# Create a test set for validation

To assess the model performance we will compare the predictions vs a labeled random subset.

To do this, we split the predicted upon image and its detections into many smaller tiles, then randomly select 40% of them to act as our test set. 


In [None]:

# set parameters for tiling script
# -------------------------------------------------------------------------------------------------------------------- #
source_path = predictions_directory                    # directory where model prediction exported to
target_path = "./datasets/testset/tiled_testset"       # directory to save tiled testset
img_ext = ".png"                                       # type of image predicted on
tile_size = 608                                        # size of tiles in pixels. default 608
all_images = "TRUE"                                    # set to FALSE to only keep images with detections
test_ratio = 0.4                                       # proportion of tiles to keep for testset
overwrite = "TRUE"                                     # set to TRUE to create new testlist. Default to pulling from existing list in one folder up from target
# -------------------------------------------------------------------------------------------------------------------- #


# set max image pixels to none
# otherwise pillow thinks large images might be a bomb DOS attack 
Image.MAX_IMAGE_PIXELS = None


# run riling script using above parameters
%run scripts/yolo_tile_modified.py -source {source_path} -target {target_path} -ext {img_ext}  -size {tile_size} -ratio {test_ratio} -overwrite {overwrite} -all_images {all_images}



# Annotate test set 

Before continuing to `03_validation_and_optimization.ipynb` the tiled testset must be annotated with ground truth labels.

Import the tiled images in the testset folder into labelstudio and annotate them. These will be used in the next notebook to validate the model predictions and optimize the detections. 