<a href="https://colab.research.google.com/github/aubreymoore/crb-damage-detector-colab/blob/main/detect_and_annotate.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

# detect_and_annotate.ipynb

NOTE: The following documentation is already slightly out of date.
Please visit https://github.com/aubreymoore/crb-damage-detector-colab before running this notebook for the first time.

This Colab Jupyter notebook runs a custom YOLOv8 object detector which scans images to find three object classes: live coconut palms, dead coconut palms and v-shaped cuts symptomatic of damage caused by coconut rhinoceros beetle, *Oryctes rhinoceros*.

IMPORTANT: Shortly after the MAIN PROGRAM section begins executing, a BROWSE button will appear below the active cell to allow you to upload single file of input data from your loacal machine to Colab.

**Note that Colab will just sit there and not do anything until you have entered a path to a test file of URLs or a ZIP file of images on your local machine.** [Click here to scroll down to the "Browser" button.](#scrollTo=5zSjfTXvIv2q&line=1&uniqifier=1)

You may choose between 2 options:
* A TEXT file (\*.txt) containing URLs for images to be scanned. One URL per line. (This is the most efficient option.)
* A ZIP file (\*.zip) containing images to be scanned.


Test data are available in a companion GitHub repository at https://github.com/aubreymoore/crb-damage-detector-colab. To use the test data, download it to your local computer it as a [ZIP file](https://github.com/aubreymoore/crb-damage-detector-colab/archive/refs/heads/main.zip) and unzip it. If you have **git** installed, you can clone the repo as an alternative. The TEXT file or ZIP file to be uploaded to Colab will be found in the repository's **data** folder.

To scan images, select **Runtime | Run all** on the main menu.
Results will be in a temporary OUTPUT folder which you can access using the **File browser** in the left Colab panel.

When image scanning is complete, the OUTPUT folder will be compressed into a single ZIP file and automatically downloaded to your computer.

### TODO

- [ finished 2024-10-19] Reduce size of images in the companion GH repo to max dimension of 960px
- [ ] Copy current trained model to companion GH repo
- [ ] Copy this Jupyter notebook to companion GH repo
- [ ] Add confidence values to bounding box labels.
- [ ] Add database to OUTPUT folder
- [ ] Extract GPS coordinates from image files
- [ ] Figure out how to use URLs to access images stored on OneDrive (Sharepoint)

# Load Python packages which are not preinstalled by Colab

In [2]:
%pip install ultralytics -q
%pip install supervision -q
%pip install imutils -q
%pip install icecream -q
%pip install ipython-autotime -q

[2K   [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m1.0/1.0 MB[0m [31m19.1 MB/s[0m eta [36m0:00:00[0m
[2K   [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m363.4/363.4 MB[0m [31m4.2 MB/s[0m eta [36m0:00:00[0m
[2K   [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m13.8/13.8 MB[0m [31m48.7 MB/s[0m eta [36m0:00:00[0m
[2K   [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m24.6/24.6 MB[0m [31m36.2 MB/s[0m eta [36m0:00:00[0m
[2K   [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m883.7/883.7 kB[0m [31m37.5 MB/s[0m eta [36m0:00:00[0m
[2K   [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m664.8/664.8 MB[0m [31m2.1 MB/s[0m eta [36m0:00:00[0m
[2K   [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m211.5/211.5 MB[0m [31m5.8 MB/s[0m eta [36m0:00:00[0m
[2K   [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m56.3/56.3 MB[0m [31m11.3 MB/s[0m eta [36m0:00:00[0m
[2K   [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━

# Import modules

In [3]:
import cv2
import supervision as sv
from ultralytics import YOLO
import imutils
import glob
import os
import shutil
from skimage import io
from icecream import ic
from google.colab import files
# ultralytics.checks()

Creating new Ultralytics Settings v0.0.6 file ✅ 
View Ultralytics Settings with 'yolo settings' or at '/root/.config/Ultralytics/settings.json'
Update Settings with 'yolo settings key=value', i.e. 'yolo settings runs_dir=path/to/dir'. For help see https://docs.ultralytics.com/quickstart/#ultralytics-settings.


# Load cell timer

In [4]:
%load_ext autotime

time: 347 µs (started: 2025-08-09 22:38:38 +00:00)


# Define functions

In [5]:
def upload_model_weights():
  '''
  Upload model weights from GitHub repo to **weights.pt** only if this file does not already exist.
  '''
  !wget -nc https://github.com/aubreymoore/code-for-CRB-damage-ai/raw/refs/heads/main/models/3class/train5/weights/best.pt -O weights.pt

# upload_model_weights()

time: 543 µs (started: 2025-08-09 22:38:45 +00:00)


In [6]:
def load_model_weights():
  model = YOLO('weights.pt')

time: 445 µs (started: 2025-08-09 22:38:52 +00:00)


In [7]:
def create_input_folder():
  if not os.path.exists('INPUT'):
    os.makedirs('INPUT')

# create_input_folder()

time: 524 µs (started: 2025-08-09 22:38:55 +00:00)


In [8]:
def create_output_folder():
  if not os.path.exists('OUTPUT'):
    os.makedirs('OUTPUT')

# create_output_folder()

time: 508 µs (started: 2025-08-09 22:38:59 +00:00)


In [9]:
def run_garbage_disposal():
  '''
  Delete any data files left over from the last run.
  '''
  shutil.rmtree('INPUT', ignore_errors=True)
  shutil.rmtree('OUTPUT', ignore_errors=True)
  shutil.rmtree('sample_data', ignore_errors=True)

  try:
    os.remove('weights.pt')
  except OSError:
    pass

# run_garbage_disposal()

time: 718 µs (started: 2025-08-09 22:39:03 +00:00)


In [10]:
def upload_and_unpack_zip_or_txt():
  '''
  Upload images (*.zip) or list of URLs (*.txt)
  '''
  uploaded = files.upload(target_dir='INPUT')
  filename = list(uploaded.keys())[0]

  urls = None
  image_file_dir = None

  if filename.endswith('.txt'):
    input_mode = 'text'
    with open(filename, 'r') as f:
      urls = f.read().splitlines()
  elif filename.endswith('.zip'):
    input_mode = 'zip'
    !unzip -q $filename -d INPUT
    image_file_dir = f'INPUT/{filename}'.replace('.zip', '')
    ic(image_file_dir)
  else:
    raise ValueError('INPUT file must be *.txt or *.zip.')
  return input_mode, urls, image_file_dir

# input_mode, urls, image_file_dir = upload_and_unpack_zip_or_txt()
# ic(input_mode)
# ic(urls)
# ic(image_file_dir)

time: 994 µs (started: 2025-08-09 22:39:07 +00:00)


In [11]:
def get_input_file_list():
  return glob.glob(f'INPUT/**/*', recursive=True)

# get_input_file_list()

time: 477 µs (started: 2025-08-09 22:39:11 +00:00)


In [12]:
def detect_objects(image, model, box_annotator, label_annotator, csv_sink):
  '''
  detect objects in an image
  returns detections and an annotated image
  '''
  results = model(image)[0]
  detections = sv.Detections.from_ultralytics(results)
  # ic(detections)
  annotated_image = box_annotator.annotate(image, detections=detections)
  labels = [f"{model.model.names[class_id]} {confidence:.2f}" for class_id, confidence in zip(detections.class_id, detections.confidence)]
  annotated_image = label_annotator.annotate(scene=annotated_image, detections=detections, labels=labels)
  return detections, annotated_image

# csv_sink = sv.CSVSink('detections.csv')
# csv_sink.open()

# upload_model_weights()
# model = YOLO('weights.pt')
# box_annotator = sv.BoxAnnotator()
# label_annotator = sv.LabelAnnotator()

# url = 'https://github.com/aubreymoore/crb-damage-detector-colab/blob/main/data/Vanuatu_July_2022_Sulav/resized-images/IMG_0532.JPG?raw=true'
# image = imutils.url_to_image(url)
# detections, annotated_image = detect_objects(image, model, box_annotator, label_annotator, csv_sink)
# ic(detections)
# sv.plot_image(annotated_image)

# custom_data = {'url': url}
# csv_sink.append(detections, custom_data)

# csv_sink.close()

time: 1.07 ms (started: 2025-08-09 22:39:16 +00:00)


# MAIN PROGRAM

In [17]:
# Clear data files from previous run
run_garbage_disposal()

create_input_folder()
create_output_folder()

# Upload images or list of URLs
input_mode, urls, image_file_dir = upload_and_unpack_zip_or_txt()

# Upload weights from trained model and load them
upload_model_weights()
model = YOLO('weights.pt')

box_annotator = sv.BoxAnnotator()
label_annotator = sv.LabelAnnotator()
csv_sink = sv.CSVSink('OUTPUT/detections.csv')
csv_sink.open()

# Scan images
if input_mode == 'text':
  for url in urls:
    try:
      image = imutils.url_to_image(url)
      detections, annotated_image = detect_objects(image, model, box_annotator, label_annotator, csv_sink)
      csv_sink.append(
          detections,
          custom_data={'image_h': image.shape[0], 'image_w': image.shape[1], 'source': url}
      )

      # Extract filename from URL
      filename = url.split('/')[-1]
      pos = filename.find('?')
      if pos >= 0:
        filename = filename[:pos]
      output_path = f'OUTPUT/{filename}'.replace('.', '_annotated.')
      ic(output_path)
      os.makedirs(os.path.dirname(output_path), exist_ok = True)
      cv2.imwrite(output_path, annotated_image)
    except:
      print(f'Error processing {url}')
    continue

if input_mode == 'zip':
  input_file_list = get_input_file_list()
  ic(input_file_list)
  for image_path in input_file_list:
    ic(image_path)
    try:
      image = cv2.imread(image_path)
      detections, annotated_image = detect_objects(image, model, box_annotator, label_annotator, csv_sink)
      csv_sink.append(
          detections,
          custom_data={'image_h': image.shape[0], 'image_w': image.shape[1], 'source': image_path}
      )

      filename = os.path.basename(image_path)
      output_path = f'OUTPUT/{filename}'.replace('.', '_annotated.')
      os.makedirs(os.path.dirname(output_path), exist_ok = True)
      result = cv2.imwrite(output_path, annotated_image)
    except:
      print(f'Error processing {image_path}')
    continue

csv_sink.close()

Saving vcuts_maui.zip to INPUT/vcuts_maui.zip


ic| image_file_dir: 'INPUT/INPUT/vcuts_maui'


--2025-08-09 22:55:20--  https://github.com/aubreymoore/code-for-CRB-damage-ai/raw/refs/heads/main/models/3class/train5/weights/best.pt
Resolving github.com (github.com)... 140.82.114.3
Connecting to github.com (github.com)|140.82.114.3|:443... connected.
HTTP request sent, awaiting response... 302 Found
Location: https://raw.githubusercontent.com/aubreymoore/code-for-CRB-damage-ai/refs/heads/main/models/3class/train5/weights/best.pt [following]
--2025-08-09 22:55:20--  https://raw.githubusercontent.com/aubreymoore/code-for-CRB-damage-ai/refs/heads/main/models/3class/train5/weights/best.pt
Resolving raw.githubusercontent.com (raw.githubusercontent.com)... 185.199.108.133, 185.199.109.133, 185.199.110.133, ...
Connecting to raw.githubusercontent.com (raw.githubusercontent.com)|185.199.108.133|:443... connected.
HTTP request sent, awaiting response... 200 OK
Length: 6269721 (6.0M) [application/octet-stream]
Saving to: ‘weights.pt’


2025-08-09 22:55:21 (69.4 MB/s) - ‘weights.pt’ saved [6

ic| input_file_list: ['INPUT/vcuts_maui.zip',
                      'INPUT/Ulua Beach.jpg',
                      'INPUT/Kaanapali Golf Course.jpg',
                      'INPUT/Hyatt coconut.jpg']
ic| image_path: 'INPUT/vcuts_maui.zip'



image 1/2 /usr/local/lib/python3.11/dist-packages/ultralytics/assets/bus.jpg: 960x736 (no detections), 453.3ms
image 2/2 /usr/local/lib/python3.11/dist-packages/ultralytics/assets/zidane.jpg: 544x960 (no detections), 299.3ms
Speed: 9.4ms preprocess, 376.3ms inference, 0.9ms postprocess per image at shape (1, 3, 544, 960)


ic| image_path: 'INPUT/Ulua Beach.jpg'


Error processing INPUT/vcuts_maui.zip

0: 736x960 1 live, 1 vcut, 428.8ms
Speed: 18.6ms preprocess, 428.8ms inference, 23.4ms postprocess per image at shape (1, 3, 736, 960)


ic| image_path: 'INPUT/Kaanapali Golf Course.jpg'



0: 960x736 2 vcuts, 452.6ms
Speed: 7.7ms preprocess, 452.6ms inference, 1.7ms postprocess per image at shape (1, 3, 960, 736)


ic| image_path: 'INPUT/Hyatt coconut.jpg'



0: 960x736 1 live, 452.0ms
Speed: 12.6ms preprocess, 452.0ms inference, 1.2ms postprocess per image at shape (1, 3, 960, 736)
time: 5min 10s (started: 2025-08-09 22:50:14 +00:00)


## Please click on the Browse buttom when it appears above this cell.

### Download OUTPUT folder as a ZIP file

In [18]:
!zip -r OUTPUT.zip OUTPUT

updating: OUTPUT/ (stored 0%)
updating: OUTPUT/detections.csv (deflated 43%)
  adding: OUTPUT/Kaanapali Golf Course_annotated.jpg (deflated 2%)
  adding: OUTPUT/Hyatt coconut_annotated.jpg (deflated 2%)
  adding: OUTPUT/Ulua Beach_annotated.jpg (deflated 9%)
time: 308 ms (started: 2025-08-09 23:04:26 +00:00)


In [19]:
from google.colab import files
files.download("OUTPUT.zip")

<IPython.core.display.Javascript object>

<IPython.core.display.Javascript object>

time: 4.84 ms (started: 2025-08-09 23:04:34 +00:00)


# FINISHED
If everything worked as intended, you should find a file named **OUTPUT.zip** in your Downloads folder. Unzip this file to see results.

In [None]:

print('FINISHED')