<a href="https://colab.research.google.com/github/aubreymoore/crb-damage-detector-colab/blob/main/detect_and_annotate.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

In [1]:
%pip install ipython-autotime -q
%load_ext autotime

time: 411 µs (started: 2025-08-11 08:10:48 +00:00)


# detect_and_annotate.ipynb

NOTE: The following documentation is already slightly out of date.
Please visit https://github.com/aubreymoore/crb-damage-detector-colab before running this notebook for the first time.

This Colab Jupyter notebook runs a custom YOLOv8 object detector which scans images to find three object classes: live coconut palms, dead coconut palms and v-shaped cuts symptomatic of damage caused by coconut rhinoceros beetle, *Oryctes rhinoceros*.

IMPORTANT: Shortly after the MAIN PROGRAM section begins executing, a BROWSE button will appear below the active cell to allow you to upload single file of input data from your loacal machine to Colab.

**Note that Colab will just sit there and not do anything until you have entered a path to a test file of URLs or a ZIP file of images on your local machine.** [Click here to scroll down to the "Browser" button.](#scrollTo=5zSjfTXvIv2q&line=1&uniqifier=1)

You may choose between 2 options:
* A TEXT file (\*.txt) containing URLs for images to be scanned. One URL per line. (This is the most efficient option.)
* A ZIP file (\*.zip) containing images to be scanned.


Test data are available in a companion GitHub repository at https://github.com/aubreymoore/crb-damage-detector-colab. To use the test data, download it to your local computer it as a [ZIP file](https://github.com/aubreymoore/crb-damage-detector-colab/archive/refs/heads/main.zip) and unzip it. If you have **git** installed, you can clone the repo as an alternative. The TEXT file or ZIP file to be uploaded to Colab will be found in the repository's **data** folder.

To scan images, select **Runtime | Run all** on the main menu.
Results will be in a temporary OUTPUT folder which you can access using the **File browser** in the left Colab panel.

When image scanning is complete, the OUTPUT folder will be compressed into a single ZIP file and automatically downloaded to your computer.

### TODO

- [ finished 2024-10-19] Reduce size of images in the companion GH repo to max dimension of 960px
- [ ] Copy current trained model to companion GH repo
- [ ] Copy this Jupyter notebook to companion GH repo
- [ ] Add confidence values to bounding box labels.
- [ ] Add database to OUTPUT folder
- [ ] Extract GPS coordinates from image files
- [ ] Figure out how to use URLs to access images stored on OneDrive (Sharepoint)

The following cell contains a link to text file which contains a list of image files to be scanned.

For example,
```https://github.com/aubreymoore/crb-damage-detector-colab/raw/refs/heads/main/data/urls.txt``` links to a file which contains:
```
https://github.com/aubreymoore/crb-damage-detector-colab/blob/main/data/images/IMG_0532.JPG?raw=true
https://github.com/aubreymoore/crb-damage-detector-colab/blob/main/data/images/IMG_0671.JPG?raw=true
https://github.com/aubreymoore/crb-damage-detector-colab/blob/main/data/images/IMG_06XX.JPG?raw=true
https://github.com/aubreymoore/crb-damage-detector-colab/blob/main/data/images/IMG_0695.JPG?raw=true
https://github.com/aubreymoore/crb-damage-detector-colab/blob/main/data/images/IMG_0704.JPG?raw=true
https://github.com/aubreymoore/crb-damage-detector-colab/blob/main/data/images/IMG_0713.JPG?raw=true
```
Edit the following so that the url points to your own data, then ```Run All```.

In [2]:
IMAGE_LIST_URL = 'https://github.com/aubreymoore/crb-damage-detector-colab/raw/refs/heads/main/data/urls.txt'

time: 641 µs (started: 2025-08-11 08:10:48 +00:00)


# Load Python packages which are not preinstalled by Colab

In [3]:
%pip install ultralytics -q
%pip install supervision -q
%pip install imutils -q
%pip install icecream -q

time: 18.8 s (started: 2025-08-11 08:10:48 +00:00)


# Import modules

In [4]:
import cv2
import supervision as sv
from ultralytics import YOLO
import imutils
import glob
import os
import shutil
from skimage import io
from icecream import ic
from google.colab import files
# ultralytics.checks()

time: 7.34 s (started: 2025-08-11 08:11:07 +00:00)


# Load cell timer

# Define functions

In [5]:
def get_list_from_url(url):
  """
  Downloads a text file from a URL and returns a list of its lines.
  """
  try:
    # Download the file
    !wget -q -O temp.txt {url}
    # Read the lines into a list
    with open('temp.txt', 'r') as f:
      lines = f.read().splitlines()
    # Clean up the temporary file
    os.remove('temp.txt')
    return lines
  except Exception as e:
    print(f"Error downloading or reading file from {url}: {e}")
    return None

# # Example usage (replace with actual URL)
# url = 'https://github.com/aubreymoore/crb-damage-detector-colab/raw/refs/heads/main/data/urls.txt'
# my_list = get_list_from_url(url)
# if my_list:
#   print("List created from URL:")
#   print(my_list)
# else:
#   print("Failed to create list from URL.")

time: 1.54 ms (started: 2025-08-11 08:11:14 +00:00)


In [6]:
def upload_model_weights():
  '''
  Upload model weights from GitHub repo to **weights.pt** only if this file does not already exist.
  '''
  !wget -nc https://github.com/aubreymoore/code-for-CRB-damage-ai/raw/refs/heads/main/models/3class/train5/weights/best.pt -O weights.pt

# upload_model_weights()

time: 815 µs (started: 2025-08-11 08:11:14 +00:00)


In [7]:
def load_model_weights():
  model = YOLO('weights.pt')

time: 657 µs (started: 2025-08-11 08:11:14 +00:00)


In [8]:
def create_input_folder():
  if not os.path.exists('INPUT'):
    os.makedirs('INPUT')

# create_input_folder()

time: 749 µs (started: 2025-08-11 08:11:14 +00:00)


In [9]:
def create_output_folder():
  if not os.path.exists('OUTPUT'):
    os.makedirs('OUTPUT')

# create_output_folder()

time: 767 µs (started: 2025-08-11 08:11:14 +00:00)


In [10]:
def run_garbage_disposal():
  '''
  Delete any data files left over from the last run.
  '''
  shutil.rmtree('INPUT', ignore_errors=True)
  shutil.rmtree('OUTPUT', ignore_errors=True)
  shutil.rmtree('sample_data', ignore_errors=True)

  try:
    os.remove('weights.pt')
  except OSError:
    pass

# run_garbage_disposal()

time: 1.03 ms (started: 2025-08-11 08:11:14 +00:00)


In [11]:
def upload_and_unpack_zip_or_txt():
  '''
  Upload images (*.zip) or list of URLs (*.txt)
  '''
  uploaded = files.upload(target_dir='INPUT')
  filename = list(uploaded.keys())[0]

  urls = None
  image_file_dir = None

  if filename.endswith('.txt'):
    input_mode = 'text'
    with open(filename, 'r') as f:
      urls = f.read().splitlines()
  elif filename.endswith('.zip'):
    input_mode = 'zip'
    !unzip -q $filename -d INPUT
    image_file_dir = f'INPUT/{filename}'.replace('.zip', '')
    ic(image_file_dir)
  else:
    raise ValueError('INPUT file must be *.txt or *.zip.')
  return input_mode, urls, image_file_dir

# input_mode, urls, image_file_dir = upload_and_unpack_zip_or_txt()
# ic(input_mode)
# ic(urls)
# ic(image_file_dir)

time: 1.65 ms (started: 2025-08-11 08:11:14 +00:00)


In [12]:
def get_input_file_list():
  return glob.glob(f'INPUT/**/*', recursive=True)

# get_input_file_list()

time: 633 µs (started: 2025-08-11 08:11:14 +00:00)


In [13]:
def detect_objects(image, model, box_annotator, label_annotator, csv_sink):
  '''
  detect objects in an image
  returns detections and an annotated image
  '''
  results = model(image)[0]
  detections = sv.Detections.from_ultralytics(results)
  # ic(detections)
  annotated_image = box_annotator.annotate(image, detections=detections)
  labels = [f"{model.model.names[class_id]} {confidence:.2f}" for class_id, confidence in zip(detections.class_id, detections.confidence)]
  annotated_image = label_annotator.annotate(scene=annotated_image, detections=detections, labels=labels)
  return detections, annotated_image

# csv_sink = sv.CSVSink('detections.csv')
# csv_sink.open()

# upload_model_weights()
# model = YOLO('weights.pt')
# box_annotator = sv.BoxAnnotator()
# label_annotator = sv.LabelAnnotator()

# url = 'https://github.com/aubreymoore/crb-damage-detector-colab/blob/main/data/Vanuatu_July_2022_Sulav/resized-images/IMG_0532.JPG?raw=true'
# image = imutils.url_to_image(url)
# detections, annotated_image = detect_objects(image, model, box_annotator, label_annotator, csv_sink)
# ic(detections)
# sv.plot_image(annotated_image)

# custom_data = {'url': url}
# csv_sink.append(detections, custom_data)

# csv_sink.close()

time: 4.87 ms (started: 2025-08-11 08:11:14 +00:00)


# MAIN PROGRAM

In [14]:
# Clear data files from previous run
run_garbage_disposal()

create_input_folder()
create_output_folder()

# Upload images or list of URLs
# input_mode, urls, image_file_dir = upload_and_unpack_zip_or_txt()

urls = get_list_from_url(IMAGE_LIST_URL)

# Upload weights from trained model and load them
upload_model_weights()
model = YOLO('weights.pt')

box_annotator = sv.BoxAnnotator()
label_annotator = sv.LabelAnnotator()
csv_sink = sv.CSVSink('OUTPUT/detections.csv')
csv_sink.open()

input_mode = 'text' # force user to provide a list of urls

# Scan images
if input_mode == 'text':
  for url in urls:
    try:
      image = imutils.url_to_image(url)
      detections, annotated_image = detect_objects(image, model, box_annotator, label_annotator, csv_sink)
      csv_sink.append(
          detections,
          custom_data={'image_h': image.shape[0], 'image_w': image.shape[1], 'source': url}
      )

      # Extract filename from URL
      filename = url.split('/')[-1]
      pos = filename.find('?')
      if pos >= 0:
        filename = filename[:pos]
      output_path = f'OUTPUT/{filename}'.replace('.', '_annotated.')
      ic(output_path)
      os.makedirs(os.path.dirname(output_path), exist_ok = True)
      cv2.imwrite(output_path, annotated_image)
    except:
      print(f'Error processing {url}')
    continue

if input_mode == 'zip':
  input_file_list = get_input_file_list()
  ic(input_file_list)
  for image_path in input_file_list:
    ic(image_path)
    try:
      image = cv2.imread(image_path)
      detections, annotated_image = detect_objects(image, model, box_annotator, label_annotator, csv_sink)
      csv_sink.append(
          detections,
          custom_data={'image_h': image.shape[0], 'image_w': image.shape[1], 'source': image_path}
      )

      filename = os.path.basename(image_path)
      output_path = f'OUTPUT/{filename}'.replace('.', '_annotated.')
      os.makedirs(os.path.dirname(output_path), exist_ok = True)
      result = cv2.imwrite(output_path, annotated_image)
    except:
      print(f'Error processing {image_path}')
    continue

csv_sink.close()

--2025-08-11 08:11:15--  https://github.com/aubreymoore/code-for-CRB-damage-ai/raw/refs/heads/main/models/3class/train5/weights/best.pt
Resolving github.com (github.com)... 140.82.121.3
Connecting to github.com (github.com)|140.82.121.3|:443... connected.
HTTP request sent, awaiting response... 302 Found
Location: https://raw.githubusercontent.com/aubreymoore/code-for-CRB-damage-ai/refs/heads/main/models/3class/train5/weights/best.pt [following]
--2025-08-11 08:11:16--  https://raw.githubusercontent.com/aubreymoore/code-for-CRB-damage-ai/refs/heads/main/models/3class/train5/weights/best.pt
Resolving raw.githubusercontent.com (raw.githubusercontent.com)... 185.199.111.133, 185.199.110.133, 185.199.108.133, ...
Connecting to raw.githubusercontent.com (raw.githubusercontent.com)|185.199.111.133|:443... connected.
HTTP request sent, awaiting response... 200 OK
Length: 6269721 (6.0M) [application/octet-stream]
Saving to: ‘weights.pt’


2025-08-11 08:11:16 (136 MB/s) - ‘weights.pt’ saved [62

ic| output_path: 'OUTPUT/IMG_0532_annotated.JPG'



0: 736x960 4 lives, 1 dead, 1664.0ms
Speed: 30.7ms preprocess, 1664.0ms inference, 3.0ms postprocess per image at shape (1, 3, 736, 960)


ic| output_path: 'OUTPUT/IMG_0671_annotated.JPG'



image 1/2 /usr/local/lib/python3.11/dist-packages/ultralytics/assets/bus.jpg: 960x736 (no detections), 1169.6ms
image 2/2 /usr/local/lib/python3.11/dist-packages/ultralytics/assets/zidane.jpg: 544x960 (no detections), 610.7ms
Speed: 26.9ms preprocess, 890.1ms inference, 1.2ms postprocess per image at shape (1, 3, 544, 960)
Error processing https://github.com/aubreymoore/crb-damage-detector-colab/blob/main/data/images/IMG_06XX.JPG?raw=true

0: 960x736 3 lives, 954.2ms
Speed: 13.8ms preprocess, 954.2ms inference, 2.1ms postprocess per image at shape (1, 3, 960, 736)


ic| output_path: 'OUTPUT/IMG_0695_annotated.JPG'



0: 736x960 4 lives, 3 vcuts, 900.1ms
Speed: 6.3ms preprocess, 900.1ms inference, 2.0ms postprocess per image at shape (1, 3, 736, 960)


ic| output_path: 'OUTPUT/IMG_0704_annotated.JPG'



0: 960x736 2 lives, 1 vcut, 404.4ms
Speed: 34.2ms preprocess, 404.4ms inference, 1.4ms postprocess per image at shape (1, 3, 960, 736)


ic| output_path: 'OUTPUT/IMG_0713_annotated.JPG'


time: 16.4 s (started: 2025-08-11 08:11:14 +00:00)


## Please click on the Browse buttom when it appears above this cell.

### Download OUTPUT folder as a ZIP file

In [15]:
!zip -r OUTPUT.zip OUTPUT

updating: OUTPUT/ (stored 0%)
updating: OUTPUT/IMG_0671_annotated.JPG (deflated 1%)
updating: OUTPUT/IMG_0532_annotated.JPG (deflated 0%)
updating: OUTPUT/detections.csv (deflated 78%)
updating: OUTPUT/IMG_0704_annotated.JPG (deflated 0%)
updating: OUTPUT/IMG_0713_annotated.JPG (deflated 1%)
updating: OUTPUT/IMG_0695_annotated.JPG (deflated 1%)
time: 104 ms (started: 2025-08-11 08:11:31 +00:00)


In [16]:
from google.colab import files
files.download("OUTPUT.zip")

<IPython.core.display.Javascript object>

<IPython.core.display.Javascript object>

time: 5.29 ms (started: 2025-08-11 08:11:31 +00:00)


# FINISHED
If everything worked as intended, you should find a file named **OUTPUT.zip** in your Downloads folder. Unzip this file to see results.

In [17]:

print('FINISHED')

FINISHED
time: 2.73 ms (started: 2025-08-11 08:11:31 +00:00)
