Lab 2: Performance of image classifiers
=====

In this exercise we will take some wildlife images captured by the general public, and evaluate the performance of some image recognition algorithms.

We will use images of parakeets, coming from the same data sample as in the previous exercise, but now we use the images and not just the metadata. They come from [GBIF](https://www.gbif.org/). Download the dataset **parakeetsNL400images** here (112MB): https://surfdrive.surf.nl/files/index.php/s/BzKBh8jDUagEEAV

2a: Online image classifiers
-----

First, process a selection of these images using some image recognition live-demo websites. Choose 4 image files to use. Store the predictions that you obtain (e.g. in a spreadsheet table). Store the FILENAME, PREDICTED LABELS as well as the PROBABILITIES, and if "bounding boxes" are provided make a note if they're good/bad.
1. ImageRecognize: https://imagerecognize.com/
2. Google Vision: https://cloud.google.com/vision/docs/drag-and-drop (don't log in -- just scroll down to the "Upload your image" box)
3. Azure Vision Studio: https://portal.vision.cognitive.azure.com/demo/generic-object-detection
4. *(LLMs such as ChaptGPT/Gemini? We will consider those later)*


<!-- 3. /// https://astica.ai/vision/object-detection/ (note: they ask for a login, ugh)
4. /// This one is gone:LIACS Live demo (NB for me, file-upload failed but URLs work fine - the image URLs are in the spreadsheet):  http://destiny.liacs.nl/
-->

From the data you've collated, write notes about the quality, including these categories:

(1) Correct predictions for the birds. How precise are the labels? (e.g. is "bird" OK?) How confident are the predictions?

(2) True but irrelevant labels.

(3) Incorrect predictions for the birds. How confident are they? How much of a problem are they?

1.

ImageRecognize 9/10
Google Vision 9/10
Azure Vision Studio 1/10 for parrot 8/10 for bird
GPT4o 9/10

2.

All vision models assigned multiple labels, some less relevant such as bird, animal, tree
GPT4o mentioned parrot primarily

3.

At the first image all algorithm were incorrect, they did not recognize the parrot, neither any bird.
On other images they all recognized a bird, and other than Azure Vision Studio they recognized that its a parrot.

For the incorrect predictions, try to group the errors according to what you think is the CAUSE. For example: "poor image quality", "target animals too small in image", "definite error (i.e. a human would not make that mistake)", or "highly-similar category (a human might easily get the classes confused)". These are just examples, you can and should add your own categorisation of the errors.

At image 1. the parrot's colour is not much different from the tree - poor image quaility - in terms of contrast

For each of the algorithms you have tested, write notes about what could be achieved if you had to use that algorithm to create a map of rose-ringed parakeets in NL.

Based on the accuracies from 10 images
ImageRecognize, GoogleVision and GPT4o would detect 90% of parrots from the observations in the tilburg area, while Azure Vision Studio would detect 10%


-----------

2b: MegaDetector
------

Next we will run MegaDetector on our images. MegaDetector is a general-purpose wildlife detector, based on YOLOv5. You can install MegaDetector using these instructions:
https://github.com/microsoft/CameraTraps/blob/main/megadetector.md

Alternatively, you can do it in Google Colab (BUT! Remember that the images would need to be on your Google Drive then, not stored locally.)
https://colab.research.google.com/github/microsoft/CameraTraps/blob/master/detection/megadetector_colab.ipynb

In [1]:
# Installing PytorchWildlife -- it may take a few minutes
!pip install PytorchWildlife


Collecting PytorchWildlife
  Using cached pytorchwildlife-1.0.2.17.tar.gz (42 kB)
  Preparing metadata (setup.py) ... [?25ldone
[?25hCollecting numpy (from PytorchWildlife)
  Downloading numpy-1.24.4-cp38-cp38-manylinux_2_17_x86_64.manylinux2014_x86_64.whl.metadata (5.6 kB)
Collecting torch==1.10.1 (from PytorchWildlife)
  Downloading torch-1.10.1-cp38-cp38-manylinux1_x86_64.whl.metadata (24 kB)
Collecting torchvision==0.11.2 (from PytorchWildlife)
  Downloading torchvision-0.11.2-cp38-cp38-manylinux1_x86_64.whl.metadata (8.8 kB)
Collecting torchaudio==0.10.1 (from PytorchWildlife)
  Downloading torchaudio-0.10.1-cp38-cp38-manylinux1_x86_64.whl.metadata (1.1 kB)
Collecting tqdm==4.66.1 (from PytorchWildlife)
  Downloading tqdm-4.66.1-py3-none-any.whl.metadata (57 kB)
Collecting Pillow==10.1.0 (from PytorchWildlife)
  Downloading Pillow-10.1.0-cp38-cp38-manylinux_2_28_x86_64.whl.metadata (9.5 kB)
Collecting supervision==0.19.0 (from PytorchWildlife)
  Downloading supervision-0.19.0-py

In [2]:
import os
from PytorchWildlife.models import detection as pw_detection
from PytorchWildlife.data import transforms as pw_trans
from PytorchWildlife.data import datasets as pw_data 
from PytorchWildlife import utils as pw_utils
from torch.utils.data import DataLoader



Follow the instructions to apply MegaDetector to your parakeet images. You can try the images 1-by-1 but you should also be able to use the batch script to process a whole collection of images.

In [3]:
# This also takes time (it downloads a DL model)

# Initializing the MegaDetectorV5 model for image detection
detection_model = pw_detection.MegaDetectorV5(device='cpu', pretrained=True)

# Initializing the Yolo-specific transform for the image
transform = pw_trans.MegaDetector_v5_Transform(target_size=detection_model.IMAGE_SIZE,
                                               stride=detection_model.STRIDE)


Downloading: "https://zenodo.org/records/13357337/files/md_v5a.0.0.pt?download=1" to /home/danaconda/.cache/torch/hub/checkpoints/md_v5a.0.0.pt
100%|██████████| 268M/268M [01:27<00:00, 3.20MB/s] 
Fusing layers... 
Fusing layers... 
Model summary: 733 layers, 140054656 parameters, 0 gradients, 208.8 GFLOPs
Model summary: 733 layers, 140054656 parameters, 0 gradients, 208.8 GFLOPs


In [20]:
# CHANGE THIS
image_folder = 'datasets/parakeetsNL400images/parakeetsNL_otherspecies_images200'

# Creating a dataset of images with the specified transform
dataset = pw_data.DetectionImageFolder(
    image_folder,
    transform=pw_trans.MegaDetector_v5_Transform(target_size=detection_model.IMAGE_SIZE,
                                                 stride=detection_model.STRIDE),
    extension="jpg"
)

# Creating a DataLoader for batching and parallel processing of the images
loader = DataLoader(dataset, batch_size=1, shuffle=False, 
                    pin_memory=True, num_workers=1, drop_last=False)

In [21]:
# list files in image_folder
image_files = os.listdir(image_folder)
print(image_files)

['28905968.jpg:Zone.Identifier', '57879220.jpg:Zone.Identifier', '25118336.jpg:Zone.Identifier', '19184465.jpg:Zone.Identifier', '32421684.jpg:Zone.Identifier', '19385799.jpg:Zone.Identifier', '44529266.jpg', '26393802.jpg:Zone.Identifier', '19034275.jpg', '20811425.jpg', '16088253.jpg:Zone.Identifier', '46628314.jpg', '19812127.jpg', 'original.jpeg.10:Zone.Identifier', '34829248.jpg:Zone.Identifier', '24630035.jpg', 'original.jpg.8:Zone.Identifier', '54627232.jpg:Zone.Identifier', '19034275.jpg:Zone.Identifier', '27451399.jpg', '20740340.jpg:Zone.Identifier', '52197030.jpg', '24621078.jpg:Zone.Identifier', '12698535.jpg', '32683167.jpg:Zone.Identifier', '62132520.jpg', '45701293.jpg:Zone.Identifier', '26393813.jpg:Zone.Identifier', '19498436.jpg', '8154192.jpg:Zone.Identifier', '24879537.jpg', '33846140.jpg', '10211236.jpg', '33297653.jpg:Zone.Identifier', '19812130.jpg:Zone.Identifier', '46787659.jpg', '1185166.jpg', '12698536.jpg', '33760371.jpg', 'original.jpg.2', '26203862.jpg:Zon

In [22]:
# THIS WILL TAKE A LONG TIME. And there's some risk it will crash on some machines.

# Performing batch detection on the images
results = detection_model.batch_image_detection(loader)

100%|██████████| 178/178 [09:49<00:00,  3.31s/it]


In [23]:
import os
for result in results:
    print(os.path.basename(result['img_id']))
    print(f"   {result['labels']}  {result['normalized_coords']} ")

10211236.jpg
   ['animal 0.97']  [[0.0825, 0.09666666666666666, 0.91375, 0.9983333333333333]] 
10262572.jpg
   ['animal 0.98']  [[0.00125, 0.09210526315789473, 0.83625, 0.9981203007518797]] 
10275335.jpg
   ['animal 0.91']  [[0.19, 0.19736842105263158, 0.655, 0.8872180451127819]] 
1030741.jpg
   ['animal 0.88']  [[0.48625, 0.4955555555555556, 0.58625, 0.6844444444444444]] 
1053823.jpg
   ['animal 0.64']  [[0.15, 0.17713004484304934, 0.89625, 0.6838565022421524]] 
1053826.jpg
   ['animal 0.47']  [[0.15375, 0.22935779816513763, 0.8075, 0.7504587155963303]] 
10744757.jpg
   ['animal 0.85']  [[0.335, 0.16891891891891891, 1.0, 0.8597972972972973]] 
10862039.jpg
   ['animal 0.97']  [[0.38125, 0.3333333333333333, 0.67375, 0.7933333333333333]] 
10914697.jpg
   ['animal 0.95']  [[0.505, 0.17529215358931552, 0.70875, 0.656093489148581]] 
112393.jpg
   ['animal 0.98']  [[0.20375, 0.08, 0.64125, 0.9383333333333334]] 
1185166.jpg
   ['animal 0.84']  [[0.4125, 0.3052434456928839, 0.5125, 0.470037453

In [36]:
results[0]

{'img_id': 'datasets/parakeetsNL400images/parakeetsNL_otherspecies_images200/10211236.jpg',
 'detections': Detections(xyxy=array([[         66,          58,         731,         599]], dtype=float32), mask=None, confidence=array([      0.971], dtype=float32), class_id=array([0]), tracker_id=None, data={}),
 'labels': ['animal 0.97'],
 'normalized_coords': [[0.0825,
   0.09666666666666666,
   0.91375,
   0.9983333333333333]]}

In [56]:
from collections import Counter
labels = [result['labels'][0][0:6] for result in results]
label_counts = Counter(labels)
print(label_counts)

Counter({'animal': 177})


In [63]:
labels_float = [float(result['labels'][0][7:11]) for result in results]
# create a pd series from labels




In [64]:
plt.figure(figsize=(8, 5))
plt.bar(range(len(labels_float)), labels_float, tick_label=labels_float)
plt.xlabel('Index')
plt.ylabel('Year')
plt.title('Bar Chart of Years')
plt.show()



In [65]:
labels_float

[0.97,
 0.98,
 0.91,
 0.88,
 0.64,
 0.47,
 0.85,
 0.97,
 0.95,
 0.98,
 0.84,
 0.96,
 0.94,
 0.92,
 0.83,
 0.96,
 0.87,
 0.77,
 0.91,
 0.87,
 0.91,
 0.92,
 0.85,
 0.84,
 0.81,
 0.96,
 0.95,
 0.96,
 0.74,
 0.95,
 0.77,
 0.9,
 0.54,
 0.89,
 0.96,
 0.89,
 0.96,
 0.71,
 0.93,
 0.91,
 0.96,
 0.93,
 0.89,
 0.91,
 0.66,
 0.81,
 0.97,
 0.79,
 0.73,
 0.8,
 0.73,
 0.55,
 0.72,
 0.96,
 0.93,
 0.95,
 0.96,
 0.97,
 0.98,
 0.87,
 0.86,
 0.97,
 0.96,
 0.93,
 0.97,
 0.97,
 0.98,
 0.96,
 0.97,
 0.93,
 0.93,
 0.89,
 0.79,
 0.77,
 0.97,
 0.92,
 0.98,
 0.94,
 0.94,
 0.89,
 0.85,
 0.92,
 0.96,
 0.79,
 0.89,
 0.91,
 0.91,
 0.47,
 0.94,
 0.92,
 0.9,
 0.91,
 0.85,
 0.98,
 0.92,
 0.92,
 0.84,
 0.95,
 0.97,
 0.89,
 0.86,
 0.95,
 0.96,
 0.95,
 0.98,
 0.94,
 0.96,
 0.66,
 0.92,
 0.94,
 0.85,
 0.97,
 0.98,
 0.62,
 0.9,
 0.86,
 0.85,
 0.9,
 0.58,
 0.81,
 0.77,
 0.98,
 0.95,
 0.92,
 0.92,
 0.83,
 0.63,
 0.84,
 0.96,
 0.95,
 0.65,
 0.96,
 0.98,
 0.98,
 0.91,
 0.94,
 0.93,
 0.96,
 0.84,
 0.95,
 0.69,
 0.96,
 0.89,
 0.9

In [53]:
keys = list(label_counts.keys())
values = list(label_counts.values())


# plot keys and values to a bar chart
import matplotlib.pyplot as plt
plt.bar(keys, values)
plt.show()


Using categorical units to plot a list of strings that are all parsable as floats or dates. If these strings should be plotted as numbers, cast to the appropriate data type before plotting.
Using categorical units to plot a list of strings that are all parsable as floats or dates. If these strings should be plotted as numbers, cast to the appropriate data type before plotting.


Analyse the outputs of MegaDetector in the same way as you did the other algorithms.

megadetector recognized the animal in every picture
note that megadetector is for camera trap images, while the images that we evaluated it on were voluntary observation uploads from voluntary birdspotters not trap cameras mainly.