<a href="https://colab.research.google.com/github/aubricot/computer_vision_with_eol_images/blob/master/object_detection_for_image_tagging/plant_pollinator/plant_poll_generate_tags_yolov3.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

# Using YOLO v3 pre-trained on Google Open Images to add plant-pollinator co-occurrence tags for ladybugs, beetles, and insects in plant images
---
*Last Updated 3 December 2021*   
Using a YOLOv3 model (downloaded from [here](https://github.com/AlexeyAB/darknet) ) pre-trained on [Google Open Images](https://storage.googleapis.com/openimages/web/visualizer/index.html?set=train&type=detection&c=%2Fm%2F03vt0) as a method to do customized, large-scale image processing. EOL Angiosperm images will be tagged for plant-pollinator co-occurrence using object detection. Tags will further extend EOLv3 image search functions.

Notes:   
* Before you you start: change the runtime to "GPU" and with "High RAM" (if available)
* Follow instructions at form fields to interact with code (change filepaths, adjust parameters, etc.; also noted in code with 'TO DO')
* For each 24 hour period on Google Colab, you have up to 12 hours of free GPU access.

References:   
* Check out [AlexeyAB's darknet repo](https://github.com/AlexeyAB/darknet) for Colab tutorials like [this one](https://colab.research.google.com/drive/12QusaaRj_lUwCGDvQNfICpa7kA7_a2dE).

## Installs & Imports
---

In [None]:
# (Optional): Mount google drive to import/export files
# Note: Only run this cell if want to save results
from google.colab import drive
drive.mount('/content/drive', force_remount=True)

In [1]:
# For importing/exporting files, working with arrays, etc
import os
import glob
import pathlib
import six.moves.urllib as urllib
import sys
import tarfile
import zipfile
import numpy as np 
import csv
import matplotlib.pyplot as plt
import time
import pandas as pd

# For downloading images
!apt-get install aria2

# For drawing onto and plotting images
import matplotlib.pyplot as plt
from PIL import Image
from PIL import ImageColor
from PIL import ImageDraw
from PIL import ImageFont
from PIL import ImageOps
import cv2
%matplotlib inline
%config InlineBackend.figure_format = 'svg'

Reading package lists... Done
Building dependency tree       
Reading state information... Done
The following additional packages will be installed:
  libc-ares2
The following NEW packages will be installed:
  aria2 libc-ares2
0 upgraded, 2 newly installed, 0 to remove and 37 not upgraded.
Need to get 1,274 kB of archives.
After this operation, 4,912 kB of additional disk space will be used.
Get:1 http://archive.ubuntu.com/ubuntu bionic-updates/main amd64 libc-ares2 amd64 1.14.0-1ubuntu0.1 [37.5 kB]
Get:2 http://archive.ubuntu.com/ubuntu bionic/universe amd64 aria2 amd64 1.33.1-1 [1,236 kB]
Fetched 1,274 kB in 0s (12.2 MB/s)
Selecting previously unselected package libc-ares2:amd64.
(Reading database ... 155222 files and directories currently installed.)
Preparing to unpack .../libc-ares2_1.14.0-1ubuntu0.1_amd64.deb ...
Unpacking libc-ares2:amd64 (1.14.0-1ubuntu0.1) ...
Selecting previously unselected package aria2.
Preparing to unpack .../aria2_1.33.1-1_amd64.deb ...
Unpacking aria2 (1

## Model preparation
---

In [2]:
# Install darknet
# Note: Ignore warnings and output text, even most recent build shows them

# Only if connecting to Google Drive
# TO DO: Type in the path to your working directory in form field to right
wd = "/content/drive/MyDrive/train/darknet" #@param {type:"string"}
cwd = 'darknet'
%cd $wd

# Download darknet (the native implementation of YOLO)
if os.path.exists(cwd):
    %cd $cwd
elif not os.path.exists(cwd):
    !git clone https://github.com/AlexeyAB/darknet
    # Compile darknet
    %cd $cwd
    # Make folders for detection datafiles
    os.makedirs('data/imgs')
    os.makedirs('data/img_info')
    os.makedirs('data/results')

# Change makefile to have GPU and OPENCV enabled
!sed -i 's/OPENCV=0/OPENCV=1/' Makefile
!sed -i 's/GPU=0/GPU=1/' Makefile
!sed -i 's/CUDNN=0/CUDNN=1/' Makefile
!sed -i 's/CUDNN_HALF=0/CUDNN_HALF=1/' Makefile
# Download pretrained YOLOv3 weights for Open Images
!wget https://pjreddie.com/media/files/yolov3-openimages.weights

# Verify CUDA version (for using GPU)
!/usr/local/cuda/bin/nvcc --version

# Make darknet
!make

[Errno 2] No such file or directory: '/content/drive/MyDrive/train/darknet'
/content
Cloning into 'darknet'...
remote: Enumerating objects: 15376, done.[K
remote: Total 15376 (delta 0), reused 0 (delta 0), pack-reused 15376[K
Receiving objects: 100% (15376/15376), 14.01 MiB | 3.48 MiB/s, done.
Resolving deltas: 100% (10340/10340), done.
/content/darknet
--2021-12-03 18:23:13--  https://pjreddie.com/media/files/yolov3-openimages.weights
Resolving pjreddie.com (pjreddie.com)... 128.208.4.108
Connecting to pjreddie.com (pjreddie.com)|128.208.4.108|:443... connected.
HTTP request sent, awaiting response... 200 OK
Length: 259229388 (247M) [application/octet-stream]
Saving to: ‘yolov3-openimages.weights’


2021-12-03 18:23:26 (19.5 MB/s) - ‘yolov3-openimages.weights’ saved [259229388/259229388]

nvcc: NVIDIA (R) Cuda compiler driver
Copyright (c) 2005-2020 NVIDIA Corporation
Built on Mon_Oct_12_20:09:46_PDT_2020
Cuda compilation tools, release 11.1, V11.1.105
Build cuda_11.1.TC455_06.29190

## Generate cropping coordinates for images (Run 1x for each batch)
---
Run EOL 20k image bundles through pre-trained object detection models and save results in 4 batches (A-D). 

### Prepare object detection functions and settings

In [3]:
# Define functions

# Display full URLs in outputs so you can click them and inspect images
pd.set_option('display.max_colwidth', None)

# Read in data file
def read_datafile(fpath, sep="\t", header=0, disp_head=True):
    try:
        df = pd.read_csv(fpath, sep=sep, header=header)
        if disp_head:
            print("Data header: \n", df.head())
    except FileNotFoundError as e:
        raise Exception("File not found: Enter the path to your file in form field and re-run").with_traceback(e.__traceback__)
    
    return df

# Read in bundle images
def read_eolbundle(bundle, no_bundles):
    # Get first 20k images for Angiosperm bundles using initial bundle basename
    eol_addr = "https://editors.eol.org/other_files/bundle_images/files/"
    bundle_base = os.path.splitext(os.path.basename(bundle))[0].rsplit('_',1)[0]
    # Add zero suffix to dynamically load in sub-bundles 
    # (ex: 000001 - 000031 for Angiosperms)
    tens = list(range(1, 10))
    hundreds = list(range(10, no_bundles))
    tens_w_zeros = ["00000" + str(num) + ".txt" for num in tens]
    hundreds_w_zeros = ["0000" + str(num) + ".txt" for num in hundreds]
    zero_suffices = tens_w_zeros + hundreds_w_zeros
    all_filenames = [eol_addr + bundle_base + "_" + zs for zs in zero_suffices]
    bundles = pd.concat([pd.read_csv(f, sep='\t', header=None) for f in all_filenames], ignore_index=True)
    print("EOL image bundle with {} images: \n{}".format(len(bundles), bundles.head()))
    
    return bundles

# Define start and stop indices in EOL bundle for running inference   
def set_start_stop(df):
    # To test with a tiny subset, use 5 random bundle images
    if test_with_tiny_subset:
        N = len(df)
        start=np.random.choice(a=N, size=1)[0]
        stop=start+5
    # To run inference on 4 batches of 5k images each
    elif "_a." in outfpath: # batch a is from 0-5000
        start=0
        stop=5000
    elif "_b." in outfpath: # batch b is from 5000-1000
        start=5000
        stop=10000
    elif "_c." in outfpath: # batch c is from 10000-15000
        start=10000
        stop=15000
    elif "_d." in outfpath: # batch d is from 15000-20000
        start=15000
        stop=20000
    
    return start, stop

# To display results
def imShow(path):
    image = cv2.imread(path)
    height, width = image.shape[:2]
    resized_image = cv2.resize(image,(3*width, 3*height), interpolation = cv2.INTER_CUBIC)
    fig = plt.gcf()
    fig.set_size_inches(9, 9)
    plt.axis("off")
    plt.imshow(cv2.cvtColor(resized_image, cv2.COLOR_BGR2RGB))
    plt.show()

# For uploading an image from url
# Modified from https://www.pyimagesearch.com/2015/03/02/convert-url-to-image-with-python-and-opencv/
def url_to_image(url):
    resp = urllib.request.urlopen(url)
    image = np.asarray(bytearray(resp.read()), dtype="uint8")
    image = cv2.imdecode(image, cv2.IMREAD_COLOR)
    image = cv2.cvtColor(image, cv2.COLOR_BGR2RGB)
    im_h, im_w = image.shape[:2]
 
    return image

### Temporarily download images from EOL bundle to Google Drive (YOLO cannot directly parse URL images)

In [4]:
# Download images for 20K bundle of Angiosperm images with 31 sub-bundles
# To DO: Enter any EOL Angiosperm image bundle URL
bundle = "https://editors.eol.org/other_files/bundle_images/files/images_for_Angiosperms_20K_breakdown_download_000030.txt" #@param {type:"string"}
df = read_eolbundle(bundle, 31)

# Test with a smaller subset than 5k images?
# TO DO: If yes, check test_with_tiny_subset box
test_with_tiny_subset = True #@param {type: "boolean"}

# Take 5k subset of bundle for running inference
# TO DO: Change filename for each batch
tags_file = "plant_poll_coocc_tags_a" #@param ["plant_poll_coocc_tags_a", "plant_poll_coocc_tags_b", "plant_poll_coocc_tags_c", "plant_poll_coocc_tags_d"] {allow-input: true}
tags_file = tags_file + ".txt"
imgs_dir = "data/imgs/"
outfpath = imgs_dir + tags_file

# Save 5k subset to tags file
start, stop = set_start_stop(df)
df = df.iloc[start:stop]
df.to_csv(outfpath, sep='\n', index=False, header=False)

# Download images 
# Note: Takes 7-10 min per 5k imgs, aria2 downloads 16imgs at a time
%cd $imgs_dir
!aria2c -x 16 -s 1 -i $tags_file

# Check how many images downloaded
print("Number of files downloaded to Google Drive: ")
len([1 for x in list(os.scandir('.')) if x.is_file()])-1 # -1 because .txt file contains image filenames

EOL image bundle with 600000 images: 
                                                                  0
0  https://content.eol.org/data/media/9d/c5/89/851.103050-1_jpg.jpg
1  https://content.eol.org/data/media/9d/c8/c4/851.113130-2_jpg.jpg
2  https://content.eol.org/data/media/9d/cc/ac/851.117570-5_jpg.jpg
3  https://content.eol.org/data/media/9d/d0/94/851.120830-2_jpg.jpg
4  https://content.eol.org/data/media/9d/d4/7c/851.124630-1_jpg.jpg
/content/darknet/data/imgs

12/03 18:25:07 [[1;32mNOTICE[0m] Downloading 5 item(s)

12/03 18:25:08 [[1;32mNOTICE[0m] Download complete: /content/darknet/data/imgs/509.2897632.jpg
[0m
12/03 18:25:08 [[1;32mNOTICE[0m] Download complete: /content/darknet/data/imgs/509.27685264.jpg

12/03 18:25:09 [[1;32mNOTICE[0m] Download complete: /content/darknet/data/imgs/509.29318491.jpg

12/03 18:25:09 [[1;32mNOTICE[0m] Download complete: /content/darknet/data/imgs/509.29318510.jpg
[0m
12/03 18:25:09 [[1;32mNOTICE[0m] Download complete: /content/da

5

In [None]:
# Move tags file used for downloading images to data/img_info/
%cd ../
!mv imgs/*.txt img_info/
%cd ../

# Make a new list of successfully downloaded image files for running inference
inf_imgs = imgs_dir + '/' + tags_file
with open(inf_imgs, 'w', encoding='utf-8') as f:
    # Walk through data/imgs/ to list files
    for dir, dirs, files in os.walk(imgs_dir):
        files = [fn for fn in files]
        for fn in files:
            if 'txt' not in fn:
                out = "data/imgs/" + fn
                f.writelines(out + '\n')

# Inspect textfile of images for inference
df = read_datafile(inf_imgs, header=None, sep='\n', disp_head=True)
print("\nNumber of images ready for inference in {}: {}".format(inf_imgs, len(df)))

### Run images through trained model
---

#### Test: Run individual image through by filename and display results

In [None]:
# Run inference on a single image by filename and show results
# TO DO: First, run with sample EOL 'butterfly bush' image
# To test with your own image, upload file to data/imgs and update fn formfield

# First, download sample EOL 'butterfly bush' image 
%cd data/imgs
!gdown --id 1A7dHlbnAlRS-pHwS2QHg--HzTQc1CLx3
%cd ../..

# TO DO: Put image in data/imgs & enter filename in formfield
fn = "542.8209936861.jpg" #@param {type:"string"}
img_fpath = 'data/imgs/' + fn

# Run darknet and show bounding box coordinates
!./darknet detector test cfg/openimages.data cfg/yolov3-openimages.cfg yolov3-openimages.weights {img_fpath}

# Display detection results
imShow('predictions.jpg')

### Generate crops: Run inference on EOL images & save results for cropping
Use 20K EOL Angiosperm image bundles to get bounding boxes of detected pollinators. Results are saved to [crops_file].tsv.   
Run in 4 batches of 5K images to backup regularly in case of Colab timeouts.

In [None]:
# Run inference on 5k image subset using darknet

# Run darknet with flag to not show bounding box coordinates
!./darknet detector test cfg/openimages.data cfg/yolov3-openimages.cfg yolov3-openimages.weights -dont_show -save_labels < {tags_outpath}

## Post-process detection results
--- 
Combine predictions for each image (YOLO saves them as individual text files) to all_predictions.txt. Then, delete images and prediction files. Next, convert detection boxes into flower/fruit tags. 

In [None]:
# Define functions

# Combine individual prediction files for each image to all_predictions.txt
def combine_predictions(imgs_dir):
    # Delete inference images file list
    !rm $tags_outpath
    # Combine inference text files for each image and save to all_predictions.txt
    fns = os.listdir(imgs_dir)
    with open('data/results/all_predictions.txt', 'w') as outfile:
        header = "class_id x y w h img_id"
        outfile.write(header + "\n")
        for fn in fns:
            if '.txt' in fn:
                with open('data/imgs/'+fn) as infile:
                    lines = infile.readlines()
                    newlines = [''.join([x.strip(), ' ' + os.path.splitext(fn)[0] + '\n']) for x in lines]
                    outfile.writelines(newlines)
    # Load all_predictions.txt
    df = pd.read_csv('data/results/all_predictions.txt')
    print("Model predictions: \n", df.head())

    return df

# Combine tagging files for batches A-D
def combine_tag_files(tags_fpath):
    # Combine tag files for batches A-D
    fpath =  os.path.splitext(tags_fpath)[0]
    base = fpath.rsplit('_',1)[0] + '_'
    exts = ['a.tsv', 'b.tsv', 'c.tsv', 'd.tsv'] 
    all_filenames = [base + e for e in exts]
    print(all_filenames)
    df = pd.concat([pd.read_csv(f, sep='\t', header=0, na_filter = False) for f in all_filenames], ignore_index=True)
    # Choose desired columns for tagging
    df = df[['url', 'img_id', 'class_id']]
    df.rename(columns={'url': 'eolMediaURL', 'img_id': 'identifier', 'class_id': 'tag'}, inplace=True)
    print("\nNew concatenated dataframe with all 4 batches: \n", df[['eolMediaURL', 'tag']].head())

    return df

def add_class_names(all_predictions):
    # Model predictions with number-coded classes
    numbered_tags = pd.read_csv(all_predictions, header=0, sep=" ")
    numbered_tags.class_id = numbered_tags.class_id - 1 # python counts from 0, Yolo from 1
    print("\nModel predictions by class id: \n", numbered_tags)

    # Add class names to model predictions
    classes = pd.read_table('data/openimages.names')
    classes.columns = ['name']
    classes_dict = pd.Series(classes.name.values, index=classes.index).to_dict()
    tags = numbered_tags.copy()
    tags.replace({"class_id":classes_dict}, inplace=True)
    tags['class_id'] = tags['class_id'].astype(str)
    print("\nModel prediction classes translated from class id's: \n", tags)

    return tags

# Add EOL media URL's to named image tags
def add_eolMediaURLs(tags, bundle):
    # Read in EOL 20k image url bundle
    bundle = read_eolbundle(bundle, 31)
    bundle.columns = ['url']
    print("EOL media URL's corresponding to inference images: \n", bundle)
    
    # Map eolMediaURLs to tags using image filenames
    img_fns = bundle['url'].apply(lambda x: os.path.splitext((os.path.basename(x)))[0])
    bundle['img_id'] = img_fns
    tags.set_index('img_id', inplace=True, drop=True)
    bundle.set_index('img_id', inplace=True, drop=True)
    final_tags = tags.merge(bundle, left_index=True, right_index=True)
    final_tags.reset_index(drop=False, inplace=True)
    final_tags.drop_duplicates(inplace=True, ignore_index=True)
    print("\nModel predictions with EOL media URL's: \n", final_tags.head())

    return final_tags

# Set filename for saving classification results
def set_outpath(tags_file):
    tags_file = os.path.splitext(tags_file)[0]
    outpath = wd + '/' + cwd + '/data/results/' + tags_file + '.tsv'
    print(outpath)
    print("Saving results to: \n", outpath)

    return outpath

### Combine predictions for each image

In [None]:
# Combine individual prediction files for each image to all_predictions.txt
df = combine_predictions(imgs_dir)

# Delete inference text files and images (only needed them for inference)
!rm -r data/imgs/*

### Convert detection boxes to flower/fruit tags

In [None]:
# Create final predictions dataframe with class names (instead of numbers) and image urls

# Add class names to numeric image tags
tags = add_class_names('data/results/all_predictions.txt')

# Add EOL media URL's from bundle to image tags df
final_tags = add_eolMediaURLs(tags, bundle)

# Save final tags to file
outpath = set_outpath(tags_file)
final_tags.to_csv(outpath, sep="\t", index=False)

## Combine tags for 5k image batches A-D
---
After running steps above for each image batch, combine tag files to one 20k tag dataset.

In [None]:
# Write header row of output tagging file
# TO DO: Enter any filename from 4 batches of tagging files
tags_file = "plant_poll_coocc_tags_d" #@param {type:"string"}
tags_fpath = "data/results/" + tags_file + ".tsv"

# Combine exported model predictions and confidence values for all batches
df = combine_tag_files(tags_fpath)

# Filter for desired classes
# TO DO: Enter classes to filter by
filter = ['Butterfly', 'Insect', 'Beetle', 'Ant', 'Bat (Animal)', 'Bird', 'Bee', 'Invertebrate', 'Animal'] #@param
pattern = '|'.join(filter1)

# Set all detections for filtered classes to 'Pollinator'
print("\nNo. tags matching filtered class(es) {}: {}\n".format(filter, len(df[df.tag.str.contains(pattern)])))
print("\nTags matching filtered class(es): \n", df[df.tag.str.contains(pattern)])
df.loc[df.tag.str.contains(pattern), 'tag'] = 'Pollinator'

# Remove all detections that aren't for filtered classes
df.loc[~df.tag.str.contains(pattern), 'tag'] = 'None'
print("\nNo. tags not matching filtered classes: \n", len(df.tag[~df.tag.str.contains(pattern)]))
print("\nTags not matching filtered classes: \n", df[~df.tag.str.contains(pattern)])

# Write results to tsv
outpath = base + 'finaltags.tsv'
df.to_csv(outpath, sep='\t', index=False)
print("\n\nFinal tagging file {}: \n{}".format(outpath, df.head()))

## Display cropping results on images
---

In [None]:
# Set number of seconds to timeout if image url taking too long to open
import socket
socket.setdefaulttimeout(10)

# TO DO: Adjust line below to see up to 50 images displayed at a time
start = 0 #@param {type:"slider", min:0, max:5000, step:50}
stop = start+50

# Loop through images
for i, row in df.iloc[start:stop].iterrows():
    try:
        # Read in image 
        url = df['eolMediaURL'][i]
        img = url_to_image(url)

        # Fetch image tag
        tag = df['tag'][i]
  
        # Display progress message after each image is loaded
        print('Successfully loaded {} of {} images'.format(i+1, (stop-start)))

        # Plot cropping box on image
        _, ax = plt.subplots(figsize=(10, 10))
        ax.imshow(img)
        plt.axis('off')
        plt.title('{}) Tag: {}'.format(i+1, tag))

    except:
        print('Check if URL from {} is valid'.format(url))