# Fridge Ingredient Detection

## Install Dependencies

Intended for usage on google Colab

choose GPU  Runtime Type

In [None]:
#we need imgaug 0.4.0
!pip install imgaug --upgrade

Requirement already up-to-date: imgaug in /usr/local/lib/python3.7/dist-packages (0.4.0)


In [None]:
#image manipulations
import imageio
import imgaug as ia
from imgaug import augmenters as iaa
from imgaug.augmentables.bbs import BoundingBox, BoundingBoxesOnImage

#filesystem 
import os
import shutil

#datastructures
import pandas as pd

#general modules
#import torch
import uuid

# Allows to display images directly in the Jupyter notebook
%pylab inline


Populating the interactive namespace from numpy and matplotlib


In [None]:
import pkg_resources
# list packages to be checked
root_packages = ['imageio', 'imgaug', 'os', 'shutil','pandas','uuid']
# print versions, but check if package is imported first
for m in pkg_resources.working_set:
    if m.project_name.lower() in root_packages:
        print(f"{m.project_name}=={m.version}")

pandas==1.1.5
imgaug==0.4.0
imageio==2.4.1


In [None]:
#set some general variables
ia.seed(2) # seed to make random ai functions act the same over various runs
base_path = '/content' #colab
#base_path = '/kaggle/working' #kaggle

#cleanup previous run
#!rm -rf '/content/*' 
#!rm -rf '/kaggle/working/*'


## Download and install the YOLOv5 model
We clone it into our setup for training. When inferencing later on we link to pytorch hub.

In [None]:
# clone YOLOv5 repository
!git clone https://github.com/ultralytics/yolov5  # clone repo
%cd yolov5
#!git reset --hard 886f1c03d839575afecb059accf74296fad395b6

fatal: destination path 'yolov5' already exists and is not an empty directory.
/content/yolov5


In [None]:
# install YOLO dependencies as necessary
!pip install -qr requirements.txt  # install dependencies (ignore errors)

#
#print('Setup complete. Using torch %s %s' % (torch.__version__, torch.cuda.get_device_properties(0) if torch.cuda.is_available() else 'CPU'))

## Get the image data

In [None]:
%cd /content
#The dataset is pulled from roboflow
#This dataset contains the base images with YOLOv5 pytorch bounding boxes; 
#It has been split into 3 parts train, validation, test (70%, 20%, 10%)
# public versions are available on: https://public.roboflow.com/object-detection/aicook

!curl -L "https://app.roboflow.com/ds/MUdazcHbSx?key=tjMnjehvq3" > roboflow.zip; unzip roboflow.zip; rm roboflow.zip

/content
  % Total    % Received % Xferd  Average Speed   Time    Time     Time  Current
                                 Dload  Upload   Total   Spent    Left  Speed
100   897  100   897    0     0   1738      0 --:--:-- --:--:-- --:--:--  1741
100 76.7M  100 76.7M    0     0  81.7M      0 --:--:-- --:--:-- --:--:--  186M
Archive:  roboflow.zip
replace README.roboflow.txt? [y]es, [n]o, [A]ll, [N]one, [r]ename: y
 extracting: README.roboflow.txt     
replace data.yaml? [y]es, [n]o, [A]ll, [N]one, [r]ename: a
error:  invalid response [a]
replace data.yaml? [y]es, [n]o, [A]ll, [N]one, [r]ename: A
 extracting: data.yaml               
 extracting: test/images/DSC_5941_JPG.rf.fa9dd334f857aedabb157c9c9252d465.jpg  
 extracting: test/images/DSC_5945_JPG.rf.4e9b3344c879abfb9970c96c5f9b33c2.jpg  
 extracting: test/images/DSC_5959_JPG.rf.9287aa37ac554558ff1332057ce27fc3.jpg  
 extracting: test/images/DSC_5968_JPG.rf.fd9cecb827c078ae20a4587da333d4e8.jpg  
 extracting: test/images/DSC_5994_JPG.rf

In [None]:
# the YAML file indicates the classes and the location of train/val data
%cat data.yaml

train: ../train/images
val: ../valid/images

nc: 30
names: ['apple', 'banana', 'beef', 'blueberries', 'bread', 'butter', 'carrot', 'cheese', 'chicken', 'chicken_breast', 'chocolate', 'corn', 'eggs', 'flour', 'goat_cheese', 'green_beans', 'ground_beef', 'ham', 'heavy_cream', 'lime', 'milk', 'mushrooms', 'onion', 'potato', 'shrimp', 'spinach', 'strawberries', 'sugar', 'sweet_potato', 'tomato']

## Image augmentations
We only have 517 base images in the dataset and we preferably need more for model training. On top of that the augmentations make the model more robust for varying situations and it helps reducing overfitting.


In [None]:
# the annotations used by the YOLOv5 model are in the pytorch format ('x_center','y_center', 'width', 'height')
# while the img-aug package needs them in PascalVOC format ('x_min', 'y_min', 'x_max', 'y_max')
# for boundingbox scaling/transformations
# the YOLOv5 uses normalized values; pascalVOC uses real values

def YOLO2VOCbbs(lblpath, img_shape):
    #input are YOLOv5 labels, but the imgaug needs VOC format for BBOX augmentations
    #each label file can contain multiple bounding boxes
    labels = pd.read_csv(lblpath, names=('class','x_center','y_center', 'width', 'height'), sep=' ' )
    img_height,img_width, _ = img_shape

    #YOLO2VOC conversion    
    labels['x_min'] = (labels['x_center'] * img_width) -  ((labels['width']  * img_width) /2) 
    labels['y_min'] = (labels['y_center'] * img_height) -  ((labels['height'] * img_height) /2) 
    labels['x_max'] = (labels['x_center'] * img_width) +  ((labels['width'] * img_width) /2) 
    labels['y_max'] = (labels['y_center'] * img_height) +  ((labels['height'] * img_height) /2) 

    #make imgaug bounding boxes
    labels['bbox'] =  labels.apply(lambda row: BoundingBox(x1=row['x_min'], y1=row['y_min'], 
                                                           x2=row['x_max'], y2=row['y_max']), axis=1)
        
    bbs = BoundingBoxesOnImage(labels['bbox'] , shape=img_shape)
    
    #return the boundingboxes bus also the classes 
    return bbs, labels['class']


def VOCbbs2YOLO(bbs_object, img_shape, classes):    
    #get arrays from the bbox and convert them into pandas dataframe
    bbs_array = bbs_object.to_xyxy_array()
    df_bbs = pd.DataFrame(bbs_array, columns=['x_min', 'y_min', 'x_max', 'y_max'])
    
    img_height,img_width, _ = img_shape
    
    # create an extra dataframe to hold the classes and the YOLOv5 bboxes 
    
    df_bbsyolo = pd.DataFrame()
    df_bbsyolo['class'] = classes
    #convert VOC to YOLO
    df_bbsyolo['x_center'] = ((df_bbs['x_max'] + df_bbs['x_min']) / 2 )  / img_width 
    df_bbsyolo['y_center'] = ((df_bbs['y_max'] + df_bbs['y_min']) / 2 ) / img_height
    df_bbsyolo['width'] = (df_bbs['x_max'] - df_bbs['x_min']) / img_width
    df_bbsyolo['height'] = (df_bbs['y_max'] - df_bbs['y_min']) / img_height
     
    return df_bbsyolo


def resize(img, bbs):
    #resize the image to a square + also scale the bounding boxes
    #we assume that the image is vertically oriented and fill the edge with black pixels

    SIZE= 640    
    prepro = iaa.Sequential([
        iaa.Resize({"height": SIZE, "width": "keep-aspect-ratio"}),
        iaa.CenterPadToFixedSize(height=SIZE, width=SIZE)    

    ], random_order=False)

    return prepro(image=img, bounding_boxes = bbs)


def augment(img, bbs, fileamount):
    #augment the original images with a rondom set of selected augmentations
    #the fileamount indicates how many files we want


    auglevel = 0.7 # add additional randomness to augmentations
    aug = iaa.Sequential([
        #Rotation: Between -3° and +3° (affects boundingboxes)        
        iaa.Sometimes(auglevel, iaa.Affine(rotate=(-3, 3)) ),
        #Noise: Up to 5% of pixels
        iaa.Sometimes(auglevel, iaa.SaltAndPepper(0.5) ),
        #Exposure: Between -20% and +20%
        iaa.Sometimes(auglevel, iaa.AddToBrightness((-20, 20))  ),
        #Blur: Up to 3px
        iaa.Sometimes(auglevel, iaa.GaussianBlur(sigma=(0,3)) ),  
        #Cutout: 12 boxes with 10% size each
        iaa.Sometimes(auglevel, iaa.Cutout(nb_iterations=12, 
                                           position='uniform', 
                                           size=0.1, 
                                           fill_mode ='constant',
                                           cval=0 ) )  
    ], random_order=True)

    # a list of augmented images and boundingboxes is returned    
    return  [aug(image=img, bounding_boxes=bbs) for _ in range(fileamount)]


In [None]:
def createImages(path, augmentAmount=0):
    '''
    expected file structure in path
    path/images/imagename.JPG
    path/labels/imagename.txt

    target structure:
    original structure is kept
    +
    path/orig/images/imagename.JPG
    path/orig/labels/imagename.txt
    path/scaled/images/imagename.JPG
    path/scaled/labels/imagename.txt
    path/augmented/images/imagename_UUID.JPG
    path/augmented/labels/imagename_UUID.txt
    
    If the augmentation is 0 then the scaled folder data will replace the root folder data (for validation/test set)
    If the augmentation <> 0 then the augmented folder data will replace the root folder data (for train set)
    '''    

    extensions = [".jpg", ".JPG"] 
    
    #cleanup previous run
    try:
        shutil.rmtree(path + '/orig')
        shutil.rmtree(path + '/scaled')
        shutil.rmtree(path + '/augmented')
    except:
        pass
    
    #  backup original files
    shutil.copytree(path + '/labels', path + '/orig/labels',copy_function = shutil.copy)            
    shutil.copytree(path + '/images', path + '/orig/images',copy_function = shutil.copy)      
    
    #setup working folders
    os.mkdir(path + '/scaled')
    os.mkdir(path + '/scaled/labels')
    os.mkdir(path + '/scaled/images')    
    if augmentAmount > 0:        
        os.mkdir(path + '/augmented')
        os.mkdir(path + '/augmented/labels')
        os.mkdir(path + '/augmented/images')    
              
    #loop all files in /images directory
    for fname in os.listdir(path + '/images'):
        if os.path.splitext(fname)[1] in extensions:        

            baseFileName = os.path.splitext(fname)[0]  #split on file extention
            image = imageio.imread(path + '/images/' + fname)
            
            #convert the original YOLO bbs to VOC bbs and keep the labels 
            bbs, sr_classes = YOLO2VOCbbs(path + '/labels/' + baseFileName + '.txt', image.shape)
                        
            # get resized image and bbs(VOC)
            res_img, res_bbs = resize(image, bbs)
            
            #visualize
            #image_before = bbs.draw_on_image(image, size=2)            
            #image_after = res_bbs.draw_on_image(res_img, size=2)
            #fig = plt.figure(figsize=(30,15))
            #ax1 = fig.add_subplot(1,3,1)
            #ax1.imshow(image_before)
            #ax2 = fig.add_subplot(1,3,2)
            #ax2.imshow(image_after)
            #plt.show()            
            
            #store resized image and YOLO boundingboxes
            imageio.imwrite(path + '/scaled/images/' + fname, res_img)            
            df_yoloBBS = VOCbbs2YOLO(res_bbs, res_img.shape,sr_classes)
            df_yoloBBS.to_csv(path + '/scaled/labels/' +  baseFileName + '.txt', sep = ' ', index= False, header=False)                       
            
            
            if augmentAmount > 0:
                aug_imgsAND_bbses = augment(res_img, res_bbs, augmentAmount)
                #for idx, img_aug in enumerate(imgs_aug):
                for img_aug, bbs_aug in aug_imgsAND_bbses:
                    #generate unique filenames
                    fuuid = str(uuid.uuid4())
                    baseFileNameUUID = os.path.splitext(fname)[0] + '_' + fuuid;        
                    
                    #store image and YOLO bbs
                    imageio.imwrite(path + '/augmented/images/' + baseFileNameUUID + os.path.splitext(fname)[1], img_aug)      
                    df_yoloBBS = VOCbbs2YOLO(bbs_aug, img_aug.shape, sr_classes)
                    df_yoloBBS.to_csv(path + '/augmented/labels/' +  baseFileNameUUID + '.txt', sep = ' ', index= False, header=False)                       
                    
                    
                    #image_after = bbs_aug.draw_on_image(img_aug, size=2)                        
                    #fig = plt.figure(figsize=(30,15))
                    #ax2 = fig.add_subplot(1,3,2)
                    #ax2.imshow(image_after)
                    #plt.show()
                    
            
    #move new files to train location
    #cleanup source (backup under /orig)
    shutil.rmtree(path + '/images')
    shutil.rmtree(path + '/labels')                  

    if augmentAmount == 0:
        shutil.copytree(path + '/scaled/labels', path + '/labels', copy_function = shutil.copy)            
        shutil.copytree(path + '/scaled/images', path + '/images', copy_function = shutil.copy)      
    else:
        shutil.copytree(path + '/augmented/labels', path + '/labels', copy_function = shutil.copy)            
        shutil.copytree(path + '/augmented/images', path + '/images', copy_function = shutil.copy)      

In [None]:

#scale the validation and test sets, O augmented version = only resize the image
createImages(base_path + '/valid',0) 
createImages(base_path + '/test',0) 

#augment the train set, supply the amount of augmented versions you would like to have (16 here)
#this might take a few minutes
createImages(base_path + '/train',16) 


## Install and login into wandb

In [None]:
!pip install wandb

Collecting wandb
[?25l  Downloading https://files.pythonhosted.org/packages/e0/b4/9d92953d8cddc8450c859be12e3dbdd4c7754fb8def94c28b3b351c6ee4e/wandb-0.10.32-py2.py3-none-any.whl (1.8MB)
[K     |▏                               | 10kB 24.3MB/s eta 0:00:01[K     |▍                               | 20kB 18.8MB/s eta 0:00:01[K     |▌                               | 30kB 15.8MB/s eta 0:00:01[K     |▊                               | 40kB 14.3MB/s eta 0:00:01[K     |█                               | 51kB 8.7MB/s eta 0:00:01[K     |█                               | 61kB 10.1MB/s eta 0:00:01[K     |█▎                              | 71kB 9.5MB/s eta 0:00:01[K     |█▍                              | 81kB 10.3MB/s eta 0:00:01[K     |█▋                              | 92kB 10.0MB/s eta 0:00:01[K     |█▉                              | 102kB 8.4MB/s eta 0:00:01[K     |██                              | 112kB 8.4MB/s eta 0:00:01[K     |██▏                             | 122kB 8.4MB/

In [None]:
import wandb
wandb.login()

<IPython.core.display.Javascript object>

[34m[1mwandb[0m: You can find your API key in your browser here: https://wandb.ai/authorize


## Train Custom YOLOv5 Detector

Here, we are able to pass a number of arguments:
- **img:** define input image size
- **batch:** determine batch size
- **epochs:** define the number of training epochs. (Note: often, 3000+ are common here; we keep it modest (100 or 200)
- **data:** set the path to our yaml file
- **cfg:** specify our model configuration, we stick to the supplied ones which are auto downloaded
- **weights:** specify a custom path to weights
- **cache:** cache images for faster training
- **adam:** another optimiser iso SGD
- **project:** the wandb project
- **name:** result/run name (also for wandb)


### Not used
- **evolve:** train the 'internal' hyper parameters of the model
See yolov5/train.py at about line 550 for the available parameters and suggested intervals.

The default parameters can be found in yolov5\data\hyp.scratch.yaml.
One can setup his own parameter values in a custom yaml file.

Model architure parts can be found in yolov5\models\common.py.
One could build his own structure with the parts. (descrided in a yaml file)

In [None]:
# train yolov5s
#this code has been used several times with various parameters
%cd /content/yolov5/
!python train.py --img 640 --batch 16 --epochs 200 --data '../data.yaml' --cfg ./models/hub/yolov5s6.yaml --weights ./weights/yolov5s6.pt --name 30l_y5_s6_200_640cust_aug16_adam_  --cache --project aicook_yolo5 --adam

Nextup: go to wandb and compare the results with other runs.
results can found at:
https://wandb.ai/cornelka/aicook_yolo5


Download the weights from the artifacts section for use in the application at inference time.