Skip to content

Object Detection

CKSpahn edited this page Nov 3, 2021 · 14 revisions

Object detection using YOLOv2

Object detection is a common task in computer vision. It is used to locate and identify objects in images, such as a cat or a dog. Such as object detection algorithms are developed for self-driving cars, they can also be used to control microscopes (see e.g. the paper from Waithe et al.). Object detection networks are typically trained on images that were annotated with bounding boxes. Bounding boxes are drawn around the object and labeled with a specific class. This is also the output of the trained network when applied to data. In our work, we employ the famous YOLOv2 network from Joseph Redmon and Ali Farhadi (see disclaimer below).

Please also have a look at the corresponding ZeroCostDL4Mic wiki page.

Important disclaimer

Our YOLOv2 notebook is based on the following paper:

Please also cite this original paper when using or developing our notebook.

General tipps for data annotation

There are many tools available to annotate data for object detection. One important thing is to choose the tool that allows you to export the annotations in the format required for the DL network to be used. The YOLOv2 implementation in the ZeroCostDL4Mic platform requires the PASCAL VOC format (VOC XML).

In our work, we used two different tools:

While the first option does not require any installation, LabelImg has to be installed via a Python distribution (i.e. Anaconda).
The advantage of the latter is that annotations can be saved, so you don't have to annotate all images at once or break the dataset into pieces.

As for every DL method, accurate annotations are key to a good-performing model. Thus, draw the bounding boxes as accurately as possible. Annotating the entire image (every cell or object) is important to perform robust model evaluation.

Datasets

We employ object detection for two purposes:

  • Detect and classify cells according to their growth stage (non-dividing, dividing and microcolonies)
  • Antibiotic phenotyping (classifying cells treated with specific antibiotics)

Growth stage classification

Sample preparation
E. coli MG1655 cultures were grown in LB Miller at 37°C and 220 rpm overnight. Working cultures were inoculated 1:200 and grown at 23°C and 220 rpm to OD600 ~ 0.5 – 0.8. For time lapse imaging, cells were immobilized under agarose pads prepared using microarray slides (VWR, catalogue number 732-4826) as described in de Jong et al., 2011. Brightfield time series (1 frame/min, 80 min total length) of 10 regions of interest were recorded with an Andor iXon Ultra 897 EMCCD camera (Oxford instruments) attached to a Nikon Eclipse Ti inverted microscope (Nikon Instruments) bearing a motorized XY-stage (Märzhäuser) and an APO TIRF 1.49NA 100x oil objective (Nikon Instruments).

Annotation and network training
To train and evaluate the YOLOv2 model, we manually annotated brightfield images using LabelImg. The network was then trained using the ZeroCostDL4Mic YOLOv2 implementation.


Network training parameters

YOLOv2 networks for the detection of E. coli growth stages were trained using the following parameters:

Task Images (train/test) Epochs Image size Train cycles Batch size LR % Valid. Augmentation Penalties (FNP, FPP, PSP, FCP) Training time
Growth stage (large FoV) 25/15 100 512 x 512 px² 4 4 0.0003 20 4x (flip/rot.) 5, 1, 3, 3 1 h 31 min
Growth stage (small FoV) 100/60 100 256 x 256 px² 4 8 0.0003 20 4x (flip/rot.) 5, 1, 3, 3 34 min

Antibiotic phenotyping

Sample preparation
E. coli strain NO34 (kind gift from Zemer Gitai, Ouzounov et al., 2016) was grown in LB at 32°C shaking at 220 rpm overnight. Working cultures were inoculated 1:200 in fresh LB and grown to mid-exponential phase and antibiotics were added at different concentrations and durations (see manuscript SI). Antibiotic stock solutions were prepared freshly 5-10 min before use. Cells were fixed using a mixture of 2% formaldehyde and 0.1% glutaraldehyde, quenched using 0.1% sodium borohydrate (w/v) in PBS for 3 min and immobilized on PLL-coated chamberslides (see Spahn et al., 2018 for details). Nucleoids were stained using 300 nM DAPI for 15 min. After 3 washes with PBS, 100 nM Nile Red in PBS was added to the chambers and confocal images were recorded with a commercial LSM710 microscope (Zeiss, Germany) bearing a Plan-Apo 63x oil objective (1.4 NA) and using 405 nm (DAPI) and 543 nm (Nile Red) laser excitation in sequential mode. Images (800 x 800 px²) were recorded with a pixel size of 84 nm, 16-bit image depth, 16.2 µs pixel dwell time, 2x line averaging and 1 AU pinhole size.

Annotation and network training Training images were annotated using LabelImg or makesense.ai. In addition to the antibiotic treatments, we added two further classes:

  • Oblique cells (partially attached cells that go out of focus)
  • Membrane vesicles

Synthetic test data was generated by randomly stitching 200 x 200 px² patches of different drug treatments and the control condition. Small patches were manually cropped from images that were not seen by the network during the training. In total, 32 test images were generated this way and annotated online using makesense.ai as described above. Additionally, 400 x 400 px² image patches of previously unseen images (drug-treatments and control) were annotated using LabelImg.



Note that the bounding boxes for the antibiotic treated cells is not shown in the panels.

Network training parameters

YOLOv2 networks for antibiotic profiling of E. coli cells were trained using the following parameters:

Task Images (train/test) Epochs Image size Train cycles Batch size LR % Valid. Augmentation Penalties (FNP, FPP, PSP, FCP) Training time
Antibiotic profiling 153/32 (50) 100 400 x 400 px² 4 16 0.001 20 8x (flip/rot.) 4, 2, 1, 2 2 h 33 min

Training YOLOv2 in google colab

Network Link to datasets and pretrained models Direct link to notebook in Colab
YOLOv2 Growth stage prediction and Antibiotic Phenotyping Open In Colab
Clone this wiki locally