Skip to content
TF and YOLO utility scripts
Branch: master
Clone or download
Pull request Compare This branch is 24 commits ahead of datitran:master.
Fetching latest commit…
Cannot retrieve the latest commit at this time.
Type Name Latest commit message Commit time
Failed to load latest commit information.
diagram.svg added a DIAGRAM! Nov 14, 2019 changed private function to the beginning of file May 5, 2019 pbtxt generated from txt file May 2, 2019 fixed help Apr 21, 2019 train/eval separation ipynb became a script Apr 23, 2019 create a script to generate yolo annotations from csv Apr 18, 2019

Object detection utility scripts

This repo contains a few Python scripts which may be useful for those trying to create the necessary prerequisite files to train an object detection model, either through the TensorFlow Object Detection API or by using YOLOv3.

Take a look inside the examples folder to have an idea of the types of files and contents that these scripts expect as input/generate as output.


  • reads the contents of image annotations stored in XML files, created with labelImg, and generates a single CSV file.
  • reads the previously generated CSV file (or any CSV file that has a column named "class") or a text file containing a single class name per line and no header, and generates a label map, one of the files needed to train a detection model using TensorFlow's Object Detection API.
  • reads the previously generated CSV and label map files, as well as all the images from a given directory, and generates a TFRecord file, which can then be used to train an object detection model with TensorFlow. The resulting TFRecord file is about the same size of all the original images that were included in it.
  • reads the CSV file and generates one .txt file for each image mentioned in the CSV file, whith the same name of the image file. These .txt files contain the object annotations for that image, in a format which darknet uses to train its models.
  • reads the CSV file and separates it into train and evaluation datasets, which are also CSV files. There are options to stratify by class and to select which fraction of the input CSV will be directed to the train dataset (the rest going to evaluation).


The scripts are to be called in a terminal. In order to know what arguments a particular script expects, run it with the -h flag to see a help message. For example: python -h


Licenses are so complicated. This work began as a fork of Dat Tran's raccoon dataset repository, but then it became its own thing. Anyway, the license is unchanged and is in the repo.

You can’t perform that action at this time.