The required .names file and .txt files were created. The training and test is however carried on image size much larger than the respective one in YOLOv3 i.e. 608 x 608 and hence the training and test time is comparatively greater by almost 50 percent, for the same dataset. This too has good visualization with Tensorboard.
├── README.md
├── dataset.py dataset
├── demo.py demo to run pytorch --> tool/darknet2pytorch
├── demo_darknet2onnx.py tool to convert into onnx --> tool/darknet2pytorch
├── demo_pytorch2onnx.py tool to convert into onnx
├── models.py model for pytorch
├── train.py train models.py
├── cfg.py cfg.py for train
├── cfg cfg --> darknet2pytorch
├── data
├── weight --> darknet2pytorch
├── tool
│ ├── camera.py a demo camera
│ ├── coco_annotation.py coco dataset generator
│ ├── config.py
│ ├── darknet2pytorch.py
│ ├── region_loss.py
│ ├── utils.py
│ └── yolo_layer.py
you can use darknet2pytorch to convert it yourself, or download my converted model.
- baidu
- yolov4.pth(https://pan.baidu.com/s/1ZroDvoGScDgtE1ja_QqJVw Extraction code:xrq9)
- yolov4.conv.137.pth(https://pan.baidu.com/s/1ovBie4YyVQQoUrC3AY0joA Extraction code:kcel)
- google
- yolov4.pth(https://drive.google.com/open?id=1wv_LiFeCRYwtpkqREPeI13-gPELBDwuJ)
- yolov4.conv.137.pth(https://drive.google.com/open?id=1fcbR0bWzYfIEdLJPzOsn4R5mlvR6IQyA)
use yolov4 to train your own data
-
Download weight
-
Transform data
For coco dataset,you can use tool/coco_annotation.py.
# train.txt image_path1 x1,y1,x2,y2,id x1,y1,x2,y2,id x1,y1,x2,y2,id ... image_path2 x1,y1,x2,y2,id x1,y1,x2,y2,id x1,y1,x2,y2,id ... ... ...
-
Train
you can set parameters in cfg.py.
python train.py -g [GPU_ID] -dir [Dataset direction] ...
Image input size is NOT restricted in 320 * 320
, 416 * 416
, 512 * 512
and 608 * 608
.
You can adjust your input sizes for a different input ratio, for example: 320 * 608
.
Larger input size could help detect smaller targets, but may be slower and GPU memory exhausting.
height = 320 + 96 * n, n in {0, 1, 2, 3, ...}
width = 320 + 96 * m, m in {0, 1, 2, 3, ...}
-
Load pytorch weights (pth file) to do the inference
python models.py <num_classes> <weightfile> <imgfile> <IN_IMAGE_H> <IN_IMAGE_W> <namefile(optional)>
There are 2 inference outputs.
- One is locations of bounding boxes, its shape is
[batch, num_boxes, 1, 4]
which represents x1, y1, x2, y2 of each bounding box. - The other one is scores of bounding boxes which is of shape
[batch, num_boxes, num_classes]
indicating scores of all classes for each bounding box.
Until now, still a small piece of post-processing including NMS is required. We are trying to minimize time and complexity of post-processing.
Reference:
- https://github.com/eriklindernoren/PyTorch-YOLOv3
- https://github.com/marvis/pytorch-caffe-darknet-convert
- https://github.com/marvis/pytorch-yolo3
@article{yolov4,
title={YOLOv4: YOLOv4: Optimal Speed and Accuracy of Object Detection},
author={Alexey Bochkovskiy, Chien-Yao Wang, Hong-Yuan Mark Liao},
journal = {arXiv},
year={2020}
}