# How to Train Scaled-YOLOv4 on Custom Objects

Scaled-YOLOv4 uses the same training procedures as YOLOv5.

This tutorial is based on the [YOLOv5 repository](https://github.com/ultralytics/yolov5) by [Ultralytics](https://www.ultralytics.com/). This notebook shows training on **your own custom objects**. Many thanks to Ultralytics for putting this repository together - we hope that in combination with clean data management tools at Roboflow, this technologoy will become easily accessible to any developer wishing to use computer vision in their projects.

### Accompanying Blog Post

A blog on Scaled-YOLOv4 is to come, in the meantime having the blog for [how to train YOLOv5](https://blog.roboflow.ai/how-to-train-yolov5-on-a-custom-dataset/)will be useful.

### Steps Covered in this Tutorial

In this tutorial, we will walk through the steps required to train Scaled-YOLOv4 on your custom objects. We use a [public blood cell detection dataset](https://public.roboflow.ai/object-detection/bccd), which is open source and free to use. You can also use this notebook on your own data.

To train our detector we take the following steps:

* Install Scaled-YOLOv4 dependencies
* Download custom Scaled-YOLOv4 object detection data
* Write our Scaled-YOLOv4 Training configuration
* Run Scaled-YOLOv4 training
* Evaluate Scaled-YOLOv4 performance
* Visualize Scaled-YOLOv4 training data
* Run Scaled-YOLOv4 inference on test images
* Export saved Scaled-YOLOv4 weights for future inference



### **About**

[Roboflow](https://roboflow.com) enables teams to deploy custom computer vision models quickly and accurately. Convert data from to annotation format, assess dataset health, preprocess, augment, and more. It's free for your first 1000 source images.

**Looking for a vision model available via API without hassle? Try Roboflow Train.**

![Roboflow Wordmark](https://i.imgur.com/dcLNMhV.png)



#Install Dependencies

_(Remember to choose GPU in Runtime if not already selected. Runtime --> Change Runtime Type --> Hardware accelerator --> GPU)_

In [8]:
%cd C:/Users/WMNL/scaled-yolov4_E7

# clone Scaled_YOLOv4
!git clone https://github.com/roboflow-ai/ScaledYOLOv4.git  # clone repo
%cd ScaledYOLOv4
#checkout the yolov4-large branch
!git checkout yolov4-large

C:\Users\WMNL\scaled-yolov4_E7
[WinError 2] 系統找不到指定的檔案。: 'ScaledYOLOv4'
C:\Users\WMNL\scaled-yolov4_E7


fatal: Too many arguments.

usage: git clone [<options>] [--] <repo> [<dir>]

    -v, --verbose         be more verbose
    -q, --quiet           be more quiet
    --progress            force progress reporting
    -n, --no-checkout     don't create a checkout
    --bare                create a bare repository
    --mirror              create a mirror repository (implies bare)
    -l, --local           to clone from a local repository
    --no-hardlinks        don't use local hardlinks, always copy
    -s, --shared          setup as shared repository
    --recurse-submodules[=<pathspec>]
                          initialize submodules in the clone
    --recursive ...       alias of --recurse-submodules
    -j, --jobs <n>        number of submodules cloned in parallel
    --template <template-directory>
                          directory from which templates will be used
    --reference <repo>    reference repository
    --reference-if-able <repo>
                          reference re

In [1]:
import torch
print('Using torch %s %s' % (torch.__version__, torch.cuda.get_device_properties(0) if torch.cuda.is_available() else 'CPU'))

Using torch 1.7.1 _CudaDeviceProperties(name='NVIDIA GeForce RTX 3080', major=8, minor=6, total_memory=10239MB, multi_processor_count=68)


In [2]:
#install mish activation funciton for cuda
%cd C:/Users/WMNL/scaled-yolov4_E7
!git clone https://github.com/JunnYu/mish-cuda
%cd mish-cuda
!python setup.py build install 

C:\Users\WMNL\scaled-yolov4_E7
C:\Users\WMNL\scaled-yolov4_E7\mish-cuda


fatal: destination path 'mish-cuda' already exists and is not an empty directory.


running build
running build_py
running egg_info
writing src\mish_cuda.egg-info\PKG-INFO
writing dependency_links to src\mish_cuda.egg-info\dependency_links.txt
writing requirements to src\mish_cuda.egg-info\requires.txt
writing top-level names to src\mish_cuda.egg-info\top_level.txt
reading manifest file 'src\mish_cuda.egg-info\SOURCES.txt'
adding license file 'LICENSE'
writing manifest file 'src\mish_cuda.egg-info\SOURCES.txt'
running build_ext
building 'mish_cuda._C' extension
Emitting ninja build file C:\Users\WMNL\scaled-yolov4_E7\mish-cuda\build\temp.win-amd64-3.7\Release\build.ninja...
Compiling objects...
Allowing ninja to set a default number of workers... (overridable by setting the environment variable MAX_JOBS=N)
[1/3] cl /showIncludes /nologo /Ox /W3 /GL /DNDEBUG /MD /MD /wd4819 /wd4251 /wd4244 /wd4267 /wd4275 /wd4018 /wd4190 /EHsc -IC:\Users\WMNL\anaconda3\envs\pytorch_E7\lib\site-packages\torch\include -IC:\Users\WMNL\anaconda3\envs\pytorch_E7\lib\site-packages\torch\incl



注意: 包含檔案:        C:\Users\WMNL\anaconda3\envs\pytorch_E7\lib\site-packages\torch\include\pybind11\detail/common.h
C:\Users\WMNL\anaconda3\envs\pytorch_E7\include\pyerrors.h(490): note: 請參閱 'HAVE_SNPRINTF' 之前的定義
注意: 包含檔案:         C:\Users\WMNL\anaconda3\envs\pytorch_E7\include\frameobject.h
注意: 包含檔案:        C:\Users\WMNL\anaconda3\envs\pytorch_E7\lib\site-packages\torch\include\pybind11\buffer_info.h
注意: 包含檔案:       C:\Users\WMNL\anaconda3\envs\pytorch_E7\lib\site-packages\torch\include\pybind11\detail/typeid.h
注意: 包含檔案:       C:\Users\WMNL\anaconda3\envs\pytorch_E7\lib\site-packages\torch\include\pybind11\detail/descr.h
注意: 包含檔案:       C:\Users\WMNL\anaconda3\envs\pytorch_E7\lib\site-packages\torch\include\pybind11\detail/internals.h
注意: 包含檔案:     C:\Users\WMNL\anaconda3\envs\pytorch_E7\lib\site-packages\torch\include\pybind11\options.h
注意: 包含檔案:     C:\Users\WMNL\anaconda3\envs\pytorch_E7\lib\site-packages\torch\include\pybind11\detail/class.h
注意: 包含檔案:     C:\Users\WMNL\anaconda3\env

In [23]:
!pip install -U PyYAML
!pip install tensorboard
!pip install tqdm
!pip install opencv-python
!pip install matplotlib
!pip install scipy

Collecting scipy
  Downloading scipy-1.7.2-cp37-cp37m-win_amd64.whl (34.1 MB)
Installing collected packages: scipy
Successfully installed scipy-1.7.2


In [4]:
%cd C:/Users/WMNL/scaled-yolov4_E7/ScaledYOLOv4/

C:\Users\WMNL\scaled-yolov4_E7\ScaledYOLOv4


# Download Correctly Formatted Custom Dataset 

We'll download our dataset from Roboflow. Use the "**YOLOv5 PyTorch**" export format. Note that the Ultralytics implementation calls for a YAML file defining where your training and test data is. The Roboflow export also writes this format for us.

To get your data into Roboflow, follow the [Getting Started Guide](https://blog.roboflow.ai/getting-started-with-roboflow/).



![YOLOv5 PyTorch export](https://i.imgur.com/5vr9G2u.png)


In [6]:
# #follow the link below to get your download code from from Roboflow
# !pip install -q roboflow
# from roboflow import Roboflow
# rf = Roboflow(model_format="yolov5", notebook="roboflow-scaled-yolov4")

In [7]:
# Export code snippet and paste here
%cd C:/Users/WMNL/scaled-yolov4_E7
#after following the link above, recieve python code with these fields filled in
#from roboflow import Roboflow
#rf = Roboflow(api_key="YOUR API KEY HERE")
#project = rf.workspace().project("YOUR PROJECT")
#dataset = project.version("YOUR VERSION").download("yolov5")

C:\Users\WMNL\scaled-yolov4_E7


In [11]:
# this is the YAML file Roboflow wrote for us that we're loading into this notebook with our data
%cat data.yaml

'cat' 不是內部或外部命令、可執行的程式或批次檔。


#Inspect Model Configuration and Architecture

Let's look at the Scaled-YOLOv4 Configuration architecture

In [11]:
%cat /content/ScaledYOLOv4/models/yolov4-csp.yaml

# parameters
nc: 80  # number of classes
depth_multiple: 1.0  # model depth multiple
width_multiple: 1.0  # layer channel multiple

# anchors
anchors:
  - [12,16, 19,36, 40,28]  # P3/8
  - [36,75, 76,55, 72,146]  # P4/16
  - [142,110, 192,243, 459,401]  # P5/32

# yolov4-csp backbone
backbone:
  # [from, number, module, args]
  [[-1, 1, Conv, [32, 3, 1]],  # 0
   [-1, 1, Conv, [64, 3, 2]],  # 1-P1/2
   [-1, 1, Bottleneck, [64]],
   [-1, 1, Conv, [128, 3, 2]],  # 3-P2/4
   [-1, 2, BottleneckCSP, [128]],
   [-1, 1, Conv, [256, 3, 2]],  # 5-P3/8
   [-1, 8, BottleneckCSP, [256]],
   [-1, 1, Conv, [512, 3, 2]],  # 7-P4/16
   [-1, 8, BottleneckCSP, [512]],
   [-1, 1, Conv, [1024, 3, 2]], # 9-P5/32
   [-1, 4, BottleneckCSP, [1024]],  # 10
  ]

# yolov4-csp head
# na = len(anchors[0])
head:
  [[-1, 1, SPPCSP, [512]], # 11
   [-1, 1, Conv, [256, 1, 1]],
   [-1, 1, nn.Upsample, [None, 2, 'nearest']],
   [8, 1, Conv, [256, 1, 1]], # route backbone P4
   [[-1, -2], 1, Concat, [1]],
   [-1, 2, Bott

# Train Custom Scaled-YOLOv4 Detector

### Next, we'll fire off training!


Here, we are able to pass a number of arguments:
- **img:** define input image size
- **batch:** determine batch size
- **epochs:** define the number of training epochs. (Note: often, 3000+ are common here!)
- **data:** set the path to our yaml file
- **cfg:** specify our model configuration
- **weights:** specify a custom path to weights.
- **name:** result names
- **nosave:** only save the final checkpoint
- **cache:** cache images for faster training

In [15]:
!pip list

Package                 Version
----------------------- ---------
absl-py                 1.0.0
argcomplete             1.12.3
argon2-cffi             20.1.0
async-generator         1.10
attrs                   21.2.0
backcall                0.2.0
bleach                  4.0.0
cachetools              4.2.4
certifi                 2021.10.8
cffi                    1.15.0
charset-normalizer      2.0.7
colorama                0.4.4
debugpy                 1.5.1
decorator               5.1.0
defusedxml              0.7.1
entrypoints             0.3
google-auth             2.3.3
google-auth-oauthlib    0.4.6
grpcio                  1.42.0
idna                    3.3
importlib-metadata      4.8.1
ipykernel               6.4.1
ipython                 7.29.0
ipython-genutils        0.2.0
ipywidgets              7.6.5
jedi                    0.18.0
Jinja2                  3.0.2
jsonschema              3.2.0
jupyter                 1.0.0
jupyter-client          7.0.6
jupyter-console         6.4.

In [1]:
!nvcc -V
print(torch.cuda.device_count())
print(torch.cuda.is_available())

nvcc: NVIDIA (R) Cuda compiler driver
Copyright (c) 2005-2021 NVIDIA Corporation
Built on Sun_Feb_14_22:08:44_Pacific_Standard_Time_2021
Cuda compilation tools, release 11.2, V11.2.152
Build cuda_11.2.r11.2/compiler.29618528_0


NameError: name 'torch' is not defined

In [13]:
%%time
# train scaled-YOLOv4 on custom data for 100 epochs
# time its performance

# python train.py --img 416 --batch 4 --epochs 3000 --device 0 --data ../data.yaml --cfg ./models/yolov4-csp.yaml --weights '' --name yolov4-csp-416-4-3000-results  --cache

%cd C:/Users/WMNL/scaled-yolov4_E7/ScaledYOLOv4/
!python train.py --img 416 --batch 4 --epochs 1000 --device 0 --data ../data.yaml --cfg ./models/yolov4-csp.yaml --weights '' --name yolov4-csp-416-4-1000-results  --cache

C:\Users\WMNL\scaled-yolov4_E7\ScaledYOLOv4
^C
Wall time: 3min 34s


# Evaluate Custom Scaled-YOLOv4 Detector Performance

Training losses and performance metrics are saved to Tensorboard and also to a logfile defined above with the **--name** flag when we train. In our case, we named this `yolov5s_results`. (If given no name, it defaults to `results.txt`.) The results file is plotted as a png after training completes.

Note from Glenn: Partially completed `results.txt` files can be plotted with `from utils.utils import plot_results; plot_results()`.

In [None]:
# Start tensorboard
# Launch after you have started training
# logs save in the folder "runs"
%load_ext tensorboard
%tensorboard --logdir runs

In [None]:
# we can also output some older school graphs if the tensor board isn't working for whatever reason... 
#from utils.general import plot_results  # plot results.txt as results.png
from IPython.display import Image, display
display(Image('/content/ScaledYOLOv4/runs/exp0_yolov4-csp-results/results.png'))  # view results.png

### Curious? Visualize Our Training Data with Labels

After training starts, view `train*.jpg` images to see training images, labels and augmentation effects.

Note a mosaic dataloader is used for training (shown below), a new dataloading concept developed by Glenn Jocher and first featured in [YOLOv4](https://arxiv.org/abs/2004.10934).

In [None]:
# first, display our ground truth data
print("GROUND TRUTH TRAINING DATA:")
Image(filename='/content/ScaledYOLOv4/runs/exp0_yolov4-csp-results/test_batch0_gt.jpg', width=900)

In [None]:
# print out an augmented training example
print("GROUND TRUTH AUGMENTED TRAINING DATA:")
Image(filename='/content/ScaledYOLOv4/runs/exp0_yolov4-csp-results/train_batch0.jpg', width=900)

#Run Inference  With Trained Weights
Run inference with a pretrained checkpoint on contents of `test/images` folder downloaded from Roboflow.

In [None]:
# trained weights are saved by default in our weights folder
%ls runs/

[0m[01;34mexp0_yolov4-csp-results[0m/  [01;34mexp1_yolov4-csp-results[0m/


In [None]:
%ls ./runs/exp0_yolov4-csp-results/weights

best_yolov4-csp-results.pt        last_003.pt  last_008.pt
best_yolov4-csp-results_strip.pt  last_004.pt  last_009.pt
last_000.pt                       last_005.pt  last_yolov4-csp-results.pt
last_001.pt                       last_006.pt  last_yolov4-csp-results_strip.pt
last_002.pt                       last_007.pt


In [5]:
# when we ran this, we saw .007 second inference time. That is 140 FPS on a TESLA P100!
# use the best weights!
%cd C:/Users/WMNL/KevinGG/scaled-yolov4_E7/ScaledYOLOv4/
# !python detect.py --weights ./runs/exp1_yolov4-csp-1100-4-3000-results/weights/best_yolov4-csp-1100-4-3000-results.pt --img 1100 --conf 0.4 --source ../test/images

%cd C:/Users/WMNL/KevinGG/scaled-yolov4_E7/ScaledYOLOv4-cpu/
!python detect.py --cfg models/yolov4-csp.cfg --weights ./runs/exp0_yolov4-csp-800-2-3000-cpu/weights/best_yolov4-csp-800-2-3000-cpu.pt --img 800 --conf 0.4 --source ../test/images

C:\Users\WMNL\KevinGG\scaled-yolov4_E7\ScaledYOLOv4
C:\Users\WMNL\KevinGG\scaled-yolov4_E7\ScaledYOLOv4-cpu
Namespace(agnostic_nms=False, augment=False, cfg='models/yolov4-csp.cfg', classes=None, conf_thres=0.4, device='', img_size=800, iou_thres=0.5, names='../data.names', output='inference/output', save_txt=False, source='../test/images', update=False, view_img=False, weights=['./runs/exp0_yolov4-csp-800-2-3000-cpu/weights/best_yolov4-csp-800-2-3000-cpu.pt'])
Using CUDA device0 _CudaDeviceProperties(name='NVIDIA GeForce RTX 3080', total_memory=10239MB)

Model Summary: 342 layers, 5.29214e+07 parameters, 5.29214e+07 gradients
image 1/53 C:\Users\WMNL\KevinGG\scaled-yolov4_E7\test\images\screenshot (1).png: 480x800 2 npc_Nissas, 5 Buybutton 1s, 1 Buybutton 2s, 1 update store Checkbutton 1s, 1 sliding areas, Done. (0.036s)
image 2/53 C:\Users\WMNL\KevinGG\scaled-yolov4_E7\test\images\screenshot (10).png: 480x800 1 npc_Nissas, 4 Buybutton 1s, 1 Buybutton 2s, 1 update store Checkbutton 1s

In [None]:
#display inference on ALL test images
#this looks much better with longer training above
import glob
from IPython.display import Image, display

for imageName in glob.glob('./inference/output/*.png'): #assuming JPG
    display(Image(filename=imageName))
    print("\n")

# Export Trained Weights for Future Inference

Now that you have trained your custom detector, you can export the trained weights you have made here for inference on your device elsewhere

In [None]:
from google.colab import files
files.download('./runs/exp0_yolov4-csp-results/weights/best_yolov4-csp-results.pt')

<IPython.core.display.Javascript object>

<IPython.core.display.Javascript object>

## Congrats!

Hope you enjoyed this!

--Team [Roboflow](https://roboflow.ai)