Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

how long time it takes to train YOLOV3 on coco dataset #742

Open
SophieChang66 opened this issue Apr 27, 2018 · 20 comments
Open

how long time it takes to train YOLOV3 on coco dataset #742

SophieChang66 opened this issue Apr 27, 2018 · 20 comments

Comments

@SophieChang66
Copy link

hello, I want to known how long time it takes to train YOLOV3 on coco dataset , on which GPU device???

@jjehl
Copy link

jjehl commented Apr 28, 2018

really depends on the GPU, several hours to several days

@ahkarami
Copy link

for example, how long it takes on one GTX 1080 ti?

@ahsan856jalal
Copy link

ahsan856jalal commented Apr 30, 2018 via email

@jjehl
Copy link

jjehl commented Apr 30, 2018

3 days should be good.

@hengshan123
Copy link

Do you trained the yolo v3 on coco ? how long time it needs?

@zhangpzh
Copy link

Hello ! I found that the default max_batches (500,200) and the sub-divisions(16) are so large. Could you please share your customized yolov3 cfg file for MS-COCO training ? @jjehl @ahsan856jalal Thank you !

@jingtianyilong
Copy link

i've train a modified network base on yolov3-tiny. About 15 BFLOPs with 1070ti. 500000 iters takes about 3days. But strangely it takes even longer on 1080ti. I'm trying to figure it out

@mbalty
Copy link

mbalty commented Sep 13, 2018

Hi. I am training with coco_train2017 (118287 images), batch_size32. I am trying to train a class-unbiased object detector, by basically mapping all classes in coco to a class object. I am using a Titan Xp gpu and it took around 3.5 hours for the first epoch. Is that normal?
Also, my version of tensorflow-gpu is 1.5.0

@zhangpzh
Copy link

I totally use 2 weeks for the original coco cfg on four M40 gpus. batch_size=64, max_iterations=500k

@jingtianyilong
Copy link

I totally use 2 weeks for the original coco cfg on four M40 gpus. batch_size=64, max_iterations=500k

I was wondering if you got the similar result as the original weight file which is available on the website.

@mbalty
Copy link

mbalty commented Sep 20, 2018

Hi. I am training with coco_train2017 (118287 images), batch_size32. I am trying to train a class-unbiased object detector, by basically mapping all classes in coco to a class object. I am using a Titan Xp gpu and it took around 3.5 hours for the first epoch. Is that normal?
Also, my version of tensorflow-gpu is 1.5.0

I am now using the yolo-tiny model and it still takes around 3 hours on my first epoch. Am I wrong or is it a bit too much? From the logs, the gpu seems to be used but also my cpu is at 100%.

@zhangpzh
Copy link

Yes, finally after 50w+ iterations, I got 33.1 mAP. But it seems the learning rate should reduce faster to save more potential time. @jingtianyilong

@techkuji
Copy link

How much time it should take to train a yolov3 model with custom objects... And i have no gpu

@jjehl
Copy link

jjehl commented Jan 23, 2019

Really, without GPU, no way or it is questions of months... You have to understand that training on images needs a lot of calculation. The time depends on the power you have the size of the dataset and the number of classes.
For example, on coco with a multi gpu 4xGTX1080 if you want to train like J. redmond, so 500 000 iterations, it takes 10 days. But after 1 days the result is not so bad and the 9 days after are just for few percents better.
If you have a multi Titan X you can increase batch so reduce time.
For 1 class, 2000 images and a GTX1060 result can be good after 12 hours.

@techkuji
Copy link

techkuji commented Jan 23, 2019 via email

@abhishek1315
Copy link

@Arup276
Just clone it using this command:
!git clone https://github.com/pjreddie/darknet

@tloki
Copy link

tloki commented Aug 25, 2019

I suppose that all of you are talking about training the whole network without pre-training or freezing (transfer learning)?

If I may ask - I tried running the PyTorch yolov3 implementation (https://github.com/westerndigitalcorporation/YOLOv3-in-PyTorch) and saw no validation-set approach in their training script? My practice till now was to take the model from epoch that performed the best on validation set (so for example after 100 epochs start validating after every epoch and dump the best models)...

I have an RTX2080Ti on disposal for a few (4) days and would like to think that I could achieve some performance training on 8 classes on Berkley dataset (70 000 train, 20 000 eval, 10 000 test).

@7thstorm
Copy link

I am running training on 4 classes on 26 million training images and approx 6 million test images.
Training on RTX 2080Ti. How do I calculate approx time to complete

@jenapss
Copy link

jenapss commented Jul 28, 2020

wow

@RezaSajadian
Copy link

running on an IBM p9 server with 4 Tesla V100 GPUs, and training on the entire coco2014 dataset, it took me well 5 days to finish all the 500200 max iterations of the yolov3 to finish.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests