Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Can't import weights and cfg file from Darknet #325

Open
zenineasa opened this issue Jul 4, 2017 · 25 comments
Open

Can't import weights and cfg file from Darknet #325

zenineasa opened this issue Jul 4, 2017 · 25 comments

Comments

@zenineasa
Copy link

zenineasa commented Jul 4, 2017

I had used Darknet to train earlier. Trying to use the same .cfg and .weights file in order to detect doesn't work, I guess. Getting the following error:
AssertionError: expect 268263452 bytes, fount 268263456

Anything that I might be doing wrong?

@jubjamie
Copy link

jubjamie commented Jul 4, 2017

Those byte counts are suspiciously close. But nonetheless, can you confirm what happens when you downloaded the cfg and weights from here: https://pjreddie.com/darknet/yolo/

@zenineasa
Copy link
Author

Actually, I was using darknet for the past few days, following pjereddie's website. Had collected a few images, marked them and trained them on darknet. It was working pretty well on that.

I just wanted to test darkflow out, so I imported the trained weights and the cfg file that I had created for darknet to darkflow (copy and paste); tried running, got this error.

Aren't the cfg and the weights configuration in darknet and darkflow mutually compatible?

@jubjamie
Copy link

jubjamie commented Jul 4, 2017

Yeah I think they are. Not sure about once they are trained if you've trained a new model but they should be. The fact that your byte count is off by 4 seems suspicious like something didn't quite save right or there's something else not quite right.

Is your new model that you're trying to load based off of a yolo cfg or is it a brand new one?

@zenineasa
Copy link
Author

Based on darknet19_448.conv.23

@jubjamie
Copy link

jubjamie commented Jul 4, 2017

Not familiar enough with darknet sorry. I presume you have no issues using one of the yolo cfgs/weights from the website?

@zenineasa
Copy link
Author

Well, when I tried replacing the yolo.cfg and yolo.weights from darknet to darkflow, it worked fine. But as I renamed yolo.cfg and yolo.weights to yolo1.cfg and yolo1.weights respectively, and tried to run those, I got another AssertionError...

AssertionError: labels.txt and cfg/yolo1.cfg indicate inconsistent class numbers.

I know that there are 80 classes in yolo and it requires to have 80 labels. So, I added a few dummy contents so that it has 80 classes, at then it worked fine. Are there somethings hardcoded for yolo.cfg? Where should I look for 'em?

@jubjamie
Copy link

jubjamie commented Jul 4, 2017

Not really sure. Have you adjusted the configs as suggested here. It requires you to specify the classes. This could cause an issue when trying to train?

@zenineasa
Copy link
Author

Yes, its exactly the same as in Darknet.

@Kowasaki
Copy link

I have the exact same off by 4 bytes error using darknet19_448.conv.23 to train in darknet before porting to darkflow! Did anyone ever figure out what the problem might be?

@Benjamin-Vencill
Copy link

Benjamin-Vencill commented Jul 14, 2017

I'm also encountering this error! I trained a brand new model in darknet yesterday (using the pre-trained darknet19_448.conv.23) and tried to load it in darkflow using the output .weights file from darknet and I'm off by 4 bytes as well! I'm working with a two-class model so my config looks like:

[convolutional]
filters=35

[region]
classes=2

as per the recommendation. This yields:

AssertionError: expect 202335260 bytes, found 202335264

I've tried several iterations of adjusting the configs (changing class and filter numbers for the last layer) to no avail. I suspected that the "off by 4 bytes" is due to a dimensional mismatch on the last layer, something like darkflow is expecting 3 classes and getting only 2 in the output layer. So I tried modifying my .cfg file like so:

[convolutional]
filters=40

[region]
classes=3

and this yields an over-read:
AssertionError: Over-read ../darknet/new_obj.weights

I would love any insight into this problem! Thanks!

@minhnhat93
Copy link

Exactly the same issue as @Benjamin-Vencill when using the trained yolo.weights and yolo.cfg on darknet website as well as after finetuning using darknet. Off by 4 bytes. Anyone have any idea how the weights in darknet and darkflow are saved/loaded?

@minhnhat93
Copy link

minhnhat93 commented Jul 16, 2017

Update: I don't know what is happening but after printing the index of layers that was loaded I found out that all the weights in the newly trained model from darknet was shifted right by 4 bytes compared to the older model in darkflow. Changing this line:

self.offset = 16
to self.offset = 20 helped me solved my problem and I was able to use my newly trained model in darknet with darkflow.
It pretty weird though because on the darknet website they still only have 16 bytes for extra stuff at the beginning: https://github.com/pjreddie/darknet/blob/d8c5cfd6c6c7dca460c64521358a0d772e5e8d52/src/parser.c#L906
Can someone who is expert at this shed some light on this behaviour?

@zinkcious
Copy link

zinkcious commented Jul 20, 2017

Exactly the same problem as above. Solved using @minhnhat93 's answer above. I still wanted to know why I have change the offset to 20 when importing the darknet model I trained using my own dataset. Because, if import the official cfg "tiny-yolo-voc.cfg" and official weight "tiny-yolo-voc.weights" to darkflow, the offset 16 workd fine.

I think it might be the bug in darknet

Thanks a lot! @minhnhat93

Update: I don't know what is happening but after printing the index of layers that was loaded I found out that all the weights in the newly trained model from darknet was shifted right by 4 bytes compared to the older model in darkflow. Changing this line:

self.offset = 16
to self.offset = 20 helped me solved my problem and I was able to use my newly trained model in darknet with darkflow.
It pretty weird though because on the darknet website they still only have 16 bytes for extra stuff at the beginning: https://github.com/pjreddie/darknet/blob/d8c5cfd6c6c7dca460c64521358a0d772e5e8d52/src/parser.c#L906
Can someone who is expert at this shed some light on this behaviour?

@zenineasa
Copy link
Author

Actually the offset must be equivalent to the size of 4 floating point variables, i.e. 24 bytes. These are using for version number and a few other stuffs in the weights file. I had read that elsewhere, can't remember where exactly it was.

@Benjamin-Vencill
Copy link

I used the solution @minhnhat93 provided and it works now! Nice work, thanks!

@zinkcious
Copy link

zinkcious commented Jul 21, 2017

I find that although there report no error importing weights from darknet using the method supplied by @minhnhat93 (change offset to 20), the detecting result is a little different from the result in darknet, as shown in the pictures below, The left pic is the result of darknet and the right pic is the result of darkflow (both using the same weights and cfg)
https://github.com/zinkcious/machine-learning-Udacity/blob/master/65_cmp.jpg
https://github.com/zinkcious/machine-learning-Udacity/blob/master/01_cmp.png

Any one knows the problem importing weights from darknet?

@zenineasa
Copy link
Author

@zinkcious That could be because of the detection threshold, couldn't that?

@zinkcious
Copy link

I don't think so, is the thresh defined in the last but one line in the cfg file? Both of the two case, it writes: thresh = .6
@zenineasa
as below:
https://github.com/zinkcious/machine-learning-Udacity/blob/master/cmp_code.png

@zinkcious
Copy link

Do you have similar problem like me? @zenineasa @minhnhat93 @Benjamin-Vencill

@zenineasa
Copy link
Author

I kind of moved on with writing one my own using Keras. There had been a few repositories in Github where developers have tried to do the same.

@zinkcious
Copy link

I have new findings, I try train tiny-yolo-voc.cfg using the VOC dataset and get "tiny-yolo-voc_100.weights" whose file size is 63471560 bytes. And when I look at the file size of tiny-yolo-voc.weights download from the official website, it's file size is 63471556 bytes, with 6 bytes different from the weights I trained. I don't understand why is it.

@minhnhat93
Copy link

@zinkcious Yeah, I just checked. I'm having some kind of problem like that too. The objectness score of the detection using darkflow and darknet after the fix are different... Even weirder, now I can load .weights file from darknet but not .backup file from darknet even though the two file formats are the same...

@zinkcious
Copy link

Any one who is familiar with darknet and yolo's source code can answer this question?...

@imaami
Copy link

imaami commented Aug 14, 2017

Yeah, I predicted these things would happen when I noticed darknet uses sizeof() to calculate its binary file format layout. The exact change that caused this is here:

pjreddie/darknet@1467621#diff-bfbbcdf73459e9ea8fb4afa8455ce74dL909

There's an issue about this bug, but unfortunately there's no fix yet (I've been meaning to write a patch, been busy with other things):
pjreddie/darknet#78

@saltedfishpan
Copy link

@minhnhat93
It worked!!!thank you very much!!!
Genius!!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

8 participants