Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fine tuning error #39

Closed
mzahran001 opened this issue Jan 27, 2017 · 12 comments
Closed

Fine tuning error #39

mzahran001 opened this issue Jan 27, 2017 · 12 comments

Comments

@mzahran001
Copy link

I am trying to Fine tune the darkflow on my data that contain 2 classes:
1- I created the label.txt file
2- I modified process.py an in #32
3- I downloaded yolo.weights It has size of 789.3 MB
4- I copied the yolov1.cfg into yolov1-2c.cfg and modified only this parameter classes=2
but the problem begin to appear when I use this command
./flow --model cfg/v1.1/yolov1-2c.cfg --load bin/yolo.weights
I get this error

Parsing ./cfg/yolo.cfg
Parsing cfg/v1.1/yolov1-2c.cfg
Loading bin/yolo.weights ...
Traceback (most recent call last):
  File "./flow", line 42, in <module>
    tfnet = TFNet(FLAGS)
  File "/home/moh/Documents/darkflow/net/build.py", line 34, in __init__
    darknet = Darknet(FLAGS)
  File "/home/moh/Documents/darkflow/dark/darknet.py", line 27, in __init__
    self.load_weights()
  File "/home/moh/Documents/darkflow/dark/darknet.py", line 82, in load_weights
    wgts_loader = loader.create_loader(*args)
  File "/home/moh/Documents/darkflow/utils/loader.py", line 104, in create_loader
    return load_type(path, cfg)
  File "/home/moh/Documents/darkflow/utils/loader.py", line 18, in __init__
    self.load(*args)
  File "/home/moh/Documents/darkflow/utils/loader.py", line 76, in load
    walker.offset, walker.size)
AssertionError: expect 269862452 bytes, found 789312988

What I am missing here ?
screenshot from 2017-01-28 01-05-29

@thtrieu
Copy link
Owner

thtrieu commented Jan 28, 2017

  1. The program understands that you want to work with yolov1-2c.
  2. The program sees that you want to load this config from weight yolo.weights
  3. Since the name yolo and yolov1-2c is different, the program assumes that you are doing a partial load from a weight file that has a different config. It assumes the config for yolo.weights is yolo.cfg, and indeed found one in ./cfg/
  4. The program use ./cfg/yolo.cfg to load yolo.weights and failed since yolo.cfg indicates it expect 269862452 bytes, but found a total of 789312988 bytes.

Solution: Change your weight name to yolov1-2c.weights

@thtrieu
Copy link
Owner

thtrieu commented Jan 28, 2017

Please update a new code for some critical bug fixes before fine-tuning your yolov1

@mzahran001
Copy link
Author

Thank you for your effort. It worked but I have a new problem.
After running this command

 ./flow --model cfg/v1.1/yolov1-2c.cfg --load bin/yolov1-2c.weights

It produce this error

Traceback (most recent call last):
  File "./flow", line 60, in <module>
    tfnet.predict()
  File "/home/moh/Documents/darkflow/net/flow.py", line 103, in predict
    os.path.join(inp_path, all_inp[i]))
  File "/home/moh/Documents/darkflow/net/yolo/test.py", line 71, in postprocess
    cords = cords.reshape([SS, B, 4])
ValueError: total size of new array must be unchanged

screenshot from 2017-02-05 11-35-36

@thtrieu
Copy link
Owner

thtrieu commented Feb 5, 2017

This has to do with the output of last layer.
Please make sure this is true:

output = side * side * (classes + num * 5)

where output found in the last [connected]
and side, classes, num found in [detection]

@mzahran001
Copy link
Author

It worked .. but produces new error
First ..the equation :

output = side * side * (classes + num * 5)
output = 7*7*(2+3*5)= 833

when I use this command it works fine:

./flow --model cfg/v1.1/yolov1-2c.cfg

but when using this command

 ./flow --model cfg/v1.1/yolov1-2c.cfg --load bin/yolov1-2c.weights

It produce this error :
screenshot from 2017-02-10 11-36-25

 ./flow --model cfg/v1.1/yolov1-2c.cfg --load bin/yolov1-2c.weights
/home/moh/Documents/darkflow/dark/darknet.py:54: UserWarning: ./cfg/yolov1-2c.cfg not found, use cfg/v1.1/yolov1-2c.cfg instead
  cfg_path, FLAGS.model))
Parsing cfg/v1.1/yolov1-2c.cfg
Loading bin/yolov1-2c.weights ...
Traceback (most recent call last):
  File "./flow", line 42, in <module>
    tfnet = TFNet(FLAGS)
  File "/home/moh/Documents/darkflow/net/build.py", line 34, in __init__
    darknet = Darknet(FLAGS)
  File "/home/moh/Documents/darkflow/dark/darknet.py", line 27, in __init__
    self.load_weights()
  File "/home/moh/Documents/darkflow/dark/darknet.py", line 82, in load_weights
    wgts_loader = loader.create_loader(*args)
  File "/home/moh/Documents/darkflow/utils/loader.py", line 104, in create_loader
    return load_type(path, cfg)
  File "/home/moh/Documents/darkflow/utils/loader.py", line 18, in __init__
    self.load(*args)
  File "/home/moh/Documents/darkflow/utils/loader.py", line 76, in load
    walker.offset, walker.size)
AssertionError: expect 745054228 bytes, found 789312988

@thtrieu
Copy link
Owner

thtrieu commented Feb 10, 2017

Where did you get yolov1-2c.weights? Clearly the file's size does not match cfg/v1.1/yolov1-2c.cfg as indicated by the assertion error (it is larger than what yolov1-2c.cfg expects). From what I checked 789312988 bytes is the size of yolov1.weights.

@mzahran001
Copy link
Author

mzahran001 commented Feb 10, 2017

yolov1-2c.weights it is the same as yolov1.weights .
I only changed the names because of the previous problem.
yolov1-2c.weights= 789,312,988 bytes
can you give me a link for your yolov1.weights ?
This is my yolov1-2c.cfg file:
https://drive.google.com/file/d/0B95Sp237mrsTSGp6Q2V4d2pXQkU/view?usp=sharing

@thtrieu
Copy link
Owner

thtrieu commented Feb 10, 2017

It is not supposed to work like that.

What you have here is yolov1-2c.cfg and yolov1.weights
Then the only possible use case is that you want to partially load yolov1-2c.cfg from yolov1.weights. To do that:

./flow --model cfg/v1.1/yolov1-2c.cfg --load bin/v1.1/yolov1.weights --config cfg/v1.1/

@mzahran001
Copy link
Author

Great ! .. It worked.
but when I start training I used this command :

./flow --train --model cfg/v1.1/yolov1-2c.cfg --load bin/yolov1.weights --config cfg/v1.1/ --annotation labels/ --dataset JPEGImages/

It produces this error :

Running entirely on CPU
cfg/v1.1/yolov1-2c.cfg loss hyper-parameters:
	side    = 7
	box     = 3
	classes = 2
	scales  = [1.0, 1.0, 0.5, 5.0]
Building cfg/v1.1/yolov1-2c.cfg loss
Building cfg/v1.1/yolov1-2c.cfg train op
Finished in 286.667635202s

Enter training ...
Traceback (most recent call last):
  File "./flow", line 53, in <module>
    print('Enter training ...'); tfnet.train()
  File "/home/moh/Documents/darkflow/net/flow.py", line 37, in train
    for i, (x_batch, datum) in enumerate(batches):
  File "/home/moh/Documents/darkflow/net/yolo/data.py", line 130, in shuffle
    data = self.parse()
  File "/home/moh/Documents/darkflow/net/yolo/data.py", line 29, in parse
    return pickle.load(f, encoding = 'latin1')[0]
TypeError: load() got an unexpected keyword argument 'encoding'

screenshot from 2017-02-10 14-21-26

@thtrieu
Copy link
Owner

thtrieu commented Feb 10, 2017

Are you using Python3?

@mzahran001
Copy link
Author

yes.
I am working on Ubuntu that has python 3 per-installed.

@thtrieu
Copy link
Owner

thtrieu commented Feb 10, 2017

encoding is a valid argument for load() in python3. I suggest printing print(sys.version) and make sure the running python is python3.

BTW, please pull new commit, some bugs are fixed.

This issue was closed.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants