Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[question] any instruction on train multi custom classes? #13

Closed
eisneim opened this issue Mar 9, 2017 · 10 comments
Closed

[question] any instruction on train multi custom classes? #13

eisneim opened this issue Mar 9, 2017 · 10 comments

Comments

@eisneim
Copy link

eisneim commented Mar 9, 2017

Hi, suppose i wanna use kittiBox to detect road-sign, billboard, monitor screen and totally 7 types of object, i seem like i have to modify AnnotationLib.py to make it parse annotation rect with class number.

thanks!

@MarvinTeichmann
Copy link
Owner

Hi,

yes, you do. In addition you have to modify the loss. Currently it does this:

true_classes = tf.reshape(tf.cast(tf.greater(classes, 0), 'int64'), [outer_ size * hypes['rnn_len']])

This is implementation is actually quite ugly and only works for two classes. Make sure that you generate a vector representing class labels as one-hot vector or as numerical class vector. You can then use this vector in combination with: tf.nn.sparse_softmax_cross_entropy_with_logits or tf.nn.softmax_cross_entropy_with_logits depending on your vector encoding.

Btw. I am also going to try to reduce the annonlib dependency. So you might want to keep an eye on the next one or two commits.

@eisneim
Copy link
Author

eisneim commented Mar 9, 2017

@MarvinTeichmann great! I'll give it a try

MarvinTeichmann added a commit that referenced this issue Mar 10, 2017
MarvinTeichmann added a commit that referenced this issue Mar 10, 2017
This commit removes a large chunk of Annolib dependency. The data is now loaded directly from the kitti dataset .txt. In addition the loss was improved to utilize a mask for `don't care` areas.

The modifications in this commit can be very useful for #13.
@MarvinTeichmann
Copy link
Owner

As mentioned in the commit messages, keep a close look at the modifications of fab6bae and bb25fc6. The functions in fab6bae can help you to visualize what is going on during data loading. bb25fc6 removes most of the dependency to AnnoLib. Data loading is now done directly fromt he kitti files. You might want to do somethink along the lines of bb25fc6 for you stuff.

@eisneim
Copy link
Author

eisneim commented Mar 11, 2017

@MarvinTeichmann wow, thank you so much

@eisneim
Copy link
Author

eisneim commented Mar 11, 2017

@MarvinTeichmann after spending few hours reading the code, i just got some basic understanding of it, and there is an error initializing tf computation graph:

File "/Users/eisneim/www/deepLearning/KittiBox/hypes/../decoder/fastBox.py", line 382, in loss
    name='reg_loss')
File "/Library/Python/2.7/site-packages/tensorflow/python/ops/math_ops.py", line 1951, in add_n
    raise ValueError("inputs must be a list of at least one Tensor with the "
ValueError: inputs must be a list of at least one Tensor with the same dtype and shape

since YOLO v2 is working OK for my project, so i guess i'll try to fine-tune YOLO model first and keeping an eye on KittiBox, definitely gonna test out KittiBox later
Thanks

@MarvinTeichmann
Copy link
Owner

Did you run git submodule update after your last pull? The reason for the error is, that you are using an old tensorflow-fcn version together with the new kittibox version. (Or vice versa).

@eisneim
Copy link
Author

eisneim commented Mar 11, 2017

oops! that's my mistake, issue fixed

@eisneim eisneim closed this as completed Mar 11, 2017
@MarvinTeichmann
Copy link
Owner

@eisneim may I asked which Yolo implementation you are using? I would like to add another baseline to the paper.

@eisneim
Copy link
Author

eisneim commented Mar 11, 2017

@MarvinTeichmann darknet for training (using YOLOv2 network) use tensorflow implementation darkflow for other stuff, use darknet binary I can get about 30-40fps performance, but accuracy varies across different class due to training datasets(which is hand annotated using my chrome extension)

my ultimate goal is to not only detect the object, but also estimate object's surface homography(4 points) so i can replace and render some other image on top of detected surface

@yangyuekai
Copy link

Hi, I want to know that did you change the code to multi class? If you did,can you give me a instruction? I have no idea on the change of code. Thanks a lot if you can answer me. It's better if you can help me.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants