Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

add classification to location possible? #1

Open
weihangChen opened this issue Jan 2, 2018 · 2 comments
Open

add classification to location possible? #1

weihangChen opened this issue Jan 2, 2018 · 2 comments

Comments

@weihangChen
Copy link

really appreciate if you help me out on this.

is that possible to combine classification and localization reusing the same loss function, hense just add one more parameter to the labels? related stackoverflow question
https://stackoverflow.com/questions/45035831/how-can-i-detect-and-localize-object-using-tensorflow-and-convolutional-neural-n

ex:

y[i] = [center.y, center.x, height, width]

updated to

y[i] = ["cat",center.y, center.x, height, width]
y[i] = ["dog",center.y, center.x, height, width]

so the model is able to tell if it is a dog or cat and draw a box around its face, assuming that the training dataset now contains images for both cats and dogs

@aleju
Copy link
Owner

aleju commented Jan 13, 2018

Theoretically you can add that, but it would require an additional categorical output of the model. That again would require a rewrite of the model, a rewrite of all training and test functions and a rewrite of all data loading functions. One would also have to create or find an appropriate dataset.
If you actually want to detect cat and dog bounding boxes, you are better off looking at Faster R-CNN, SSD or something like that. They come out of the box with classification per bounding box and also use better underlying techniques (i.e. are much more accurate).

@weihangChen
Copy link
Author

weihangChen commented Jan 15, 2018

thank you for your advise, actually it is not cat dog face that I am trying to classifiy and locate, but svhn dataset - digits are well horizontally aligned and height is the same. Each digit has its own bounding box and there is a bigger box containing 1 to 5 digits. Since my trained classifier can acheive like 90% accuracy in classification, my initial thought that adding the bounding box in the training might just give me 90% in both identification and coordinate as the result of regression.

As someone without much math skill, it definitely takes me a while to understand that backpropagation can be triggered in any hidden layer, not just the last one. The dence layer at 4 dimensions (or any dimensions) is actually the result of dimension reduction in regression. I just rewrote your implementation from Keras to Tensorflow and get an mse loss between 0.00018 and 0.00005 after 100000 interations, the rectangles drawn around the cat face seems to be fine.

so this week I will experiment this approch on the svhn dataset and see how it goes, but since you already point out that other solutions actually fit the problem better, it might just be a dead end for me heading that road. Again. thanks!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants