Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

could you do me a favor? #6

Closed
llf1234 opened this issue Apr 4, 2018 · 2 comments
Closed

could you do me a favor? #6

llf1234 opened this issue Apr 4, 2018 · 2 comments

Comments

@llf1234
Copy link

llf1234 commented Apr 4, 2018

i am a beginner in deep hash, there are two questions I can not understand,
1 why do not use classification results as hashcode? where are the shortcomings?
2 i find almost all supervised deep
hash methods use alexnet as backbone network ,is it allowed to change this network when i making papers
Please favor me with your instructions。
thank you very much!

@caozhangjie
Copy link
Collaborator

caozhangjie commented Apr 4, 2018 via email

@yuewuqing2224
Copy link

I want to add some of my own thought on this.

The overall goal of image hashing is to map each image to a binary hash code so that ones that are similar shares similar hash codes and ones that are different have different hash codes. The distance is measured in Hamming distance. Binary codes are usually much shorter than the number of classes. This ensures fast retrieval speed since distance between binary codes can be efficiently computed using bitwise xor and smaller hash codes will further boost the speed. So two things that people should care about: (1) image similarity (2) hash code learning

Most papers nowadays are more concerned with improving the retrieval speed and accuracy based on commonly used datasets. This means that (2) are more relevant. (1) is simply based on some common practice in the field. If you look at those datasets, you will notice that they all have class label informations. Even if some paper do not use this directly and formulate it as similarity matrix, you should know that whether images are similar or not will always be computed from class label. For instance in imagenet100 and cifar10, this is simply the class lable. In NUS WIDE and MS COCO, this is whether two images share at least one common class label or not.

And as a side note, the pretrained model used for image hashing will always be from some pretrained Imagenet classification model. You never see people report training from scratch. This is because many simply do not converge. So I would say that nowadays most image hashing models work by projecting one hot vector or multiclass label into a much shorter binary hash codes. This is different from classification because softmax results or some intermediate conv features as used in face recognition are all floating values. You usually calculate the max or the Euclidean distance. Results from them can not be directly mapped to binary codes without fine-tune with image hashing algorithm.

With that being said, if you are interested in (1), you can check out papers like context embedding network or contextual visual similarity. If you don't care about it, then feel free to use any class label information you want if it gives you good results. Many papers do use it to boost their performance and some even strictly use it.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants