Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

cnn提取的特征是否还有必要用hash #15

Closed
ghostviper opened this issue Sep 9, 2019 · 2 comments
Closed

cnn提取的特征是否还有必要用hash #15

ghostviper opened this issue Sep 9, 2019 · 2 comments

Comments

@ghostviper
Copy link

ghostviper commented Sep 9, 2019

首先感谢大佬的cnn case和各种hash benchmark!致敬!想请教下大佬,我用vgg提取了2048d的特征,在用hash索引的方法是不是不太好,我看到你用的是矩阵相乘排序的方式,因为我的图片大概有500多G 抽取下来的特征大概也有几个G,加载到内存有点太伤了,另外看到你关于CNN 抽取特征做PCA的实验非常感谢指点,这个也是个思路(有点好奇cnn输出的特征做dense输出和pca的效果),但是最终 有必要用hash搜索吗,是否会有很大的损失。目前我试过2048D 做LSH 效果 很不太好。

@willard-yuan
Copy link
Owner

能不用hash就尽量不用hash,如果不是做系统工程,只是测试特征的好坏,用暴力搜索就好。另外如果内存不够,也不一定非得一次加载到内存,可以分批计算。如果做系统工程,要构建索引,可以用PQ/OPQ/HNSW等方式,强烈不推荐用hash。

@ghostviper
Copy link
Author

ghostviper commented Sep 9, 2019

谢谢

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants