New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

benchmark #23

Closed
YontiLevin opened this Issue Mar 7, 2017 · 11 comments

Comments

Projects
None yet
6 participants
@YontiLevin

YontiLevin commented Mar 7, 2017

how do you compare to nmslib?
https://github.com/searchivarius/NMSLIB

can you please publish some comparison benchmarks?

10x

@iNDicat0r

This comment has been minimized.

Show comment
Hide comment
@iNDicat0r

iNDicat0r Mar 7, 2017

Would be nice to see few benchmarks

iNDicat0r commented Mar 7, 2017

Would be nice to see few benchmarks

@YontiLevin

This comment has been minimized.

Show comment
Hide comment
@YontiLevin

YontiLevin Mar 7, 2017

nevermind - found it
https://github.com/facebookresearch/faiss/wiki/Indexing-1M-vectors

maybe it should be in the main readme file

YontiLevin commented Mar 7, 2017

nevermind - found it
https://github.com/facebookresearch/faiss/wiki/Indexing-1M-vectors

maybe it should be in the main readme file

@YontiLevin YontiLevin closed this Mar 7, 2017

@jegou

This comment has been minimized.

Show comment
Hide comment
@jegou

jegou Mar 7, 2017

Member

Thank you Yonti. This wiki page is for small-scale benchmark, so we don't want to put too much emphasis on it because it is not our primary target. Said otherwise,

  • nmslib is very good on million-scale benchmarks
  • Faiss is mostly intended to index much larger-scale datasets, for which sub-linear training time and index overhead are key factors in the choice of the algorithms.
Member

jegou commented Mar 7, 2017

Thank you Yonti. This wiki page is for small-scale benchmark, so we don't want to put too much emphasis on it because it is not our primary target. Said otherwise,

  • nmslib is very good on million-scale benchmarks
  • Faiss is mostly intended to index much larger-scale datasets, for which sub-linear training time and index overhead are key factors in the choice of the algorithms.
@willard-yuan

This comment has been minimized.

Show comment
Hide comment
@willard-yuan

willard-yuan Mar 8, 2017

I used the script bench_gpu_sift1m_falconn.py to compare faiss with falconn:

============ Falconn search
falconn hashing table finished...
nprobe=  16 0.007 s recalls= 0.7648 0.8562 0.8567
nprobe=  32 0.010 s recalls= 0.8274 0.9301 0.9306
nprobe=  64 0.014 s recalls= 0.8607 0.9723 0.9728
nprobe= 128 0.020 s recalls= 0.8757 0.9915 0.9920
nprobe= 256 0.026 s recalls= 0.8794 0.9973 0.9978
nprobe= 512 0.034 s recalls= 0.8808 0.9990 0.9995

============ Faiss search
nprobe=  16 0.050 s recalls= 0.8273 0.8324 0.8324
nprobe=  32 0.069 s recalls= 0.8957 0.9024 0.9024
nprobe=  64 0.103 s recalls= 0.9477 0.9549 0.9549
nprobe= 128 0.170 s recalls= 0.9760 0.9837 0.9837
nprobe= 256 0.293 s recalls= 0.9866 0.9944 0.9944
nprobe= 512 0.519 s recalls= 0.9907 0.9987 0.9987

The Falconn seems very promising, but currently it only supports static data sets.

willard-yuan commented Mar 8, 2017

I used the script bench_gpu_sift1m_falconn.py to compare faiss with falconn:

============ Falconn search
falconn hashing table finished...
nprobe=  16 0.007 s recalls= 0.7648 0.8562 0.8567
nprobe=  32 0.010 s recalls= 0.8274 0.9301 0.9306
nprobe=  64 0.014 s recalls= 0.8607 0.9723 0.9728
nprobe= 128 0.020 s recalls= 0.8757 0.9915 0.9920
nprobe= 256 0.026 s recalls= 0.8794 0.9973 0.9978
nprobe= 512 0.034 s recalls= 0.8808 0.9990 0.9995

============ Faiss search
nprobe=  16 0.050 s recalls= 0.8273 0.8324 0.8324
nprobe=  32 0.069 s recalls= 0.8957 0.9024 0.9024
nprobe=  64 0.103 s recalls= 0.9477 0.9549 0.9549
nprobe= 128 0.170 s recalls= 0.9760 0.9837 0.9837
nprobe= 256 0.293 s recalls= 0.9866 0.9944 0.9944
nprobe= 512 0.519 s recalls= 0.9907 0.9987 0.9987

The Falconn seems very promising, but currently it only supports static data sets.

@jegou

This comment has been minimized.

Show comment
Hide comment
@jegou

jegou Mar 8, 2017

Member

@willard-yuan : the link to your script is incorrect (it is linked to ours).

my understanding is that Falconn uses the full vectors.

As stated in previous discussion and the doc, this 1M small-scale benchmark is not what Faiss has been initially built for (See the discussion with nmslib, for which we give numbers). In Faiss, we mostly consider hundred millions to billions of vectors, which is not a scale on which Falconn has been tested or evaluated to my knowledge... This is why we specifically use compressed domain representations and index structures with little memory overhead.

Member

jegou commented Mar 8, 2017

@willard-yuan : the link to your script is incorrect (it is linked to ours).

my understanding is that Falconn uses the full vectors.

As stated in previous discussion and the doc, this 1M small-scale benchmark is not what Faiss has been initially built for (See the discussion with nmslib, for which we give numbers). In Faiss, we mostly consider hundred millions to billions of vectors, which is not a scale on which Falconn has been tested or evaluated to my knowledge... This is why we specifically use compressed domain representations and index structures with little memory overhead.

@willard-yuan

This comment has been minimized.

Show comment
Hide comment
@willard-yuan

willard-yuan Mar 9, 2017

@jegou Sorry to the link incorrect. I did the experiment based on your script. The code I added to bench_gpu_sift1m_falconn.py is as follows:

#################################################################
# Falconn search experiment
#################################################################
print "============ Falconn search"

import falconn
params_cp = falconn.LSHConstructionParameters()
params_cp.dimension = d
params_cp.lsh_family = 'cross_polytope'
params_cp.distance_function = 'negative_inner_product'
params_cp.storage_hash_table = 'flat_hash_table'
params_cp.k = 3
params_cp.l = 10
params_cp.num_setup_threads = 0
params_cp.last_cp_dimension = 16
params_cp.num_rotations = 3
seed = 119417657
params_cp.seed = seed ^ 833840234
cp_table = falconn.LSHIndex(params_cp)
cp_table.setup(xb)
print "falconn hashing table finished..."

nprobes = [16, 32, 64, 128, 256, 512]
for i, nprobe in enumerate(nprobes):
    I = []
    delta_t = 0.0
    cp_table.set_num_probes(nprobe)
    for i in range(xq.shape[0]):
        t0 = time.time()
        idxs = cp_table.find_k_nearest_neighbors(xq[i, :], 100)
        t1 = time.time()
        delta_t += t1 - t0
        I.append(idxs)
    I = np.array(I)
    print "nprobe=%4d %.3f s recalls=" % (nprobe, delta_t / float(nq)),
    for rank in 1, 10, 100:
        n_ok = (I[:, :rank] == gt[:, :1]).sum() # recall rate @ 1, 10, 100 nearest neighbors
        print "%.4f" % (n_ok / float(nq)),
    print

I didn't take much time to tune the number of hashing function k and the hashing table l. but the result shows ok.

So for hundred millions to billions of vectors (very large-scale), it's a good option to choose Faiss. For small-scale dataset, Falconn and Nmslib are good choice. Did I get the point?

willard-yuan commented Mar 9, 2017

@jegou Sorry to the link incorrect. I did the experiment based on your script. The code I added to bench_gpu_sift1m_falconn.py is as follows:

#################################################################
# Falconn search experiment
#################################################################
print "============ Falconn search"

import falconn
params_cp = falconn.LSHConstructionParameters()
params_cp.dimension = d
params_cp.lsh_family = 'cross_polytope'
params_cp.distance_function = 'negative_inner_product'
params_cp.storage_hash_table = 'flat_hash_table'
params_cp.k = 3
params_cp.l = 10
params_cp.num_setup_threads = 0
params_cp.last_cp_dimension = 16
params_cp.num_rotations = 3
seed = 119417657
params_cp.seed = seed ^ 833840234
cp_table = falconn.LSHIndex(params_cp)
cp_table.setup(xb)
print "falconn hashing table finished..."

nprobes = [16, 32, 64, 128, 256, 512]
for i, nprobe in enumerate(nprobes):
    I = []
    delta_t = 0.0
    cp_table.set_num_probes(nprobe)
    for i in range(xq.shape[0]):
        t0 = time.time()
        idxs = cp_table.find_k_nearest_neighbors(xq[i, :], 100)
        t1 = time.time()
        delta_t += t1 - t0
        I.append(idxs)
    I = np.array(I)
    print "nprobe=%4d %.3f s recalls=" % (nprobe, delta_t / float(nq)),
    for rank in 1, 10, 100:
        n_ok = (I[:, :rank] == gt[:, :1]).sum() # recall rate @ 1, 10, 100 nearest neighbors
        print "%.4f" % (n_ok / float(nq)),
    print

I didn't take much time to tune the number of hashing function k and the hashing table l. but the result shows ok.

So for hundred millions to billions of vectors (very large-scale), it's a good option to choose Faiss. For small-scale dataset, Falconn and Nmslib are good choice. Did I get the point?

@jegou

This comment has been minimized.

Show comment
Hide comment
@jegou

jegou Mar 9, 2017

Member

@willard-yuan: thank you for the details of your script. This sounds ok.

Considering your comment: yes, nmslib/Falconn are good candidates for the setup you experiment (small dataset, no memory constraint).

For small datasets, Faiss remains the choice if

  1. you have memory constraints.
  2. if you want to do exact search on the CPU or GPU (or exact k-means or knn-graph), since we believe that we have the fastest published implementation.
Member

jegou commented Mar 9, 2017

@willard-yuan: thank you for the details of your script. This sounds ok.

Considering your comment: yes, nmslib/Falconn are good candidates for the setup you experiment (small dataset, no memory constraint).

For small datasets, Faiss remains the choice if

  1. you have memory constraints.
  2. if you want to do exact search on the CPU or GPU (or exact k-means or knn-graph), since we believe that we have the fastest published implementation.
@mrgloom

This comment has been minimized.

Show comment
Hide comment
@mrgloom

mrgloom Mar 14, 2017

Here aproximate knn benchmark, it will be great if you contribute your results:
https://github.com/erikbern/ann-benchmarks

mrgloom commented Mar 14, 2017

Here aproximate knn benchmark, it will be great if you contribute your results:
https://github.com/erikbern/ann-benchmarks

@jegou

This comment has been minimized.

Show comment
Hide comment
@jegou

jegou Mar 14, 2017

Member

@mrgloom
We actually included a comparison with the best performing library (namely nmslib) in the wiki, see
https://github.com/facebookresearch/faiss/wiki/Indexing-1M-vectors

However, I must say that this benchmark is very limited in my opinion, since it only considers million-sized vector sets. This is in deep contrast with many state-of-the-art approaches of the literature, which haver been routinely evaluated on billion-sized benchmarks since 2010.

For instance, the SIFT1M benchmark comes from this page,
http://corpus-texmex.irisa.fr/
but at the same URL there is another SIFT1B that would be much more interesting for ANN methods that can actually scale. This is the typical size for which Faiss has been built.

Other things that could be improved for this benchmark:

  • it does not consider GPUs,
  • it does not consider batch queries (quite important in practice).
  • it does not include the memory usage. For instance, nmslib takes 2.5x more memory on the SIFT1M benchmark than Faiss, see our wiki.

Cleary such an experimental protocol is not what interest us, and not the setup that should make you adopt Faiss versus nmslib (except if the memory requirement of nmslib is considered problematic).
Apart from that, this is a really useful benchmark for ANN methods on that scale. I have linked it from the wiki (in the section "Indexing 1M vectors").

Member

jegou commented Mar 14, 2017

@mrgloom
We actually included a comparison with the best performing library (namely nmslib) in the wiki, see
https://github.com/facebookresearch/faiss/wiki/Indexing-1M-vectors

However, I must say that this benchmark is very limited in my opinion, since it only considers million-sized vector sets. This is in deep contrast with many state-of-the-art approaches of the literature, which haver been routinely evaluated on billion-sized benchmarks since 2010.

For instance, the SIFT1M benchmark comes from this page,
http://corpus-texmex.irisa.fr/
but at the same URL there is another SIFT1B that would be much more interesting for ANN methods that can actually scale. This is the typical size for which Faiss has been built.

Other things that could be improved for this benchmark:

  • it does not consider GPUs,
  • it does not consider batch queries (quite important in practice).
  • it does not include the memory usage. For instance, nmslib takes 2.5x more memory on the SIFT1M benchmark than Faiss, see our wiki.

Cleary such an experimental protocol is not what interest us, and not the setup that should make you adopt Faiss versus nmslib (except if the memory requirement of nmslib is considered problematic).
Apart from that, this is a really useful benchmark for ANN methods on that scale. I have linked it from the wiki (in the section "Indexing 1M vectors").

@mrgloom

This comment has been minimized.

Show comment
Hide comment
@mrgloom

mrgloom Mar 15, 2017

btw what PCA method\library did you use in your experiment with 4096D image descriptors?
Seems it's about 1M * 4096 * 4 ~ 16 Gb, did you use some online-PCA/out-of-core-PCA?
like https://www.reddit.com/r/MachineLearning/comments/2zresl/how_to_pca_large_data_sets_im_running_out_of/

mrgloom commented Mar 15, 2017

btw what PCA method\library did you use in your experiment with 4096D image descriptors?
Seems it's about 1M * 4096 * 4 ~ 16 Gb, did you use some online-PCA/out-of-core-PCA?
like https://www.reddit.com/r/MachineLearning/comments/2zresl/how_to_pca_large_data_sets_im_running_out_of/

@reshu-b7

This comment has been minimized.

Show comment
Hide comment
@reshu-b7

reshu-b7 Jul 27, 2018

Hi @mdouze @YontiLevin @mrgloom

Is there any script for benchmarking Deep1B or SIFT1B in C++ in this repository ? It will be great if I can get some reference as I am having some difficulties refactoring the SIFT1M C++ benchmark for 10M or higher vectors.

Thanks.

reshu-b7 commented Jul 27, 2018

Hi @mdouze @YontiLevin @mrgloom

Is there any script for benchmarking Deep1B or SIFT1B in C++ in this repository ? It will be great if I can get some reference as I am having some difficulties refactoring the SIFT1M C++ benchmark for 10M or higher vectors.

Thanks.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment