# IVFADC

IVFADC is ideal when our main priority is to minimize memory usage while maintaining fast search speeds. This comes at the cost of okay — but not good recall performance.

There are two steps to indexing with IVFADC:
- Vectors are assigned to different lists (or Voronoi cells) in the IVF structure.
- The vectors are compressed using PQ.

After indexing vectors, an Asymmetric Distance Computation (ADC) is performed between query vectors xq and our indexed, quantized vectors.

With symmetric distance computation (SDC, left) we quantize xq before comparing it to our previously quantized xb vectors. ADC (right) skips the quantization of xq and compares it directly to the quantized xb vectors.

In [2]:
import numpy as np

# now define a function to read the fvecs file format of Sift1M dataset
def read_fvecs(fp):
    a = np.fromfile(fp, dtype='int32')
    d = a[0]
    return a.reshape(-1, d + 1)[:, 1:].copy().view('float32')

wb = read_fvecs('./sift/sift_base.fvecs')  # 1M samples
# also get some query vectors to search with
xq = read_fvecs('./sift/sift_query.fvecs')
xq = xq[0].reshape(1, xq.shape[1])

wb.shape, xq.shape

((1000000, 128), (1, 128))

In [3]:
wb = wb[:500000]
d = wb.shape[1]
wb.shape, d

((500000, 128), 128)

In [20]:
# lets COmpare with IVFFlat
import faiss

nlist = 256
quantizer  = faiss.IndexFlatL2(d)
index_base = faiss.IndexIVFFlat(quantizer, d, nlist )
index_base.train(wb)
index_base.add(wb)

In [21]:
# Set higher nprobe for baseline to get more comprehensive results
index_base.nprobe = 16
D_base, I_base = index_base.search(xq, k = 100)

In [34]:
%%timeit
index_base.search(xq, k=100)

631 μs ± 52.3 μs per loop (mean ± std. dev. of 7 runs, 1,000 loops each)


In [8]:
def compare_results(I1, I2):
    n_correct = 0
    for i in range(I1.shape[1]):
        if I1[0, i] in I2[0, :]:
            n_correct += 1
    print(f'Number of correct matches: {n_correct} out of {I1.shape[1]}')

In [49]:
import faiss
index = faiss.index_factory(d, "IVF256,PQ32x8")
index.train(wb)
index.add(wb)


In [70]:
D, I = index.search(xq, k = 100)

With this, we create an IVFADC index with 256 IVF cells; each vector is compressed with PQ using m and nbits values of 32 and 8, respectively. PQ uses nbits == 8 by default so we can also write "IVF256,PQ32".

m: number of subvectors that original vectors are split into

nbits: number of bits used by each subquantizer, we can calculate the number of centroids used by each subquantizer as 2**nbits

We can decrease nbits to reduce index memory usage or increase to improve recall and search speed. However, the current version of Faiss does restrict nbits to >= 8 for IVF,PQ

In [None]:
compare_results(I_base, I)

Number of correct matches: 21 out of 100


In [77]:
index.nprobe 

1

Lets increse the index.probe and compare again

In [78]:


print(f"Current nprobe: {index.nprobe}")
D, I = index.search(xq, k = 100)
compare_results(I_base, I)

# Set nprobe to 32 and search again
index.nprobe = 16
print(f"New nprobe: {index.nprobe}")

D_nprobe16, I_nprobe16 = index.search(xq, k=100)
compare_results(I_base, I_nprobe16)

Current nprobe: 1
Number of correct matches: 21 out of 100
New nprobe: 16
Number of correct matches: 75 out of 100


In [79]:
# Set nprobe to 32 and search again
index.nprobe = 32
print(f"New nprobe: {index.nprobe}")

D_nprobe32, I_nprobe32 = index.search(xq, k=100)
compare_results(I_base, I_nprobe32)

New nprobe: 32
Number of correct matches: 75 out of 100


IVFADC and other indexes using PQ can benefit from Optimized Product Quantization (OPQ).

OPQ works by rotating vectors to flatten the distribution of values across the subvectors used in PQ. This is particularly beneficial for unbalanced vectors with uneven data distributions

OPQ in faiss is only rotaion, if you want to compress vectors you should use PQ again



In [83]:
index = faiss.index_factory(d, "OPQ32,IVF256,PQ32x8")
index.train(wb)
index.add(wb)


In [85]:
D_1, I_1 = index.search(xq, k=100)
compare_results(I_base, I_1)

Number of correct matches: 21 out of 100


We can increase our nprobe value to improve recall (at the cost of speed). However, because we added a pre-processing step to our index, we cannot access nprobe directly with index.nprobe as this index no longer refers to the IVF portion of our index.

In [86]:
# we must extract the IVF index before modifying the nprobe value 
ivf = faiss.extract_index_ivf(index)
ivf.nprobe = 16
D_16, I_16 = index.search(xq, k=100)
compare_results(I_base, I_16)

Number of correct matches: 79 out of 100


# MULTI-D-ADC
The multi-D-ADC index is based on the inverted multi-index (IMI), an extension of IVF. Alongside PQ which produce ADC at search time

IMI can outperform IVF in both recall and search speed but does increase memory usage 

The IMI index works in a very similar way to IVF, but Voronoi cells split across multiple vector subspaces. 

Given a query vector xq, we would compare each xq subvector to its respective subspace cells.




In [87]:
index = faiss.index_factory(d, "IMI2x8,PQ32")
index.train(wb)  # index construction time is large for IMI
index.add(wb)

In [88]:
D_1, I_1 = index.search(xq, k=100)
compare_results(I_base, I_1)

Number of correct matches: 1 out of 100


In [89]:
imi = faiss.extract_index_ivf(index)
imi.nprobe = 16
D_16, I_16 = index.search(xq, k=100)
compare_results(I_base, I_16)

Number of correct matches: 25 out of 100


In [90]:
imi.nprobe = 128
D_128, I_128 = index.search(xq, k=100)
compare_results(I_base, I_128)

Number of correct matches: 65 out of 100


In [91]:
imi.nprobe = 200
D_128, I_128 = index.search(xq, k=100)
compare_results(I_base, I_128)

Number of correct matches: 74 out of 100


In [92]:
# We forgot to compare times taken for searches for recall above 70
# "OPQ32,IMI2x8,PQ32" is one of our best indexes in terms of recall and speed at low memory.

# HNSW Indexes
This index splits our indexed vectors into cells as per usual with IVF, but this time we will optimize the process using HNSW.

Compared to our previous two indexes, IVF with HNSW produces comparable or better speed and significantly higher recall — at the cost of much higher memory usage.

HNSW graphs break the typical graph containing both long-range and short-range links into multiple layers (hierarchies). During the search, we begin at the highest layer, which consists of long-range links. As we move down through each layer, the links become more granular.

#### But this is HNSW, how do we integrate it with IVF ???
Using `IVF`, we introduce our query vector `xq` and compare it to every cell centroid(exhaustive), then you pick the top `nprobe` closest centroids (cells) and now you compare `xq` against the vectors stored in those `nprobe` cells( unless we use `PQ`)

To pair this process with `HNSW`, we produce an HNSW graph of all of these cell centroids instead of the vectors, making the exhaustive centroid search approximate searching its nearest neighbours.

Previously we have been using `IVF256`, means 256 centroids. An exhaustive search of 256 `centroids` with `xq` is pretty fast, so there is no reason to use an approximate search with few centroids. 

Lets say we have `1M=1000000` vectors and `~256=250` centroids, so each centroid cell has `~4000` vectors and here we are doing exhaustive search..In this case  `IVF+HNSW `on the cell centroids does not help.

With `IVF+HNSW` indexes, we need to swap ` few centroids and large cells` for  `many centroids and small cells`.

So, 
The standard IVF+HNSW index can be built with "IVF4096_HNSW32,Flat":
- 4096 IVF cells.
- Cell centroids are stored in an HNSW graph. Each centroid is linked to 32 other centroids.
- The vectors themselves have not been changed. They are Flat vectors.

In [23]:
import faiss
index = faiss.index_factory(d, "IVF4096_HNSW32,Flat")
index.train(wb)
index.add(wb)

In [24]:
D_1, I_1 = index.search(xq, k=100)

In [28]:
%%timeit
index.search(xq, k=100)

14.9 μs ± 385 ns per loop (mean ± std. dev. of 7 runs, 100,000 loops each)


In [29]:
compare_results(I_base, I_1)

Number of correct matches: 26 out of 100


In [30]:
index.nprobe

1

In [31]:
index.nprobe = 16
D_16, I_16 = index.search(xq, k=100)
compare_results(I_base, I_16)

Number of correct matches: 68 out of 100


In [32]:
index.nprobe =128
D_128, I_128 = index.search(xq, k=100)
compare_results(I_base, I_128)

Number of correct matches: 99 out of 100


In [33]:
%%timeit
index.search(xq, k=100)

298 μs ± 13.8 μs per loop (mean ± std. dev. of 7 runs, 1,000 loops each)


However, the IVF+HNSW index is not without its flaws. Although we have incredible recall and fast search speeds, the memory usage of this index is huge

In [35]:
import os
def get_memory_size(index):
    faiss.write_index(index, "temp.index")
    file_size  = os.path.getsize("temp.index")
    os.remove("temp.index")
    return file_size

index_size, index_base_size = get_memory_size(index), get_memory_size(index_base)
print(f"Index size: {index_size} bytes")   
print(f"Index_base size: {index_base_size} bytes")

Index size: 263241920 bytes
Index_base size: 260133259 bytes


we can reduce this using PQ and OPQ, but this will reduce recall and increase search times.

#### Remember nothing comes free, everything comes with a tradeoff