# Lower memory footprint

https://github.com/facebookresearch/faiss/wiki/Lower-memory-footprint

# 使用内存太多，我如何压缩存储?——IndexIVFPQ

In [16]:
import numpy as np
d = 64                           # dimension
nb = 100000                      # database size
nq = 10000                       # nb of queries
np.random.seed(1234)             # make reproducible
xb = np.random.random((nb, d)).astype('float32')
xb[:, 0] += np.arange(nb) / 1000.
xq = np.random.random((nq, d)).astype('float32')
xq[:, 0] += np.arange(nq) / 1000.
print('xb', xb.shape)
# print('xb', xb[:1])
print('xq', xq.shape)
# print('xq', xq[:1])

xb (100000, 64)
xq (10000, 64)


`IndexFlatL2` 和 `IndexIVFFlat` 都会保存全部的向量. 为了扩展到非常大的数据集，Faiss提供了变通，基于乘积量化器（product quantizers）的有损压缩来存储向量。

向量仍然存在在`Voronoi cells`, 但是它们的大小减少到你设置的字节数`m`(`d`必须是`m`的倍数).

压缩是基于一个`Product Quantizer`, that can be seen as an additional level of quantization, that is applied on sub-vectors of the vectors to encode.

In this case, since the vectors are not stored exactly, the distances that are returned by the search method are also approximations.

> `IVF`是`Voronoi cells`，`PQ`是压缩


In [17]:
nlist = 100
m = 8                             # number of subquantizers
k = 4

import faiss

quantizer = faiss.IndexFlatL2(d)  # this remains the same
index = faiss.IndexIVFPQ(quantizer, d, nlist, m, 8)
                                    # 8 specifies that each sub-vector is encoded as 8 bits
index.train(xb)
index.add(xb)
D, I = index.search(xb[:5], k)    # sanity check
print(I)
print(D)

[[   0   78  608  159]
 [   1 1063  555  380]
 [   2  304  134   46]
 [   3   64  773  265]
 [   4  288  827  531]]
[[1.6157436 6.1152253 6.4348025 6.564184 ]
 [1.389575  5.6771317 5.9956017 6.486294 ]
 [1.7025063 6.121688  6.189084  6.489888 ]
 [1.8057687 6.5440307 6.6684756 6.859398 ]
 [1.4920276 5.79976   6.190908  6.3791513]]


In [18]:
index.nprobe = 10              # make comparable with experiment above
D, I = index.search(xq, k)     # search
print(I[:5])

[[ 399  210  329 1619]
 [1193   39  911  187]
 [1267  197  527  425]
 [ 184  599  466  359]
 [ 828  377  120  416]]


# Results

当用训练向量搜索时，结果如下：
```text
[[   0   78  608  159]
 [   1 1063  555  380]
 [   2  304  134   46]
 [   3   64  773  265]
 [   4  288  827  531]]
```
可以看到，knn是正确的，向量的id是他们自己, 即第一列。
```text
[[1.6157436 6.1152253 6.4348025 6.564184 ]
 [1.389575  5.6771317 5.9956017 6.486294 ]
 [1.7025063 6.121688  6.189084  6.489888 ]
 [1.8057687 6.5440307 6.6684756 6.859398 ]
 [1.4920276 5.79976   6.190908  6.3791513]]
```
但是，向量与其自己的距离不是0，尽管这个距离与其近邻相比很低，即第一列与后面列的值相比很低. 这是由于压缩后有损失.

本例中，我们把`64维的32位浮点（32/8*64=256 bytes）`压缩成`8字节`，即用8个字节保存64维的32位浮点数，因此, so the compression factor is 32.

当用查询向量搜索时, 结果如下:
```text
[[ 9432  9649  9900 10287]
 [10229 10403  9829  9740]
 [10847 10824  9787 10089]
 [11268 10935 10260 10571]
 [ 9582 10304  9616  9850]]
```
They can be compared with the IVFFlat results above. For this case, most results are wrong, but they are in the correct area of the space, as shown by the IDs around 10000. The situation is better for real data because:

对均匀分布的数据（uniform data）进行索引很难，因为它没有规律来探索cluster和降维.

而对于自然数据, 语义相近的近邻通常与不相关的很接近.

## 简化索引的构造
因为构建索引有点复杂，所以，Faiss提供了一个工厂函数，你用字符串就可以构造索引. The indexes above can be obtained with the following shorthand:
```python
index = faiss.index_factory(d, "IVF100,PQ8")
```
把`PQ4`替换成`Flat`就相当于`IndexFlat`. 当对输入向量集合进行预处理（如，PCA）时，工厂方法就特别有用. 例如, 把输入向量集合进行预处理，利用PAC降维到32维，工厂方法的字符串可写成: "PCA32,IVF100,Flat".

# Further reading

Explore the next sections to get more specific information about the types of indexes, GPU faiss, coding structure, etc.