In [9]:
import numpy as np
import faiss

# Step 1: Generate some random data (e.g., 100 vectors of dimension 128)
d = 128  # Dimension of the vectors
nb = 100  # Number of vectors in the database
nq = 10   # Number of query vectors

# Random vectors as database (nb vectors of dimension d)
xb = np.random.random((nb, d)).astype('float32')

# Random vectors as queries (nq vectors of dimension d)
xq = np.random.random((nq, d)).astype('float32')


print(xb.shape)
print(xq.shape)
# Step 2: Create a Faiss index
# IndexFlatL2 is a simple index that uses the L2 distance to search for nearest neighbors
index = faiss.IndexFlatL2(d)

# Step 3: Add the vectors to the index
index.add(xb)  # Add the database vectors to the index

# Step 4: Perform a search for the nearest neighbors
k = 5  # Number of nearest neighbors to find
D, I = index.search(xq, k)  # D will contain the distances, I will contain the indices

# Step 5: Display the results
print("Distances (D):\n", D)
print("Indices (I):\n", I)

(100, 128)
(10, 128)
Distances (D):
 [[16.256353 17.526577 17.641703 18.028427 18.198862]
 [14.221191 15.588301 15.953667 16.61119  16.65957 ]
 [17.010647 17.553207 17.598967 17.89856  17.982792]
 [17.655169 17.753386 17.866066 17.890345 18.051277]
 [15.694202 16.456861 17.053413 17.112684 17.229088]
 [15.802317 16.401707 16.924883 17.364285 17.461557]
 [15.935392 16.116259 17.012123 17.118624 17.210512]
 [15.693578 16.61319  17.206377 18.065845 18.819876]
 [14.957388 15.994317 16.049309 16.131912 16.709272]
 [17.370728 17.972992 18.210238 18.511457 18.730286]]
Indices (I):
 [[76 77 23 53 49]
 [36 43 31 12 64]
 [52 84 39 81 19]
 [78 12 76 57 43]
 [87 96 75  7 51]
 [15 93 29 23 72]
 [39 65 75 10 26]
 [89 35 38 73 20]
 [55 69 37 32 85]
 [57 99 72  6 50]]


In [11]:
xb

array([[-0.69962424,  0.05650522,  0.8492318 , ...,  0.23148045,
         0.18666257,  0.3201271 ],
       [ 0.17743818,  0.3238134 ,  0.26041076, ...,  0.5004343 ,
         0.6291678 ,  0.3896989 ],
       [ 0.3778883 ,  0.55530465,  0.7270944 , ...,  0.05307725,
         0.5341706 ,  0.98758173],
       ...,
       [ 0.53600526,  0.4145273 ,  0.11517183, ...,  0.9445234 ,
         0.50819606,  0.9418767 ],
       [ 0.9208359 ,  0.08851461,  0.8492019 , ...,  0.34470984,
         0.3317671 ,  0.8395345 ],
       [ 0.6159058 ,  0.20452574,  0.84711885, ...,  0.9428098 ,
         0.6144394 ,  0.49276283]], shape=(100, 128), dtype=float32)

These are the results of querying **10 query vectors** (`xq`) against a FAISS index containing **100 base vectors** (`xb`), and you're asking FAISS to return the **5 nearest neighbors (k=5)** for each query vector.

---

### Result Breakdown

You got two outputs:

1. **Distances (D)** – these are the L2 (Euclidean) distances between each query vector and its nearest neighbors.
2. **Indices (I)** – these are the indices (positions) of the nearest neighbors **in your original database (`xb`)**.

---

### How to Read It

Each row corresponds to one query vector.

Example:

#### First row:
```text
Distances:
[17.725014 17.804916 18.037642 18.049805 18.490448]
Indices:
[12 95  4 13 87]

This means:
	•	For query vector xq[0]:
	•	Its closest vector in the database is xb[12], and the L2 distance is 17.73
	•	The second closest is xb[95], distance 17.80
	•	…and so on, up to the fifth closest xb[87], distance 18.49

Another example:

Query xq[1] → top 5 neighbors:
Indices:   [29 56  2 25 95]
Distances: [15.679272 17.418774 18.164995 18.186073 18.192493]

So xb[29] is the closest match to xq[1], and it’s a much better match (distance 15.68) than the others (17.4+).

⸻

What Does the Distance Mean?

Faiss is using L2 distance, i.e., the sum of squared differences across all 128 dimensions.

Lower values = more similar

Higher values = less similar

In your case, the distances range from ~15.4 to ~19.0, so:
	•	Anything close to 15 is a pretty close match
	•	Anything near 19 is a weaker match (still top 5, but not as close)

⸻

Summary
	•	Each row = 1 query vector’s results
	•	Indices[i] tells you which xb vectors are the closest
	•	Distances[i] tells you how close they are (lower is better)