To understand the mechanics of LlamaIndex, let's walk through a simplified numerical example. This will illustrate how text is transformed into numbers and how retrieval works.



# Step 1: document and chunking (breaking into node)

In [1]:
docuemnt1= "The capital of Nepal is Kathmandu. Kathmandu is known for its rich history and ancient temples."

LlamaIndex would first chunk this into smaller nodes. For simplicity, let's say we have two nodes:



Node =  "The capital of Nepal is Kathmandu."

Node 2= "Kathmandu is known for its rich history and ancient temples."

# Step 2: Vector Embeddings


Next, an embedding model (like text-embedding-ada-002 from OpenAI) converts these nodes into numerical vectors. These are simplified, low-dimensional vectors for illustration; in reality, they have hundreds or thousands of dimensions.

Vector for Node 1: [0.8, 0.2, -0.5]

Vector for Node 2: [0.6, 0.7, 0.3]



These vectors are stored in a VectorStoreIndex.



# Step 4: Query Embedding


The same embedding model converts the user's query into a vector:

Query Vector: [0.7, 0.1, -0.4]



# Step 5: Semantic Search (Cosine Similarity)


The query engine now compares the query vector to the node vectors in the index to find the most similar one. A common method for this is cosine similarity, which measures the cosine of the angle between two vectors. A value closer to 1 indicates higher similarity.

The formula for cosine similarity between two vectors A and B is:

Cosine Similarity(A,B)= 
∥A∥∥B∥
A⋅B
​
 
Similarity between Query and Node 1:

Dot Product: (0.7 * 0.8) + (0.1 * 0.2) + (-0.4 * -0.5) = 0.56 + 0.02 + 0.2 = 0.78

Magnitude of Query Vector: sqrt(0.7^2 + 0.1^2 + (-0.4)^2) = sqrt(0.49 + 0.01 + 0.16) = sqrt(0.66) ≈ 0.81

Magnitude of Node 1 Vector: sqrt(0.8^2 + 0.2^2 + (-0.5)^2) = sqrt(0.64 + 0.04 + 0.25) = sqrt(0.93) ≈ 0.96

Cosine Similarity: 0.78 / (0.81 * 0.96) ≈ 0.99

Similarity between Query and Node 2:

Dot Product: (0.7 * 0.6) + (0.1 * 0.7) + (-0.4 * 0.3) = 0.42 + 0.07 - 0.12 = 0.37

Magnitude of Node 2 Vector: sqrt(0.6^2 + 0.7^2 + 0.3^2) = sqrt(0.36 + 0.49 + 0.09) = sqrt(0.94) ≈ 0.97

Cosine Similarity: 0.37 / (0.81 * 0.97) ≈ 0.47



# Step 6: Retrieval and Response Generation
The query engine sees that Node 1 has a much higher similarity score (0.99) than Node 2 (0.47). Therefore, it retrieves Node 1: "The capital of Nepal is Kathmandu."

