# Components in LlamaIndex

This notebook is part of the [Hugging Face Agents Course](https://www.hf.co/learn/agents-course), a free Course from beginner to expert, where you learn to build Agents.

![Agents course share](https://huggingface.co/datasets/agents-course/course-images/resolve/main/en/communication/share.png)

Alfred is hosting a party and needs to be able to find relevant information on personas that will be attending the party. Therefore, we will use a `QueryEngine` to index and search through a database of personas.

## Let's install the dependencies

We will install the dependencies for this unit.

In [2]:
!pip install llama-index datasets llama-index-callbacks-arize-phoenix arize-phoenix llama-index-vector-stores-chroma llama-index-llms-huggingface-api llama-index-embeddings-huggingface -U -q

[?25l     [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m0.0/67.3 kB[0m [31m?[0m eta [36m-:--:--[0m[2K     [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m67.3/67.3 kB[0m [31m5.0 MB/s[0m eta [36m0:00:00[0m
[?25h  Installing build dependencies ... [?25l[?25hdone
  Getting requirements to build wheel ... [?25l[?25hdone
  Preparing metadata (pyproject.toml) ... [?25l[?25hdone
[2K   [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m491.5/491.5 kB[0m [31m26.9 MB/s[0m eta [36m0:00:00[0m
[2K   [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m4.0/4.0 MB[0m [31m88.6 MB/s[0m eta [36m0:00:00[0m
[2K   [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m301.2/301.2 kB[0m [31m22.1 MB/s[0m eta [36m0:00:00[0m
[2K   [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m242.7/242.7 kB[0m [31m18.8 MB/s[0m eta [36m0:00:00[0m
[2K   [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m63.2/63.2 kB[0m [31m4.6 MB/s[0m eta [36m0:0

And, let's log in to Hugging Face to use serverless Inference APIs.

In [3]:
from huggingface_hub import login

login()

VBox(children=(HTML(value='<center> <img\nsrc=https://huggingface.co/front/assets/huggingface_logo-noborder.sv…

## Create a `QueryEngine` for retrieval augmented generation

### Setting up the persona database

We will be using personas from the [dvilasuero/finepersonas-v0.1-tiny dataset](https://huggingface.co/datasets/dvilasuero/finepersonas-v0.1-tiny). This dataset contains 5K personas that will be attending the party!

Let's load the dataset and store it as files in the `data` directory

In [4]:
from datasets import load_dataset
from pathlib import Path

dataset = load_dataset(path="dvilasuero/finepersonas-v0.1-tiny", split="train")

Path("data").mkdir(parents=True, exist_ok=True)
for i, persona in enumerate(dataset):
    with open(Path("data") / f"persona_{i}.txt", "w") as f:
        f.write(persona["persona"])

The secret `HF_TOKEN` does not exist in your Colab secrets.
To authenticate with the Hugging Face Hub, create a token in your settings tab (https://huggingface.co/settings/tokens), set it as secret in your Google Colab and restart your session.
You will be able to reuse this secret in all of your notebooks.
Please note that authentication is recommended but still optional to access public models or datasets.


README.md:   0%|          | 0.00/618 [00:00<?, ?B/s]

data/train-00000-of-00001.parquet:   0%|          | 0.00/35.0M [00:00<?, ?B/s]

Generating train split:   0%|          | 0/5000 [00:00<?, ? examples/s]

Awesome, now we have a local directory with all the personas that will be attending the party, we can load and index!

### Loading and embedding persona documents

We will use the `SimpleDirectoryReader` to load the persona descriptions from the `data` directory. This will return a list of `Document` objects.

In [5]:
dataset

Dataset({
    features: ['id', 'persona', 'model_name_embeddings', 'embedding', 'nn_indices', 'nn_scores', 'projection', 'cluster_label', 'summary_label'],
    num_rows: 5000
})

In [4]:
persona

{'id': '<urn:uuid:4b463cdb-d10b-4634-943d-789aa0fa6a61>',
 'persona': 'A medical professional specializing in oncology or a healthcare provider focused on patient education, likely an internist or a gastroenterologist, who prioritizes clear and concise communication of complex health information to patients.',
 'model_name_embeddings': 'Alibaba-NLP/gte-large-en-v1.5',
 'embedding': [-0.03687451407313347,
  -0.008114331401884556,
  -0.006511371117085218,
  0.006908327806740999,
  0.017953531816601753,
  0.011432834900915623,
  0.00039495425880886614,
  0.0074589308351278305,
  -0.0048296633176505566,
  -0.0016417173901572824,
  -0.011590508744120598,
  -0.006339957471936941,
  -0.008978557772934437,
  -0.006717226002365351,
  -0.0022173526231199503,
  -0.021207941696047783,
  0.02809717319905758,
  -0.009009630419313908,
  -0.006346428766846657,
  -0.012153471820056438,
  0.03864119201898575,
  0.03268368914723396,
  0.005661402828991413,
  -0.05846736952662468,
  -0.06825656443834305,


In [5]:
from llama_index.core import SimpleDirectoryReader

reader = SimpleDirectoryReader(input_dir="data")
documents = reader.load_data()
len(documents)

5000

In [9]:
documents

[Document(id_='bd39870b-2045-4ee0-bd7c-dc3bd4240c8d', embedding=None, metadata={'file_path': '/content/data/persona_0.txt', 'file_name': 'persona_0.txt', 'file_type': 'text/plain', 'file_size': 132, 'creation_date': '2025-06-24', 'last_modified_date': '2025-06-24'}, excluded_embed_metadata_keys=['file_name', 'file_type', 'file_size', 'creation_date', 'last_modified_date', 'last_accessed_date'], excluded_llm_metadata_keys=['file_name', 'file_type', 'file_size', 'creation_date', 'last_modified_date', 'last_accessed_date'], relationships={}, metadata_template='{key}: {value}', metadata_separator='\n', text_resource=MediaResource(embeddings=None, data=None, text='A local art historian and museum professional interested in 19th-century American art and the local cultural heritage of Cincinnati.', path=None, url=None, mimetype=None), image_resource=None, audio_resource=None, video_resource=None, text_template='{metadata_str}\n\n{content}'),
 Document(id_='1c96a81c-047e-4e4f-9810-7e30ab47d456

Now we have a list of `Document` objects, we can use the `IngestionPipeline` to create nodes from the documents and prepare them for the `QueryEngine`. We will use the `SentenceSplitter` to split the documents into smaller chunks and the `HuggingFaceEmbedding` to embed the chunks.

In [6]:
from llama_index.embeddings.huggingface import HuggingFaceEmbedding

from llama_index.core.node_parser import SentenceSplitter
from llama_index.core.ingestion import IngestionPipeline

# create the pipeline with transformations
pipeline = IngestionPipeline(
    transformations=[
        SentenceSplitter(),
        HuggingFaceEmbedding(model_name="BAAI/bge-small-en-v1.5"),
    ]
)

# run the pipeline sync or async
nodes = await pipeline.arun(documents=documents[:10])
nodes

modules.json:   0%|          | 0.00/349 [00:00<?, ?B/s]

config_sentence_transformers.json:   0%|          | 0.00/124 [00:00<?, ?B/s]

README.md:   0%|          | 0.00/94.8k [00:00<?, ?B/s]

sentence_bert_config.json:   0%|          | 0.00/52.0 [00:00<?, ?B/s]

config.json:   0%|          | 0.00/743 [00:00<?, ?B/s]

model.safetensors:   0%|          | 0.00/133M [00:00<?, ?B/s]

tokenizer_config.json:   0%|          | 0.00/366 [00:00<?, ?B/s]

vocab.txt:   0%|          | 0.00/232k [00:00<?, ?B/s]

tokenizer.json:   0%|          | 0.00/711k [00:00<?, ?B/s]

special_tokens_map.json:   0%|          | 0.00/125 [00:00<?, ?B/s]

config.json:   0%|          | 0.00/190 [00:00<?, ?B/s]

[TextNode(id_='edab54ed-ef94-41fb-9add-6cc21d814314', embedding=[-0.0012166907545179129, 0.01964835450053215, 0.04393022134900093, -0.0022480515763163567, -0.01840735413134098, -0.017943933606147766, 0.009850618429481983, 0.03162042796611786, -0.06904760748147964, -0.04868912324309349, 0.0006960140890441835, -0.039564911276102066, -0.04958917200565338, 0.03078899346292019, -0.0216880664229393, -0.0027349276933819056, 0.009126023389399052, 0.08719330281019211, 0.008962404914200306, 0.012264067307114601, 0.014265610836446285, -0.0663817822933197, 0.03564688563346863, -0.019090434536337852, -0.029022280126810074, -0.0002521576243452728, 0.016371168196201324, -0.006577853579074144, -0.01683872938156128, -0.11301668733358383, -0.0613473579287529, 0.009758847765624523, -0.026674138382077217, 0.014211314730346203, 0.06158553063869476, 0.010618181899189949, 0.013383880257606506, 0.0661582350730896, 0.007938399910926819, 0.017128722742199898, 0.02946525812149048, 0.016287578269839287, -0.025776

In [11]:
len(nodes)

10

In [16]:
nodes[0]

TextNode(id_='9ead66a7-ae64-41d9-a675-406774b7ee65', embedding=[-0.0012166907545179129, 0.01964835450053215, 0.04393022134900093, -0.0022480515763163567, -0.01840735413134098, -0.017943933606147766, 0.009850618429481983, 0.03162042796611786, -0.06904760748147964, -0.04868912324309349, 0.0006960140890441835, -0.039564911276102066, -0.04958917200565338, 0.03078899346292019, -0.0216880664229393, -0.0027349276933819056, 0.009126023389399052, 0.08719330281019211, 0.008962404914200306, 0.012264067307114601, 0.014265610836446285, -0.0663817822933197, 0.03564688563346863, -0.019090434536337852, -0.029022280126810074, -0.0002521576243452728, 0.016371168196201324, -0.006577853579074144, -0.01683872938156128, -0.11301668733358383, -0.0613473579287529, 0.009758847765624523, -0.026674138382077217, 0.014211314730346203, 0.06158553063869476, 0.010618181899189949, 0.013383880257606506, 0.0661582350730896, 0.007938399910926819, 0.017128722742199898, 0.02946525812149048, 0.016287578269839287, -0.0257763

As, you can see, we have created a list of `Node` objects, which are just chunks of text from the original documents. Let's explore how we can add these nodes to a vector store.

### Storing and indexing documents

Since we are using an ingestion pipeline, we can directly attach a vector store to the pipeline to populate it.
In this case, we will use `Chroma` to store our documents.
Let's run the pipeline again with the vector store attached.
The `IngestionPipeline` caches the operations so this should be fast!

In [7]:
import chromadb
from llama_index.vector_stores.chroma import ChromaVectorStore

db = chromadb.PersistentClient(path="./alfred_chroma_db")
chroma_collection = db.get_or_create_collection(name="alfred")
vector_store = ChromaVectorStore(chroma_collection=chroma_collection)

pipeline = IngestionPipeline(
    transformations=[
        SentenceSplitter(),
        HuggingFaceEmbedding(model_name="BAAI/bge-small-en-v1.5"),
    ],
    vector_store=vector_store,
)

nodes = await pipeline.arun(documents=documents[:10])
len(nodes)

10

In [21]:
nodes[0]

TextNode(id_='599a8a1c-d7b3-4390-a644-d5bfd5f979c4', embedding=[-0.0012166907545179129, 0.01964835450053215, 0.04393022134900093, -0.0022480515763163567, -0.01840735413134098, -0.017943933606147766, 0.009850618429481983, 0.03162042796611786, -0.06904760748147964, -0.04868912324309349, 0.0006960140890441835, -0.039564911276102066, -0.04958917200565338, 0.03078899346292019, -0.0216880664229393, -0.0027349276933819056, 0.009126023389399052, 0.08719330281019211, 0.008962404914200306, 0.012264067307114601, 0.014265610836446285, -0.0663817822933197, 0.03564688563346863, -0.019090434536337852, -0.029022280126810074, -0.0002521576243452728, 0.016371168196201324, -0.006577853579074144, -0.01683872938156128, -0.11301668733358383, -0.0613473579287529, 0.009758847765624523, -0.026674138382077217, 0.014211314730346203, 0.06158553063869476, 0.010618181899189949, 0.013383880257606506, 0.0661582350730896, 0.007938399910926819, 0.017128722742199898, 0.02946525812149048, 0.016287578269839287, -0.0257763

We can create a `VectorStoreIndex` from the vector store and use it to query the documents by passing the vector store and embedding model to the `from_vector_store()` method.

In [8]:
from llama_index.core import VectorStoreIndex
from llama_index.embeddings.huggingface import HuggingFaceEmbedding


embed_model = HuggingFaceEmbedding(model_name="BAAI/bge-small-en-v1.5")
index = VectorStoreIndex.from_vector_store(
    vector_store=vector_store, embed_model=embed_model
)

We don't need to worry about persisting the index to disk, as it is automatically saved within the `ChromaVectorStore` object and the passed directory path.

### Querying the index

Now that we have our index, we can use it to query the documents.
Let's create a `QueryEngine` from the index and use it to query the documents using a specific response mode.


In [14]:
from llama_index.llms.huggingface_api import HuggingFaceInferenceAPI
import nest_asyncio

nest_asyncio.apply()  # This is needed to run the query engine
llm = HuggingFaceInferenceAPI(model_name="Qwen/Qwen2.5-Coder-32B-Instruct")
query_engine = index.as_query_engine(
    llm=llm,
    response_mode="tree_summarize",
)
response = query_engine.query(
    "Respond using a persona that describes author and travel experiences?"
)
response

Response(response='An individual deeply immersed in the study of Cypriot culture, history, and society, having dedicated significant time to research and reside in Cyprus. This person has gained a profound understanding of the local customs and way of life, making them a valuable resource for those interested in the intricacies of Cypriot heritage.', source_nodes=[NodeWithScore(node=TextNode(id_='ee832d0b-4226-4741-a897-fb53958e812f', embedding=None, metadata={'file_path': '/content/data/persona_1.txt', 'file_name': 'persona_1.txt', 'file_type': 'text/plain', 'file_size': 266, 'creation_date': '2025-06-24', 'last_modified_date': '2025-06-24'}, excluded_embed_metadata_keys=['file_name', 'file_type', 'file_size', 'creation_date', 'last_modified_date', 'last_accessed_date'], excluded_llm_metadata_keys=['file_name', 'file_type', 'file_size', 'creation_date', 'last_modified_date', 'last_accessed_date'], relationships={<NodeRelationship.SOURCE: '1'>: RelatedNodeInfo(node_id='425a01c7-2902-4f

In [10]:
query_engine.query("What is the meaning of life?")

Response(response='The meaning of life is a profound and philosophical question that has puzzled humans for centuries. Different individuals and cultures have their own interpretations and beliefs about what gives life meaning. For the anthropologist or cultural expert, the meaning of life might be found in the rich tapestry of human experiences, traditions, and the interconnectedness of all life. On the other hand, for the web developer or student, the meaning of life could be seen through the lens of creation, problem-solving, and the impact of technology on society. Ultimately, the meaning of life is a deeply personal and subjective concept.', source_nodes=[NodeWithScore(node=TextNode(id_='ee832d0b-4226-4741-a897-fb53958e812f', embedding=None, metadata={'file_path': '/content/data/persona_1.txt', 'file_name': 'persona_1.txt', 'file_type': 'text/plain', 'file_size': 266, 'creation_date': '2025-06-24', 'last_modified_date': '2025-06-24'}, excluded_embed_metadata_keys=['file_name', 'fi

In [12]:
query_engine = index.as_query_engine(
    llm=llm,
    response_mode="compact",
)
query_engine.query("What is the meaning of life?")

Response(response="The meaning of life is a profound and philosophical question that has been explored by many cultures, scholars, and individuals throughout history. For someone deeply immersed in Cypriot culture, the meaning of life might be intertwined with the island's rich history, its people's resilience, and the importance of community and tradition. On the other hand, for a web developer or student, the meaning of life could be seen through the lens of creation, problem-solving, and the impact of technology on society. Both perspectives highlight the importance of understanding one's place in the world and the values that guide one's actions and beliefs.", source_nodes=[NodeWithScore(node=TextNode(id_='ee832d0b-4226-4741-a897-fb53958e812f', embedding=None, metadata={'file_path': '/content/data/persona_1.txt', 'file_name': 'persona_1.txt', 'file_type': 'text/plain', 'file_size': 266, 'creation_date': '2025-06-24', 'last_modified_date': '2025-06-24'}, excluded_embed_metadata_keys

## Evaluation and observability

LlamaIndex provides **built-in evaluation tools to assess response quality.**
These evaluators leverage LLMs to analyze responses across different dimensions.
We can now check if the query is faithful to the original persona.

In [15]:
from llama_index.core.evaluation import FaithfulnessEvaluator

# query index
evaluator = FaithfulnessEvaluator(llm=llm)
eval_result = evaluator.evaluate_response(response=response)
eval_result.passing

False

If one of these LLM based evaluators does not give enough context, we can check the response using the Arize Phoenix tool, after creating an account at [LlamaTrace](https://llamatrace.com/login) and generating an API key.

In [16]:
import llama_index
import os

PHOENIX_API_KEY = "ddf8491f1ae9ec21e51:17be80d"
os.environ["OTEL_EXPORTER_OTLP_HEADERS"] = f"api_key={PHOENIX_API_KEY}"
llama_index.core.set_global_handler(
    "arize_phoenix", endpoint="https://llamatrace.com/v1/traces"
)


Now, we can query the index and see the response in the Arize Phoenix tool.

In [17]:
response = query_engine.query(
    "What is the name of the someone that is interested in AI and techhnology?"
)
response

Response(response="The information provided does not mention anyone interested in AI and technology. The details given are about an anthropologist or cultural expert focusing on Cypriot culture and a local art historian specializing in 19th-century American art and Cincinnati's cultural heritage.", source_nodes=[NodeWithScore(node=TextNode(id_='ee832d0b-4226-4741-a897-fb53958e812f', embedding=None, metadata={'file_path': '/content/data/persona_1.txt', 'file_name': 'persona_1.txt', 'file_type': 'text/plain', 'file_size': 266, 'creation_date': '2025-06-24', 'last_modified_date': '2025-06-24'}, excluded_embed_metadata_keys=['file_name', 'file_type', 'file_size', 'creation_date', 'last_modified_date', 'last_accessed_date'], excluded_llm_metadata_keys=['file_name', 'file_type', 'file_size', 'creation_date', 'last_modified_date', 'last_accessed_date'], relationships={<NodeRelationship.SOURCE: '1'>: RelatedNodeInfo(node_id='425a01c7-2902-4f3c-83c3-e080226fa3b7', node_type='4', metadata={'file

We can then go to the [LlamaTrace](https://llamatrace.com/login) and explore the process and response.

![arize-phoenix](https://huggingface.co/datasets/agents-course/course-images/resolve/main/en/unit2/llama-index/arize.png)    