In [None]:
# !pip install llama_index==0.11.4
# !pip install PyYAML
# !pip install docx2txt==0.8

# Basic Llama Index

In this section, we will learn what the architecture of a basic RAG system built by Llama index will consist of. You can create a basic RAG application after reading this section.

In the above example, we create a Document object with input text - a string, and metadata - a dictionary.

## Nodes

Creating Documents is quite simple, however, the data in the document is still raw data, so how can raw data be converted into a format that LLM can process and infer effectively. The main solution is Nodes, which are smaller contents extracted from the document, the purpose of which is to divide the document into smaller parts for easier management.

Nodes avoid the situation of exceeding the limit of the prompt that the model allows. For example, when there is a 100-page ebook, we do not use all the data in it directly into the prompt for LLM to process, because it will exceed the input limit of the model. Moreover, this method has the disadvantage of high processing costs, we have to pay more for using the API, the accuracy of the answer is also not guaranteed because when the input is such a long text, our prompt does not focus on any specific information, leading to the model not understanding and giving the correct answer. In addition, we can establish relationships between nodes.

In Llamaindex, we can create nodes for text data using the TextNode class. The syntax used is:


In [2]:
from llama_index.core import Document
from llama_index.core.schema import TextNode

text = "MENTAL CARE Team"
doc = Document(
        text=text
        )

node1 = TextNode(text=doc.text[:6])
node2 = TextNode(text=doc.text[7:])

print(node1)
print(node2)

Node ID: 9a3bea69-562a-4e9f-a3d6-f19758afd867
Text: MENTAL
Node ID: 99652ecc-6662-4c4a-8318-7fd334a64d17
Text: CARE Team


We can create nodes automatically using TokenTextSplitter, the syntax is:

```Python
from llama_index.core.node_parser import TokenTextSplitter

TokenTextSplitter(
chunk_size,
chunk_overlap,
separator,
)
```

Where:

- chunk_size: The size of each text chunk. This is the number of tokens (words or characters) in each chunk.

- chunk_overlap: The number of tokens that will be duplicated between consecutive chunks. This means that a part of the previous chunk will be repeated in the next chunk.

- separator: The character or string of characters used to separate text chunks.

In [3]:
from llama_index.core import Document
from llama_index.core.node_parser import TokenTextSplitter

text = "It's sunny today, I went to eat ice cream, it's so cold my teeth are freezing!"
doc = Document(text=text)
splitter = TokenTextSplitter(
    chunk_size=18,
    chunk_overlap=5,
    separator= " "
)
nodes = splitter.get_nodes_from_documents([doc])

for node in nodes:
    print(node)

Metadata length (2) is close to chunk size (18). Resulting chunks are less than 50 tokens. Consider increasing the chunk size or decreasing the size of your metadata to avoid this.
Node ID: 4ae68656-8eb3-4144-856a-80da8016e92a
Text: It's sunny today, I went to eat ice cream, it's so cold
Node ID: 92498115-9ae5-4cc3-877a-31b5e03f7f21
Text: it's so cold my teeth are freezing!


In the above example, we call the TokenTextSplitter class from the node_parser module in the llama_index.core package. Next, we create a doc object from the text, then
create a splitter object from the TokenTextSplitter class with the properties chunk_size, chunk_overlap, separator. To create nodes, we call the get_nodes_from_documents method from the splitter object we just created. So our nodes have been automatically created. A warning appears saying that we will have problems when our chunk_size is too small, smaller or not too different from the size of the metadata. Because then the metadata will take up most of the data of the node while the actual data is very little.

Creating nodes does not stop at splitting the document, we can establish relationships between nodes, in addition, there are some more advanced node transformation methods that we will learn in the next section. Here is how we create relationships between two nodes, there are 5 relationships between nodes:

In the system of dividing text into small segments (nodes) as used in the `llama_index` library, the relationships between nodes can be described as follows:

1. **SOURCE**:
- This node is the source document. This is the entire original text from which the other nodes are split.

- For example: If the original text is "It's sunny today, I went to eat ice cream, my teeth are cold!", then the `SOURCE` node will contain this entire text.

2. **PREVIOUS**:
- This node is the previous node in the document. It represents the text immediately before the current node.

- For example: If there are two consecutive nodes, the second node will have a `PREVIOUS` link pointing to the first node.

3. **NEXT**:
- This node is the next node in the document. It represents the text that follows the current node.

- For example, if there are two consecutive nodes, the first node will have a `NEXT` link pointing to the second node.

4. **PARENT**:
- This node is the parent node in the document. It represents a larger text that contains the current node.

- For example, if a large document is divided into smaller paragraphs, and each of these smaller paragraphs is divided into smaller paragraphs, then a node can have a parent node that contains it.

5. **CHILD**:
- This node is a child node in the document. It represents a smaller text that is contained within the current node.

- For example, if a large text is divided into smaller paragraphs, then each smaller paragraph will be a child node of the large node.

In [8]:
print(doc)
print(nodes[0])
print(nodes[1])

Doc ID: 0d10e863-0b95-43ce-ae23-186998cad8be
Text: It's sunny today, I went to eat ice cream, it's so cold my teeth
are freezing!
Node ID: 7a4c8d96-3303-46c9-a982-752a7cd675c6
Text: It's sunny today, I went to eat ice cream, it's so cold
Node ID: 2a522af9-bd71-41ce-adf5-c41f603d2314
Text: it's so cold my teeth are freezing!


In [9]:
nodes

[TextNode(id_='7a4c8d96-3303-46c9-a982-752a7cd675c6', embedding=None, metadata={}, excluded_embed_metadata_keys=[], excluded_llm_metadata_keys=[], relationships={<NodeRelationship.SOURCE: '1'>: RelatedNodeInfo(node_id='0d10e863-0b95-43ce-ae23-186998cad8be', node_type=<ObjectType.DOCUMENT: '4'>, metadata={}, hash='b1526564d45dbcefc0d21df5115c0273ae46be71605c670ea1594b01221dcb25'), <NodeRelationship.NEXT: '3'>: RelatedNodeInfo(node_id='2a522af9-bd71-41ce-adf5-c41f603d2314', node_type=<ObjectType.TEXT: '1'>, metadata={}, hash='ad357d0f99e2c94faad156ee88603b8563023dda17192c3a90dabe953d616d1f')}, text="It's sunny today, I went to eat ice cream, it's so cold", mimetype='text/plain', start_char_idx=0, end_char_idx=55, text_template='{metadata_str}\n\n{content}', metadata_template='{key}: {value}', metadata_seperator='\n'),
 TextNode(id_='2a522af9-bd71-41ce-adf5-c41f603d2314', embedding=None, metadata={}, excluded_embed_metadata_keys=[], excluded_llm_metadata_keys=[], relationships={<NodeRelat

In [10]:
print(nodes[0].relationships)
print(nodes[1].relationships)

{<NodeRelationship.SOURCE: '1'>: RelatedNodeInfo(node_id='0d10e863-0b95-43ce-ae23-186998cad8be', node_type=<ObjectType.DOCUMENT: '4'>, metadata={}, hash='b1526564d45dbcefc0d21df5115c0273ae46be71605c670ea1594b01221dcb25'), <NodeRelationship.NEXT: '3'>: RelatedNodeInfo(node_id='2a522af9-bd71-41ce-adf5-c41f603d2314', node_type=<ObjectType.TEXT: '1'>, metadata={}, hash='ad357d0f99e2c94faad156ee88603b8563023dda17192c3a90dabe953d616d1f')}
{<NodeRelationship.SOURCE: '1'>: RelatedNodeInfo(node_id='0d10e863-0b95-43ce-ae23-186998cad8be', node_type=<ObjectType.DOCUMENT: '4'>, metadata={}, hash='b1526564d45dbcefc0d21df5115c0273ae46be71605c670ea1594b01221dcb25'), <NodeRelationship.PREVIOUS: '2'>: RelatedNodeInfo(node_id='7a4c8d96-3303-46c9-a982-752a7cd675c6', node_type=<ObjectType.TEXT: '1'>, metadata={}, hash='7989d88830163286446b05ad76b47dd2f4d9cf0550b7ce47c8eaaa9d8eb197aa')}


The part of establishing relationships between nodes, these nodes will usually be created automatically, you can see in the example above, the relationships have been created automatically. We can see that node 1 has a SOURCE node as doc, has a NEXT relationship with node 2. Similarly, node 2 has a PREVIOUS relationship with node 1. In general, we just need to know that these relationships are very important, helping to query more accurately.

## Indexing

After you enter data into the system, LlamaIndex will help you index the data into an easily accessible structure. This process usually involves creating vector embeddings and storing them in a specialized database called a vector store. Indexing helps organize data so that it is easy to search and retrieve later.

Llama Index supports many different types of indexes such as SummaryIndex, VectorStoreIndex, TreeIndex, KnowledgeGraphIndex, depending on each case, we will choose the appropriate type. In general, these types of indexes all perform the functions of creating indexes, adding new nodes, and querying.

In [11]:
from llama_index.core import Document, SummaryIndex
from llama_index.core.node_parser import TokenTextSplitter

text = "The fat cat was lying by the window. I wished I could be like that."
doc = Document(
    text=text
    )

splitter = TokenTextSplitter(
    chunk_size=20,
    chunk_overlap=5,
    separator= " "
)
nodes = splitter.get_nodes_from_documents([doc])
index = SummaryIndex(nodes)
index2 = SummaryIndex.from_documents([doc])

print("index 1", index.index_struct)
print("index 2", index2.index_struct)

Metadata length (2) is close to chunk size (20). Resulting chunks are less than 50 tokens. Consider increasing the chunk size or decreasing the size of your metadata to avoid this.
index 1 IndexList(index_id='166953ab-ab4f-412b-b3f5-2cdf40ae2d71', summary=None, nodes=['30aad8eb-e8d1-42bb-a5b4-978daf8ee7df'])
index 2 IndexList(index_id='5798db81-5a25-4508-b206-e215defd2638', summary=None, nodes=['6df79211-d0b9-47e1-8d55-a5783250884f'])


In [12]:
index2 = SummaryIndex.from_documents([doc])

In [13]:
print(index.index_struct)

IndexList(index_id='166953ab-ab4f-412b-b3f5-2cdf40ae2d71', summary=None, nodes=['30aad8eb-e8d1-42bb-a5b4-978daf8ee7df'])


In the above example, we create an index from nodes using SummayIndex, we can see the structure of the index via index_struct. So the index is ready to be retrieved, from the beginning until now we created Doc, created Node, created Index is only for the purpose of retrieving information from data, right? To retrieve data in the index, Llama Index supports many different retrieval tools, but the simplest here is to create a query_engine object from index.as_query_engine(), which means we create a question and answer engine from the index. Usage is also very simple, just call the query_engine.query() method with the parameter being the content of the question.

In [None]:
query_engine = index.as_query_engine()
respone = query_engine.query("what is fat cat doing?")
print(respone)

In the above code, we get an error because we have not set the OPEN AI API key so we cannot use query_engine. Now let's add the key to fix the error.

In [18]:
from llama_index.core import Settings
from llama_index.llms.openai import OpenAI
import openai

openai.api_key = "sk-proj-your-openai-api-key"
Settings.llm = OpenAI(model="gpt-4o-mini", temperature=0.2)


In the code above, we set the api_key from open api, in addition, we set the LLM model that we use as gpt-4o-mini, with temperature as 0.2. Here, temperature has a value from 0 to 2, the higher the value, the more creative and random the answer. When the value is lower or equal to 0, the result will be more fixed, that is, we ask a question many times, the answer is the same. In general, depending on the purpose of use, we will adjust the number appropriately.

In [19]:
query_engine = index.as_query_engine()
respone = query_engine.query("what is fat cat doing?")
print(respone)

The fat cat was lying by the window.
