In [1]:
import os
from dotenv import load_dotenv

load_dotenv()
OPENAI_API_KEY = os.getenv('OPENAI_API_KEY')

## Initial Setup

The following imports are essential for setting up the Indox application. These imports include the main Indox retrieval augmentation module, question-answering models, embeddings, and data loader splitter.


In [2]:
from indox import IndoxRetrievalAugmentation
from indox.llms import OpenAiQA
from indox.embeddings import OpenAiEmbedding
from indox.data_loader_splitter import ClusteredSplit

In this step, we initialize the Indox Retrieval Augmentation, the QA model, and the embedding model. Note that the models used for QA and embedding can vary depending on the specific requirements.


In [3]:
Indox = IndoxRetrievalAugmentation()
qa_model = OpenAiQA(api_key=OPENAI_API_KEY,model="gpt-3.5-turbo-0125")
embed = OpenAiEmbedding(api_key=OPENAI_API_KEY,model="text-embedding-3-small")

## Data Loader Setup

We set up the data loader using the `ClusteredSplit` class. This step involves loading documents, configuring embeddings, and setting options for processing the text.


In [4]:
loader_splitter = ClusteredSplit(file_path="sample.txt",embeddings=embed,remove_sword=False,re_chunk=False,chunk_size=300,use_openai_summary=True)

In [5]:
docs = loader_splitter.load_and_chunk()

--Generated 1 clusters--


## Vector Store Connection and Document Storage

In this step, we connect the Indox application to the vector store and store the processed documents.


In [7]:
from indox.vector_stores import ChromaVectorStore
db = ChromaVectorStore(collection_name="sample",embedding=embed)

In [8]:
Indox.connect_to_vectorstore(db)

<indox.vector_stores.Chroma.ChromaVectorStore at 0x215d56a6c00>

In [9]:
Indox.store_in_vectorstore(docs)

<indox.vector_stores.Chroma.ChromaVectorStore at 0x215d56a6c00>

## Querying and Interpreting the Response

In this step, we query the Indox application with a specific question and use the QA model to get the response. The response is a tuple where the first element is the answer and the second element contains the retrieved context with their cosine scores.
response[0] contains the answer
response[1] contains the retrieved context with their cosine scores


In [12]:
retriever = Indox.QuestionAnswer(vector_database=db,llm=qa_model,top_k=5)

In [13]:
retriever.invoke(query="How cinderella reach happy ending?")

"Cinderella reached her happy ending through her kindness, perseverance, and the magical assistance she received. Despite being mistreated by her stepmother and stepsisters, Cinderella remained kind and pure of heart. With the help of a little bird and her mother's grave, she was able to attend the royal ball where the prince fell in love with her. Even though she had to escape from the prince, he searched for her and eventually found her with the help of the golden slipper she left behind. The prince then declared that he would only marry the woman whose foot fit the golden slipper, leading to Cinderella's ultimate happy ending as she was the only one whose foot fit the slipper."

In [14]:
retriever.context

["They never once thought of cinderella, and believed that she was sitting at home in the dirt, picking lentils out of the ashes   The prince approached her, took her by the hand and danced with her He would dance with no other maiden, and never let loose of her hand, and if any one else came to invite her, he said, this is my partner She danced till it was evening, and then she wanted to go home But the king's son said, I will go with you and bear you company, for he wished to see to whom the beautiful maiden belonged She escaped from him, however, and sprang into the pigeon-house   The king's son waited until her father came, and then he told him that the unknown maiden had leapt into the pigeon-house   The old man thought, can it be cinderella   And they had to bring him an axe and a pickaxe that he might hew the pigeon-house to pieces, but no one was inside it   And when they got home cinderella lay in her dirty clothes among the ashes, and a dim little oil-lamp was burning on the 