# Cluster Split

In [1]:
import os
from dotenv import load_dotenv

load_dotenv()
OPENAI_API_KEY = os.getenv('OPENAI_API_KEY')

## Initial Setup

The following imports are essential for setting up the Indox application. These imports include the main Indox retrieval augmentation module, question-answering models, embeddings, and data loader splitter.

In [2]:
from indox import IndoxRetrievalAugmentation
from indox.llms import OpenAiQA
from indox.embeddings import OpenAiEmbedding
from indox.data_loader_splitter import ClusteredSplit

In this step, we initialize the Indox Retrieval Augmentation, the QA model, and the embedding model. Note that the models used for QA and embedding can vary depending on the specific requirements.


In [3]:
Indox = IndoxRetrievalAugmentation()
qa_model = OpenAiQA(api_key=OPENAI_API_KEY,model="gpt-3.5-turbo-0125")
embed = OpenAiEmbedding(api_key=OPENAI_API_KEY,model="text-embedding-3-small")

In [4]:
file_path = "sample.txt"

## Data Loader Setup

We set up the data loader using the `ClusteredSplit` class. This step involves loading documents, configuring embeddings, and setting options for processing the text.


In [6]:
loader_splitter = ClusteredSplit(file_path=file_path,embeddings=embed,remove_sword=False,re_chunk=False,chunk_size=300,use_openai_summary=True)

In [7]:
docs = loader_splitter.load_and_chunk()

--Generated 1 clusters--


In [8]:
docs

["The wife of a rich man fell sick, and as she felt that her end was drawing near, she called her only daughter to her bedside and said, dear child, be good and pious, and then the good God will always protect you, and I will look down on you from heaven and be near you   Thereupon she closed her eyes and departed   Every day the maiden went out to her mother's grave, and wept, and she remained pious and good   When winter came the snow spread a white sheet over the grave, and by the time the spring sun had drawn it off again, the man had taken another wife The woman had brought with her into the house two daughters, who were beautiful and fair of face, but vile and black of heart Now began a bad time for the poor step-child   Is the stupid goose to sit in the parlor with us, they said   He who wants to eat bread must earn it   Out with the kitchen-wench   They took her pretty clothes away from her, put an old grey bedgown on her, and gave her wooden shoes   Just look at the proud prin

## Vector Store Connection and Document Storage

In this step, we connect the Indox application to the vector store and store the processed documents.


In [9]:
from indox.vector_stores import ChromaVectorStore
db = ChromaVectorStore(collection_name="sample",embedding=embed)

In [10]:
Indox.connect_to_vectorstore(db)

<indox.vector_stores.Chroma.ChromaVectorStore at 0x25115b2ec90>

In [11]:
Indox.store_in_vectorstore(docs)

<indox.vector_stores.Chroma.ChromaVectorStore at 0x25115b2ec90>

## Querying and Interpreting the Response

In this step, we query the Indox application with a specific question and use the QA model to get the response. The response is a tuple where the first element is the answer and the second element contains the retrieved context with their cosine scores.
response[0] contains the answer
response[1] contains the retrieved context with their cosine scores

In [12]:
retriever = Indox.QuestionAnswer(vector_database=db,llm=qa_model,top_k=5)

In [13]:
retriever.invoke(query="How cinderella reach happy ending?")

'Cinderella reaches her happy ending through perseverance, kindness, and the help of magical elements. Despite facing mistreatment from her stepmother and stepsisters, Cinderella remains kind and pious. With the assistance of a magical hazel tree and a little white bird, she is able to attend a royal festival where the prince falls in love with her. The prince searches for her true identity after she flees each night, leaving only a golden slipper behind. Ultimately, the prince finds Cinderella and recognizes her as the one whose foot fits the golden slipper. This leads to Cinderella marrying the prince and living happily ever after.'

In [14]:
retriever.context

["This documentation tells the story of Cinderella, a young girl who faces mistreatment from her stepmother and stepsisters after her mother's death. Despite the cruelty she endures, Cinderella remains kind and pious. Through the help of a magical hazel tree and a little white bird, she is able to attend a royal festival where the prince falls in love with her. The prince searches for her true identity after she flees each night, leaving only a golden slipper behind. Ultimately,",
 "They never once thought of cinderella, and believed that she was sitting at home in the dirt, picking lentils out of the ashes   The prince approached her, took her by the hand and danced with her He would dance with no other maiden, and never let loose of her hand, and if any one else came to invite her, he said, this is my partner She danced till it was evening, and then she wanted to go home But the king's son said, I will go with you and bear you company, for he wished to see to whom the beautiful maide