## Retrieval Augmentation Using Ollama
Here, we will explore how to work with Indox Retrieval Augmentation with Ollama. Actually Ollama is an open-source project that running LLMs on your local nachine. Ollama provides access to a diverse and continuously expanding library of pre-trained LLM models.

### Effortless Installation and Setup
One of Ollama’s standout features is its user-friendly installation process. Whether you’re a Windows, macOS, or Linux user, Ollama offers intuitive
 installation methods tailored to your operating system, ensuring a smooth and hassle-free setup experience.
 
### How to Download Ollama
You need to download Ollama, head on to the official website of [Ollama](https://ollama.com/) and hit the download button.Ollama supports 3 different operating systems.

### How to Run Ollama
you can download the intended model using the following cammand:


In [None]:
ollama run llama2

## Now lets run Indox
If you haven't install required packages, please download them.

In [None]:
!pip install ollama
!pip install mistralai
!pip install ollama
!pip install chromadb

[![Open In Colab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/github/osllmai/inDox/blob/master/Demo/ollama.ipynb)

## Setting Up the Python Environment

If you are running this project in your local IDE, please create a Python environment to ensure all dependencies are correctly managed. You can follow the steps below to set up a virtual environment named `indox`:

### Windows

1. **Create the virtual environment:**
```bash
python -m venv indox
```
2. **Activate the virtual environment:**
```bash
indox_judge\Scripts\activate
```

### macOS/Linux

1. **Create the virtual environment:**
   ```bash
   python3 -m venv indox
```

2. **Activate the virtual environment:**
    ```bash
   source indox/bin/activate
```
### Install Dependencies

Once the virtual environment is activated, install the required dependencies by running:

```bash
pip install -r requirements.txt
```


 we should set our MISTRAL_API_KEY as an environment variable.

In [1]:
import os
from dotenv import load_dotenv

load_dotenv()
MISTRAL_API_KEY = os.getenv('MISTRAL_API_KEY')

### Import Essential Libraries
Then, we import essential libraries for our `Indox` question answering system:
- `IndoxRetrievalAugmentation`: Enhances the retrieval process for better QA performance.
- `Ollama`: Any intended model from Ollama.
- `MistralEmbedding`: Utilizes Mistral embeddings for improved semantic understanding.
- `SimpleLoadAndSplit`: A utility for loading and splitting PDF files.
- `ChromaVectorStore`: Using ChromaVectorStore to store our documents.

In [2]:
from indox import IndoxRetrievalAugmentation
from indox.data_loader_splitter import SimpleLoadAndSplit
from indox.llms import Ollama
from indox.vector_stores import Chroma
from indox.embeddings import MistralEmbedding

### Building the Indox System and Initializing Models

Next, we will build our `inDox` system and initialize the Ollama model which we have already downloaded,  along with the embedding model. This setup will allow us to leverage the advanced capabilities of Indox for our tasks.


In [3]:
indox = IndoxRetrievalAugmentation()
llm_model = Ollama(model="llama3")

[32mINFO[0m: [1mIndoxRetrievalAugmentation initialized[0m

            ██  ███    ██  ██████   ██████  ██       ██
            ██  ████   ██  ██   ██ ██    ██   ██  ██
            ██  ██ ██  ██  ██   ██ ██    ██     ██
            ██  ██  ██ ██  ██   ██ ██    ██   ██   ██
            ██  ██  █████  ██████   ██████  ██       ██
            


### Connecting Embedding Model to Indox

With our reference data chunked and ready, the next step is to connect our embedding model to the Indox system. This connection enables the system to leverage the embeddings for better semantic understanding and retrieval performance.

We use the `connect_to_vectorstore` method to link the `HuggingFaceEmbedding` model with our Indox system. By specifying the embeddings and a collection name, we ensure that our reference data is appropriately indexed and stored, facilitating efficient retrieval during the question-answering process.

Let's connect the embedding model to Indox.


In [None]:
embed_mistral = MistralEmbedding(MISTRAL_API_KEY)

### Setting Up Reference Directory and File Path

To demonstrate the capabilities of our Indox question answering system, we will use a sample directory. This directory will contain our reference data, which we will use for testing and evaluation.

First, we specify the path to our sample file. In this case, we are using a file named `sample.txt` located in our working directory. This file will serve as our reference data for the subsequent steps.

Let's define the file path for our reference data.

In [4]:
!wget https://raw.githubusercontent.com/osllmai/inDox/master/Demo/sample.txt
file_path = "Demo/sample.txt"

[32mINFO[0m: [1mUnstructuredLoadAndSplit initialized successfully[0m
[32mINFO[0m: [1mStarting processing[0m
[32mINFO[0m: [1mCreated initial document elements[0m
[32mINFO[0m: [1mCompleted chunking process[0m
[32mINFO[0m: [1mSuccessfully obtained all documents[0m


### Chunking Reference Data with SimpleLoadAndSplit

To effectively utilize our reference data, we need to process and chunk it into manageable parts. This ensures that our question answering system can efficiently handle and retrieve relevant information.

We use the `SimpleLoadAndSplit` utility for this task. This tool allows us to load the PDF files and split it into smaller chunks. This process enhances the performance of our retrieval and QA models by making the data more accessible and easier to process. We are using 'bert-base-uncased' model for splitting data.

In this step, we define the file path for our reference data and use `SimpleLoadAndSplit` to chunk the data with a maximum chunk size of 200 characters. Also we can handle to remove stop words or not by initializing `remove-sword` parameter.

Let's proceed with chunking our reference data.

In [5]:
loader_splitter = SimpleLoadAndSplit(file_path=file_path)
docs = loader_splitter.load_and_chunk()

[32mINFO[0m: [1mInitializing Ollama with model: llama3[0m
[32mINFO[0m: [1mOllama initialized successfully[0m


### Storing Data in the Vector Store

After connecting our embedding model to the Indox system, the next step is to store our chunked reference data in the vector store. This process ensures that our data is indexed and readily available for retrieval during the question-answering process.

We use the `store_in_vectorstore` method to store the processed data in the vector store. By doing this, we enhance the system's ability to quickly access and retrieve relevant information based on the embeddings generated earlier.

Let's proceed with storing the data in the vector store.

In [6]:
db = Chroma(collection_name="sample", embedding_function=embed_mistral)

[32mINFO[0m: [1mInitialized Mistral embeddings[0m


In [7]:
db.add(docs=docs)

[32mINFO[0m: [1mConnection to the vector store database established successfully[0m


<indox.vector_stores.Chroma.ChromaVectorStore at 0x2528c616300>

## Query from RAG System with Indox
With our Retrieval-Augmented Generation (RAG) system built using Indox, we are now ready to test it with a sample question. This test will demonstrate how effectively our system can retrieve and generate accurate answers based on the reference data stored in the vector store.

We'll use a sample query to test our system:
- **Query**: "How did Cinderella reach her happy ending?"

This question will be processed by our Indox system to retrieve relevant information and generate an appropriate response.

Let's test our RAG system with the sample question

In [9]:
retriever = indox.QuestionAnswer(vector_database=db, llm=llm_model)

In [10]:
query = "How cinderella reach her happy ending?"

Now that our Retrieval-Augmented Generation (RAG) system with Indox is fully set up, we can test it with a sample question. We'll use the `invoke` submethod to get a response from the system.


The `invoke` method processes the query using the connected QA model and retrieves relevant information from the vector store. It returns a list where:
- The first index contains the answer.
- The second index contains the contexts and their respective scores.


We'll pass this query to the `invoke` method and print the response.


In [11]:
answer = retriever.invoke(query)

[32mINFO[0m: [1mRetrieving context and scores from the vector database[0m
[32mINFO[0m: [1mGenerating answer without document relevancy filter[0m
[32mINFO[0m: [1mAnswering question[0m
[32mINFO[0m: [1mGenerating response[0m
[32mINFO[0m: [1mResponse generated successfully[0m
[32mINFO[0m: [1mQuery answered successfully[0m


In [12]:
answer

"Based on the given context, here are the options:A) Cinderella's fairy godmother helped her get ready for the ball.B) Cinderella's kindness and hard work earned her a magical dress from the bird on the hazel-tree.C) Cinderella's step-sisters and mother helped her get ready for the wedding.D) Cinderella used her magic to transform herself into a beautiful princess.And the best full answer is:B) Cinderella's kindness and hard work earned her a magical dress from the bird on the hazel-tree.This option stands out as the most correct because it highlights Cinderella's humble nature and her good deeds, which ultimately led to her receiving a magical dress that transformed her into a beautiful princess."

## Retrieve information by using Agnet
Here we are using Agent to retrieve answer.
Note: to be more familiar with AgenticRAG pleas read [this page]("https://docs.osllm.ai/agenticRag.html")

In [13]:
agent = indox.AgenticRag(vector_database=db,llm=llm_model)
agent.run("where does messi plays right now?")

[32mINFO[0m: [1mGenerating response[0m
[32mINFO[0m: [1mResponse generated successfully[0m
[32mINFO[0m: [1mNot relevant doc[0m
[32mINFO[0m: [1mGenerating response[0m
[32mINFO[0m: [1mResponse generated successfully[0m
[32mINFO[0m: [1mNot relevant doc[0m
[32mINFO[0m: [1mGenerating response[0m
[32mINFO[0m: [1mResponse generated successfully[0m
[32mINFO[0m: [1mNot relevant doc[0m
[32mINFO[0m: [1mGenerating response[0m
[32mINFO[0m: [1mResponse generated successfully[0m
[32mINFO[0m: [1mNot relevant doc[0m
[32mINFO[0m: [1mGenerating response[0m
[32mINFO[0m: [1mResponse generated successfully[0m
[32mINFO[0m: [1mNot relevant doc[0m
[32mINFO[0m: [1mNo Relevant document found, Start web search[0m
[32mINFO[0m: [1mNo Relevant Context Found, Start Searching On Web...[0m
[32mINFO[0m: [1mAnswer Base On Web Search[0m
[32mINFO[0m: [1mAnswering question[0m
[32mINFO[0m: [1mGenerating response[0m
[32mINFO[0m: [1mResponse gene

'Based on the given context, it appears that Lionel Messi is currently playing for Inter Miami, as he has announced his decision to join the Major League Soccer (MLS) team.'