# QdrantDB 
In this notebook, we will demonstrate how to use QdrantDB, for accessing and querying data efficiently. QdrantDB is designed to work seamlessly with modern analytical workloads, making it a powerful tool for data analysis, research, and question-answering systems.

To begin, ensure you have `QdrantDB` installed in your Python environment. You can easily install it using `pip install qdrant-client`.

[![Open In Colab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/github/osllmai/inDox/blob/master/Demo/qdrant.ipynb)

In [None]:
!pip install indox
!pip install qdrant-client
!pip install semantic_text_splitter
!pip install sentence-transformers

## Setting Up the Python Environment

If you are running this project in your local IDE, please create a Python environment to ensure all dependencies are correctly managed. You can follow the steps below to set up a virtual environment named `indox`:

### Windows

1. **Create the virtual environment:**
```bash
python -m venv indox
```
2. **Activate the virtual environment:**
```bash
indox_judge\Scripts\activate
```

### macOS/Linux

1. **Create the virtual environment:**
   ```bash
   python3 -m venv indox
```

2. **Activate the virtual environment:**
    ```bash
   source indox/bin/activate
```
### Install Dependencies

Once the virtual environment is activated, install the required dependencies by running:

```bash
pip install -r requirements.txt
```


### Load Hugging face API key And QDRANT API KEY


In [1]:
import os
from dotenv import load_dotenv

load_dotenv('api.env')

HUGGINGFACE_API_KEY = os.environ['HUGGINGFACE_API_KEY']
QDRANT_API_KEY = os.environ['Qdrant_API_KEY']

In [2]:
from indox.llms import HuggingFaceModel
from indox.embeddings import HuggingFaceEmbedding
mistral_qa = HuggingFaceModel(api_key=HUGGINGFACE_API_KEY,model="mistralai/Mistral-7B-Instruct-v0.2")
embed = HuggingFaceEmbedding(api_key=HUGGINGFACE_API_KEY,model="multi-qa-mpnet-base-cos-v1")

[32mINFO[0m: [1mInitializing HuggingFaceModel with model: mistralai/Mistral-7B-Instruct-v0.2[0m
[32mINFO[0m: [1mHuggingFaceModel initialized successfully[0m
[32mINFO[0m: [1mInitialized HuggingFaceEmbedding with model: multi-qa-mpnet-base-cos-v1[0m


In [3]:
from indox import IndoxRetrievalAugmentation
indox = IndoxRetrievalAugmentation()

[32mINFO[0m: [1mIndoxRetrievalAugmentation initialized[0m

            ██  ███    ██  ██████   ██████  ██       ██
            ██  ████   ██  ██   ██ ██    ██   ██  ██
            ██  ██ ██  ██  ██   ██ ██    ██     ██
            ██  ██  ██ ██  ██   ██ ██    ██   ██   ██
            ██  ██  █████  ██████   ██████  ██       ██
            


Initialize a language model and an embedding model using the indox library with Hugging Face . The HuggingFaceModel class is used to create an instance of the Mistral-7B-Instruct model for tasks like question answering

### Load Sample Data


In [None]:
!wget https://raw.githubusercontent.com/osllmai/inDox/master/Demo/sample.txt

In [4]:
file_path = "sample.txt"
with open(file_path, "r") as file:
    text = file.read()

use the `RecursiveCharacterTextSplitter` class from the indox library to divide a large text into smaller, manageable chunks


In [5]:
from indox.splitter import RecursiveCharacterTextSplitter

splitter = RecursiveCharacterTextSplitter(400,20)
content_chunks = splitter.split_text(text)

### Set up vector store


In [6]:
from indox.vector_stores import Qdrant

url = "url" 
qdrant = Qdrant(
    collection_name="IndoxTest", 
    embedding_function=embed, 
    url=url, 
    api_key=QDRANT_API_KEY

)

[32mINFO[0m: [1mEmbedding documents[0m
[32mINFO[0m: [1mStarting to fetch embeddings for texts using model: SentenceTransformer(
  (0): Transformer({'max_seq_length': 512, 'do_lower_case': False}) with Transformer model: MPNetModel 
  (1): Pooling({'word_embedding_dimension': 768, 'pooling_mode_cls_token': False, 'pooling_mode_mean_tokens': True, 'pooling_mode_max_tokens': False, 'pooling_mode_mean_sqrt_len_tokens': False, 'pooling_mode_weightedmean_tokens': False, 'pooling_mode_lasttoken': False, 'include_prompt': True})
  (2): Normalize()
)[0m


Batches: 100%|██████████| 1/1 [00:00<00:00,  4.81it/s]
2024-09-09 15:20:00,259 - httpx - INFO - HTTP Request: PUT https://ffbf001a-09a2-4afc-baa9-6e64584f7d01.europe-west3-0.gcp.cloud.qdrant.io:6333/collections/IndoxTest2 "HTTP/1.1 409 Conflict"


Collection IndoxTest2 already exists.


### Storing Data in the Vector Store

In [7]:
qdrant.add(texts=content_chunks)

[32mINFO[0m: [1mEmbedding documents[0m
[32mINFO[0m: [1mStarting to fetch embeddings for texts using model: SentenceTransformer(
  (0): Transformer({'max_seq_length': 512, 'do_lower_case': False}) with Transformer model: MPNetModel 
  (1): Pooling({'word_embedding_dimension': 768, 'pooling_mode_cls_token': False, 'pooling_mode_mean_tokens': True, 'pooling_mode_max_tokens': False, 'pooling_mode_mean_sqrt_len_tokens': False, 'pooling_mode_weightedmean_tokens': False, 'pooling_mode_lasttoken': False, 'include_prompt': True})
  (2): Normalize()
)[0m


Batches: 100%|██████████| 2/2 [00:00<00:00, 15.16it/s]
2024-09-09 15:20:01,469 - httpx - INFO - HTTP Request: PUT https://ffbf001a-09a2-4afc-baa9-6e64584f7d01.europe-west3-0.gcp.cloud.qdrant.io:6333/collections/IndoxTest2/points?wait=true "HTTP/1.1 200 OK"


['1ea39065-a426-4264-9543-4bf3ca5fe851',
 '1beb709f-0995-47f7-9292-51e81351ab99',
 'fc171713-4323-4e57-9397-99faeb2ed81d',
 '80d33960-3bee-40e7-a6d9-9a4a3bb6a9c2',
 'bb33d4f9-89de-4caa-9c16-95e5400c4bd3',
 '70799ae2-1fbb-4c77-9ef7-938b43a72d3f',
 '62697cec-7847-4ab6-a947-d2d08279e5df',
 'b166a95c-3133-4cef-b349-b6b6033ae0b4',
 'd9fb7b1e-a7b8-41f7-9982-144fab7b558d',
 '77a572c5-5e8a-4492-be63-1973769a724a',
 'c1a9715c-07ab-466e-8cac-0c5e8fb1cabf',
 '26b1f4ae-72b2-4e47-8d78-96e4176a523e',
 '98f9b089-4a88-49c2-a593-8dadb3c684e6',
 'd50874da-1645-49f3-8359-812249b54857',
 'cfe17ac5-8f47-4873-b31c-441b1280aef8',
 '7f018883-577c-410e-99bd-8fcc25a3d1ef',
 'd6c1d9d0-2059-4d06-802f-a115f2edb8b1',
 'f72ff63d-1ca3-434c-a085-d09fc0e36188',
 '161c91b6-32c2-4caf-a769-68c3f4f8f498',
 'caa245ce-799b-42ea-ba38-fa4b90fb686f',
 '5aa14d33-0bdf-4093-8ad1-0d32625acfdb',
 'f6d2ac79-0b30-49e3-b564-c7bcdf3e3363',
 '21db3a5a-3f46-4243-b7ea-f0638afaec0e',
 '1cd009dc-27d2-4d96-851c-c412b2bc09f2',
 '3d7f6877-c0f1-

In [8]:
retriever = indox.QuestionAnswer(vector_database=qdrant,llm=mistral_qa,top_k=5)
query = "How cinderella reach her happy ending?"
answer = retriever.invoke(query=query)


[32mINFO[0m: [1mRetrieving context and scores from the vector database[0m
[32mINFO[0m: [1mEmbedding documents[0m
[32mINFO[0m: [1mStarting to fetch embeddings for texts using model: SentenceTransformer(
  (0): Transformer({'max_seq_length': 512, 'do_lower_case': False}) with Transformer model: MPNetModel 
  (1): Pooling({'word_embedding_dimension': 768, 'pooling_mode_cls_token': False, 'pooling_mode_mean_tokens': True, 'pooling_mode_max_tokens': False, 'pooling_mode_mean_sqrt_len_tokens': False, 'pooling_mode_weightedmean_tokens': False, 'pooling_mode_lasttoken': False, 'include_prompt': True})
  (2): Normalize()
)[0m


Batches: 100%|██████████| 1/1 [00:00<00:00, 151.49it/s]
2024-09-09 15:20:01,744 - httpx - INFO - HTTP Request: POST https://ffbf001a-09a2-4afc-baa9-6e64584f7d01.europe-west3-0.gcp.cloud.qdrant.io:6333/collections/IndoxTest2/points/search "HTTP/1.1 200 OK"


[32mINFO[0m: [1mGenerating answer without document relevancy filter[0m
[32mINFO[0m: [1mAnswering question[0m
[32mINFO[0m: [1mSending request to Hugging Face API[0m
[32mINFO[0m: [1mReceived successful response from Hugging Face API[0m
[32mINFO[0m: [1mQuery answered successfully[0m


In [9]:
answer

'Cinderella receives a magical gift from a bird, allowing her to attend a royal ball where she catches the eye of the prince. When the prince realizes that his new bride is actually Cinderella, he takes her away on his horse to live happily ever after, leaving her cruel step-mother and step-sisters behind in regret and anger.'