<a href="https://colab.research.google.com/github/edquestofficial/RAG-With-LLama-Index/blob/main/QnA_from_multiple_files_Advanced.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

**<h1> QnA From Multiple Files Advanced </h1>**

This project allows users to extract information and get answers from any uploaded file, whether it's a text file, PDF, or any other type.

**<h2> Features </h2>**

*   Multi-Format Support: Extract information from text files, PDFs, and other document types.
*  Flexible Querying: Retrieve answers and insights based on user queries.

*   Seamless Integration: Utilize Chroma and LlamaIndex for efficient document indexing and querying.

<h2> Step 1:
Install the necessary dependencies:

In [None]:
!pip install llama_index.core
!pip install llama_index.llms.gemini
!pip install llama_index.embeddings.gemini
!pip install llama_index.vector_stores.chroma

In [None]:
!pip install docx2txt
!pip install llama-index-readers-file

**<h2>Step 2:**
Integrating LlamaIndex, Gemini, and Chroma </h4>

In [None]:
import os

from llama_index.core import Settings
from llama_index.core import SimpleDirectoryReader
from llama_index.core import StorageContext
from llama_index.core import VectorStoreIndex
from llama_index.llms.gemini import Gemini
from llama_index.embeddings.gemini import GeminiEmbedding
from llama_index.vector_stores.chroma import ChromaVectorStore
import chromadb

**<h2>Step 3:**
Configuring Gemini Models with API Key

In [None]:
api_key = input("enter api key: ")

gemini_embedding_model = GeminiEmbedding(api_key=api_key, model_name="models/embedding-001")

llm = Gemini(api_key=api_key, model_name="models/gemini-pro")

**<h2>Step 4:**
File Upload and Directory Setup

In [None]:
from google.colab import files
uploaded = files.upload()

# Create a directory if not existing
data_dir = 'data'
os.makedirs(data_dir, exist_ok=True)

# Move uploaded files to the data directory
for filename in uploaded.keys():
    os.rename(filename, os.path.join(data_dir, filename))

**<h2>Step 5:**
Index Creation and Storage Setup with Chroma and LlamaIndex

In [None]:
documents = SimpleDirectoryReader(data_dir).load_data()

# Create a client and a new collection
client = chromadb.PersistentClient(path="./chroma_db")
chroma_collection = client.get_or_create_collection("quickstart")

# Create a vector store
vector_store = ChromaVectorStore(chroma_collection=chroma_collection)

# Create a storage context
storage_context = StorageContext.from_defaults(vector_store=vector_store)

# Set Global settings
Settings.llm = llm
Settings.embed_model = gemini_embedding_model

# Create an index from the documents and save it to the disk
index = VectorStoreIndex.from_documents(documents, storage_context=storage_context)

**<h2>Step 6:**
Querying Index and Retrieving Results

In [None]:
load_client = chromadb.PersistentClient(path="./chroma_db")

# Fetch the collection
chroma_collection = load_client.get_collection("quickstart")

# Fetch the vector store
vector_store = ChromaVectorStore(chroma_collection=chroma_collection)

# Get the index from the vector store
index = VectorStoreIndex.from_vector_store(
    vector_store
)

# Initialize the query engine (assuming 'index' is already created and available)
test_query_engine = index.as_query_engine()
response = test_query_engine.query(input('what is your question: '))
print(response)