# Use LLamaIndexQueryEngine to query Markdown files 

This notebook demonstrates the use of the `LLamaIndexqueryEngine` for retrieval-augmented question answering over documents. It shows how to set up the engine with Docling parsed Markdown files, and execute natural language queries against the indexed data. 

The `LlamaIndexQueryEngine` provides an efficient way to query vectorDBs using any LlamaIndex's [vector store](https://docs.llamaindex.ai/en/stable/module_guides/storing/vector_stores/)

In [None]:
%pip install llama-index-vector-stores-chroma==0.4.1
%pip install llama-index==0.12.16
%pip install llama-index llama-index-vector-stores-pinecone=0.4.4

In [None]:
import os

import autogen

config_list = autogen.config_list_from_json(env_or_file="../OAI_CONFIG_LIST")

assert len(config_list) > 0
print("models to use: ", [config_list[i]["model"] for i in range(len(config_list))])

# Put the OpenAI API key into the environment
os.environ["OPENAI_API_KEY"] = config_list[0]["api_key"]

### In the first example, we build a LLamaIndexQueryEngine instance on top of ChromaDB. 
Refer to this [link](https://docs.trychroma.com/production/containers/docker) for running Chromadb in a Docker container.


In [None]:
from chromadb import HttpClient
from llama_index.vector_stores.chroma import ChromaVectorStore

# we need to set up LlmaIndex's ChromaVectorStore
# Refer to https://docs.llamaindex.ai/en/stable/examples/vector_stores/chroma_metadata_filter/ for more information
chroma_client = HttpClient(
    host="172.17.0.3",
    port=8000,
)
# use get_collection to get an existing collection
chroma_collection = chroma_client.get_collection("default_collection")
# chroma_collection = chroma_client.create_collection("default_collection")
chroma_vector_store = ChromaVectorStore(chroma_collection=chroma_collection)

Then we can use the `chroma_vector_store` to create our `LLamaIndexQueryEngine` instance.

In [None]:
from llama_index.llms.openai import OpenAI

from autogen.agentchat.contrib.rag import LlamaIndexQueryEngine

chroma_query_engine = LlamaIndexQueryEngine(
    vector_store=chroma_vector_store,
    llm=OpenAI(model="gpt-4o", temperature=0.0),  # Default model for querying, change if needed
)

Initialize the database with input docs and query it with the engine.

In [None]:
input_dir = (
    "/workspaces/ag2/test/agents/experimental/document_agent/pdf_parsed/"  # Update to match your input directory
)
input_docs = [input_dir + "nvidia_10k_2024.md"]  # Update to match your input documents

In [None]:
chroma_query_engine.init_db(new_doc_paths_or_urls=input_docs)

# If you don't want to initialize the database with new docs, you can call connect_db() instead
# chroma_query_engine.connect_db()

In [None]:
question = "How much money did Nvidia spend in research and development"
answer = chroma_query_engine.query(question)
print(answer)

Great, we got the data we needed. Now, let's add another document.

In [None]:
new_docs = [input_dir + "Toast_financial_report.md"]

In [None]:
chroma_query_engine.add_docs(new_doc_paths_or_urls=new_docs)

And query again from the same database but this time for another corporate entity.

In [None]:
question = "How much money did Toast earn in 2024?"
answer = chroma_query_engine.query(question)
print(answer)

### In the second example, we build a similar LLamaIndexQueryEngine instance, but on top of Pinecone.
Refer to https://docs.llamaindex.ai/en/stable/examples/vector_stores/PineconeIndexDemo/ for more details on how to set up Pinecone and  PineconeVectorStore

In [None]:
import os

from pinecone import Pinecone, ServerlessSpec

os.environ["PINECONE_API_KEY"] = "pcsk_37NsKE_BBntLEYzwarCpKFTDtL3g6bZx3uqHcMNF7xhHrixs5oarnx6tm5RaRSyhciCnbt"
api_key = os.environ["PINECONE_API_KEY"]

pc = Pinecone(api_key=api_key)

In [None]:
# dimensions are for text-embedding-ada-002, which PineconeVectorStore uses for embedding text by default

pc.create_index(
    name="quickstart",
    dimension=1536,
    metric="euclidean",
    spec=ServerlessSpec(cloud="aws", region="us-east-1"),
)

In [None]:
from llama_index.vector_stores.pinecone import PineconeVectorStore

pinecone_index = pc.Index("quickstart")
pinecone_vector_store = PineconeVectorStore(pinecone_index=pinecone_index)

In [None]:
pinecone_query_engine = LlamaIndexQueryEngine(
    vector_store=pinecone_vector_store,
    llm=OpenAI(model="gpt-4o", temperature=0.0),  # Default model for querying, change if needed
)

In [None]:
pinecone_query_engine.init_db(new_doc_paths_or_urls=input_docs)

In [None]:
question = "How much money did Nvidia spend in research and development"
answer = chroma_query_engine.query(question)
print(answer)

In [None]:
new_docs = [input_dir + "Toast_financial_report.md"]

In [None]:
chroma_query_engine.add_docs(new_doc_paths_or_urls=new_docs)

In [None]:
question = "How much money did Toast earn in 2024?"
answer = chroma_query_engine.query(question)
print(answer)