# RAG using Meta AI Llama-3.2

In [1]:
!pip install streamlit ollama llama-index-vector-stores-qdrant llama-index-llms-ollama
!pip install --upgrade llama-index llama-index-embeddings-huggingface



In [2]:
import nest_asyncio
from IPython.display import Markdown, display

from llama_index.core import Settings
from llama_index.llms.ollama import Ollama
from llama_index.core import PromptTemplate
from llama_index.embeddings.huggingface import HuggingFaceEmbedding
from llama_index.core import VectorStoreIndex, ServiceContext, SimpleDirectoryReader, StorageContext
from llama_index.core.postprocessor import SentenceTransformerRerank
from llama_index.vector_stores.qdrant import QdrantVectorStore
from llama_index.core import Settings
import qdrant_client

In [3]:
# allows nested access to the event loop
nest_asyncio.apply()

# add your documents in this directory, you can drag & drop
input_dir_path = './docs'

In [4]:
collection_name="chat_with_docs"

client = qdrant_client.QdrantClient(
    host="localhost",
    port=6333
)

def create_index(documents):
    vector_store = QdrantVectorStore(client=client, collection_name=collection_name)
    storage_context = StorageContext.from_defaults(vector_store=vector_store)
    index = VectorStoreIndex.from_documents(
        documents,
        storage_context=storage_context,
    )
    return index

In [5]:
# setup llm & embedding model and reranker
llm=Ollama(model="llama3.2:1b", request_timeout=120.0)
embed_model = HuggingFaceEmbedding( model_name="BAAI/bge-large-en-v1.5", trust_remote_code=True)
rerank = SentenceTransformerRerank(
    model="cross-encoder/ms-marco-MiniLM-L-2-v2", top_n=3
)

In [6]:
# load data
loader = SimpleDirectoryReader(
            input_dir = input_dir_path,
            required_exts=[".pdf"],
            recursive=True
        )
docs = loader.load_data()

# Creating an index over loaded data
Settings.embed_model = embed_model
try:
    index = create_index(docs)
    print('Using Qdrant collection')
except:
    index = VectorStoreIndex.from_documents(docs, show_progress=True)

# Create the query engine, where we use a cohere reranker on the fetched nodes
Settings.llm = llm
query_engine = index.as_query_engine(
    similarity_top_k=10, node_postprocessors=[rerank]
)

# ====== Customise prompt template ======
qa_prompt_tmpl_str = (
"Context information is below.\n"
"---------------------\n"
"{context_str}\n"
"---------------------\n"
"Given the context information above I want you to think step by step to answer the query in a crisp manner, incase case you don't know the answer say 'I don't know!'.\n"
"Query: {query_str}\n"
"Answer: "
)
qa_prompt_tmpl = PromptTemplate(qa_prompt_tmpl_str)

query_engine.update_prompts(
    {"response_synthesizer:text_qa_template": qa_prompt_tmpl}
)

Parsing nodes:   0%|          | 0/32 [00:00<?, ?it/s]

Generating embeddings:   0%|          | 0/45 [00:00<?, ?it/s]

In [7]:
# Generate the response
response = query_engine.query("What exactly is DSPy?",)

display(Markdown(str(response)))

I'd be happy to help you understand what DSPy is. Based on the context information provided, it seems that DSPy (Declarative Streaming Parallel Programming) is a framework for programming language models (LMs), specifically designed to abstract and automate the task of prompting them for specific tasks or behaviors.

In simpler terms, DSPy allows users to describe the desired behavior of an LM in natural language, using a concise notation called signatures. These signatures define the input fields and output fields required by the LM to perform the specified task. This way, users can create programs that don't require manual intervention or customization of the LM.

The key features of DSPy mentioned in the context information are:

1. Natural language signatures: DSPy allows users to describe tasks or behaviors using natural language.
2. Parameterized declarative modules: DSPy enables the creation of parameterized modules that can be used to abstract and automate specific tasks or prompting techniques.
3. Teleprompters: DSPy includes a feature called teleprompters, which allow for general optimization strategies (telepromoters) to optimize arbitrary pipelines of modules.

Overall, DSPy provides a platform for building, automating, and fine-tuning LM-based systems, making it an attractive solution for various applications in natural language processing.

## References
- https://github.com/patchy631/ai-engineering-hub/tree/main/document-chat-rag