In [43]:
model = "llama3.2"

#### Task 1: Simple Chain with Retrieval

**Objective:**

Implement a simple RAG chain with ChatOllama, HuggingFaceEmbeddings and Chroma.

Process:

1. Retrieve documents from chroma db based on query
2. Invoke chain with retrieved documents as input

**Task Description:**

- load llm model via ollama
- load huggingface embedding model (`model_name="sentence-transformers/all-mpnet-base-v2"`)
- create chroma db client
- create prompt template for summarization
- create simple chain with following steps: retrieved documents, prompt, model, output parser
- create query and perform similarity search with a query
- invoke chain and pass retrieved documents to the chain

**Useful links:**

- [RAG with Ollama](https://python.langchain.com/v0.2/docs/tutorials/local_rag/)


In [44]:
from langchain_ollama import ChatOllama

# ADD HERE YOUR CODE
model = ChatOllama(model=model, base_url="http://localhost:5000")

In [45]:
from langchain_huggingface import HuggingFaceEmbeddings

# ADD HERE YOUR CODE
embedding_model = HuggingFaceEmbeddings(model_name="sentence-transformers/all-mpnet-base-v2")



In [46]:
from langchain_chroma import Chroma
import chromadb
import chromadb
from chromadb.config import DEFAULT_TENANT, DEFAULT_DATABASE, Settings

client = chromadb.HttpClient(
    host="localhost",
    port=8000,
    ssl=False,
    headers=None,
    settings=Settings(allow_reset=True, anonymized_telemetry=False),
    tenant=DEFAULT_TENANT,
    database=DEFAULT_DATABASE,
)

# Create a collection
# ADD HERE YOUR CODE
collection = client.get_or_create_collection("test_collection")

# Create chromadb
# ADD HERE YOUR CODE
vector_db_from_client = Chroma(
    client=client,
    collection_name="test_collection",
    embedding_function=embedding_model
)

In [47]:
from langchain_core.output_parsers import StrOutputParser
from langchain_core.prompts import ChatPromptTemplate

prompt = ChatPromptTemplate.from_template(
    "Summarize the main themes in these retrieved docs: {docs}"
)


# Convert loaded documents into strings by concatenating their content
# and ignoring metadata
def format_docs(docs):
    return "\n\n".join(doc.page_content for doc in docs)


chain = {"docs": format_docs} | prompt | model | StrOutputParser()

In [48]:
search_query = "Types of Machine Learning Systems"

# ADD HERE YOUR CODE
# Perform vector search
docs = vector_db_from_client.similarity_search(search_query)

print(docs)

[Document(metadata={'page': 33, 'source': './AI_Book.pdf'}, page_content='Types of Machine Learning Systems\nThere are so many different types of Machine Learning systems that it is useful to\nclassify them in broad categories based on:\nWhether or not they are trained with human supervision (supervised, unsuper\nvised, semisupervised, and Reinforcement Learning)\nWhether or not they can learn incrementally on the fly (online versus batch\nlearning)\nWhether they work by simply comparing new data points to known data points,\nor instead detect patterns in the training data and build a predictive model, much\nlike scientists do (instance-based versus model-based learning)\nThese criteria are not exclusive; you can combine them in any way you like. For\nexample, a state-of-the-art spam filter may learn on the fly using a deep neural net\nwork model trained using examples of spam and ham; this makes it an online, model-\nbased, supervised learning system.\nLets look at each of these crite

In [49]:
chain.invoke(docs)

'The main themes in these retrieved documents are:\n\n1. **Classification of Machine Learning Systems**: The documents discuss ways to categorize machine learning systems based on four criteria:\n * Supervised, unsupervised, semisupervised, and reinforcement learning.\n * Online versus batch learning.\n * Instance-based versus model-based learning.\n\n2. **Types of Supervised Learning**: The documents explain that supervised learning involves training the system with labeled data, where the desired solutions are included in the training data.\n\n3. **Reinforcement Learning**: Reinforcement learning is a type of machine learning where the system learns by interacting with an environment and receiving feedback in the form of rewards or penalties.\n\n4. **Batch and Online Learning**: The documents discuss the difference between batch learning, where the system is trained using all available data at once, and online learning, where the system can learn incrementally from a stream of incomi

#### Task 2: Q&A with RAG

**Objective:**

Implement a Q/A retrieval chain with ChatOllama, HuggingFaceEmbeddings and Chroma

**Task Description:**

- create RAG-Q/A prompt template
- create retriever from vector db client (instead of manually passing in docs, we automatically retrieve them from our vector store based on the user question)
- create simple chain with following steps: retriever, formatting retrieved docs, user question, prompt, model, output parser
- create question for Q/A retrieval chain
- invoke chain and with question

**Useful links:**

- [RAG with Ollama](https://python.langchain.com/v0.2/docs/tutorials/local_rag/)


In [51]:
from langchain_core.runnables import RunnablePassthrough

prompt_template = """
You are an assistant for question-answering tasks. Use the following pieces of retrieved context to answer the question. If you don't know the answer, just say that you don't know. Use three sentences maximum and keep the answer concise.

<context>
{context}
</context>

Answer the following question:

{question}"""

# ADD HERE YOUR CODE
rag_prompt = ChatPromptTemplate.from_template(prompt_template)

# ADD HERE YOUR CODE
retriever = vector_db_from_client.as_retriever()

# ADD HERE YOUR CODE
qa_rag_chain = (
    {"context": retriever | format_docs, "question": RunnablePassthrough()}
    | rag_prompt
    | model
    | StrOutputParser()
)

In [52]:
qa_rag_chain

{
  context: VectorStoreRetriever(tags=['Chroma', 'HuggingFaceEmbeddings'], vectorstore=<langchain_chroma.vectorstores.Chroma object at 0x7fc5fd213410>)
           | RunnableLambda(format_docs),
  question: RunnablePassthrough()
}
| ChatPromptTemplate(input_variables=['context', 'question'], messages=[HumanMessagePromptTemplate(prompt=PromptTemplate(input_variables=['context', 'question'], template="\nYou are an assistant for question-answering tasks. Use the following pieces of retrieved context to answer the question. If you don't know the answer, just say that you don't know. Use three sentences maximum and keep the answer concise.\n\n<context>\n{context}\n</context>\n\nAnswer the following question:\n\n{question}"))])
| ChatOllama(model='llama3.2', base_url='http://localhost:5000', _client=<ollama._client.Client object at 0x7fc5fc8f7010>, _async_client=<ollama._client.AsyncClient object at 0x7fc5fc8af710>)
| StrOutputParser()

In [53]:
question = "What is supervised learning?"

# ADD HERE YOUR CODE
qa_rag_chain.invoke(question)

'Supervised learning is a type of machine learning where the system learns from labeled data, where the correct output is already known. In this context, the chapter describes it as a typical task in Machine Learning, with examples such as classification (like spam filtering) and regression (predicting a target numeric value).'