In [23]:
model = "llama3.2:1b"

#### Task 1: Simple Chain with Retrieval

**Objective:**

Implement a simple RAG chain with ChatOllama, HuggingFaceEmbeddings and Chroma.

Process:

1. Retrieve documents from chroma db based on query
2. Invoke chain with retrieved documents as input

**Task Description:**

- load llm model via ollama
- load huggingface embedding model (`model_name="sentence-transformers/all-mpnet-base-v2"`)
- create chroma db client
- create prompt template for summarization
- create simple chain with following steps: retrieved documents, prompt, model, output parser
- create query and perform similarity search with a query
- invoke chain and pass retrieved documents to the chain

**Useful links:**

- [RAG with Ollama](https://python.langchain.com/v0.2/docs/tutorials/local_rag/)


In [24]:
from langchain_ollama import ChatOllama

# ADD HERE YOUR CODE
model = ChatOllama(
    model=model,
)

In [25]:
from langchain_huggingface import HuggingFaceEmbeddings

# ADD HERE YOUR CODE
embedding_model = HuggingFaceEmbeddings (model_name="sentence-transformers/all-mpnet-base-v2")



In [26]:
from langchain_chroma import Chroma
import chromadb
import chromadb
from chromadb.config import DEFAULT_TENANT, DEFAULT_DATABASE, Settings

client = chromadb.HttpClient(
    host="localhost",
    port=8000,
    ssl=False,
    headers=None,
    settings=Settings(allow_reset=True, anonymized_telemetry=False),
    tenant=DEFAULT_TENANT,
    database=DEFAULT_DATABASE,
)

# Create a collection
# ADD HERE YOUR CODE
collection_name = "AI_Book"

collection = client.get_or_create_collection(collection_name)


# Create chromadb
# ADD HERE YOUR CODE
vector_db_from_client = Chroma(
    client=client,
    collection_name=collection_name,
    embedding_function=embedding_model
)

In [27]:
from langchain_core.output_parsers import StrOutputParser
from langchain_core.prompts import ChatPromptTemplate

prompt = ChatPromptTemplate.from_template(
    "Summarize the main themes in these retrieved docs: {docs}"
)


# Convert loaded documents into strings by concatenating their content
# and ignoring metadata
def format_docs(docs):
    return "\n\n".join(doc.page_content for doc in docs)


chain = chain = {"docs": format_docs} | prompt | model | StrOutputParser()

In [28]:
search_query = "Types of Machine Learning Systems"

# ADD HERE YOUR CODE
# Perform vector search
docs = vector_db_from_client.similarity_search(search_query)

print(docs)

[Document(page_content='The most common learning algorithms: Linear and Polynomial Regression,\nLogistic Regression, k-Nearest Neighbors, Support Vector Machines, Decision\nTrees, Random Forests, and Ensemble methods.\nxiv | Preface'), Document(page_content='The most common learning algorithms: Linear and Polynomial Regression,\nLogistic Regression, k-Nearest Neighbors, Support Vector Machines, Decision\nTrees, Random Forests, and Ensemble methods.\nxiv | Preface'), Document(page_content='Forests and Ensemble methods (discussed in Part I ). Deep Learn\ning is best suited for complex problems such as image recognition,\nspeech recognition, or natural language processing, provided you\nhave enough data, computing power, and patience.\nOther Resources\nMany resources are available to learn about Machine Learning. Andrew Ngs ML\ncourse on Coursera  and Geoffrey Hintons course on neural networks and Deep\nLearning  are amazing, although they both require a significant time investment\n(thin

In [29]:
chain.invoke(docs)

"There is no provided text to summarize. Please provide the text you'd like me to summarize, and I'll be happy to help."

#### Task 2: Q&A with RAG

**Objective:**

Implement a Q/A retrieval chain with ChatOllama, HuggingFaceEmbeddings and Chroma

**Task Description:**

- create RAG-Q/A prompt template
- create retriever from vector db client (instead of manually passing in docs, we automatically retrieve them from our vector store based on the user question)
- create simple chain with following steps: retriever, formatting retrieved docs, user question, prompt, model, output parser
- create question for Q/A retrieval chain
- invoke chain and with question

**Useful links:**

- [RAG with Ollama](https://python.langchain.com/v0.2/docs/tutorials/local_rag/)


In [30]:
from langchain_core.runnables import RunnablePassthrough

prompt_template = """
You are an assistant for question-answering tasks. Use the following pieces of retrieved context to answer the question. If you don't know the answer, just say that you don't know. Use three sentences maximum and keep the answer concise.

<context>
{context}
</context>

Answer the following question:

{question}"""

# ADD HERE YOUR CODE
rag_prompt = ChatPromptTemplate.from_template(prompt_template)

# ADD HERE YOUR CODE
retriever = vector_db_from_client.as_retriever()

# ADD HERE YOUR CODE
qa_rag_chain = ({"context": retriever | format_docs, "question": RunnablePassthrough()} | rag_prompt | model | StrOutputParser())

In [31]:
qa_rag_chain

{
  context: VectorStoreRetriever(tags=['Chroma', 'HuggingFaceEmbeddings'], vectorstore=<langchain_chroma.vectorstores.Chroma object at 0x7f3e347a8290>)
           | RunnableLambda(format_docs),
  question: RunnablePassthrough()
}
| ChatPromptTemplate(input_variables=['context', 'question'], messages=[HumanMessagePromptTemplate(prompt=PromptTemplate(input_variables=['context', 'question'], template="\nYou are an assistant for question-answering tasks. Use the following pieces of retrieved context to answer the question. If you don't know the answer, just say that you don't know. Use three sentences maximum and keep the answer concise.\n\n<context>\n{context}\n</context>\n\nAnswer the following question:\n\n{question}"))])
| ChatOllama(model='llama3.2:1b', _client=<ollama._client.Client object at 0x7f3e3423db90>, _async_client=<ollama._client.AsyncClient object at 0x7f3e3423dfd0>)
| StrOutputParser()

In [32]:
question = "What is supervised learning?"

# ADD HERE YOUR CODE
qa_rag_chain.invoke(question)

'Supervised learning is a type of machine learning where an algorithm is trained on labeled data, meaning it is given some input and output values to learn from. The goal is for the algorithm to make predictions or classify new, unseen input data based on its learned patterns and relationships. In contrast, unsupervised learning involves discovering hidden patterns or structures in the data without any prior labels.'