In [1]:
model = "llama3.2:1b"

#### Task 1: Simple Chain with Retrieval

**Objective:**

Implement a simple RAG chain with ChatOllama, HuggingFaceEmbeddings and Chroma.
(Implementieren Sie eine einfache RAG-Kette mit ChatOllama, HuggingFaceEmbeddings und Chroma.)

Process:

1. Retrieve documents from chroma db based on query
   (Rufen Sie Dokumente basierend auf der Abfrage aus der Chroma-Datenbank ab)
2. Invoke chain with retrieved documents as input
   (Aufrufkette mit abgerufenen Dokumenten als Eingabe)

**Task Description:**

- load llm model via ollama
- load huggingface embedding model (`model_name="sentence-transformers/all-mpnet-base-v2"`)
- create chroma db client
- create prompt template for summarization
- create simple chain with following steps: retrieved documents, prompt, model, output parser
- create query and perform similarity search with a query
- invoke chain and pass retrieved documents to the chain

**Useful links:**

- [RAG with Ollama](https://python.langchain.com/v0.2/docs/tutorials/local_rag/)


In [10]:
from langchain_ollama import ChatOllama

# ADD HERE YOUR CODE
llm = ChatOllama(model = model)

In [4]:
from langchain_huggingface import HuggingFaceEmbeddings

# ADD HERE YOUR CODE
embedding_model = HuggingFaceEmbeddings(model_name = "sentence-transformers/all-mpnet-base-v2")

  from tqdm.autonotebook import tqdm, trange


In [None]:
from langchain_chroma import Chroma
import chromadb
import chromadb
from chromadb.config import DEFAULT_TENANT, DEFAULT_DATABASE, Settings

client = chromadb.HttpClient(
    host     = "localhost",
    port     = 8000,
    ssl      = False,
    headers  = None,
    settings = Settings(allow_reset=True, anonymized_telemetry=False),
    tenant   = DEFAULT_TENANT,
    database = DEFAULT_DATABASE,
)

# Create a collection
# ADD HERE YOUR CODE
collection_name = "AI_Book"
collection = client.get_or_create_collection(collection_name)

# Create chromadb
# ADD HERE YOUR CODE
#vector_db_from_client = Chroma(persist_directory = "chroma_db") erster Versuch
vector_db_from_client = Chroma(
    client = client,
    collection_name = collection_name,
    embedding_function = embedding_model)

In [11]:
from langchain_core.output_parsers import StrOutputParser
from langchain_core.prompts import ChatPromptTemplate

prompt = ChatPromptTemplate.from_template(
    "Summarize the main themes in these retrieved docs: {docs}"
)


# Convert loaded documents into strings by concatenating their content
# and ignoring metadata
def format_docs(docs):
    return "\n\n".join(doc.page_content for doc in docs)


chain = {"docs": format_docs} | prompt | llm | StrOutputParser()

In [14]:
search_query = "Types of Machine Learning Systems"

# ADD HERE YOUR CODE
# Perform vector search
docs = vector_db_from_client.similarity_search(search_query)

print(docs)

[]


In [17]:
chain.invoke(docs)

"However, I need to clarify that I don't see any documents retrieved. Could you please provide me with the documents you'd like me to summarize? Additionally, could you please let me know which retrieved docs you're referring to (e.g., a specific text or a set of related texts)?"

#### Task 2: Q&A with RAG

**Objective:**

Implement a Q/A retrieval chain with ChatOllama, HuggingFaceEmbeddings and Chroma

**Task Description:**

- create RAG-Q/A prompt template
- create retriever from vector db client (instead of manually passing in docs, we automatically retrieve them from our vector store based on the user question)
- create simple chain with following steps: retriever, formatting retrieved docs, user question, prompt, model, output parser
- create question for Q/A retrieval chain
- invoke chain and with question

**Useful links:**

- [RAG with Ollama](https://python.langchain.com/v0.2/docs/tutorials/local_rag/)


In [19]:
from langchain_core.runnables import RunnablePassthrough

prompt_template = """
You are an assistant for question-answering tasks. Use the following pieces of retrieved context to answer the question. If you don't know the answer, just say that you don't know. Use three sentences maximum and keep the answer concise.

<context>
{context}
</context>

Answer the following question:

{question}"""

# ADD HERE YOUR CODE
rag_prompt = ChatPromptTemplate.from_template(prompt_template)

# ADD HERE YOUR CODE
retriever = vector_db_from_client.as_retriever()

# ADD HERE YOUR CODE
qa_rag_chain = ({"context": retriever | format_docs, "question": RunnablePassthrough()}
    | rag_prompt
    | llm
    | StrOutputParser()
)

In [20]:
qa_rag_chain

{
  context: VectorStoreRetriever(tags=['Chroma', 'HuggingFaceEmbeddings'], vectorstore=<langchain_chroma.vectorstores.Chroma object at 0x7fb6ba313450>)
           | RunnableLambda(format_docs),
  question: RunnablePassthrough()
}
| ChatPromptTemplate(input_variables=['context', 'question'], messages=[HumanMessagePromptTemplate(prompt=PromptTemplate(input_variables=['context', 'question'], template="\nYou are an assistant for question-answering tasks. Use the following pieces of retrieved context to answer the question. If you don't know the answer, just say that you don't know. Use three sentences maximum and keep the answer concise.\n\n<context>\n{context}\n</context>\n\nAnswer the following question:\n\n{question}"))])
| ChatOllama(model='llama3.2:1b', _client=<ollama._client.Client object at 0x7fb6ba077610>, _async_client=<ollama._client.AsyncClient object at 0x7fb6ba028190>)
| StrOutputParser()

In [21]:
question = "What is supervised learning?"

# ADD HERE YOUR CODE
qa_rag_chain.invoke(question)

'Supervised learning is a type of machine learning where an algorithm is trained on labeled data to make predictions or decisions based on specific examples, with the goal of minimizing the difference between predicted outcomes and actual outcomes. The labeled data provides feedback in the form of correct or incorrect responses, allowing the algorithm to learn patterns and relationships between variables. This process enables supervised learning algorithms to improve their performance over time by adjusting their weights and biases through iterative training.'