In [None]:
model = "llama3.2:1B"

#### Task 1: Simple Chain with Retrieval

**Objective:**

Implement a simple RAG chain with ChatOllama, HuggingFaceEmbeddings and Chroma. 

Process: 

1. Retrieve documents from chroma db based on query
2. Invoke chain with retrieved documents as input

**Task Description:**

- load llm model via ollama
- load embedding model via ollama with `ollama pull pull bge-m3` (if not yet done)
- create chroma db client
- create prompt template for summarization
- create simple chain with following steps: retrieved documents, prompt, model, output parser
- create query and perform similarity search with a query
- invoke chain and pass retrieved documents to the chain


**Useful links:**

- [RAG with Ollama](https://python.langchain.com/v0.2/docs/tutorials/local_rag/)
- [Streaming in Langchain](https://python.langchain.com/docs/concepts/streaming/)


In [2]:
from langchain_ollama import ChatOllama

# ADD HERE YOUR CODE
model = ChatOllama(model=model)

In [3]:
from langchain_ollama import OllamaEmbeddings

# ADD HERE YOUR CODE
embedding_model = OllamaEmbeddings(model="bge-m3")

In [4]:
from langchain_chroma import Chroma
import chromadb
import chromadb
from chromadb.config import DEFAULT_TENANT, DEFAULT_DATABASE, Settings

client = chromadb.HttpClient(
    host="localhost",
    port=8000,
    ssl=False,
    headers=None,
    settings=Settings(allow_reset=True, anonymized_telemetry=False),
    tenant=DEFAULT_TENANT,
    database=DEFAULT_DATABASE,
)

# Create a collection
# ADD HERE YOUR CODE
collection_name = "ai_model_book"
collection = client.get_collection(name=collection_name)

# Create chromadb
# ADD HERE YOUR CODE
vector_db_from_client = Chroma(client=client,
    collection_name=collection_name,
    embedding_function=embedding_model)

Failed to send telemetry event ClientStartEvent: capture() takes 1 positional argument but 3 were given


In [8]:
from langchain_core.output_parsers import StrOutputParser
from langchain_core.prompts import ChatPromptTemplate

prompt = ChatPromptTemplate.from_template(
    "Summarize the main themes in these retrieved docs: {docs}"
)


# Convert loaded documents into strings by concatenating their content
# and ignoring metadata
def format_docs(docs):
    return "\n\n".join(doc.page_content for doc in docs)


chain = prompt | model | StrOutputParser()

In [9]:
search_query = "Types of Machine Learning Systems"

# ADD HERE YOUR CODE
# Perform vector search
docs = vector_db_from_client.similarity_search(search_query, k=3)

print(docs)
formatted_docs_input = {"docs": format_docs(docs)}

[Document(metadata={'page': 33, 'source': './AI_Book.pdf'}, page_content='Types of Machine Learning Systems\nThere are so many different types of Machine Learning systems that it is useful to\nclassify them in broad categories based on:\n Whether or not they are trained with human supervision (supervised, unsuper\nvised, semisupervised, and Reinforcement Learning)\n Whether or not they can learn incrementally on the fly (online versus batch\nlearning)\n Whether they work by simply comparing new data points to known data points,\nor instead detect patterns in the training data and build a predictive model, much\nlike scientists do (instance-based versus model-based learning)\nThese criteria are not exclusive; you can combine them in any way you like. For\nexample, a state-of-the-art spam filter may learn on the fly using a deep neural net\nwork model trained using examples of spam and ham; this makes it an online, model-\nbased, supervised learning system.\nLets look at each of these cr

In [10]:
chain.invoke(formatted_docs_input)

ResponseError: model 'llama3.2:1b' not found (status code: 404)

In [None]:
# Simple stream the chain output
for chunk in chain.stream(formatted_docs_input):
    print(chunk, end="", flush=True)

The main themes retrieved from the documents are:

1. Classification of machine learning systems based on training criteria:
	* Supervised, unsupervised, semisupervised, and Reinforcement Learning (RL) types
	* Online versus batch learning (incremental vs. offline)
2. The importance of supervision during training:
	* Four major categories: supervised, unsupervised, semisupervised, and RL
3. Machine learning systems can be classified based on the type of data they learn from:
	* Supervised learning: includes labels, which are desired solutions
	* Unsupervised learning: involves unlabeled data that is clustered or represented visually
4. The ability to incrementally update models online versus offline:
	* Online learning: updates models as new data becomes available
	* Batch learning: updates models after a batch of new data is processed

These themes provide a foundation for understanding the different types of machine learning systems and their characteristics, which can inform various

In [None]:
# More complex async event streaming
async for event in chain.astream_events(formatted_docs_input, version="v2"):
    kind = event["event"]
    if kind == "on_chat_model_stream":
        print(event["data"]["chunk"].content, end="", flush=True)

The main themes retrieved from the docs on "Types of Machine Learning Systems" are:

1. **Classification**: The ability to categorize data into predefined groups or classes, with supervised learning systems relying on labeled training data.

2. **Machine Learning Paradigms**: The distinction between various machine learning paradigms, including:
   - Supervised learning (label-based)
   - Unsupervised learning (data-based)
   - Model-based learning (pattern detection and building predictive models)

3. **Supervision Types**:
   - Supervised learning: Training data with desired solutions (labels) to enable prediction.
   - Unsupervised learning: Training data without labels, used for clustering or pattern discovery.

4. **Learning Methods**: Online vs. batch learning, as well as incremental vs. batch learning, which refer to the speed at which data is processed and learned.

5. **Machine Learning Systems Classification**:
   - Online systems that learn incrementally on the fly (e.g., de

#### Task 2: Q&A with RAG

**Objective:**

Implement a Q/A retrieval chain with ChatOllama, HuggingFaceEmbeddings and Chroma

**Task Description:**

- create RAG-Q/A prompt template
- create retriever from vector db client (instead of manually passing in docs, we automatically retrieve them from our vector store based on the user question)
- create simple chain with following steps: retriever, formatting retrieved docs, user question, prompt, model, output parser
- create question for Q/A retrieval chain
- invoke chain and with question

**Useful links:**

- [RAG with Ollama](https://python.langchain.com/v0.2/docs/tutorials/local_rag/)

In [None]:
from langchain_core.runnables import RunnablePassthrough

prompt_template = """
You are an assistant for question-answering tasks. Use the following pieces of retrieved context to answer the question. If you don't know the answer, just say that you don't know. Use three sentences maximum and keep the answer concise.

<context>
{context}
</context>

Answer the following question:

{question}"""

# ADD HERE YOUR CODE
rag_prompt = ChatPromptTemplate.from_template(prompt_template)

# ADD HERE YOUR CODE
retriever = vector_db_from_client.as_retriever(search_kwargs={"k": 3})

# ADD HERE YOUR CODE
qa_rag_chain = (
    {"context": retriever | format_docs, "question": RunnablePassthrough()}
    | rag_prompt
    | model
    | StrOutputParser()
)

In [None]:
qa_rag_chain

{
  context: VectorStoreRetriever(tags=['Chroma', 'OllamaEmbeddings'], vectorstore=<langchain_chroma.vectorstores.Chroma object at 0x000001AC94A52110>, search_kwargs={'k': 3})
           | RunnableLambda(format_docs),
  question: RunnablePassthrough()
}
| ChatPromptTemplate(input_variables=['context', 'question'], messages=[HumanMessagePromptTemplate(prompt=PromptTemplate(input_variables=['context', 'question'], template="\nYou are an assistant for question-answering tasks. Use the following pieces of retrieved context to answer the question. If you don't know the answer, just say that you don't know. Use three sentences maximum and keep the answer concise.\n\n<context>\n{context}\n</context>\n\nAnswer the following question:\n\n{question}"))])
| ChatOllama(model='llama3.2:1b', _client=<ollama._client.Client object at 0x000001AC92CD16D0>, _async_client=<ollama._client.AsyncClient object at 0x000001AC94B42F90>)
| StrOutputParser()

In [None]:
question = "What is supervised learning?"

# ADD HERE YOUR CODE
qa_rag_chain.invoke(question)

'Supervised learning is a type of machine learning system where the training data includes the desired solutions, called labels, to be used for performance evaluation and fine-tuning. This approach requires human labeling of the data but provides an automatic way to train models on labeled datasets using supervised learning techniques.'

In [None]:
# More complex async event streaming
async for event in qa_rag_chain.astream_events(question, version="v2"):
    kind = event["event"]
    if kind == "on_chat_model_stream":
        print(event["data"]["chunk"].content, end="", flush=True)

Supervised learning is a type of machine learning where the training data includes desired solutions, called labels, that the algorithm attempts to predict or classify based on its input data. It requires human supervision and involves using labeled data to train a model on its own behavior.

#### Alternative: Using pre-built ConversationalRetrievalChain Class

In [None]:
from langchain.chains import ConversationalRetrievalChain
from langchain.memory import ConversationBufferMemory

In [None]:
retriever = vector_db_from_client.as_retriever()
memory = ConversationBufferMemory(memory_key="chat_history", return_messages=True)

In [None]:
qa_chain = ConversationalRetrievalChain.from_llm(
    model, retriever=retriever, memory=memory, verbose=False
)

In [None]:
# More complex async event streaming
async for event in qa_chain.astream_events("What is supervised learning?", version="v2"):
    kind = event["event"]
    if kind == "on_chat_model_stream":
        print(event["data"]["chunk"].content, end="", flush=True)

Supervised learning is a type of machine learning where an algorithm is trained on labeled data, meaning that the data includes desired outputs or labels, and the goal is to learn a mapping between inputs and outputs.

In supervised learning, the training data consists of pairs of inputs (or features) and corresponding outputs. The algorithm learns to predict the output for a new input by analyzing the patterns in the existing data.

The key characteristics of supervised learning are:

1. **Labeled data**: The training data includes labeled examples, where each example is associated with an output or target variable.
2. **Learning from feedback**: The algorithm receives feedback (labels) on its predictions, allowing it to adjust and improve its performance over time.
3. **Goal-oriented**: The goal of supervised learning is to learn a specific mapping between inputs and outputs, often to perform a particular task or make a prediction.

Supervised learning can be further divided into two

In [None]:
# More complex async event streaming
async for event in qa_chain.astream_events("Which algorithms can be used there?", version="v2"):
    kind = event["event"]
    if kind == "on_chat_model_stream":
        print(event["data"]["chunk"].content, end="", flush=True)

Here is the rephrased follow-up question:

Which types of supervised learning algorithms are commonly applied to real-world problems such as image classification, sentiment analysis, and predictive modeling?Based on the provided text, the four types of supervised learning algorithms that are commonly applied to real-world problems such as image classification, sentiment analysis, and predictive modeling are:

1. Linear Regression
2. Polynomial Regression
3. Logistic Regression

These algorithms are mentioned in the following sentence:

8. What are the two most common supervised tasks?