In [1]:
import nest_asyncio
nest_asyncio.apply()

import ollama

sentence-transformers/all-MiniLM-L6-v2

In [2]:
from llama_index.llms.ollama import Ollama
from llama_index.embeddings.huggingface import HuggingFaceEmbedding
from llama_index.core.settings import Settings

llm = Ollama(model="llama3.2:3b")
embed_model = HuggingFaceEmbedding(model_name="BAAI/bge-small-en-v1.5")

Settings.llm = llm
Settings.embed_model = embed_model

  from .autonotebook import tqdm as notebook_tqdm


# 2️⃣ Breaking it down
``` LlamaIndex Workflows (an event-driven architecture). In that system, functions don't call each other directly; they communicate by throwing "Events" back and forth.```

### **A. `Event`**

* `Event` is a **base class in LlamaIndex workflows**.
* Workflows in LlamaIndex are **step-based pipelines**, and **events** are the objects that pass between steps.
* Examples of events:

  * `StartEvent` → triggers the workflow
  * `StopEvent` → indicates a step has finished
* You can define **custom events** to carry specific data between steps.

---

### **B. `NodeWithScore`**

* Each `NodeWithScore` is a **document chunk + similarity score**.
* When you retrieve documents from your vector database, you get:

  * The **chunk of text** (Node)
  * Its **relevance score** (Score)

So `NodeWithScore` represents **a retrieved document and how relevant it is to the query**.

---

### **C. `RetrieverEvent`**

* This is a **custom event** that will hold the **results of the retrieval step** in your RAG workflow.
* By defining:

```python
nodes: list[NodeWithScore]
```

You are saying:

> “This event will carry a list of retrieved nodes (documents) with their similarity scores.”

# 4️⃣ How it fits into a RAG workflow

```
StartEvent(query)
       │
       ▼
Retrieve Step
       │
       ▼
RetrieverEvent(nodes=[NodeWithScore, ...])
       │
       ▼
Synthesize Step (uses nodes to generate answer)
```

* `RetrieverEvent` is **just a container**
* Carries **retrieved text chunks + scores** from retrieval → synthesis


In [3]:
from llama_index.core.workflow import Event
from llama_index.core.schema import NodeWithScore


class RetrieverEvent(Event):
    """Result of running retrieval"""

    nodes: list[NodeWithScore]

In [None]:
from llama_index.core import SimpleDirectoryReader, VectorStoreIndex
from llama_index.core.response_synthesizers import CompactAndRefine
from llama_index.core.workflow import (
    Context,
    Workflow,
    StartEvent,
    StopEvent,
    step
)

class RAGWorkflow(Workflow):
    @step
    async def ingest(self, ctx: Context, ev: StartEvent) -> StopEvent | None:
        dirname = ev.get("dirname")
        if not dirname:
            return None
        
        documents = SimpleDirectoryReader(dirname).load_data()
        index = VectorStoreIndex.from_documents(
            documents=documents
        )
        return StopEvent(result=index)
    
    @step
    async def retrieve(self, ctx: Context, ev: StartEvent) -> RetrieverEvent | None:
        """ Retrieve relevant documents from the index based on the query. """
        query = ev.get("query")
        index = ev.get("index")

        if not query:
            return None
        
        print(f"Retrieving documents for query: {query}")

        await ctx.store.set("query", query)

        if index is None:
            print("Index is empty, load some documents before querying!")
            return None

        retriever = index.as_retriever( similarity_top_k=2 )
        nodes = await retriever.aretrieve(query)
        print(f"Retrieved {len(nodes)} documents.")
        print(("Document Retrieved:"))
        for node in nodes:
            print("-----"*10)
            print(node.get_text())
        return RetrieverEvent(nodes=nodes)


    @step
    async def synthesize(self, ctx: Context, ev: RetrieverEvent) -> StopEvent:
        """Return a streaming response using reranked nodes."""
        summarizer = CompactAndRefine(streaming=True, verbose=True)
        query = await ctx.store.get("query", default=None)

        response = await summarizer.asynthesize(query, nodes=ev.nodes)
        return StopEvent(result=response)
        


In [5]:
w = RAGWorkflow()

In [6]:
# Ingest the documents
index = await w.run(dirname="data")

In [9]:
# Run a query
result = await w.run(query="How was DeepSeekR1 trained?", index=index)
print("\nFinal response:")
async for chunk in result.async_response_gen():
    print(chunk, end="", flush=True)

Retrieving documents for query: How was DeepSeekR1 trained?
Retrieved 2 documents.
Document contents:
--------------------------------------------------
DeepSeek-R1: Incentivizing Reasoning Capability in LLMs via
Reinforcement Learning
DeepSeek-AI
research@deepseek.com
Abstract
We introduce our first-generation reasoning models, DeepSeek-R1-Zero and DeepSeek-R1.
DeepSeek-R1-Zero, a model trained via large-scale reinforcement learning (RL) without super-
vised fine-tuning (SFT) as a preliminary step, demonstrates remarkable reasoning capabilities.
Through RL, DeepSeek-R1-Zero naturally emerges with numerous powerful and intriguing
reasoning behaviors. However, it encounters challenges such as poor readability, and language
mixing. To address these issues and further enhance reasoning performance, we introduce
DeepSeek-R1, which incorporates multi-stage training and cold-start data before RL. DeepSeek-
R1 achieves performance comparable to OpenAI-o1-1217 on reasoning tasks. To support th

2025-12-07 15:50:53,879 - INFO - HTTP Request: POST http://localhost:11434/api/chat "HTTP/1.1 200 OK"


DeepSeek-R1 was trained using a multi-stage approach, which includes supervised fine-tuning (SFT) followed by large-scale reinforcement learning (RL). The model was initially trained with cold-start data before undergoing RL training. This combination of SFT and RL appears to have significantly enhanced the reasoning capabilities of DeepSeek-R1.

In [None]:
DeepSeek-R1 was trained using a multi-stage approach, which includes supervised fine-tuning (SFT) followed by large-scale reinforcement learning (RL). The model was initially trained with cold-start data before undergoing RL training. This combination of SFT and RL appears to have significantly enhanced the reasoning capabilities of DeepSeek-R1.