# LLamaIndex Learning

LlamaIndex training and practice

In [193]:
from IPython.display import display, Markdown


def display_markdown(content, clear=True):
    display(
        Markdown(
            content.replace('\[', '$$').replace('\]', '$$').replace(
                '\(', '$').replace('\)', '$')
        ),
        clear=clear
    )

In [242]:
from llama_index.core.settings import Settings
from llama_index.llms.ollama import Ollama

llm = Settings.llm = Ollama(model="qwen2.5:3b", request_timeout=30.0)

In [207]:
prompt = "Explain matrix multiplication simply."

gen = llm.stream_complete(prompt)

for response in gen:
    display_markdown(response.text)

Sure! Matrix multiplication is a fundamental operation in linear algebra where you take two matrices and produce another matrix as their result. Here’s a simple breakdown of how it works:

### Understanding Matrices

First, let's understand what a matrix is: A matrix is essentially a rectangular array of numbers (or other mathematical objects). The size or dimension of the matrix is defined by its number of rows and columns.

For example:
- A 2x3 matrix has two rows and three columns.
- A 3x2 matrix has three rows and two columns.

### Matrix Multiplication

Matrix multiplication involves taking two matrices and producing another matrix. To multiply two matrices $ A $ and $ B $, the number of columns in the first matrix must be equal to the number of rows in the second matrix. The result will have the same number of rows as the first matrix and the same number of columns as the second.

Let's denote:
- Matrix $ A $ is an $ m \times n $ matrix (with $ m $ rows and $ n $ columns).
- Matrix $ B $ is a $ p \times q $ matrix (with $ p $ rows and $ q $ columns).

The resulting matrix $ C $, which will have dimensions $ m \times q $, can be calculated as follows:

1. **Identify the elements**: 
   - The element in the $ i $-th row and $ j $-th column of the resulting matrix $ C $ is found by multiplying each element of the $ i $-th row of matrix $ A $ with the corresponding element from the $ j $-th column of matrix $ B $, and then summing these products.

2. **Calculate elements**:
   - Specifically, if we want to calculate the element at position (i, j) in the resulting matrix $ C $:
     $$
     c_{ij} = \sum_{k=1}^{n} a_{ik} b_{kj}
     $$
   - Here, $ a_{ik} $ is an entry from row $ i $ and column $ k $ of matrix $ A $, and $ b_{kj} $ is the corresponding element in matrix $ B $.

### Example

Let's take two matrices for example:

- Matrix $ A = \begin{bmatrix}
1 & 2 \\
3 & 4
\end{bmatrix} $
- Matrix $ B = \begin{bmatrix}
5 & 6 \\
7 & 8
\end{bmatrix} $

To find the product $ C = A \times B $:

The element at position (1,1) of matrix $ C $ is:
$$ c_{11} = (1*5) + (2*7) = 5 + 14 = 19 $$

The element at position (1,2) of matrix $ C $ is:
$$ c_{12} = (1*6) + (2*8) = 6 + 16 = 22 $$

The element at position (2,1) of matrix $ C $ is:
$$ c_{21} = (3*5) + (4*7) = 15 + 28 = 43 $$

The element at position (2,2) of matrix $ C $ is:
$$ c_{22} = (3*6) + (4*8) = 18 + 32 = 50 $$

Thus, the resulting matrix $ C $ would be:
$$
C = \begin{bmatrix}
19 & 22 \\
43 & 50
\end{bmatrix}
$$

### Summary

In essence, to multiply two matrices $ A \times B $:
- The number of columns in the first matrix $ A $ must match the number of rows in the second matrix $ B $.
- Each element in the resulting matrix is computed by taking a row from the first matrix and multiplying it with a column from the second matrix.
- Sum these products to get each corresponding entry in the resulting matrix.

Matrix multiplication can be used for various applications, including transformations (such as scaling or rotating images), solving systems of linear equations, and many other areas.

In [243]:
from llama_index.core.llms import ChatMessage
from llama_index.core.tools import ToolSelection, ToolOutput
from llama_index.core.workflow import Event, Context

class InputEvent(Event):
    input: list[ChatMessage]


class ToolCallEvent(Event):
    tool_calls: list[ToolSelection]


class FunctionOutputEvent(Event):
    output: ToolOutput


class ProgressEvent(Event):
    msg: str

class ToolCallOutputEvent(Event):
    tool_call: ToolSelection
    output: str

In [256]:
from typing import Any, List

from llama_index.core.llms.function_calling import FunctionCallingLLM
from llama_index.core.memory import ChatMemoryBuffer
from llama_index.core.tools.types import BaseTool
from llama_index.core.workflow import Workflow, StartEvent, StopEvent, step


class FunctionCallingAgent(Workflow):
    def __init__(
        self,
        *args: Any,
        llm: FunctionCallingLLM | None = None,
        tools: List[BaseTool] | None = None,
        **kwargs: Any,
    ) -> None:
        super().__init__(*args, **kwargs)
        self.tools = tools or []

        self.llm = llm or Ollama(model='qwen2.5:3b')
        assert self.llm.metadata.is_function_calling_model

        self.memory = ChatMemoryBuffer.from_defaults(llm=llm)
        self.sources = []

    @step
    async def prepare_chat_history(self, ctx: Context, ev: StartEvent) -> InputEvent:
        # clear sources
        self.sources = []

        # get user input
        user_input = ev.input
        user_msg = ChatMessage(role="user", content=user_input)
        self.memory.put(user_msg)

        # get chat history
        chat_history = self.memory.get()
        
        next_ev = InputEvent(input=chat_history)
        ctx.write_event_to_stream(next_ev)
        return next_ev

    @step
    async def handle_llm_input(
        self, ctx: Context, ev: InputEvent
    ) -> ToolCallEvent | StopEvent:
        chat_history = ev.input

        response = await self.llm.achat_with_tools(
            self.tools, chat_history=chat_history, allow_parallel_tool_calls=False
        )
        self.memory.put(response.message)

        ctx.write_event_to_stream(ProgressEvent(msg=f'LLM response: {response}'))

        tool_calls = self.llm.get_tool_calls_from_response(
            response, error_on_no_tool_call=False
        )

        if not tool_calls:
            next_ev = StopEvent(
                result={"response": response, "sources": [*self.sources]}
            )
        else:
            next_ev = ToolCallEvent(tool_calls=tool_calls)

        ctx.write_event_to_stream(next_ev)
        return next_ev

    @step
    async def handle_tool_calls(self, ctx: Context, ev: ToolCallEvent) -> InputEvent:
        tool_calls = ev.tool_calls
        tools_by_name = {tool.metadata.get_name(): tool for tool in self.tools}

        tool_msgs = []

        # call tools -- safely!
        for tool_call in tool_calls:
            tool = tools_by_name.get(tool_call.tool_name)
            additional_kwargs = {
                "tool_call_id": tool_call.tool_id,
                "name": tool.metadata.get_name(),
            }
            if not tool:
                tool_msgs.append(
                    ChatMessage(
                        role="tool",
                        content=f"Tool {tool_call.tool_name} does not exist",
                        additional_kwargs=additional_kwargs,
                    )
                )
                continue

            try:
                tool_output = tool(**tool_call.tool_kwargs)
                self.sources.append(tool_output)
                tool_msgs.append(
                    ChatMessage(
                        role="tool",
                        content=tool_output.content,
                        additional_kwargs=additional_kwargs,
                    )
                )
            except Exception as e:
                tool_msgs.append(
                    ChatMessage(
                        role="tool",
                        content=f"Encountered error in tool call: {e}",
                        additional_kwargs=additional_kwargs,
                    )
                )

            ctx.write_event_to_stream(ToolCallOutputEvent(tool_call=tool_call, output=tool_msgs[-1].content))

        for msg in tool_msgs:
            self.memory.put(msg)

        chat_history = self.memory.get()

        next_ev = InputEvent(input=chat_history)
        ctx.write_event_to_stream(next_ev)
        return next_ev

In [257]:
from llama_index.core.tools import FunctionTool

def add(x: int, y: int) -> int:
    """Useful function to add two numbers."""
    return x + y


def multiply(x: int, y: int) -> int:
    """Useful function to multiply two numbers."""
    return x * y


tools = [
    FunctionTool.from_defaults(add),
    FunctionTool.from_defaults(multiply),
]

agent = FunctionCallingAgent(
    llm=llm,
    tools=tools,
    timeout=120,
    verbose=True,
)

In [283]:
ret = await agent.run(input="What is (29485 + 193482) * 999?")

Running step prepare_chat_history
Step prepare_chat_history produced event InputEvent
Running step handle_llm_input
Step handle_llm_input produced event ToolCallEvent
Running step handle_tool_calls
Step handle_tool_calls produced event InputEvent
Running step handle_llm_input
Step handle_llm_input produced event StopEvent


In [284]:
display_markdown(ret['response'].message.content)

The result of (29,485 + 193,482) * 999 is 222,744,033.

In [None]:
from llama_index.utils.workflow import draw_all_possible_flows

async def main():
    agent = FunctionCallingAgent(
        llm=llm,
        tools=tools,
        timeout=120,
        verbose=False
    )
    handler = agent.run(input="What is (29485 + 193482) * 999?")

    async for ev in handler.stream_events():
        print(ev)

    final_result = await handler
    print("Final result")
    display_markdown(final_result['response'].message.content, clear=False)

    draw_all_possible_flows(agent, filename="streaming_workflow.html")

await main()

input=[ChatMessage(role=<MessageRole.USER: 'user'>, content='What is (29485 + 193482) * 999?', additional_kwargs={})]
msg='LLM response: assistant: '
tool_calls=[ToolSelection(tool_id='add', tool_name='add', tool_kwargs={'x': 29485, 'y': 193482})]
tool_call=ToolSelection(tool_id='add', tool_name='add', tool_kwargs={'x': 29485, 'y': 193482}) output='222967'
input=[ChatMessage(role=<MessageRole.USER: 'user'>, content='What is (29485 + 193482) * 999?', additional_kwargs={}), ChatMessage(role=<MessageRole.ASSISTANT: 'assistant'>, content='', additional_kwargs={'tool_calls': [{'function': {'name': 'add', 'arguments': {'x': 29485, 'y': 193482}}}]}), ChatMessage(role=<MessageRole.TOOL: 'tool'>, content='222967', additional_kwargs={'tool_call_id': 'add', 'name': 'add'})]
msg='LLM response: assistant: '
tool_calls=[ToolSelection(tool_id='multiply', tool_name='multiply', tool_kwargs={'x': 222967, 'y': 999})]
tool_call=ToolSelection(tool_id='multiply', tool_name='multiply', tool_kwargs={'x': 2229

The result of (29485 + 193482) * 999 is 222744033.