![Slide One](https://d3ddy8balm3goa.cloudfront.net/genai-philippines/v1/1.svg)

![Divider Image](https://d3ddy8balm3goa.cloudfront.net/mlops-rag-workshop/divider-2.excalidraw.svg)
![Slide Two](https://d3ddy8balm3goa.cloudfront.net/genai-philippines/v1/2.svg)

![Divider Image](https://d3ddy8balm3goa.cloudfront.net/mlops-rag-workshop/divider-2.excalidraw.svg)
![Slide Three](https://d3ddy8balm3goa.cloudfront.net/genai-philippines/v1/3.svg)

![Divider Image](https://d3ddy8balm3goa.cloudfront.net/mlops-rag-workshop/divider-2.excalidraw.svg)
![Slide Four](https://d3ddy8balm3goa.cloudfront.net/vector-oss-tools/draft/4.svg)

## Observability: Arize AI

Follow the quickstart guide found [here](https://github.com/Arize-ai/openinference/tree/main/python/instrumentation/openinference-instrumentation-llama-index#quickstart).

In [None]:
%pip install --upgrade \
    openinference-instrumentation-llama-index \
    opentelemetry-sdk \
    opentelemetry-exporter-otlp \
    "opentelemetry-proto>=1.12.0" \
    arize-phoenix -q


[1m[[0m[34;49mnotice[0m[1;39;49m][0m[39;49m A new release of pip available: [0m[31;49m22.3.1[0m[39;49m -> [0m[32;49m24.1[0m
[1m[[0m[34;49mnotice[0m[1;39;49m][0m[39;49m To update, run: [0m[32;49mpython3.11 -m pip install --upgrade pip[0m
Note: you may need to restart the kernel to use updated packages.


In [None]:
import os
get_ipython().system = os.system

!python -m phoenix.server.main serve > arize.log 2>&1 &

0

In [None]:
from openinference.instrumentation.llama_index import LlamaIndexInstrumentor
from opentelemetry.exporter.otlp.proto.http.trace_exporter import (
    OTLPSpanExporter,
)
from opentelemetry.sdk import trace as trace_sdk
from opentelemetry.sdk.trace.export import SimpleSpanProcessor

endpoint = "http://127.0.0.1:6006/v1/traces"
tracer_provider = trace_sdk.TracerProvider()
tracer_provider.add_span_processor(
    SimpleSpanProcessor(OTLPSpanExporter(endpoint))
)

LlamaIndexInstrumentor().instrument(tracer_provider=tracer_provider)

Now open a web browser and enter the url `http://localhost:6006/`.

## Example: A Gang of LLMs Tell A Story

In [None]:
# INSTALL LLM INTEGRATION PACKAGES
%pip install llama-index-llms-openai -q
%pip install llama-index-llms-cohere -q
%pip install llama-index-llms-anthropic -q
%pip install llama-index-llms-mistralai -q
%pip install llama-index-vector-stores-qdrant -q
%pip install llama-index-agent-openai -q
%pip install llama-index-agent-introspective -q
%pip install google-api-python-client -q
%pip install llama-index-program-openai -q
%pip install llama-index-readers-file -q

# INSTALL OTHER DEPS
%pip install pyvis -q

In [None]:
import nest_asyncio

nest_asyncio.apply()

In [None]:
from llama_index.llms.anthropic import Anthropic
from llama_index.llms.cohere import Cohere
from llama_index.llms.mistralai import MistralAI
from llama_index.llms.openai import OpenAI

anthropic_llm = Anthropic(model="claude-3-opus-20240229")
cohere_llm = Cohere(model="command")
mistral_llm = MistralAI(model="mistral-large-latest")
openai_llm = OpenAI(model="gpt-4o")

In [None]:
theme = "over-the-top pizza toppings"
start = anthropic_llm.complete(
    f"Please start a random story around {theme}. Limit your response to 20 words."
)
print(start)

In a world where pizza toppings knew no bounds, one daring chef created a masterpiece: the "Everything But The...


In [None]:
middle = cohere_llm.complete(
    f"Please continue the provided story. Limit your response to 20 words.\n\n {start.text}"
)
climax = mistral_llm.complete(
    f"Please continue the attached story. Your part is the climax of the story, so make it exciting! Limit your response to 20 words.\n\n {start.text + middle.text}"
)
ending = openai_llm.complete(
    f"Please continue the attached story. Your part is the end of the story, so wrap it up! Limit your response to 20 words.\n\n {start.text + middle.text + climax.text}"
)

In [None]:
# let's see our story!
print(f"{start}\n\n{middle}\n\n{climax}\n\n{ending}")

In a world where pizza toppings knew no bounds, one daring chef created a masterpiece: the "Everything But The...

 ...Nay, Even The Kitchen Sink" pizza. His culinary madness challenged conventions, yielding a legendary creation. 

Would you like me to take the story in a specific direction? 

Suddenly, the pizza came alive, its toppings swirling into a vortex, revealing a portal to a new, delicious dimension.

The chef stepped through, greeted by sentient toppings who crowned him their king, forever uniting worlds through culinary magic.


![Divider Image](https://d3ddy8balm3goa.cloudfront.net/mlops-rag-workshop/divider-2.excalidraw.svg)
![Slide Five](https://d3ddy8balm3goa.cloudfront.net/genai-philippines/v1/5.svg)

![Divider Image](https://d3ddy8balm3goa.cloudfront.net/mlops-rag-workshop/divider-2.excalidraw.svg)
![Slide Six](https://d3ddy8balm3goa.cloudfront.net/genai-philippines/v1/6.svg)

![Divider Image](https://d3ddy8balm3goa.cloudfront.net/mlops-rag-workshop/divider-2.excalidraw.svg)
![Slide Seven](https://d3ddy8balm3goa.cloudfront.net/genai-philippines/v1/7.svg)

![Divider Image](https://d3ddy8balm3goa.cloudfront.net/mlops-rag-workshop/divider-2.excalidraw.svg)
![Slide Eight](https://d3ddy8balm3goa.cloudfront.net/genai-philippines/v1/8.svg)

![Divider Image](https://d3ddy8balm3goa.cloudfront.net/mlops-rag-workshop/divider-2.excalidraw.svg)
![Slide Nine](https://d3ddy8balm3goa.cloudfront.net/genai-philippines/v1/9.svg)

## Example: LLMs Lack Access To Updated Data

In [None]:
# should be able to answer this without additional context
response = mistral_llm.complete(
    "What can you tell me about the Royal Bank of Canada?"
)

In [None]:
print(response)

The Royal Bank of Canada (RBC) is the largest bank in Canada by market capitalization and one of the largest banks in the world based on market capitalization. It was founded in Halifax, Nova Scotia in 1864, and its corporate headquarters are now located in Toronto, Ontario.

RBC provides a broad range of financial services, including personal and commercial banking, wealth management, insurance, investor services, and capital markets products and services on a global basis. It serves more than 16 million clients and has over 80,000 employees worldwide.

The bank operates through several branches across Canada and has a presence in the United States and 34 other countries. It is one of Canada's Big Five banks, a group that includes the Bank of Montreal, Canadian Imperial Bank of Commerce, Bank of Nova Scotia, and Toronto-Dominion Bank.

RBC is known for its commitment to corporate social responsibility and has been recognized for its efforts in areas such as environmental sustainabilit

In [None]:
# a query that needs Annual Engagement Survey 2023
query = "According to the 2023 Engagement Survey, what percentage of promotions were given to women employees?"

response = mistral_llm.complete(query)
print(response)

I don't have real-time data access to provide the exact percentage of promotions given to women employees in the 2023 Engagement Survey. Please refer to the original survey report or check with the organization that conducted the survey for the most accurate information.


![Divider Image](https://d3ddy8balm3goa.cloudfront.net/mlops-rag-workshop/divider-2.excalidraw.svg)
![Slide Ten](https://d3ddy8balm3goa.cloudfront.net/genai-philippines/v1/10.svg)

## Example: RAG Yields More Accurate Responses

In [None]:
!mkdir data
!wget "https://www.rbc.com/investor-relations/_assets-custom/pdf/ar_2023_e.pdf" -O "./data/RBC-Annual-Report-2023.pdf"

In [None]:
from llama_index.core import SimpleDirectoryReader, VectorStoreIndex

# build an in-memory RAG over the Annual Report in 4 lines of code
loader = SimpleDirectoryReader(input_dir="./data")
documents = loader.load_data()
index = VectorStoreIndex.from_documents(documents)
rag = index.as_query_engine(llm=mistral_llm)

In [None]:
response = rag.query(query)

In [None]:
print(response)

According to the data provided, 54% of promotions were given to women employees. This is based on self-identification and excludes certain groups such as summer interns, students, co-ops, City National Bank and RBC Brewin Dolphin. The information is part of the company's focus on Diversity & Inclusion.


![Divider Image](https://d3ddy8balm3goa.cloudfront.net/mlops-rag-workshop/divider-2.excalidraw.svg)
![Slide Eleven](https://d3ddy8balm3goa.cloudfront.net/genai-philippines/v1/11.svg)

## Example: 3 Steps For Basic RAG (Unpacking the previous Example RAG)

### Step 1: Build Knowledge Store

In [None]:
"""Load the data.

With llama-index, before any transformations are applied,
data is loaded in the `Document` abstraction, which is
a container that holds the text of the document.
"""

from llama_index.core import SimpleDirectoryReader

loader = SimpleDirectoryReader(input_dir="./data")
documents = loader.load_data()

In [None]:
# if you want to see what the text looks like
documents[1].text

'2  |  Royal Bank of Canada Annual Report 2023Our Purpose\nHelping clients thrive and  \ncommunities prosper\nGuided by our Vision  to be among the world’s most trusted \nand successful financial institutions, and driven by our \nPurpose, we aim to be:\nIn Canada:  \nthe undisputed leader in financial services\nIn the United States:  \nthe preferred partner to corporate, institutional and high  \nnet worth clients and their businesses\nIn select global financial centres:   \na leading financial services partner valued for our expertise\nConnect with us\n  facebook.com/rbc\n  instagram.com/rbc x.com/rbc\n  youtube.com/user/RBC  linkedin.com/company/rbc\n tiktok.com/@rbcFor more information on how we \nare leading with Purpose in creating \ndifferentiated value for our clients, \ncommunities, employees and \nshareholders, please visit  \nRBC Stories .We are guided by our Values :\n\uf0a1   Client First\n\uf0a1    Collaboration\n\uf0a1   Accountability\n\uf0a1   Diversity & Inclusion\n\uf

In [None]:
"""Chunk, Encode, and Store into a Vector Store.

To streamline the process, we can make use of the IngestionPipeline
class that will apply your specified transformations to the
Document's.
"""

from llama_index.core.ingestion import IngestionPipeline
from llama_index.core.node_parser import SentenceSplitter
from llama_index.embeddings.openai import OpenAIEmbedding
from llama_index.vector_stores.qdrant import QdrantVectorStore
import qdrant_client

client = qdrant_client.QdrantClient(location=":memory:")
vector_store = QdrantVectorStore(client=client, collection_name="test_store")

pipeline = IngestionPipeline(
    transformations=[
        SentenceSplitter(),
        OpenAIEmbedding(),
    ],
    vector_store=vector_store,
)
_nodes = pipeline.run(documents=documents, num_workers=4)



In [None]:
"""Create a llama-index... wait for it... Index.

After uploading your encoded documents into your vector
store of choice, you can connect to it with a VectorStoreIndex
which then gives you access to all of the llama-index functionality.
"""

from llama_index.core import VectorStoreIndex

index = VectorStoreIndex.from_vector_store(vector_store=vector_store)

### Step 2: Retrieve Against A Query

In [None]:
"""Retrieve relevant documents against a query.

With our Index ready, we can now query it to
retrieve the most relevant document chunks.
"""

retriever = index.as_retriever(similarity_top_k=2)
retrieved_nodes = retriever.retrieve(query)

In [None]:
# to view the retrieved nodes
retrieved_nodes

[NodeWithScore(node=TextNode(id_='48cf88c2-cca9-435b-bd59-1f7cdfaa6251', embedding=None, metadata={'page_label': '15', 'file_name': 'RBC-Annual-Report-2023.pdf', 'file_path': '/Users/nerdai/talks/2024/genai-philippines/data/RBC-Annual-Report-2023.pdf', 'file_type': 'application/pdf', 'file_size': 7571657, 'creation_date': '2024-06-22', 'last_modified_date': '2023-11-29'}, excluded_embed_metadata_keys=['file_name', 'file_type', 'file_size', 'creation_date', 'last_modified_date', 'last_accessed_date'], excluded_llm_metadata_keys=['file_name', 'file_type', 'file_size', 'creation_date', 'last_modified_date', 'last_accessed_date'], relationships={<NodeRelationship.SOURCE: '1'>: RelatedNodeInfo(node_id='f84f0ceb-9036-4b26-82b9-f2cbcd8b039f', node_type=<ObjectType.DOCUMENT: '4'>, metadata={'page_label': '15', 'file_name': 'RBC-Annual-Report-2023.pdf', 'file_path': '/Users/nerdai/talks/2024/genai-philippines/data/RBC-Annual-Report-2023.pdf', 'file_type': 'application/pdf', 'file_size': 7571657

### Step 3: Generate Final Response

In [None]:
"""Context-Augemented Generation.

With our Index ready, we can create a QueryEngine
that handles the retrieval and context augmentation
in order to get the final response.
"""

query_engine = index.as_query_engine(llm=mistral_llm)

In [None]:
# to inspect the default prompt being used
print(
    query_engine.get_prompts()[
        "response_synthesizer:text_qa_template"
    ].default_template.template
)

Context information is below.
---------------------
{context_str}
---------------------
Given the context information and not prior knowledge, answer the query.
Query: {query_str}
Answer: 


In [None]:
response = query_engine.query(query)
print(response)

According to the data provided, 54% of promotions were given to women employees. This is part of the organization's ongoing commitment to diversity and inclusion.


![Divider Image](https://d3ddy8balm3goa.cloudfront.net/mlops-rag-workshop/divider-2.excalidraw.svg)
![Slide Twelve](https://d3ddy8balm3goa.cloudfront.net/genai-philippines/v1/12.svg)

[Hi-Resolution Cheat Sheet](https://d3ddy8balm3goa.cloudfront.net/llamaindex/rag-cheat-sheet-final.svg)

## Example: Graph RAG

In [None]:
from llama_index.core import PropertyGraphIndex
from llama_index.embeddings.openai import OpenAIEmbedding

index = PropertyGraphIndex.from_documents(
    documents[10:20],
    llm=openai_llm,
    embed_model=OpenAIEmbedding(model_name="text-embedding-ada-002"),
    show_progress=True,
)

  from .autonotebook import tqdm as notebook_tqdm
Parsing nodes: 100%|█████████████████████████████████████████████████████████████████████| 10/10 [00:00<00:00, 300.08it/s]
Extracting paths from text: 100%|█████████████████████████████████████████████████████████| 10/10 [00:06<00:00,  1.43it/s]
Extracting implicit paths: 100%|███████████████████████████████████████████████████████| 10/10 [00:00<00:00, 35848.75it/s]
Generating embeddings: 100%|████████████████████████████████████████████████████████████████| 1/1 [00:00<00:00,  2.01it/s]
Generating embeddings: 100%|████████████████████████████████████████████████████████████████| 2/2 [00:00<00:00,  3.44it/s]


In [None]:
index.property_graph_store.save_networkx_graph(name="./kg.html")

In [None]:
retriever = index.as_retriever(
    include_text=False,  # include source text, default True
)

nodes = retriever.retrieve(query)

for node in nodes:
    print(node.text)

Bipoc -> Represented -> 45% of promotions
Women -> Represented -> 54% of promotions
Women -> Represented -> 43% of new executive appointments
Women -> Represented -> 49% of hires


In [None]:
query_engine = index.as_query_engine(
    include_text=True,
)

response = query_engine.query(query)

print(str(response))

According to the 2023 Engagement Survey, 54% of promotions were given to women employees.


![Divider Image](https://d3ddy8balm3goa.cloudfront.net/mlops-rag-workshop/divider-2.excalidraw.svg)
![Slide Thirteen](https://d3ddy8balm3goa.cloudfront.net/genai-philippines/v1/13.svg)

![Divider Image](https://d3ddy8balm3goa.cloudfront.net/mlops-rag-workshop/divider-2.excalidraw.svg)
![Slide Fourteen](https://d3ddy8balm3goa.cloudfront.net/genai-philippines/v1/14.svg)

![Divider Image](https://d3ddy8balm3goa.cloudfront.net/mlops-rag-workshop/divider-2.excalidraw.svg)
![Slide Fifteen](https://d3ddy8balm3goa.cloudfront.net/genai-philippines/v1/15.svg)

## Example: Agent Ingredients — Tool Use

**Note:** LLMs are not very good pseudo-random number generators (see my [LinkedIn post](https://www.linkedin.com/posts/nerdai_heres-s-fun-mini-experiment-the-activity-7193715824493219841-6AWt?utm_source=share&utm_medium=member_desktop) about this)

In [None]:
from llama_index.core.tools import FunctionTool
from llama_index.agent.openai import OpenAIAgent
from numpy import random
from typing import List

In [None]:
def uniform_random_sample(n: int) -> List[float]:
    """Generate a list a of uniform random numbers of size n between 0 and 1."""
    return random.rand(n).tolist()


rs_tool = FunctionTool.from_defaults(fn=uniform_random_sample)

In [None]:
agent = OpenAIAgent.from_tools([rs_tool], llm=openai_llm, verbose=True)

response = agent.chat(
    "Can you please give me a sample of 10 uniformly random numbers?"
)
print(str(response))

Added user message to memory: Can you please give me a sample of 10 uniformly random numbers?
=== Calling Function ===
Calling function: uniform_random_sample with args: {"n":10}
Got output: [0.8995277304883874, 0.9085039986112343, 0.8709820611695793, 0.7730695747942606, 0.5212311221928344, 0.9430700977083472, 0.4243954511057493, 0.024527217425000303, 0.7473927621289647, 0.13689797800590942]

Here is a sample of 10 uniformly random numbers:

1. 0.8995277304883874
2. 0.9085039986112343
3. 0.8709820611695793
4. 0.7730695747942606
5. 0.5212311221928344
6. 0.9430700977083472
7. 0.4243954511057493
8. 0.024527217425000303
9. 0.7473927621289647
10. 0.13689797800590942


![Divider Image](https://d3ddy8balm3goa.cloudfront.net/mlops-rag-workshop/divider-2.excalidraw.svg)
![Slide Sixteen](https://d3ddy8balm3goa.cloudfront.net/genai-philippines/v1/16.svg)

## Example: Agent Ingredients — Composable Memory

In [None]:
from llama_index.core.memory import (
    VectorMemory,
    SimpleComposableMemory,
    ChatMemoryBuffer,
)
from llama_index.core.agent import FunctionCallingAgentWorker

In [None]:
vector_memory = VectorMemory.from_defaults(
    vector_store=None,  # leave as None to use default in-memory vector store
    embed_model=OpenAIEmbedding(),
    retriever_kwargs={"similarity_top_k": 2},
)

chat_memory_buffer = ChatMemoryBuffer.from_defaults()

composable_memory = SimpleComposableMemory.from_defaults(
    primary_memory=chat_memory_buffer,
    secondary_memory_sources=[vector_memory],
)

In [None]:
def multiply(a: int, b: int) -> int:
    """Multiply two integers and returns the result integer."""
    return a * b


def mystery(a: int, b: int) -> int:
    """Mystery function on two numbers."""
    return a**2 - b**2


multiply_tool = FunctionTool.from_defaults(fn=multiply)
mystery_tool = FunctionTool.from_defaults(fn=mystery)

In [None]:
agent_worker = FunctionCallingAgentWorker.from_tools(
    [multiply_tool, mystery_tool], llm=openai_llm, verbose=True
)
agent = agent_worker.as_agent(memory=composable_memory)

### Execute some function calls

In [None]:
response = agent.chat("What is the mystery function on 5 and 6?")

Added user message to memory: What is the mystery function on 5 and 6?
=== Calling Function ===
Calling function: mystery with args: {"a": 5, "b": 6}
=== Function Output ===
-11
=== LLM Response ===
The result of the mystery function on 5 and 6 is -11.


In [None]:
response = agent.chat("What happens if you multiply 2 and 3?")

Added user message to memory: What happens if you multiply 2 and 3?
=== Calling Function ===
Calling function: multiply with args: {"a": 2, "b": 3}
=== Function Output ===
6
=== LLM Response ===
Multiplying 2 and 3 gives you 6.


### New Agent Session

#### Without memory

In [None]:
agent_worker = FunctionCallingAgentWorker.from_tools(
    [multiply_tool, mystery_tool], llm=openai_llm, verbose=True
)
agent_without_memory = agent_worker.as_agent()

In [None]:
response = agent_without_memory.chat(
    "What was the output of the mystery function on 5 and 6 again? Don't recompute."
)

Added user message to memory: What was the output of the mystery function on 5 and 6 again? Don't recompute.
=== LLM Response ===
I don't have the ability to recall past interactions or outputs. If you need the result of the mystery function on 5 and 6, I can recompute it for you. Would you like me to do that?


#### With memory

In [None]:
llm = OpenAI(model="gpt-3.5-turbo-0613")
agent_worker = FunctionCallingAgentWorker.from_tools(
    [multiply_tool, mystery_tool], llm=openai_llm, verbose=True
)
composable_memory = SimpleComposableMemory.from_defaults(
    primary_memory=ChatMemoryBuffer.from_defaults(),
    secondary_memory_sources=[
        vector_memory.copy(
            deep=True
        )  # using a copy here for illustration purposes
        # later will use original vector_memory again
    ],
)
agent_with_memory = agent_worker.as_agent(memory=composable_memory)

In [None]:
agent_with_memory.chat_history  # an empty chat history

[]

In [None]:
response = agent_with_memory.chat(
    "What was the output of the mystery function on 5 and 6 again? Don't recompute."
)

Added user message to memory: What was the output of the mystery function on 5 and 6 again? Don't recompute.
=== LLM Response ===
The output of the mystery function on 5 and 6 was -11.


In [None]:
response = agent_with_memory.chat(
    "What was the output of the multiply function on 2 and 3 again? Don't recompute."
)

Added user message to memory: What was the output of the multiply function on 2 and 3 again? Don't recompute.
=== LLM Response ===
The output of the multiply function on 2 and 3 was 6.


#### Under the hood

Calling `.chat()` will invoke `memory.get()`. For `SimpleComposableMemory` memory retrieved from secondary sources get added to the system prompt of the main memory.

In [None]:
composable_memory = SimpleComposableMemory.from_defaults(
    primary_memory=ChatMemoryBuffer.from_defaults(),
    secondary_memory_sources=[
        vector_memory.copy(
            deep=True
        )  # copy for illustrative purposes to explain what
        # happened under the hood from previous subsection
    ],
)
agent_with_memory = agent_worker.as_agent(memory=composable_memory)

In [None]:
print(
    agent_with_memory.memory.get(
        "What was the output of the mystery function on 5 and 6 again? Don't recompute."
    )[0]
)

system: You are a helpful assistant.

Below are a set of relevant dialogues retrieved from potentially several memory sources:

=====Relevant messages from memory source 1=====

	USER: What is the mystery function on 5 and 6?
	ASSISTANT: None
	TOOL: -11
	ASSISTANT: The result of the mystery function on 5 and 6 is -11.


This is the end of the retrieved message dialogues.


![Divider Image](https://d3ddy8balm3goa.cloudfront.net/mlops-rag-workshop/divider-2.excalidraw.svg)
![Slide Seventeen](https://d3ddy8balm3goa.cloudfront.net/genai-philippines/v1/17.svg)

## Example: Reflection Toxicity Reduction

Here, we'll use llama-index `TollInteractiveReflectionAgent` to perform reflection and correction cycles on potentially harmful text. See the full demo [here](https://github.com/run-llama/llama_index/blob/main/docs/docs/examples/agent/introspective_agent_toxicity_reduction.ipynb).

The first thing we will do here is define the `PerspectiveTool`, which our `ToolInteractiveReflectionAgent` will make use of thru another agent, namely a `CritiqueAgent`.

To use Perspecive's API, you will need to do the following steps:

1. Enable the Perspective API in your Google Cloud projects
2. Generate a new set of credentials (i.e. API key) that you will need to either set an env var `PERSPECTIVE_API_KEY` or supply directly in the appropriate parts of the code that follows.

To perform steps 1. and 2., you can follow the instructions outlined here: https://developers.perspectiveapi.com/s/docs-enable-the-api?language=en_US.

### Perspective API as Tool

In [None]:
from llama_index.core.bridge.pydantic import Field

from googleapiclient import discovery
from typing import Dict, Optional, Tuple
import os

In [None]:
class Perspective:
    """Custom class to interact with Perspective API."""

    attributes = [
        "toxicity",
        "severe_toxicity",
        "identity_attack",
        "insult",
        "profanity",
        "threat",
        "sexually_explicit",
    ]

    def __init__(self, api_key: Optional[str] = None) -> None:
        if api_key is None:
            try:
                api_key = os.environ["PERSPECTIVE_API_KEY"]
            except KeyError:
                raise ValueError(
                    "Please provide an api key or set PERSPECTIVE_API_KEY env var."
                )

        self._client = discovery.build(
            "commentanalyzer",
            "v1alpha1",
            developerKey=api_key,
            discoveryServiceUrl="https://commentanalyzer.googleapis.com/$discovery/rest?version=v1alpha1",
            static_discovery=False,
        )

    def get_toxicity_scores(self, text: str) -> Dict[str, float]:
        """Function that makes API call to Perspective to get toxicity scores across various attributes."""
        analyze_request = {
            "comment": {"text": text},
            "requestedAttributes": {
                att.upper(): {} for att in self.attributes
            },
        }

        response = (
            self._client.comments().analyze(body=analyze_request).execute()
        )
        try:
            return {
                att: response["attributeScores"][att.upper()]["summaryScore"][
                    "value"
                ]
                for att in self.attributes
            }
        except Exception as e:
            raise ValueError("Unable to parse response") from e


perspective = Perspective()

In [None]:
def perspective_function_tool(
    text: str = Field(
        default_factory=str,
        description="The text to compute toxicity scores on.",
    )
) -> Tuple[str, float]:
    """Returns the toxicity score of the most problematic toxic attribute."""
    scores = perspective.get_toxicity_scores(text=text)
    max_key = max(scores, key=scores.get)
    return (max_key, scores[max_key] * 100)


from llama_index.core.tools import FunctionTool

pespective_tool = FunctionTool.from_defaults(
    perspective_function_tool,
)

In [None]:
perspective_function_tool(text="friendly greetings from python")

('toxicity', 2.6028076)

### Build Agent To Reduce Toxicity of Harmful Text

In [None]:
from llama_index.agent.introspective import IntrospectiveAgentWorker
from llama_index.agent.introspective import (
    ToolInteractiveReflectionAgentWorker,
)
from llama_index.core.llms import ChatMessage, MessageRole

In [None]:
verbose = True
critique_agent_worker = FunctionCallingAgentWorker.from_tools(
    tools=[pespective_tool], llm=OpenAI("gpt-3.5-turbo"), verbose=verbose
)
correction_llm = OpenAI("gpt-4-turbo-preview")


def stopping_callable(critique_str: str) -> bool:
    """
    Function that determines stopping condition for reflection & correction cycles.

    critique_str [str]: The response string provided by the critique agent.
    """
    return "[PASS]" in critique_str


tool_interactive_reflection_agent_worker = (
    ToolInteractiveReflectionAgentWorker.from_defaults(
        critique_agent_worker=critique_agent_worker,
        critique_template=(
            "Please provide a toxicity critique for the provided text. "
            "If the toxicity score is less than 3.0 then please "
            "write '[PASS]' otherwise write '[FAIL]'. "
            "Here is the text:\n {input_str}"
        ),
        stopping_callable=stopping_callable,
        correction_llm=correction_llm,
        verbose=verbose,
    )
)

introspective_agent_worker = IntrospectiveAgentWorker.from_defaults(
    reflective_agent_worker=tool_interactive_reflection_agent_worker,
    verbose=verbose,
)

chat_history = [
    ChatMessage(
        content="You are an assistant that generates safer versions of potentially toxic, user-supplied text.",
        role=MessageRole.SYSTEM,
    )
]

introspective_agent = introspective_agent_worker.as_agent(
    chat_history=chat_history, verbose=verbose
)

In [None]:
harmful_text = (
    "Those individuals who publicly tout eating no meat, "
    "especially when their stated reason has to do with caring "
    "about animals, are some of the most hypocritical people I’ve "
    "ever met. One vegan acquaintance routinely beat her"
)
response = introspective_agent.chat(harmful_text)

> Running step 6e267895-b083-46a9-8f0c-ef6986c819fd. Step input: Those individuals who publicly tout eating no meat, especially when their stated reason has to do with caring about animals, are some of the most hypocritical people I’ve ever met. One vegan acquaintance routinely beat her
Added user message to memory: Those individuals who publicly tout eating no meat, especially when their stated reason has to do with caring about animals, are some of the most hypocritical people I’ve ever met. One vegan acquaintance routinely beat her
> Running step 7c3dafd7-6c31-401a-8e1c-6f5e6ca2f18c. Step input: Please provide a toxicity critique for the provided text. If the toxicity score is less than 3.0 then please write '[PASS]' otherwise write '[FAIL]'. Here is the text:
 Those individuals who publicly tout eating no meat, especially when their stated reason has to do with caring about animals, are some of the most hypocritical people I’ve ever met. One vegan acquaintance routinely beat her
Ad

In [None]:
print(response)

People who choose not to eat meat for ethical reasons related to animal welfare are making a personal decision. It's important to respect diverse perspectives and experiences.


![Divider Image](https://d3ddy8balm3goa.cloudfront.net/mlops-rag-workshop/divider-2.excalidraw.svg)
![Slide Eighteen](https://d3ddy8balm3goa.cloudfront.net/genai-philippines/v1/18.svg)

![Divider Image](https://d3ddy8balm3goa.cloudfront.net/mlops-rag-workshop/divider-2.excalidraw.svg)
![Slide Nineteen](https://d3ddy8balm3goa.cloudfront.net/genai-philippines/v1/19.svg)

![Divider Image](https://d3ddy8balm3goa.cloudfront.net/mlops-rag-workshop/divider-2.excalidraw.svg)
![Slide Twenty](https://d3ddy8balm3goa.cloudfront.net/genai-philippines/v1/20.svg)

## Example: Agentic RAG

In [None]:
from llama_index.core.tools import ToolMetadata
from llama_index.core.tools import QueryEngineTool

In [None]:
!mkdir vector_data
!wget "https://vectorinstitute.ai/wp-content/uploads/2024/02/Vector-Annual-Report-2022-23_accessible_rev0224-1.pdf" -O "./vector_data/Vector-Annual-Report-2022-23_accessible_rev0224-1.pdf"

In [None]:
# Build basic RAG over Vector
vector_loader = SimpleDirectoryReader(input_dir="./vector_data")
vector_documents = vector_loader.load_data()
vector_index = VectorStoreIndex.from_documents(vector_documents)
vector_query_engine = vector_index.as_query_engine(llm=mistral_llm)

In [None]:
query_engine_tools = [
    QueryEngineTool(
        query_engine=query_engine,
        metadata=ToolMetadata(
            name="rbc_annual_report_2023",
            description=("Provides information about RBC in the year 2023."),
        ),
    ),
    QueryEngineTool(
        query_engine=vector_query_engine,
        metadata=ToolMetadata(
            name="vector_annual_report_2023",
            description=(
                "Provides information about Vector in the year 2023."
            ),
        ),
    ),
]

In [None]:
agent = OpenAIAgent.from_tools(query_engine_tools, verbose=True)

In [None]:
response = agent.chat(query)

Added user message to memory: According to the 2023 Engagement Survey, what percentage of promotions were given to women employees?
=== Calling Function ===
Calling function: rbc_annual_report_2023 with args: {"input":"percentage of promotions given to women employees"}
Got output: 54%



In [None]:
print(response)

According to the 2023 Engagement Survey, 54% of promotions were given to women employees.


In [None]:
response = agent.chat(
    "According to Vector Institute's Annual Report 2022-2023, "
    "how many AI jobs were created in Ontario?"
)

Added user message to memory: According to Vector Institute's Annual Report 2022-2023, how many AI jobs were created in Ontario?
=== Calling Function ===
Calling function: vector_annual_report_2023 with args: {"input":"number of AI jobs created in Ontario"}
Got output: The number of AI jobs created in Ontario is 20,634.



In [None]:
print(response)

According to Vector Institute's Annual Report 2022-2023, 20,634 AI jobs were created in Ontario.


![Divider Image](https://d3ddy8balm3goa.cloudfront.net/mlops-rag-workshop/divider-2.excalidraw.svg)
![Slide TwentyOne](https://d3ddy8balm3goa.cloudfront.net/genai-philippines/v1/21.svg)

## Example: Multi-hop Agent (WIP)

At the time of this presentation, this is still ongoing work, but despite its unfinished status, it demonstrates the flexibility and advantages for using an agentic interface over extneral knowledge bases (i.e., RAG).

With the multi-hop agent, we aim to solve query's by first planning out the required data elements that should be retrieved in order to be able to answer the question. And so, we're really combining here a few concepts:

- planning
- structured data extraction (using a RAG tool)
- reflection/correction

![multi-hop agent](https://d3ddy8balm3goa.cloudfront.net/genai-philippines/v1/multi-hop-agent.excalidraw.svg)

In [None]:
from llama_index.core import VectorStoreIndex, SimpleDirectoryReader, Document
from llama_index.core.tools import QueryEngineTool

index = VectorStoreIndex.from_documents([Document.example()])
tool = QueryEngineTool.from_defaults(
    index.as_query_engine(),
    name="dummy",
    description="dummy",
)

In [None]:
from llama_index.core.agent import FunctionCallingAgentWorker
from llama_index.agent.multi_hop.planner import MultiHopPlannerAgent

# create the function calling worker for reasoning
worker = FunctionCallingAgentWorker.from_tools([tool], verbose=True)

# wrap the worker in the top-level planner
agent = MultiHopPlannerAgent(worker, tools=[tool], verbose=True)

In [None]:
agent.create_plan(
    input="Who is more than just a film director, Gene Kelly or Yannis Smaragdis?"
)

=== Initial plan ===
## Structured Context
{
    "title": "StructuredContext",
    "description": "Data class for holding data requirements to answer query",
    "type": "object",
    "properties": {
        "original_query": {
            "title": "Original Query",
            "description": "The original query",
            "type": "string"
        },
        "sub_question_1": {
            "title": "Sub Question 1",
            "description": "Sub question 1: What are the notable achievements of Gene Kelly as a performer and director?",
            "type": "string"
        },
        "sub_question_2": {
            "title": "Sub Question 2",
            "description": "Sub question 2: What are the notable achievements of Yannis Smaragdis as a filmmaker?",
            "type": "string"
        }
    }
}

## Sub Tasks
original_query:
Extract this data field. -> The original query
deps: []


sub_question_1:
Extract this data field. -> Sub question 1: What are the notable achievements of

'9b5607b4-26fd-44ee-8f82-b86d136e5e79'

![Divider Image](https://d3ddy8balm3goa.cloudfront.net/mlops-rag-workshop/divider-2.excalidraw.svg)
![Slide TwentyThree](https://d3ddy8balm3goa.cloudfront.net/genai-philippines/v1/23.svg)

![Divider Image](https://d3ddy8balm3goa.cloudfront.net/mlops-rag-workshop/divider-2.excalidraw.svg)
![Slide TwentyTwo](https://d3ddy8balm3goa.cloudfront.net/genai-philippines/v1/22.svg)