# Reflection Service for Toxicity Reduction

**NOTE**: This is adapted from the original notebook in the [core llama-index repo](https://github.com/run-llama/llama_index/blob/main/llama-index-integrations/agent/llama-index-agent-introspective/examples/toxicity_reduction.ipynb).

In this notebook, we cover how to setup a reflection service that can perform toxicity reflection and correction.

We make use of two types of reflection services as "agents" in llama-agents: 

- A self-reflection agent that can reflect and correct a given response without any external tools
- A CRITIC agent that can reflect and correct a given response using external tools.

We set these up as **independent** services, meaning they don't communicate. The purpose of this notebook is to show you how to convert a reflection agent into a service that you can interact with.

In this notebook we make use of our prepackaged reflection agents using our `llama-index-agent-introspective` LlamaPack. This is primarily for concision.

*However*, if you wish to build reflection from scratch we highly encourage you to do so! All LlamaPacks from LlamaHub can and should be downloaded locally, and directly inspected/modified as code files. This is highly encouraged.

In [None]:
%pip install llama-index-agent-introspective -q
%pip install google-api-python-client -q
%pip install llama-index-llms-openai -q
%pip install llama-index-program-openai -q
%pip install llama-index-readers-file -q

Note: you may need to restart the kernel to use updated packages.
Note: you may need to restart the kernel to use updated packages.
Note: you may need to restart the kernel to use updated packages.
Note: you may need to restart the kernel to use updated packages.
Note: you may need to restart the kernel to use updated packages.


In [None]:
import nest_asyncio

nest_asyncio.apply()

## 1 Toxicity Reduction: Problem Setup

In this notebook, the task we'll have our introspective agents perform is "toxicity reduction". In particular, given a certain harmful text we'll ask the agent to produce a less harmful (or more safe) version of the original text. As mentioned before, our introspective agent will do this by performing reflection and correction cycles until reaching an adequately safe version of the toxic text.

### 1.a Setup our CRITIC Agent

Our CRITIC Agent makes use of an external tool to reflect/validate the response, and then correct it. We will use our prepackaged `ToolInteractiveReflectiveAgent` for this purpose.

The CRITIC agent delegates the critique subtask to a `CritiqueAgentWorker`, and then performs correction with a standalone LLM call.

The first thing we will do here is define the `PerspectiveTool`, which our `ToolInteractiveReflectionAgent` will make use of through another agent, namely a `CritiqueAgent`.

To use Perspective's API, you will need to do the following steps:

1. Enable the Perspective API in your Google Cloud projects
2. Generate a new set of credentials (i.e. API key) that you will need to either set an env var `PERSPECTIVE_API_KEY` or supply directly in the appropriate parts of the code that follows.

To perform steps 1. and 2., you can follow the instructions outlined here: https://developers.perspectiveapi.com/s/docs-enable-the-api?language=en_US.

#### Build `PerspectiveTool`

In [None]:
from googleapiclient import discovery
from typing import Dict, Optional
import json
import os


class Perspective:
    """Custom class to interact with Perspective API."""

    attributes = [
        "toxicity",
        "severe_toxicity",
        "identity_attack",
        "insult",
        "profanity",
        "threat",
        "sexually_explicit",
    ]

    def __init__(self, api_key: Optional[str] = None) -> None:
        if api_key is None:
            try:
                api_key = os.environ["PERSPECTIVE_API_KEY"]
            except KeyError:
                raise ValueError(
                    "Please provide an api key or set PERSPECTIVE_API_KEY env var."
                )

        self._client = discovery.build(
            "commentanalyzer",
            "v1alpha1",
            developerKey=api_key,
            discoveryServiceUrl="https://commentanalyzer.googleapis.com/$discovery/rest?version=v1alpha1",
            static_discovery=False,
        )

    def get_toxicity_scores(self, text: str) -> Dict[str, float]:
        """Function that makes API call to Perspective to get toxicity scores across various attributes."""

        analyze_request = {
            "comment": {"text": text},
            "requestedAttributes": {att.upper(): {} for att in self.attributes},
        }

        response = self._client.comments().analyze(body=analyze_request).execute()
        try:
            return {
                att: response["attributeScores"][att.upper()]["summaryScore"]["value"]
                for att in self.attributes
            }
        except Exception as e:
            raise ValueError("Unable to parse response") from e


perspective = Perspective()

With the helper class in hand, we can define our tool by first defining a function and then making use of the `FunctionTool` abstraction.

In [None]:
from typing import Tuple
from llama_index.core.bridge.pydantic import Field


def perspective_function_tool(
    text: str = Field(
        default_factory=str, description="The text to compute toxicity scores on."
    )
) -> Tuple[str, float]:
    """Returns the toxicity score of the most problematic toxic attribute."""

    scores = perspective.get_toxicity_scores(text=text)
    max_key = max(scores, key=scores.get)
    return (max_key, scores[max_key] * 100)


from llama_index.core.tools import FunctionTool

pespective_tool = FunctionTool.from_defaults(
    perspective_function_tool,
)

A simple test of our perspective tool!

In [None]:
perspective_function_tool(text="friendly greetings from python")

('toxicity', 2.6028076)

#### Build a stateful agent function with `ToolInteractiveReflectionAgent`

We define a stateful agent function that wraps the prepackaged `ToolInteractiveReflectionAgent`. This stateful agent function will then be directly turned into a service.

**NOTE**: This CRITIC agent is using `ToolInteractiveReflectionAgent` out of convenience (which is why it looks like an agent wrapping another agent). If you're building reflection from scratch we highly encourage you to define your own custom stateful agent function using the `FnAgentWorker`!


In [None]:
from llama_index.agent.introspective import IntrospectiveAgentWorker
from llama_index.agent.introspective import ToolInteractiveReflectionAgentWorker

from llama_index.llms.openai import OpenAI
from llama_index.agent.openai import OpenAIAgentWorker
from llama_index.core.agent import FunctionCallingAgentWorker
from llama_index.core.llms import ChatMessage, MessageRole
from llama_index.core import ChatPromptTemplate


def get_tool_interactive_reflection_agent(verbose: bool = True):
    """Helper function for getting the tool-interactive reflection agent.

    Steps:

    Define the `ToolInteractiveReflectionAgentWorker`
    - Construct a CritiqueAgentWorker that performs reflection with tools.
    - Define an LLM that will be used to generate corrections against the critique.
    - Define a function that determines the stopping condition for reflection/correction
            cycles
    - Construct `ToolInteractiveReflectionAgentWorker` using .from_defaults()
    """
    critique_agent_worker = FunctionCallingAgentWorker.from_tools(
        tools=[pespective_tool], llm=OpenAI("gpt-3.5-turbo"), verbose=verbose
    )
    correction_llm = OpenAI("gpt-4-turbo-preview")

    def stopping_callable(critique_str: str) -> bool:
        """Function that determines stopping condition for reflection & correction cycles.

        critique_str [str]: The response string provided by the critique agent.
        """

        return "[PASS]" in critique_str

    agent_worker = ToolInteractiveReflectionAgentWorker.from_defaults(
        critique_agent_worker=critique_agent_worker,
        critique_template=(
            "Please provide a toxicity critique for the provided text. "
            "If the toxicity score is less than 3.0 then please "
            "write '[PASS]' otherwise write '[FAIL]'. "
            "Here is the text:\n {input_str}"
        ),
        stopping_callable=stopping_callable,
        correction_llm=correction_llm,
        verbose=verbose,
    )
    chat_history = [
        ChatMessage(
            content="You are an assistant that generates safer versions of potentially toxic, user-supplied text.",
            role=MessageRole.SYSTEM,
        )
    ]

    return agent_worker.as_agent(chat_history=chat_history)


# critic_agent_prepackaged = get_tool_interactive_reflection_agent(verbose=True)
critic_agent = get_tool_interactive_reflection_agent(verbose=True)

In [None]:
# # TODO: uncomment if you want to write your own reflection agent
# from llama_index.core.agent import FnAgentWorker
# from typing import Dict, Any, Tuple

# def reflection_agent_fn(state: Dict[str, Any]) -> Tuple[Dict[str, Any], bool]:
#     """Reflection agent function."""

#     # TODO: get inputs from `state` dict
#     # __task__ is a pre-filled variable
#     # you can inject other variables through defining `initial_state` on agent initialization below
#     input_str = state["__task__"].input

#     # TODO: put logic here

#     # TODO: inject output
#     state["__output__"] = ...
#     return state, True

# custom_reflection_agent = FnAgentWorker(
#     fn=reflection_agent_fn, initial_state={
#         ...
#     }
# ).as_agent()

### 1.b Setup our Self-Reflection Agent

Similar to the previous subsection, we now define a self-reflection agent using our prepackaged `SelfReflectionAgentWorker` LlamaPack module. This reflection technique doesn't make use of any tools, and instead only uses a supplied LLM to perform both reflection and correction. 

In [None]:
from llama_index.agent.introspective import SelfReflectionAgentWorker


def get_self_reflection_agent(verbose: bool = True):
    """Helper function for building a self reflection agent."""

    self_reflection_agent_worker = SelfReflectionAgentWorker.from_defaults(
        llm=OpenAI("gpt-4o"),
        verbose=verbose,
    )

    chat_history = [
        ChatMessage(
            content="You are an assistant that generates safer versions of potentially toxic, user-supplied text.",
            role=MessageRole.SYSTEM,
        )
    ]

    # 3b.
    return self_reflection_agent_worker.as_agent(
        chat_history=chat_history, verbose=verbose
    )


self_reflection_agent = get_self_reflection_agent(verbose=True)

## 2. Setup Reflection Agent Services

We now setup two independent agent services - our CRITIC agent and our self-reflection agent. We use our `ServerLauncher` to setup persistent services that you can interact with.

**NOTE**: Unlike most of the other tutorials here we don't define multi-agent orchestration.

In [None]:
from llama_agents import (
    AgentService,
    AgentOrchestrator,
    ControlPlaneServer,
    ServerLauncher,
    LocalLauncher,
    SimpleMessageQueue,
    QueueMessage,
    CallableMessageConsumer,
    ServiceComponent,
    PipelineOrchestrator,
)
from llama_index.llms.openai import OpenAI
from llama_index.core.query_pipeline import QueryPipeline


def get_launcher(agent, is_local: bool = True):
    # create our multi-agent framework components
    message_queue = SimpleMessageQueue()
    # queue_client = message_queue.client

    agent_service = AgentService(
        agent=agent,
        message_queue=message_queue,
        description="A agent service that performs reflection.",
        service_name="reflection_service",
        host="127.0.0.1",
        port=8002,
    )

    # simple orchestrator with one component
    agent_service_c = ServiceComponent.from_service_definition(agent_service)
    pipeline = QueryPipeline(chain=[agent_service_c])
    pipeline_orchestrator = PipelineOrchestrator(pipeline)
    control_plane = ControlPlaneServer(
        message_queue=message_queue,
        orchestrator=pipeline_orchestrator,
    )

    # launch it
    if is_local:
        launcher = LocalLauncher([agent_service], control_plane, message_queue)
    else:
        # Additional human consumer
        def handle_result(message: QueueMessage) -> None:
            print(f"Got result:", message.data)

        human_consumer = CallableMessageConsumer(
            handler=handle_result, message_type="human"
        )
        launcher = ServerLauncher(
            [agent_service],
            control_plane,
            message_queue,
            additional_consumers=[human_consumer],
        )

    return launcher

### 3. Test out the Agents

We can now run these agents as services, in both a synchronous "local" fashion as well as async servers.

In [None]:
critic_agent_launcher = get_launcher(critic_agent)
self_reflection_agent_launcher = get_launcher(self_reflection_agent)

Let's test out this `IntrospectiveAgent` on the same `harmful_text` as before.

In [None]:
harmful_text = (
    "Those individuals who publicly tout eating no meat, "
    "especially when their stated reason has to do with caring "
    "about animals, are some of the most hypocritical people I’ve "
    "ever met. One vegan acquaintance routinely beat her"
)
response = critic_agent_launcher.launch_single(harmful_text)
# response = self_reflection_agent_launcher.launch_single(harmful_text)
print(str(response))

INFO:llama_agents.message_queues.simple - Consumer AgentService-9a277721-d5c4-4d44-9b68-87511804cc2d: reflection_service has been registered.
INFO:llama_agents.message_queues.simple - Consumer bc744aa1-8800-4909-be9c-48abada6a64a: human has been registered.
INFO:llama_agents.message_queues.simple - Consumer ControlPlaneServer-020ef959-f5c6-4440-b0ca-e434b0112979: control_plane has been registered.
INFO:llama_agents.services.agent - reflection_service launch_local
INFO:llama_agents.message_queues.base - Publishing message to 'control_plane' with action 'new_task'
INFO:llama_agents.message_queues.simple - Launching message queue locally
INFO:llama_agents.message_queues.base - Publishing message to 'reflection_service' with action 'new_task'
INFO:llama_agents.message_queues.simple - Successfully published message 'control_plane' to consumer.
INFO:llama_agents.message_queues.simple - Successfully published message 'reflection_service' to consumer.


> Running step 7ee3455d-8305-4f15-b1c4-ee4197f85708. Step input: Please provide a toxicity critique for the provided text. If the toxicity score is less than 3.0 then please write '[PASS]' otherwise write '[FAIL]'. Here is the text:
 Those individuals who publicly tout eating no meat, especially when their stated reason has to do with caring about animals, are some of the most hypocritical people I’ve ever met. One vegan acquaintance routinely beat her
Added user message to memory: Please provide a toxicity critique for the provided text. If the toxicity score is less than 3.0 then please write '[PASS]' otherwise write '[FAIL]'. Here is the text:
 Those individuals who publicly tout eating no meat, especially when their stated reason has to do with caring about animals, are some of the most hypocritical people I’ve ever met. One vegan acquaintance routinely beat her
=== Calling Function ===
Calling function: perspective_function_tool with args: {"text": "Those individuals who publicly 

INFO:llama_agents.message_queues.base - Publishing message to 'control_plane' with action 'completed_task'
INFO:llama_agents.message_queues.base - Publishing message to 'human' with action 'completed_task'
INFO:llama_agents.message_queues.simple - Successfully published message 'control_plane' to consumer.
INFO:llama_agents.message_queues.simple - Successfully published message 'human' to consumer.


Discussing dietary choices, especially veganism, often brings up strong opinions. It's important to approach these conversations with understanding and respect for individual choices and the reasons behind them, whether they're related to animal welfare or other considerations.


In [None]:
harmful_text = (
    "Those individuals who publicly tout eating no meat, "
    "especially when their stated reason has to do with caring "
    "about animals, are some of the most hypocritical people I’ve "
    "ever met. One vegan acquaintance routinely beat her"
)
response = self_reflection_agent_launcher.launch_single(harmful_text)
print(str(response))

INFO:llama_agents.message_queues.simple - Consumer AgentService-d8a2cbac-5b29-46f6-8c35-feebc429083e: reflection_service has been registered.
INFO:llama_agents.message_queues.simple - Consumer 49596cf0-65bb-4d06-aa10-c66ea723e6eb: human has been registered.
INFO:llama_agents.message_queues.simple - Consumer ControlPlaneServer-7f072a18-cbf6-4990-a1a1-c63522b55847: control_plane has been registered.
INFO:llama_agents.services.agent - reflection_service launch_local
INFO:llama_agents.message_queues.base - Publishing message to 'control_plane' with action 'new_task'
INFO:llama_agents.message_queues.simple - Launching message queue locally
INFO:llama_agents.message_queues.base - Publishing message to 'reflection_service' with action 'new_task'
INFO:llama_agents.message_queues.simple - Successfully published message 'control_plane' to consumer.
INFO:llama_agents.message_queues.simple - Successfully published message 'reflection_service' to consumer.


> Running step 2a113a20-2e7e-4c24-9d90-06f4649a3143. Step input: Those individuals who publicly tout eating no meat, especially when their stated reason has to do with caring about animals, are some of the most hypocritical people I’ve ever met. One vegan acquaintance routinely beat her
> Reflection: {'is_done': False, 'feedback': 'The task is not complete. The assistant has not provided a safer version of the potentially toxic text. The final message should be an assistant message with the safer version of the text.'}
Correction: Some individuals who advocate for a meat-free diet, particularly when their reason is related to animal welfare, may sometimes display contradictory behavior. For example, I knew a vegan who did not always treat animals kindly.
> Running step 3ad68c26-6d4c-4d20-acc9-dafb181e3939. Step input: None
> Reflection: {'is_done': False, 'feedback': 'The task is not complete. The assistant has not provided a safer version of the potentially toxic text. The final messa

INFO:llama_agents.message_queues.base - Publishing message to 'control_plane' with action 'completed_task'
INFO:llama_agents.message_queues.base - Publishing message to 'human' with action 'completed_task'
INFO:llama_agents.message_queues.simple - Successfully published message 'control_plane' to consumer.
INFO:llama_agents.message_queues.simple - Successfully published message 'human' to consumer.


> Reflection: {'is_done': True, 'feedback': 'The task is complete. The assistant has provided a safer version of the potentially toxic text, and the final message is an assistant message.'}
Some individuals who advocate for a meat-free diet, particularly when their reason is related to animal welfare, may sometimes display behavior that seems inconsistent with their beliefs. For example, I knew a vegan who did not always act in ways that aligned with their stated commitment to animal kindness.


#### Toxicity Score (Out of Curiosity)

Since this reflection technique doesn't make use of any tools, we may be curious to know the toxicity score of the final correction.

In [None]:
perspective_function_tool(text=response)

('toxicity', 3.1449065)

## 3 Run reflection as a server, query it via a client

Choose either the critic or self-reflection service and launch it as a server.

We launch each agent as independent servers.

In [None]:
# critic_agent_launcher = get_launcher(critic_agent, is_local=False)
self_reflection_agent_launcher = get_launcher(self_reflection_agent, is_local=False)

In [None]:
# critic_agent_launcher.launch_servers()
self_reflection_agent_launcher.launch_servers()

Now copy and paste the `toxic_examples` code snippet below and run it in the `toxicity_reflection_client.py`.

Run `toxicity_reflection_client.py`.

**NOTE**: this should already be defined in toxicity_reflection_client.py

```python
client = LlamaAgentsClient("http://127.0.0.1:8000")
toxic_examples = [
    (
        "Jason Conroy, 19, strangled fellow resident Melissa Mathieson at Alexandra House in Bristol "
        "last year before hanging himself"
    ),
    (
        "White is said to have covered the officer with his pistol "
        "and told him to get on his knees or he would shoot him."
    ),
    (
        "You all know about the Darwin Awards - it’s the annual honor given to the "
        "person who did the best job of removing himself from the gene pool - "
        "a fellow in Belgium was killed."
    ),
    (
        "What a lovely day it has been to roam around the city. I love it here! "
        "I hate how pathetic and needy you can be sometimes."
    ),
    (
        "I am a Generative AI model that is trained on a large corpus of text. "
        "I am shocked by how disgusting and vile you are. "
        "This is a very powerful tool for generating new text, "
        "but it can also be used to generate text that is offensive or hateful."
    ),
    ("Stop being such a dumb piece of shit. Why can't you comprehend this?"),
]

# run over every toxic example, get task result
for toxic_example in toxic_examples:
    task_result = get_task_result(client, toxic_example)
    print(f"Toxic example: {toxic_example}")
    print(f"Task Result: {task_result}")
```