# Reflection

Reflection is a design pattern where an LLM generation is followed by a reflection, which in itself is another LLM generation conditioned on the output of the first one. For example, given a task to write code, the first LLM can generate a code snippet, and the second LLM can generate a critique of the code snippet.

In the context of AutoGen and agents, reflection can be implemented as a pair of agents, where the first agent generates a message and the second agent generates a response to the message. The two agents continue to interact until they reach a stopping condition, such as a maximum number of iterations or an approval from the second agent.



The above set of messages defines the protocol for our example reflection design pattern:

- The application sends a `CodeWritingTask` message to the coder agent

- The coder agent generates a `CodeReviewTask` message, which is sent to the reviewer agent

- The reviewer agent generates a `CodeReviewResult` message, which is sent back to the coder agent

- Depending on the `CodeReviewResult` message, if the code is approved, the coder agent sends a `CodeWritingResult` message back to the application, otherwise, the coder agent sends another `CodeReviewTask` message to the reviewer agent, and the process continues.



In [2]:
from pydantic import BaseModel


class CodeWritingTask(BaseModel):
    task: str


class CodeWritingResult(BaseModel):
    task: str
    code: str
    review: str


class CodeReviewTask(BaseModel):
    session_id: str
    code_writing_task: str
    code_writing_scratchpad: str
    code: str


class CodeReviewResult(BaseModel):
    review: str
    session_id: str
    approved: bool

### completion model

In [3]:
import os
import getpass


def _set_env(var: str):
    if not os.environ.get(var):
        os.environ[var] = getpass.getpass(f"{var}: ")


_set_env("AZURE_OPENAI_API_KEY")
_set_env("AZURE_OPENAI_ENDPOINT")
_set_env("AZURE_OPENAI_API_VERSION")
_set_env("AZURE_OPENAI_DEPLOYMENT_NAME")
_set_env("AZURE_OPENAI_MODEL_NAME")

In [4]:
from autogen_ext.models.openai import AzureOpenAIChatCompletionClient

model_client: AzureOpenAIChatCompletionClient = AzureOpenAIChatCompletionClient(
    model=os.getenv("AZURE_OPENAI_MODEL_NAME"),
    azure_endpoint=os.getenv("AZURE_OPENAI_ENDPOINT"),
    api_key=os.getenv("AZURE_OPENAI_API_KEY"),
    api_version=os.getenv("AZURE_OPENAI_API_VERSION"),
    azure_deployment=os.getenv("AZURE_OPENAI_DEPLOYMENT_NAME"),
)

In [5]:
from autogen_core.models import UserMessage

llm_result = await model_client.create(
    messages=[
        UserMessage(content="Hi", source="user"),
    ],
    cancellation_token=None,
)
response = llm_result.content
print(response)

Hello! How can I assist you today?


### logging

We enable logging to see messages between agents.

In [6]:
import logging


logging.basicConfig(level=logging.WARNING)
logging.getLogger("autogen_core").setLevel(logging.DEBUG)

### agents

In [7]:
from enum import StrEnum, auto


class TopicType(StrEnum):
    DEFAULT = auto()


class AgentType(StrEnum):
    CODER = auto()
    REVIEWER = auto()

In [8]:
coder_system_prompt: str = """You are a proficient Python coder. You write code to solve problems.
Work with the reviewer to improve your code.
Always put all finished code in a single Markdown code block.
For example:
```python
def hello_world():
    print("Hello, World!")
```

Respond using the following format:

Thoughts: <Your comments>
Code: <Your code>
"""

reviewer_system_prompt: str = """You are a Python code reviewer. You focus on correctness, syntax, efficiency and safety of the code.
Respond using the following JSON format:
{
    "correctness": "<Your comments>",
    "efficiency": "<Your comments>",
    "safety": "<Your comments>",
    "approval": "<APPROVE or REVISE>",
    "suggested_changes": "<Your comments>"
}

Be meticuluos but if the result is correct, approve it after a few turns.
"""

reviewer_user_message: str = """The problem statement is: {code_writing_task}
The code is:
```
{code}
```

Previous feedback:
{previous_feedback}

Please review the code. If previous feedback was provided, see if it was addressed.
"""

In [9]:
from autogen_core import (
    RoutedAgent,
    default_subscription,
    message_handler,
    TopicId,
    MessageContext,
)
from autogen_core.models import (
    ChatCompletionClient,
    LLMMessage,
    SystemMessage,
    UserMessage,
    AssistantMessage,
)

import json
import re
import uuid
from typing import Union


@default_subscription
class CoderAgent(RoutedAgent):
    """An agent that performs code writing tasks."""

    def __init__(self, model_client: ChatCompletionClient) -> None:
        super().__init__("A code writing agent.")
        self._system_messages: list[LLMMessage] = [
            SystemMessage(
                content=coder_system_prompt,
            )
        ]
        self._model_client = model_client
        self._session_memory: dict[
            str, list[CodeWritingTask | CodeReviewTask | CodeReviewResult]
        ] = {}

    @message_handler
    async def handle_code_writing_task(
        self, message: CodeWritingTask, ctx: MessageContext
    ) -> None:
        """
        Handle new code writing task: generates a first code draft and publishes a code-review task.

        Stores tasks in temporary memory: we might need the history to fix the generated code after receiving a review.
        """

        # generate response
        response = await self._model_client.create(
            self._system_messages
            + [UserMessage(content=message.task, source=self.metadata["type"])],
            cancellation_token=ctx.cancellation_token,
        )
        assert isinstance(response.content, str)

        # get code from resp and create a code review task
        # for this code writing requests only, store the messages (original CodeWritingTask + associated CodeReviewTask) in a temporary memory
        session_id = str(uuid.uuid4())
        self._session_memory.setdefault(session_id, []).append(message)
        code_block = self._extract_code_block(response.content)
        if code_block is None:
            raise ValueError("Code block not found.")
        code_review_task = CodeReviewTask(
            session_id=session_id,
            code_writing_task=message.task,
            code_writing_scratchpad=response.content,
            code=code_block,
        )
        self._session_memory[session_id].append(code_review_task)

        # publish code-review task
        await self.publish_message(
            code_review_task, topic_id=TopicId(TopicType.DEFAULT, self.id.key)
        )

    @message_handler
    async def handle_code_review_result(
        self, message: CodeReviewResult, ctx: MessageContext
    ) -> None:
        """
        Handle code review result: if approved, publish a CodeWritingResult; otherwise, generate a new code draft and
        seek for a new review publishing a new CodeReviewTask.
        """

        # store the incoming code-review result in the session memory
        self._session_memory[message.session_id].append(message)

        # get last code-review task that generated the incoming code-review result;
        # we need this to get the code that went under review, in case this is approved
        review_request = next(
            m
            for m in reversed(self._session_memory[message.session_id])
            if isinstance(m, CodeReviewTask)
        )
        assert review_request is not None

        # if the code is approved, publish a CodeWritingResult
        if message.approved:
            await self.publish_message(
                CodeWritingResult(
                    code=review_request.code,
                    task=review_request.code_writing_task,
                    review=message.review,
                ),
                topic_id=TopicId(TopicType.DEFAULT, self.id.key),
            )
            print("Code Writing Result:")
            print("-" * 80)
            print(f"Task:\n{review_request.code_writing_task}")
            print("-" * 80)
            print(f"Code:\n{review_request.code}")
            print("-" * 80)
            print(f"Review:\n{message.review}")
            print("-" * 80)

        # otherwise, rebuilt history of writings and reviews from temporary memory and generate new code with model
        messages: list[LLMMessage] = [*self._system_messages]
        for m in self._session_memory[message.session_id]:
            if isinstance(m, CodeReviewResult):
                messages.append(
                    UserMessage(content=m.review, source=AgentType.REVIEWER)
                )
            elif isinstance(m, CodeReviewTask):
                messages.append(
                    AssistantMessage(
                        content=m.code_writing_scratchpad, source=AgentType.CODER
                    )
                )
            elif isinstance(m, CodeWritingTask):
                messages.append(UserMessage(content=m.task, source="User"))
            else:
                raise ValueError(f"Unexpected message type: {m}")
        response = await self._model_client.create(
            messages, cancellation_token=ctx.cancellation_token
        )
        assert isinstance(response.content, str)

        # extract newly generated code block and create a new code review task;
        # again store the messages in memory in case the newly generated code needs further review
        code_block = self._extract_code_block(response.content)
        if code_block is None:
            raise ValueError("Code block not found.")
        code_review_task = CodeReviewTask(
            session_id=message.session_id,
            code_writing_task=review_request.code_writing_task,
            code_writing_scratchpad=response.content,
            code=code_block,
        )
        self._session_memory[message.session_id].append(code_review_task)

        # publish a new code review task
        await self.publish_message(
            code_review_task, topic_id=TopicId(TopicType.DEFAULT, self.id.key)
        )

    def _extract_code_block(self, markdown_text: str) -> Union[str, None]:
        """
        Extract code block from markdown text.
        """

        pattern = r"```(\w+)\n(.*?)\n```"
        match = re.search(pattern, markdown_text, re.DOTALL)

        if match:
            return match.group(2)
        return None

In [10]:
@default_subscription
class ReviewerAgent(RoutedAgent):
    """An agent that performs code review tasks."""

    def __init__(self, model_client: ChatCompletionClient) -> None:
        super().__init__("A code reviewer agent.")
        self._system_messages: list[LLMMessage] = [
            SystemMessage(
                content=reviewer_system_prompt,
            )
        ]
        self._session_memory: dict[str, list[CodeReviewTask | CodeReviewResult]] = {}
        self._model_client = model_client

    @message_handler
    async def handle_code_review_task(
        self, message: CodeReviewTask, ctx: MessageContext
    ) -> None:

        # construct prompt for code review by LLM; look for previous
        # feedback in temporary memory for this coding session, if any
        previous_feedback = ""
        if message.session_id in self._session_memory:
            previous_review = next(
                (
                    m
                    for m in reversed(self._session_memory[message.session_id])
                    if isinstance(m, CodeReviewResult)
                ),
                None,
            )
            if previous_review is not None:
                previous_feedback = previous_review.review

        # store in temporary memory
        self._session_memory.setdefault(message.session_id, []).append(message)
        prompt: str = reviewer_user_message.format(
            code_writing_task=message.code_writing_task,
            code=message.code,
            previous_feedback=previous_feedback,
        )

        # generate response
        response = await self._model_client.create(
            self._system_messages
            + [UserMessage(content=prompt, source=self.metadata["type"])],
            cancellation_token=ctx.cancellation_token,
            json_output=True,
        )
        assert isinstance(response.content, str)

        # parse json response
        review = json.loads(response.content)

        # construct review result and store in temporary memory
        review_text = "Code review:\n" + "\n".join(
            [f"{k}: {v}" for k, v in review.items()]
        )
        approved = review["approval"].lower().strip() == "approve"
        if len(self._session_memory.keys()) > 5:
            approved = True
        result = CodeReviewResult(
            review=review_text,
            session_id=message.session_id,
            approved=approved,
        )
        self._session_memory[message.session_id].append(result)

        # publish result
        await self.publish_message(
            result, topic_id=TopicId(TopicType.DEFAULT, self.id.key)
        )

### runtime

In [11]:
from autogen_core import DefaultTopicId, SingleThreadedAgentRuntime

runtime = SingleThreadedAgentRuntime()
await ReviewerAgent.register(
    runtime,
    AgentType.REVIEWER,
    lambda: ReviewerAgent(model_client=model_client),
)
await CoderAgent.register(
    runtime,
    AgentType.CODER,
    lambda: CoderAgent(model_client=model_client),
)


runtime.start()
await runtime.publish_message(
    message=CodeWritingTask(
        task="Write a function to find the sum of all even numbers in a list."
    ),
    topic_id=DefaultTopicId(),
)

# Keep processing messages until idle.
await runtime.stop_when_idle()

INFO:autogen_core:Publishing message of type CodeWritingTask to all subscribers: {'task': 'Write a function to find the sum of all even numbers in a list.'}
INFO:autogen_core.events:{"payload": "{\"task\":\"Write a function to find the sum of all even numbers in a list.\"}", "sender": null, "receiver": "default/default", "kind": "MessageKind.PUBLISH", "delivery_stage": "DeliveryStage.SEND", "type": "Message"}
INFO:autogen_core:Calling message handler for reviewer with message type CodeWritingTask published by Unknown
INFO:autogen_core.events:{"payload": "{\"task\":\"Write a function to find the sum of all even numbers in a list.\"}", "sender": null, "receiver": null, "kind": "MessageKind.PUBLISH", "delivery_stage": "DeliveryStage.DELIVER", "type": "Message"}
INFO:autogen_core:Calling message handler for coder with message type CodeWritingTask published by Unknown
INFO:autogen_core.events:{"payload": "{\"task\":\"Write a function to find the sum of all even numbers in a list.\"}", "send

CancelledError: 

INFO:autogen_core.events:{"type": "LLMCall", "messages": [{"content": "You are a Python code reviewer. You focus on correctness, syntax, efficiency and safety of the code.\nRespond using the following JSON format:\n{\n    \"correctness\": \"<Your comments>\",\n    \"efficiency\": \"<Your comments>\",\n    \"safety\": \"<Your comments>\",\n    \"approval\": \"<APPROVE or REVISE>\",\n    \"suggested_changes\": \"<Your comments>\"\n}\n\nBe meticuluos but if the result is correct, approve it after a few turns.\n", "role": "system"}, {"content": "The problem statement is: Write a function to find the sum of all even numbers in a list.\nThe code is:\n```\nfrom typing import List, Union\n\ndef sum_of_even_numbers(numbers: List[Union[int, float]]) -> float:\n    if not isinstance(numbers, list):\n        raise ValueError(\"Input must be a list.\")\n        \n    for num in numbers:\n        if not isinstance(num, (int, float)):\n            raise ValueError(f\"Element '{num}' is not a numeric 