# Self-reflection pattern

Self-reflection pattern contains at least 2 promtps: the prompt that perform task itself and prompt to reflect the previous response. In self-reflection pattern, 2 prompt are performed by same LLM model in separated fashion.

In this example, we are using langchain API and an ollama model hosted locally.

## Setup
host an ollama instance by using either running a docker or a run a service following instruction from ollama's guide.

In [None]:
import os
# Change root directory
os.chdir("../../src/")


In [2]:
from agent_design_pattern.agent import AgentMessage, LLMChain
from agent_design_pattern.orchestration import ReflectionAgent
from langchain_ollama import ChatOllama
from langchain_core.prompts import ChatPromptTemplate
from langchain_core.output_parsers import StrOutputParser
import re

  from .autonotebook import tqdm as notebook_tqdm


In [3]:
class CasualOllamaChain(LLMChain):
    def __init__(self, model, system_prompt, user_prompt_template = "{query}", **kargs):
        super().__init__()
        self.llm = ChatOllama(model=model, reasoning=False, **kargs) if isinstance(model, str) else model
        self.prompt = ChatPromptTemplate([
            ("system", system_prompt),
            ("human", user_prompt_template)
        ])
        self.chain = self.prompt | self.llm | StrOutputParser()

    def invoke(self, message: AgentMessage, **kwargs) -> AgentMessage:
        response = self.chain.invoke(message.to_dict(), **kwargs)
        # Snippet to remove thinking
        if "</think>" in response and "<think>" not in response:
            response = "<think>" + response
        response = re.sub(r'<think>.*?</think>', '', response, flags=re.DOTALL)
        print(response)
        message.response = response

        message.execution_result = "success"
        return message

In [4]:
system_prompt_task = """/no_think You are a helpful coding assistant.
You task is to write a python function and return the implementation of the function.
Some requirements:
- The logic is clear and easy to understand.
- The function arguments and return values (if any) should be typed.
- If the function is too long (for example greater than 80 lines), split the logic into multiple smaller functions.
- All functions should have docstring explanation. In the explanation, there should be an simple example to illustrate the function and how to call it.
- The response should contain function with docstring explanation. And DO NOT contain explanation outside of the code
"""
user_prompt_task = "{query}"
task_chain = CasualOllamaChain("qwen3:4b-thinking-2507-q4_K_M", system_prompt_task, user_prompt_task, base_url="192.168.55.1::11434")

system_prompt_reflection = """/no_think You are a excellent code reviewer and refactor.
Given a function implementation and it explanation, your task is to review and code and correct if contains any mistake.
Some note:
- For the implementation, check if the orignal query and suggested implementation are match.
- Is there any syntax error in the code.
- For the explanation, verify if the docstring follows Google style docstring.
- In the docstring, make sure to have an example to call the function.

Make sure the final output only contain full function code, inline code comment and docstring, nothing else."""
user_prompt_reflection = "Input query: {query}\n\nFunction implementation: {context_response}"
# Use the same llm for task and self-reflection
reflection_chain = CasualOllamaChain(task_chain.llm, system_prompt_reflection, user_prompt_reflection, base_url="192.168.55.1::11434")  # use the same model for self-reflection

def state_callback(state: str):
    print(f"agent state: {state}")
reflection_agent = ReflectionAgent(task_chain, reflection_chain, state_change_callback=state_callback)

In [5]:
# Take a leetcode as an example. Source: https://leetcode.com/problems/palindrome-partitioning-ii/description/
query = """Write python function(s) to solve the following problem:
Given a string s, partition s such that every substring of the partition is a palindrome.
Return the minimum cuts needed for a palindrome partitioning of s.

Example 1:
Input: s = "aab"
Output: 1
Explanation: The palindrome partitioning ["aa","b"] could be produced using 1 cut.

Example 2:
Input: s = "a"
Output: 0

Example 3:
Input: s = "ab"
Output: 1

Constraints:
1 <= s.length <= 2000
s consists of lowercase English letters only."""

final_message = reflection_agent.execute(AgentMessage(query=query))
print(final_message)
print("final response")
print(final_message.response)

agent state: running




def min_palindrome_cuts(s: str) -> int:
    """
    Return the minimum cuts needed for a palindrome partitioning of the string s.

    The function partitions the string into substrings such that each substring is a palindrome.
    The minimum number of cuts is returned (e.g., for "aab", the partition ["aa", "b"] requires 1 cut).

    Args:
        s (str): Input string of length between 1 and 2000.

    Returns:
        int: Minimum number of cuts needed.

    Example:
        >>> min_palindrome_cuts("aab")
        1
        >>> min_palindrome_cuts("a")
        0
        >>> min_palindrome_cuts("ab")
        1
    """
    n = len(s)
    is_pal = [[False] * n for _ in range(n)]
    
    for i in range(n-1, -1, -1):
        for j in range(i, n):
            if i == j:
                is_pal[i][j] = True
            elif j == i + 1:
                is_pal[i][j] = (s[i] == s[j])
            else:
                is_pal[i][j] = (s[i] == s[j] and is_pal[i+1][j-1])
    
    dp = [float('in