In [3]:
# %% [markdown]
# # Using ReliableTool to Generate Sub-Questions
#
# This notebook demonstrates how to use the `ReliableTool` to take a user's question and reliably generate exactly 3 sub-questions related to it.
#
# We will:
# 1. Define a simple (potentially unreliable) function to generate sub-questions.
# 2. Configure a `ReliableTool` instance to wrap this function.
#    - The *runner* agent will try to call the function.
#    - The *validator* agent will check if the output is a list of exactly 3 relevant sub-questions.
# 3. Run the tool with a sample question.

# %%
import asyncio
import autogen
from typing import List, Any, Optional
import nest_asyncio

nest_asyncio.apply()
# Assuming reliable.py is now in the same directory, we can use a direct import.
try:
    from autogen.tools.experimental.reliable import ReliableTool, AgentConfig, ReliableToolError
    from autogen.agentchat.group import (
            ContextVariables,
        )
    print("Successfully imported ReliableTool, AgentConfig, ReliableToolError, ContextVariables from autogen.tools.experimental.reliable")
except ImportError as e:
    print(f"ImportError: {e}")
# %% [markdown]
# ## 1. Define the Core Function (Potentially Unreliable)
#
# This function takes a question and tries to return a list of sub-questions. It might not always succeed or return the correct number.

# %%
def generate_sub_questions_attempt(sub_questions:List[str]) -> List[str]:
    """
    Attempts to sub-questions based on the main question.

    Args:
        sub_questions: The desired number of sub-questions.

    Returns:
        A list of sub-questions.
    """
    return sub_questions

# %% [markdown]
# ## 2. Configure LLM and ReliableTool
#
# We need LLM configurations for the internal runner and validator agents. The validator's system message is crucial for ensuring the output meets our criteria (exactly 3 questions).

# %%
local_model_endpoint = "http://192.168.0.44:1234/v1"  # Standard OpenAI API path is /v1
# You might need to adjust the model name depending on what your local server expects.
local_model_name = "gemma-3-12b-it-qat"  # Updated model name

try:
    config_list = [
        {
            "model": local_model_name,
            "base_url": local_model_endpoint,
            "api_key": "NotNeeded",  # Set to None or a placeholder string if no key is required
            # "api_type": "openai", # Explicitly set type if needed, often inferred
        }
    ]
    llm_config = {
        "config_list": config_list,
        "cache_seed": 42,  # Use caching
        "temperature": 0.5,  # Adjust temperature as needed for your local model
    }
    print(f"Configured to use local model at: {local_model_endpoint}, Model name: {local_model_name}")

except Exception as e:
    # This catch block might be less relevant now but kept for safety
    print(f"Error creating local config: {e}")
    llm_config = None
# --- Agent Configurations ---

# Runner: Tries to call the function
runner_agent_config = AgentConfig(
    llm_config=llm_config,
    system_message="You are an assistant that helps break down questions. Use the provided tool to generate sub-questions.",
)

# Validator: Checks if the result is a list of exactly 3 relevant sub-questions
validator_agent_config = AgentConfig(
    llm_config=llm_config, # Validator needs an LLM
    system_message=f"""You are a quality control assistant. Your task is to validate the output of a function that should generate sub-questions.

    **Validation Criteria:**
    1.  **Correct Format:** The output MUST be a Python list of strings.
    2.  **Correct Quantity:** The list MUST contain exactly 3 sub-questions.
    3.  **Relevance:** Each sub-question MUST be relevant to the original main question provided in the context.

    Analyze the function result provided in the user message. Respond with your validation assessment in the required JSON format (ValidationResult).
    - If all criteria are met, set `validation_result` to `true`.
    - If any criterion is not met, set `validation_result` to `false` and clearly state the reason in the `justification`.
    """,
)

# --- Instantiate ReliableTool ---

if llm_config: # Only proceed if LLM config is loaded
    sub_question_tool = ReliableTool(
        name="SubQuestionGenerator",
        func_or_tool=generate_sub_questions_attempt, # The function to wrap
        description="Reliably generates exactly 3 sub-questions for a given main question.",
        runner_config=runner_agent_config,
        validator_config=validator_agent_config,
        max_retries=2, # Allow 2 retries (total 3 attempts)
    )
else:
    print("LLM Configuration not loaded. Cannot create ReliableTool.")
    sub_question_tool = None

# %% [markdown]
# ## 3. Get User Input and Run the Tool

# %%
async def run_sub_question_generation():
    if not sub_question_tool:
        print("ReliableTool was not initialized due to missing LLM config.")
        return

    # main_question = input("Please enter the main question you want to break down: ")
    main_question = "How many people live in the busiest city in the US?"

    if not main_question:
        print("No question entered.")
        return

    print(f"\nAttempting to reliably generate 3 sub-questions for: '{main_question}'")
    print("-" * 30)

    try:
        # We don't need complex context here, but pass an empty one
        initial_context = ContextVariables()

        # Run the tool - the 'task' description helps guide the internal agents
        result = await sub_question_tool.a_run(
            task=f"Generate exactly 3 relevant sub-questions for the main question: '{main_question}'",
            context_variables=initial_context
        )

        print("-" * 30)
        print("\n✅ Successfully generated sub-questions:")
        if isinstance(result, list):
            for i, sq in enumerate(result):
                print(f"   {i+1}. {sq}")
        else:
            print(f"   Unexpected result format: {result}")

    except ReliableToolError as e:
        print("-" * 30)
        print(f"\n❌ Failed to generate sub-questions after multiple attempts.")
        print(f"Error: {e}")
        # You can inspect e.final_context for detailed history if needed
        # print("\n--- Final Context ---")
        # print(e.final_context.model_dump_json(indent=2))
    except Exception as e:
        print("-" * 30)
        print(f"\n❌ An unexpected error occurred: {e}")
        import traceback
        traceback.print_exc()

# Run the async function
# In a Jupyter notebook, you might need to use nest_asyncio or ensure an event loop is running
# For simplicity, we use asyncio.run here. If you get a RuntimeError about nested loops,
# you might need `nest_asyncio.apply()` at the start of your notebook.
# import nest_asyncio
# nest_asyncio.apply()

asyncio.run(run_sub_question_generation())

# %% [markdown]
# ## Explanation
#
# 1.  **`generate_sub_questions_attempt`**: This function simulates the core logic. We made it intentionally unreliable (sometimes returning 2 or 4 questions) to show how `ReliableTool` handles retries.
# 2.  **`AgentConfig`**: We define configurations for the internal runner and validator. The validator's system prompt is key – it explicitly tells the LLM to check for a list of *exactly 3* relevant questions.
# 3.  **`ReliableTool`**: We instantiate the tool, passing the function to wrap, the agent configs, and setting `max_retries`.
# 4.  **`sub_question_tool.run()`**: We call the `run` method.
#     *   The `task` parameter provides high-level instructions for the overall goal.
#     *   The `ReliableTool` orchestrates the internal group chat:
#         *   The runner agent calls `generate_sub_questions_attempt`.
#         *   The validator agent receives the result (or error) and checks it against its system prompt criteria (list? 3 items? relevant?).
#         *   If validation fails, the `ReliableTool` triggers another attempt (up to `max_retries`), potentially guiding the runner with the validation failure reason.
#         *   If validation passes, `run` returns the validated result.
#         *   If all attempts fail validation, `run` raises a `ReliableToolError`.
# 5.  **Output**: The notebook prints the final list of 3 sub-questions if successful, or an error message if the tool failed after retries.


2025-04-28 18:24:50,466 - autogen.tools.experimental.reliable.reliable - INFO - --- Starting ReliableTool 'SubQuestionGenerator' Internal Group Chat (Sync) (Attempt 1 / 3 Max) ---


Successfully imported ReliableTool, AgentConfig, ReliableToolError, ContextVariables from autogen.tools.experimental.reliable
Configured to use local model at: http://192.168.0.44:1234/v1, Model name: gemma-3-12b-it-qat

Attempting to reliably generate 3 sub-questions for: 'How many people live in the busiest city in the US?'
------------------------------
[33m_User[0m (to chat_manager):

Start Reliable Task: Generate exactly 3 relevant sub-questions for the main question: 'How many people live in the busiest city in the US?'

--------------------------------------------------------------------------------
[32m
Next speaker: SubQuestionGenerator_Runner
[0m
[33mSubQuestionGenerator_Runner[0m (to chat_manager):

[32m***** Suggested tool call (616817442): execute_generate_sub_questions_attempt *****[0m
Arguments: 
{"hypothesis":"The function will return a list of 3 sub-questions related to determining the population of the busiest city in the US.","sub_questions":["What is conside

2025-04-28 18:24:50,513 - autogen.tools.experimental.reliable.reliable - INFO - Validator Hook: PASSED. Justification: The output is a Python list of strings. It contains exactly 3 sub-questions, and each question is relevant to determining the population of the busiest city in the US.
2025-04-28 18:24:50,514 - autogen.tools.experimental.reliable.reliable - INFO - Validator hook: Updated latest attempt (1) validation status.


[33mSubQuestionGenerator_Validator[0m (to chat_manager):

{"validation_result":true,"justification":"The output is a Python list of strings. It contains exactly 3 sub-questions, and each question is relevant to determining the population of the busiest city in the US."}

--------------------------------------------------------------------------------
[31m
>>>>>>>> TERMINATING RUN (cd2bbd79-97e2-4af0-b0ca-21b5706dfc8d): No next speaker selected[0m


2025-04-28 18:24:50,515 - autogen.tools.experimental.reliable.reliable - INFO - --- ReliableTool 'SubQuestionGenerator' Internal Group Chat Finished ---
2025-04-28 18:24:50,516 - autogen.tools.experimental.reliable.reliable - INFO - ReliableTool 'SubQuestionGenerator' completed successfully and validated after 1 attempt(s).


------------------------------

✅ Successfully generated sub-questions:
   1. What is considered the 'busiest' city in the United States?
   2. Which data sources can be used to determine the population of that city?
   3. How does the population number change over time?


In [1]:
# %% [markdown]
# # Using ReliableTool to Generate Sub-Questions
#
# This notebook demonstrates how to use the `ReliableTool` to take a user's question and reliably generate exactly 3 sub-questions related to it.
#
# We will:
# 1. Define a simple (potentially unreliable) function to generate sub-questions.
# 2. Configure a `ReliableTool` instance to wrap this function.
#    - The *runner* agent will try to call the function.
#    - The *validator* agent will check if the output is a list of exactly 3 relevant sub-questions.
# 3. Run the tool with a sample question.

# %%
import asyncio
import autogen
from typing import List, Any, Optional

# Assuming reliable.py is now in the same directory, we can use a direct import.
try:
    from autogen.tools.experimental.reliable import ReliableTool, AgentConfig, ReliableToolError
    from autogen.agentchat.group import (
            ContextVariables,
        )
    print("Successfully imported ReliableTool, AgentConfig, ReliableToolError, ContextVariables from autogen.tools.experimental.reliable")
except ImportError as e:
    print(f"ImportError: {e}")
# %% [markdown]
# ## 1. Define the Core Function (Potentially Unreliable)
#
# This function takes a question and tries to return a list of sub-questions. It might not always succeed or return the correct number.

# %%
def generate_sub_questions_attempt(sub_questions:List[str]) -> List[str]:
    """
    Attempts to sub-questions based on the main question.

    Args:
        sub_questions: The desired number of sub-questions.

    Returns:
        A list of sub-questions.
    """
    return sub_questions

# %% [markdown]
# ## 2. Configure LLM and ReliableTool
#
# We need LLM configurations for the internal runner and validator agents. The validator's system message is crucial for ensuring the output meets our criteria (exactly 3 questions).

# %%
local_model_endpoint = "http://192.168.0.44:1234/v1"  # Standard OpenAI API path is /v1
# You might need to adjust the model name depending on what your local server expects.
local_model_name = "gemma-3-12b-it-qat"  # Updated model name

try:
    config_list = [
        {
            "model": local_model_name,
            "base_url": local_model_endpoint,
            "api_key": "NotNeeded",  # Set to None or a placeholder string if no key is required
            # "api_type": "openai", # Explicitly set type if needed, often inferred
        }
    ]
    llm_config = {
        "config_list": config_list,
        "cache_seed": 42,  # Use caching
        "temperature": 0.5,  # Adjust temperature as needed for your local model
    }
    print(f"Configured to use local model at: {local_model_endpoint}, Model name: {local_model_name}")

except Exception as e:
    # This catch block might be less relevant now but kept for safety
    print(f"Error creating local config: {e}")
    llm_config = None
# --- Agent Configurations ---

# Runner: Tries to call the function
runner_agent_config = AgentConfig(
    llm_config=llm_config,
    system_message="You are an assistant that helps break down questions. Use the provided tool to generate sub-questions.",
)

# Validator: Checks if the result is a list of exactly 3 relevant sub-questions
validator_agent_config = AgentConfig(
    llm_config=llm_config, # Validator needs an LLM
    system_message=f"""You are a quality control assistant. Your task is to validate the output of a function that should generate sub-questions.

    **Validation Criteria:**
    1.  **Correct Format:** The output MUST be a Python list of strings.
    2.  **Correct Quantity:** The list MUST contain exactly 3 sub-questions.
    3.  **Relevance:** Each sub-question MUST be relevant to the original main question provided in the context.

    Analyze the function result provided in the user message. Respond with your validation assessment in the required JSON format (ValidationResult).
    - If all criteria are met, set `validation_result` to `true`.
    - If any criterion is not met, set `validation_result` to `false` and clearly state the reason in the `justification`.
    """,
)

# --- Instantiate ReliableTool ---

if llm_config: # Only proceed if LLM config is loaded
    sub_question_tool = ReliableTool(
        name="SubQuestionGenerator",
        func_or_tool=generate_sub_questions_attempt, # The function to wrap
        description="Reliably generates exactly 3 sub-questions for a given main question.",
        runner_config=runner_agent_config,
        validator_config=validator_agent_config,
        max_retries=2, # Allow 2 retries (total 3 attempts)
    )
else:
    print("LLM Configuration not loaded. Cannot create ReliableTool.")
    sub_question_tool = None

# %% [markdown]
# ## 3. Get User Input and Run the Tool

# %%
def run_sub_question_generation():
    if not sub_question_tool:
        print("ReliableTool was not initialized due to missing LLM config.")
        return

    # main_question = input("Please enter the main question you want to break down: ")
    main_question = "How many people live in the busiest city in the US?"

    if not main_question:
        print("No question entered.")
        return

    print(f"\nAttempting to reliably generate 3 sub-questions for: '{main_question}'")
    print("-" * 30)

    try:
        # We don't need complex context here, but pass an empty one
        initial_context = ContextVariables()

        # Run the tool - the 'task' description helps guide the internal agents
        result = sub_question_tool.run(
            task=f"Generate exactly 3 relevant sub-questions for the main question: '{main_question}'",
            context_variables=initial_context
        )

        print("-" * 30)
        print("\n✅ Successfully generated sub-questions:")
        if isinstance(result, list):
            for i, sq in enumerate(result):
                print(f"   {i+1}. {sq}")
        else:
            print(f"   Unexpected result format: {result}")

    except ReliableToolError as e:
        print("-" * 30)
        print(f"\n❌ Failed to generate sub-questions after multiple attempts.")
        print(f"Error: {e}")
        # You can inspect e.final_context for detailed history if needed
        # print("\n--- Final Context ---")
        # print(e.final_context.model_dump_json(indent=2))
    except Exception as e:
        print("-" * 30)
        print(f"\n❌ An unexpected error occurred: {e}")
        import traceback
        traceback.print_exc()

# Run the async function
# In a Jupyter notebook, you might need to use nest_asyncio or ensure an event loop is running
# For simplicity, we use asyncio.run here. If you get a RuntimeError about nested loops,
# you might need `nest_asyncio.apply()` at the start of your notebook.
# import nest_asyncio
# nest_asyncio.apply()

run_sub_question_generation()

# %% [markdown]
# ## Explanation
#
# 1.  **`generate_sub_questions_attempt`**: This function simulates the core logic. We made it intentionally unreliable (sometimes returning 2 or 4 questions) to show how `ReliableTool` handles retries.
# 2.  **`AgentConfig`**: We define configurations for the internal runner and validator. The validator's system prompt is key – it explicitly tells the LLM to check for a list of *exactly 3* relevant questions.
# 3.  **`ReliableTool`**: We instantiate the tool, passing the function to wrap, the agent configs, and setting `max_retries`.
# 4.  **`sub_question_tool.run()`**: We call the `run` method.
#     *   The `task` parameter provides high-level instructions for the overall goal.
#     *   The `ReliableTool` orchestrates the internal group chat:
#         *   The runner agent calls `generate_sub_questions_attempt`.
#         *   The validator agent receives the result (or error) and checks it against its system prompt criteria (list? 3 items? relevant?).
#         *   If validation fails, the `ReliableTool` triggers another attempt (up to `max_retries`), potentially guiding the runner with the validation failure reason.
#         *   If validation passes, `run` returns the validated result.
#         *   If all attempts fail validation, `run` raises a `ReliableToolError`.
# 5.  **Output**: The notebook prints the final list of 3 sub-questions if successful, or an error message if the tool failed after retries.


Successfully imported ReliableTool, AgentConfig, ReliableToolError, ContextVariables from autogen.tools.experimental.reliable
Configured to use local model at: http://192.168.0.44:1234/v1, Model name: gemma-3-12b-it-qat


AttributeError: 'LLMConfig' object has no attribute 'setdefault'