## Task Generator

In this notebook, we will make changes to support a task generator for user simulation study in Tool Sanbbox. The user simulation study has been broken down into two components:
- Natural Language Task Generator: We will take the current randomized state of the environment, all the available tools and prepare a prompt for the model to generate a natural language task.
- User Trajectory Generator: This class will take in a natural language task and try to complete the task in the tool sandbox. Most importantly, we will strip the "thinking" outputs from the outputs of this class. So ideal trajectory should look like a sequence of tool calls and system messages to user.


Additionally, we want to evaluate the *feasibility* of the task generator and *accuracy* of the trajectory generator. Feasbility can be calculated as *how many models are able to complete the task* and accuracy of trajectory generator, we should be able to judge by using milestones and minefields. Milestones and minefields should also be automatically generated but then we also have to find a way to judge it's own accuracy. I know LLM as a judge should be doable but still.

In [None]:
%load_ext autoreload
%autoreload 2


In [None]:
from tool_sandbox.common.execution_context import DatabaseNamespace, RoleType, get_current_context

from tool_sandbox.common.message_conversion import Message
from tool_sandbox.common.tool_conversion import convert_to_openai_tool, convert_to_openai_tools, convert_python_function_to_openai_function


In [None]:
current_context = get_current_context()

sandbox_database = current_context.get_database(DatabaseNamespace.SANDBOX, get_all_history_snapshots=True, drop_sandbox_message_index=False)


In [None]:
print(sandbox_database)


In [None]:
available_tools = current_context.get_available_tools(scrambling_allowed=True)
print(available_tools)


In [None]:
convert_to_openai_tool(available_tools["add_contact"])


In [None]:
convert_python_function_to_openai_function("add_contact", available_tools["add_contact"])


In [None]:
# get all tools and their openai tool representation
from tool_sandbox.common.tool_discovery import get_all_tools, ToolBackend
from tool_sandbox.common.tool_conversion import convert_python_function_to_nl_tool_string, get_tool_docs_natural_language

# TODO: Write a utility function to convert openai_tools to natural language strings for use with LLM.

all_tools = get_all_tools(ToolBackend.DEFAULT)

tool_docs = get_tool_docs_natural_language(all_tools)
print(tool_docs)




In [None]:
from tool_sandbox.common.tool_discovery import get_tool_categories_info



In [None]:
get_tool_categories_info()


In [None]:
from dotenv import load_dotenv
import random
from tool_sandbox.common.execution_context import ExecutionContext
from tool_sandbox.scenarios.openai_task_generator import OpenAITaskGenerator
from tool_sandbox.common.tool_discovery import ToolBackend

load_dotenv()

execution_context = ExecutionContext()


In [None]:
# Create OpenAI task generator
task_generator = OpenAITaskGenerator(
    model_name="o3-mini",
    execution_context=execution_context,
    max_retries=3,
    populate_sample_data=False  # Skipping for MVP
)


In [None]:
generated_tasks = []
tool_categories = ["contact", "messaging", "reminder", "setting", "utilities"]

for i in range(10):
    num_categories = random.randint(3, len(tool_categories))
    allowed_categories = random.sample(tool_categories, num_categories)
    generated_task = task_generator.generate_task(
        allowed_tool_categories=allowed_categories,
        preferred_tool_backend=ToolBackend.DEFAULT,
    )
    generated_tasks.append(generated_task)


In [None]:
# dump the generated tasks to a jsonl file
import json
from attrs import asdict
filename = "../data/o3_mini_generated_tasks.jsonl"
with open(filename, 'w', encoding='utf-8') as f:
    for task in generated_tasks:
        # Convert the @define object to a dictionary
        task_dict = asdict(task)
        # Write each task as a JSON line
        f.write(json.dumps(task_dict) + '\n')


In [None]:
import csv
from pathlib import Path
def jsonl_to_csv(jsonl_file: str, csv_file: str) -> None:
    """Convert a JSONL file of generated tasks to CSV format.

    Args:
        jsonl_file: Path to the input JSONL file
        csv_file: Path to the output CSV file
    """
    # Load tasks from JSONL
    tasks = []
    with open(jsonl_file, 'r', encoding='utf-8') as f:
        for line in f:
            if line.strip():
                task_dict = json.loads(line.strip())
                tasks.append(task_dict)

    if not tasks:
        print(f"No tasks found in {jsonl_file}")
        return

    # Write to CSV (excluding task_id)
    fieldnames = ['description', 'required_tools', 'tools_category']

    with open(csv_file, 'w', newline='', encoding='utf-8') as f:
        writer = csv.DictWriter(f, fieldnames=fieldnames)
        writer.writeheader()

        for task in tasks:
            # Convert lists to semicolon-separated strings for CSV compatibility
            csv_row = {
                'description': task['description'],
                'required_tools': '; '.join(task['required_tools']),
                'tools_category': '; '.join(task['tools_category'])
            }
            writer.writerow(csv_row)

    print(f"âœ… Converted {len(tasks)} tasks from {jsonl_file} to {csv_file}")


In [None]:
jsonl_to_csv("../data/gpt_4o_generated_tasks.jsonl", "../data/gpt_4o_generated_tasks.csv")
