0m364 · 0m364 · Mar 30, 2026 · Mar 30, 2026
diff --git a/LLM_TOOL_GUIDANCE.md b/LLM_TOOL_GUIDANCE.md
@@ -0,0 +1,62 @@
+# Guidance for LLM Tool Usage (ToolBench)
+
+This document provides guidance on how current Large Language Models (LLMs) like LLaMA-2, GPT-3.5, and GPT-4 can leverage the ToolBench dataset to learn how to effectively use tools.
+
+## Introduction
+ToolBench provides a large-scale, high-quality instruction tuning dataset. The dataset includes real-world API documentation, complex multi-tool queries, and complete reasoning traces using methods like ReAct or Depth-First Search-based Decision Trees (DFSDT).
+
+## 1. Defining Tools in the System Prompt
+To enable an LLM to use tools, the available functions must be explicitly defined in the system prompt. ToolBench uses a standard JSON schema format to define tools.
+
+### Example Tool Definition
+```json
+{
+  "name": "transitaire_for_transitaires",
+  "description": "This is the subfunction for tool \"transitaires\", you can use this tool.The description of this function is: \"R\u00e9cup\u00e8re un transitaire donn\u00e9e\"",
+  "parameters": {
+    "type": "object",
+    "properties": {
+      "is_id": {
+        "type": "string",
+        "description": "",
+        "example_value": "DOUANE_AGENCE_GONDRAND"
+      }
+    },
+    "required": ["is_id"],
+    "optional": []
+  }
+}
+```
+
+### Prompt Engineering
+Your system prompt should instruct the model to use the tools effectively. A common structure is:
+1. Define the persona (e.g., "You are an AI assistant that can use tools...").
+2. Provide the tools schema as shown above.
+3. Enforce an execution format (e.g., "Output your thought process, followed by an Action, and then the Action Input in JSON format").
+4. Specify a "Finish" tool or action when the task is complete.
+
+## 2. Using the Dataset for Supervised Fine-Tuning (SFT)
+The provided script `scripts/copy_datasets_to_jsonl.py` extracts the `query`, `function` schemas, and `train_messages` from the ToolBench annotations.
+
+### Data Structure
+The `training_data.jsonl` output contains each instance as a JSON record with the conversational trace. To fine-tune an LLM, this must be mapped to your chosen chat format (e.g., ChatML).
+
+**Example Mapping (ChatML Style):**
+- **System Role**: Inject the tool schemas (`function` key) and instructions here.
+- **User Role**: The human's `query`.
+- **Assistant Role**: The model's reasoning (`Thought`), tool selection (`Action`), and parameters (`Action Input`), or the `final_answer`.
+- **Function/Tool Role**: The simulated response from the API.
+
+By training on these complete trajectories, the LLM learns to reason (CoT/ReAct) and output valid JSON for function calls.
+
+## 3. The Inference Loop
+During inference, the model cannot execute tools itself; it relies on an execution loop:
+
+1. **Prompt the Model:** Provide the user query and the available tools in the system prompt.
+2. **Model Generates Action:** The LLM outputs a `function_call` (or text indicating an action and parameters).
+3. **Execution:** Your system parses the action, calls the real API or mocked environment, and receives a result.
+4. **Provide Result:** Append the result as a new message (role: `function` or `tool`) to the conversation history.
+5. **Iterate:** Feed the updated history back to the model. It will either take another action or use the `Finish` tool to return the `final_answer`.
+
+## 4. Advanced: DFSDT vs ReAct
+ToolBench uses DFSDT (Depth-First Search-based Decision Tree) which allows the model to explore multiple paths and backtrack if an API fails or returns poor results. While simple fine-tuning often uses standard ReAct (Thought -> Action -> Observation), more advanced setups can use tree-search algorithms at inference time to evaluate multiple candidate actions using the model's self-evaluation capabilities.
diff --git a/scripts/copy_datasets_to_jsonl.py b/scripts/copy_datasets_to_jsonl.py
@@ -0,0 +1,47 @@
+import json
+import os
+import argparse
+
+def process_datasets(input_dir, output_file):
+    print(f"Reading dataset files from {input_dir}")
+    print(f"Writing parsed jsonl output to {output_file}")
+
+    with open(output_file, 'w', encoding='utf-8') as outfile:
+        # We will loop through directories G1_answer, G2_answer, G3_answer if available
+        # or process exactly the directory passed
+        if not os.path.exists(input_dir):
+            print(f"Error: {input_dir} does not exist.")
+            return
+
+        total_lines_written = 0
+        for root, dirs, files in os.walk(input_dir):
+            for filename in files:
+                if filename.endswith(".json"):
+                    filepath = os.path.join(root, filename)
+                    try:
+                        with open(filepath, 'r', encoding='utf-8') as infile:
+                            data = json.load(infile)
+
+                        if "answer_generation" in data:
+                            answer_gen = data["answer_generation"]
+                            if "train_messages" in answer_gen and answer_gen.get("valid_data", False):
+                                # Extract useful training info
+                                record = {
+                                    "query": answer_gen.get("query", ""),
+                                    "functions": answer_gen.get("function", []),
+                                    "train_messages": answer_gen["train_messages"]
+                                }
+                                outfile.write(json.dumps(record, ensure_ascii=False) + '\n')
+                                total_lines_written += 1
+                    except Exception as e:
+                        print(f"Error processing {filepath}: {e}")
+
+        print(f"Successfully processed and wrote {total_lines_written} lines to {output_file}.")
+
+if __name__ == '__main__':
+    parser = argparse.ArgumentParser(description="Convert ToolBench datasets to JSONL")
+    parser.add_argument("--input_dir", type=str, default="data_example/answer", help="Input directory containing JSON files")
+    parser.add_argument("--output_file", type=str, default="training_data.jsonl", help="Output JSONL file path")
+
+    args = parser.parse_args()
+    process_datasets(args.input_dir, args.output_file)