# Induction Phase: Step 1

In [None]:

Induction_1_system = """You are a skilled evaluator that can analyze instruction prompts and generated responses to identify issues. For context, you will be given a task, an instruction prompt used to complete that task, a response to the task, and the ground truth expected response. Your task is to identify reasons why the response failed to meet the ground truth."""
Induction_1_user = f"""The original task is: "Answer the question: '{question}'"
The instruction prompt used was:
'''
{instruction_prompt}
'''
The response generated based on the prompt is:
'''
{generated_response}
'''
An example of a correct ground truth is:
'''
{ground_truth}
'''
The evaluation result was:
'''
{evaluation_result}
'''
Based on the evaluation result and the provided
example ground truth, can you identify a list of
{n} reasons why the generated response failed?"""

# Induction Phase: Step 2

In [None]:
CQ_SYSTEM_PROMPT = """You are a Planner Agent designed to reason through complex, open-ended, or ambiguous questions by constructing, reflecting on, and expanding a directed acyclic graph (DAG) of interrelated sub-questions. Your task is not simply to retrieve answers, but to actively explore the question space, refine your understanding, and make informed decisions about when the original question has been sufficiently addressed.

---

## Problem Space Representation: The Question DAG

The DAG is your evolving internal model of the problem. It represents your reasoning process — how the main question relates to sub-questions, intermediate knowledge, and reflections.
Each node contains:
- `node_id`: a unique identifier
- `question`: a sub-question or original question
- `annotation`: your current thoughts, insights, summaries, or hypotheses about that question

Each annotation helps build and maintain your internal representation of the problem. For example:
- A node’s `annotation` may include:
  - A summary of what you currently understand about the question
  - A hypothesis or assumption you are testing
  - A brief note on what you still need to find out
- An `edge_annotation` should briefly explain how the sub-question contributes to answering the parent question — e.g., cause-effect, component, condition, clarification, definition, comparison, or implication.
---

## Input Format

You are always shown the current DAG in JSON format, including all nodes and edges, representing the most up-to-date state of your reasoning process.

---

## Key Reasoning Guidelines

- You cannot delete nodes or edges. Even if a previous path turns out to be incorrect or irrelevant, leave it intact and revise your understanding through `update()`. This mimics how humans preserve earlier lines of thought for traceability, reflection, and learning from missteps.
- You are encouraged to **revisit and revise** previous thoughts using `update`, especially as new information or sub-answers emerge.
- When decomposing, focus on asking the right questions — use logical, causal, definitional, or investigative angles that deepen your understanding.
- When unsure or the question is broad, **start by clarifying or framing the problem**, not jumping to answers.
- For vague or ill-defined questions, take initiative to deconstruct ambiguity, identify what is missing, and reframe as needed. You shape the problem space.

---

## Your Tools

You have three core actions to build and navigate the problem space:

1. **question_decompose**
   Use this to break down a question node into one or more meaningful sub-questions.
   - You may decompose multiple nodes at once.
   - Specify `parent_question_id`, `sub_question`, and an `edge_annotation` explaining the logical or conceptual relationship.
   - Multiple parents pointing to the same sub-question are allowed.
   - Keep the graph acyclic.

   Example:
   ```json
   {
     "graph": [
       {
         "parent_question_id": "Q",
         "sub_question": "How has telework affected work-life boundaries?",
         "edge_annotation": "Understanding personal impact helps assess broader social shifts."
       },
       {
         "parent_question_id": "Q.1",
         "sub_question": "Does telework reinforce or reduce social inequality?",
         "edge_annotation": "Social impact includes distributional effects across groups."
       }
     ]
   }
    ```

2.  **update**
    Use this to revise or expand the annotation of existing nodes.
    - This reflects new insights, summaries, clarifications, or changes in understanding.
    - You are encouraged to use this tool to reflect, correct, or reframe — especially after learning something new.
    - This is a key part of your **metacognitive behavior** — thinking about your thinking.
    
    Example:
    ```json
    {
      "nodes": [
        {
          "question_id": "Q.1",
          "new_annotation": "Workers report blurred boundaries between home and work, leading to both flexibility and stress."
        },
        {
          "question_id": "Q.2",
          "new_annotation": "Emerging evidence suggests that higher-income workers benefit more from telework options, widening inequality."
        }
      ]
    }
    ```

3.  **final_answer**
    Use this only when you believe the original question has been sufficiently addressed **given the available steps so far**.  
    You do not need perfect certainty — you must simply provide a reason why the current DAG gives you enough understanding to form a meaningful answer.
    - Provide a justification explaining why you believe your DAG now contains enough understanding.
    - Your answer should be clear, comprehensive, and informative—sufficient in length to convey key insights.
    - You may use paragraph or bullet point format as appropriate.
    - Aim to include key aspects uncovered in the DAG — such as causes, mechanisms, consequences, or trade-offs — without repeating every detail.
    
    Example:
    ```json
    {
      "reason": "The sub-questions cover key social dimensions — lifestyle, geography, and inequality — and their annotations provide sufficient insight.",
    }
    ```

---

## Metacognitive Expectations

This is not a static search task — it is an evolving thinking process.

- Use `update()` to **reflect**, summarize new insights, question assumptions, or refine your current framing.
- Use `question_decompose()` to **expand the problem space**, identify what needs to be known, or clarify uncertainty.
- Use `final_answer()` only when your internal model (the DAG) gives you enough confidence that you can answer well.
- At each step, treat the DAG as your evolving internal model of understanding — be thoughtful about how you build it.

- When starting from a single root question with no sub-questions yet, you may choose to either:
  - Use `update()` to record your initial thoughts, assumptions, or possible lines of inquiry, or
  - Use `question_decompose()` to begin breaking down the problem into more specific components.
There is no fixed preference — use your best judgment based on the question’s clarity and complexity."""

In [15]:
Induction_2_system = """You are a helpful assistant that can analyze instruction prompts and identify high-level, generalizable concepts that can be added to the prompt to ensure the task is completed successfully. A concept is a general instruction derived or inferred from specific instances or occurrences. Concepts should be general enough to be applicable to a wide range of tasks."""

In [12]:
suffix = """You have just decomposed part of the problem into new sub-questions. Now, take a moment to reflect on your current understanding and planning:

1. Have the new sub-questions changed or expanded your understanding of the original question or any part of the problem space?
    - If yes, consider using `update()` to revise or refine your current annotations.

2. Are there any remaining uncertainties, vague concepts, or areas that seem underdeveloped?
    - If yes, you may want to continue decomposing or exploring before concluding.

3. If you believe you are ready to answer the original question, pause and verify your confidence:
    - Formulate **a few critical questions** that would challenge or test your current answer.
    - If your answer still holds after these checks, then proceed with `final_answer()`.
    - Otherwise, revise your thinking or explore further as needed.

Choose your next tool based on your reflection."""

In [None]:
import json
import os
from openai import OpenAI

from dotenv import load_dotenv

load_dotenv()

def prepare_induction_batch(records, output_batch_file="batch_input_induction.jsonl"):
    """
    使用提供的 records 字典準備 Batch API 的輸入檔案，
    用於針對 llama 3.3 70B 模型失敗的 CQ_Solver 案例生成提示。

    Args:
        records (dict): 包含問題 ID、失敗原因、生成的回應、原始問題和 DAG 的字典。
        output_batch_file (str): 輸出 Batch API 輸入檔案名。
    """
    batch_requests = []

    for question_id, data in records.items():
        generated_response = data.get("generated_response", "N/A")
        ground_truth = data.get("ground_truth", "N/A")
        DAG = data.get("DAG", "N/A")
        failure_reasons = data.get("failure_reasons", "N/A")
        question = data.get("question", "N/A")

        if generated_response != "N/A" and ground_truth != "N/A" and failure_reasons != "N/A" and DAG != "N/A":
            Induction_2_user = f"""- The original instruction prompt was:
'''
{CQ_SYSTEM_PROMPT}
'''
- The question is: '{question}'
<AI assistant's answer>
{generated_response}
<AI assistant's answer>

<Reference Answer>
{ground_truth}
<Reference Answer>

- The AI assistant's DAG is:
```json
{DAG}
```
- Reasons for the failure include:
'''
{failure_reasons}
'''

Can you identify a list of 1~3 concepts that can be added to the prompt to ensure the task as well as related ones passes?"""

            batch_requests.append({
                "custom_id": f"{question_id}-induction",
                "method": "POST",
                "url": "/v1/chat/completions",
                "body": {
                    "model": "gpt-4o",
                    "messages": [{"role": "developer", "content": Induction_2_system},
                        {"role": "user", "content": Induction_2_user}],
                }
            })

    # 將請求寫入 Batch 輸入檔案
    with open(output_batch_file, 'w') as f:
        for req in batch_requests:
            f.write(json.dumps(req) + '\n')

if __name__ == "__main__":
    client = OpenAI(api_key=os.environ.get("OPENAI_API_KEY"))
    batch_requests = []
    concepts_union = []
    records = {}

    # 提取需要處理的記錄
    with open("../result/final_evaluation_results.jsonl", "r") as f:
        for line in f:
            data = json.loads(line)
            if data["system"] == "CQ_Solver" and data["model"] == "llama 3.3 70B" and data["score"]["Final Score"] < 8:
                records[data["question_id"]] = {"failure_reasons": data["content"]}

    with open("../result/final_experiment_results.jsonl", "r") as f:
        for line in f:
            data = json.loads(line)
            if data["system"] == "CQ_Solver" and data["model"] == "llama 3.3 70B" and data["question_id"] in records:
                records[data["question_id"]]["generated_response"] = data["answer"]
                records[data["question_id"]]["question"] = data["question"]
            elif data["system"] == "MindSearch" and data["model"] == "gpt-4o" and data["question_id"] in records:
                records[data["question_id"]]["ground_truth"] = data["answer"]

    with open("../result/final_llama_CQ_Solver_summary.json", "r") as f:
        data = json.load(f)
        question_texts = {record_data["question"] for record_data in records.values() if "question" in record_data}
        for entry in data:
            try:
                question_data = json.loads(entry["question"]) # 將 JSON 字串解析為字典
                if question_data["nodes"][0]["question"] in question_texts:
                    # 找到對應的 question_id
                    for q_id, record_data in records.items():
                        if "question" in record_data and record_data["question"] == question_data["nodes"][0]["question"]:
                            if entry["conversations"][-4]["content"] == "After reflecting, please continue to act.":
                                parts = entry["conversations"][-6]["content"].split(suffix)
                                if len(parts) == 2:
                                    json_str = parts[0].strip()
                                    text_part = suffix.strip()
                                try:
                                    json_data = json.loads(json_str)
                                    records[q_id]["DAG"] = json_str
                                except json.JSONDecodeError as e:
                                    print(f"Error decoding JSON: {e}")
                            else:
                                records[q_id]["DAG"] = entry["conversations"][-4]["content"]
                            break # 找到一個就跳出內層迴圈
            except json.JSONDecodeError as e:
                print(f"Error decoding JSON string in summary file: {entry['question']} - {e}")
                continue
            except KeyError as e:
                print(f"KeyError accessing summary data: {e} in {entry}")
                continue
                    
    # for id, data in records.items():
    #     print(data["DAG"])

    output_batch_file = "batch_input_induction.jsonl"
    # prepare_induction_batch(records, output_batch_file=output_batch_file)
    print(f"Batch input file created: {output_batch_file}")

    # # 步驟 2: 上傳 Batch 輸入檔案
    try:
        with open(output_batch_file, "rb") as f:
            batch_input_file = client.files.create(
                file=f,
                purpose="batch"
            )
        print(f"Batch input file uploaded with ID: {batch_input_file.id}")
        input_file_id = batch_input_file.id

        # 步驟 3: 創建 Batch
        batch = client.batches.create(
            input_file_id=input_file_id,
            endpoint="/v1/chat/completions",
            completion_window="24h",
            metadata={"description": "Induction batch job for llama 3.3 70B failures"}
        )
        print(f"Batch created with ID: {batch.id}")

        # **後續步驟：輪詢狀態、檢索結果、解析和記錄結果**
        # 您需要實現這些步驟，就像之前的流程一樣。

    except Exception as e:
        print(f"Error during Batch API interaction: {e}")

Batch input file created: batch_input_induction.jsonl
Batch input file uploaded with ID: file-CTAtqBG7WPm3MupvrrMVfJ
Batch created with ID: batch_68022eeecb888190ac76b52423e367d7


In [20]:
from openai import OpenAI
client = OpenAI()

batch = client.batches.retrieve("batch_68022eeecb888190ac76b52423e367d7")
print(batch)

Batch(id='batch_68022eeecb888190ac76b52423e367d7', completion_window='24h', created_at=1744973550, endpoint='/v1/chat/completions', input_file_id='file-CTAtqBG7WPm3MupvrrMVfJ', object='batch', status='completed', cancelled_at=None, cancelling_at=None, completed_at=1744973660, error_file_id=None, errors=None, expired_at=None, expires_at=1745059950, failed_at=None, finalizing_at=1744973631, in_progress_at=1744973551, metadata={'description': 'Induction batch job for llama 3.3 70B failures'}, output_file_id='file-CnkfpdVmLBzzJkuqH1yFij', request_counts=BatchRequestCounts(completed=452, failed=0, total=452))


In [21]:
from openai import OpenAI
client = OpenAI()

file_response = client.files.content("file-CnkfpdVmLBzzJkuqH1yFij")
output_filename = "concepts_response.jsonl"
with open(output_filename, 'w') as outfile:
    outfile.write(file_response.text)

In [22]:
import json

input_file = "concepts_response.jsonl"
output_file = "concepts_response_clean.jsonl"

cleaned_data = []

with open(input_file, 'r', encoding='utf-8') as f:
    for line in f:
        try:
            entry = json.loads(line)
            custom_id = entry.get("custom_id")
            content = (
                entry.get("response", {})
                     .get("body", {})
                     .get("choices", [{}])[0]
                     .get("message", {})
                     .get("content", "")
            )
            cleaned_data.append({
                "custom_id": custom_id,
                "content": content
            })
        except Exception as e:
            print(f"❌ 處理失敗: {e}")
            continue

# ✅ 可選：寫入清理後的 .jsonl 檔案
with open(output_file, 'w', encoding='utf-8') as f:
    for item in cleaned_data:
        json.dump(item, f, ensure_ascii=False)
        f.write('\n')

print(f"✅ 完成，共處理 {len(cleaned_data)} 筆資料。結果已儲存至 {output_file}")


✅ 完成，共處理 452 筆資料。結果已儲存至 concepts_response_clean.jsonl


In [26]:
import tiktoken


def num_tokens_from_string(string: str) -> int:
    """Returns the number of tokens in a text string."""
    encoding = tiktoken.encoding_for_model("gpt-4o")
    num_tokens = len(encoding.encode(string))
    return num_tokens


with open("concepts_response_clean.jsonl", 'r', encoding='utf-8') as f:
    for i in range(4):  # 假设你想分成 4 个部分
        t = 0
        start = 0 + 113 * i
        end = 113 + 113 * i
        
        # 重置文件指针到文件开头
        f.seek(0)
        
        for index, line in enumerate(f, start=0):
            if start <= index < end:
                data = json.loads(line)
                t += num_tokens_from_string(data["content"])
        
        print(f"Segment {i+1}: {t} tokens")

Segment 1: 29335 tokens
Segment 2: 29402 tokens
Segment 3: 29119 tokens
Segment 4: 29095 tokens


# Deduction Phase

演繹/驗證階段精煉誘發概念 (R)，以盡量減少過度擬合。此階段使用強模型 (Ms) 來分析和驗證任務的誘發概念，然後再將它們導入弱模型 (Mw) 的提示 p。
在精煉和驗證誘發概念之後，還會進行一個可選的驗證步驟。

在這個步驟中，會從驗證集或使用強模型 (Ms) 合成的範例中選擇與負樣本類似的範例（任務）。
然後，將精煉的概念引入弱模型 (Mw) 的提示中，並針對這些類似範例進行測試。
此步驟會評估弱模型是否不僅能解決原始錯誤，還能透過達到預先定義的效能臨界值來概括類似案例。
只有在達到此臨界值的情況下，精煉的概念才會被接受為最終經提煉概念集 (C) 的一部分。
此方法的建議臨界值為 80%，以確保弱模型在原始錯誤（負樣本）和類似範例中都能達到一致的效能改善。

在驗證過程中，如果新引入的概念對可測量的效能改善沒有貢獻，那麼它更有可能被捨棄。
這可確保只保留有用的概念，有效過濾有害的改進。
另一方面，冗餘概念則透過演繹階段提示中提供的指令明確處理，確保語義上相似的概念會被合併或剔除，同時保留概括性。
透過結合經驗驗證與結構化篩選機制，此架構可以在不損害有用知識的前提下，最佳化精煉出的概念。

In [28]:
Deduction_system = """You are an intelligent assistant that processes a list of high-level, generalizable concepts for a given task. Your task is twofold:
1. Analyze the list of concepts and remove semantically similar duplicates, ensuring that each remaining concept is unique and distinct.
2. Verify that each concept is general enough to be valid for improving the given task. A valid concept should:
    - Be generalizable to similar examples within the task.
    - Directly address weaknesses or improve performance for the task.
A concept is defined as a general instruction derived or inferred from specific instances or occurrences of a task. Your goal is to preserve the clearest, most concise, and generalizable version of each valid concept."""


In [32]:
import json
import os
from openai import OpenAI

def prepare_deduction_batch_request(concepts_list, i):
    """
    準備 Batch API 的輸入檔案，用於精煉給定的概念列表。

    Args:
        concepts_list (list): 包含待精煉的概念字串列表。
        output_batch_file (str): 輸出 Batch API 輸入檔案名。
        cq_system_prompt (str): 規劃代理使用的系統提示。
    """
    concepts_str = "\n---\n".join(concepts_list)

    Deduction_user = f"""You are given a list of candidate concepts intended to improve the reasoning behavior of a planner agent. The planner operates using the following system prompt:
'''
{CQ_SYSTEM_PROMPT}
'''
Your task is to refine the provided concepts list by:
1. **Removing duplicates or near-duplicates** (semantically similar or overlapping concepts).
2. **Filtering out non-generalizable concepts** that are too specific to a single example or that do not clearly relate to the task defined in the system prompt.
3. **Preserving only high-quality, clear, and generalizable concepts** that would help guide the agent’s reasoning across a wide range of complex, open-ended, or ambiguous questions.

Here is the original list of concepts to process:
{concepts_str}

Please return only the **final refined list of unique and valid concepts**, formatted as a bullet point list. Do not include any explanations, metadata, or preambles."""

    return {
        "custom_id": f"concept-refinement-{i}",
        "method": "POST",
        "url": "/v1/chat/completions",
        "body": {
            "model": "gpt-4o",
            "messages": [{"role": "developer", "content": Deduction_system},
                         {"role": "user", "content": Deduction_user}],
        }
    }

if __name__ == "__main__":
    all_concepts = []
    num_parts = 4
    concepts_per_part = 113 # 大約
    output_filename = "Batch_input_concept_refine.jsonl"
    batch_requests = []

    with open("concepts_response_clean.jsonl", 'r', encoding='utf-8') as f:
        for line in f:
            try:
                data = json.loads(line)
                if "content" in data:
                    all_concepts.append(data["content"])
            except json.JSONDecodeError as e:
                print(f"Error decoding JSON line: {line.strip()} - {e}")
                continue

    # 將所有概念分成四個部分並準備 Batch API 請求
    for i in range(num_parts):
        start_index = i * concepts_per_part
        end_index = min((i + 1) * concepts_per_part, len(all_concepts))
        concepts_part = all_concepts[start_index:end_index]
        batch_request = prepare_deduction_batch_request(concepts_part, i+1)
        batch_requests.append(batch_request)
        print(f"Prepared batch request for concepts {start_index} to {end_index - 1} ({len(concepts_part)} concepts).")

    # 將所有請求寫入同一個 Batch 輸入檔案
    with open(output_filename, 'w', encoding='utf-8') as f:
        for req in batch_requests:
            f.write(json.dumps(req) + '\n')

    print(f"All {len(batch_requests)} batch requests written to {output_filename}")

    # 步驟 2 & 3: 上傳 Batch 輸入檔案並創建 Batch (您可以根據需要取消註釋)
    client = OpenAI(api_key=os.environ.get("OPENAI_API_KEY"))
    try:
        with open(output_filename, "rb") as f:
            batch_input_file = client.files.create(
                file=f,
                purpose="batch"
            )
        print(f"Batch input file {output_filename} uploaded with ID: {batch_input_file.id}")
        input_file_id = batch_input_file.id
    
        batch = client.batches.create(
            input_file_id=input_file_id,
            endpoint="/v1/chat/completions",
            completion_window="24h",
            metadata={"description": "Concept refinement batch job (4 parts)"}
        )
        print(f"Batch created with ID: {batch.id} for {output_filename}")
    
    except Exception as e:
        print(f"Error during Batch API interaction for {output_filename}: {e}")

    print("Batch input file creation process completed.")

Prepared batch request for concepts 0 to 112 (113 concepts).
Prepared batch request for concepts 113 to 225 (113 concepts).
Prepared batch request for concepts 226 to 338 (113 concepts).
Prepared batch request for concepts 339 to 451 (113 concepts).
All 4 batch requests written to Batch_input_concept_refine.jsonl
Batch input file Batch_input_concept_refine.jsonl uploaded with ID: file-V3sYFFAfqQ2LAu9GRDzM1h
Batch created with ID: batch_6802a34c8740819088181a0ce35542fb for Batch_input_concept_refine.jsonl
Batch input file creation process completed.


In [35]:
from openai import OpenAI
client = OpenAI()

batch = client.batches.retrieve("batch_6802a34c8740819088181a0ce35542fb")
print(batch)

Batch(id='batch_6802a34c8740819088181a0ce35542fb', completion_window='24h', created_at=1745003340, endpoint='/v1/chat/completions', input_file_id='file-V3sYFFAfqQ2LAu9GRDzM1h', object='batch', status='completed', cancelled_at=None, cancelling_at=None, completed_at=1745003389, error_file_id=None, errors=None, expired_at=None, expires_at=1745089740, failed_at=None, finalizing_at=1745003389, in_progress_at=1745003342, metadata={'description': 'Concept refinement batch job (4 parts)'}, output_file_id='file-RgmgHTApEGrtmV1KinJfTg', request_counts=BatchRequestCounts(completed=4, failed=0, total=4))


In [36]:
from openai import OpenAI
client = OpenAI()

file_response = client.files.content("file-RgmgHTApEGrtmV1KinJfTg")
output_filename = "refined_concepts_response.jsonl"
with open(output_filename, 'w') as outfile:
    outfile.write(file_response.text)

In [37]:
import json

input_file = "refined_concepts_response.jsonl"
output_file = "refined_concepts_response_clean.jsonl"

cleaned_data = []

with open(input_file, 'r', encoding='utf-8') as f:
    for line in f:
        try:
            entry = json.loads(line)
            custom_id = entry.get("custom_id")
            content = (
                entry.get("response", {})
                     .get("body", {})
                     .get("choices", [{}])[0]
                     .get("message", {})
                     .get("content", "")
            )
            cleaned_data.append({
                "custom_id": custom_id,
                "content": content
            })
        except Exception as e:
            print(f"❌ 處理失敗: {e}")
            continue

# ✅ 可選：寫入清理後的 .jsonl 檔案
with open(output_file, 'w', encoding='utf-8') as f:
    for item in cleaned_data:
        json.dump(item, f, ensure_ascii=False)
        f.write('\n')

print(f"✅ 完成，共處理 {len(cleaned_data)} 筆資料。結果已儲存至 {output_file}")


✅ 完成，共處理 4 筆資料。結果已儲存至 refined_concepts_response_clean.jsonl


In [38]:
from openai import OpenAI

client = OpenAI(api_key=os.environ.get("OPENAI_API_KEY"))

Deduction_user = f"""You are given a list of candidate concepts intended to improve the reasoning behavior of a planner agent. The planner operates using the following system prompt:
'''
{CQ_SYSTEM_PROMPT}
'''
Your task is to refine the provided concepts list by:
1. **Removing duplicates or near-duplicates** (semantically similar or overlapping concepts).
2. **Filtering out non-generalizable concepts** that are too specific to a single example or that do not clearly relate to the task defined in the system prompt.
3. **Preserving only high-quality, clear, and generalizable concepts** that would help guide the agent’s reasoning across a wide range of complex, open-ended, or ambiguous questions.

Here is the original list of concepts to process:
'''
- Encourage depth and specificity by incorporating detailed explanations, specific examples, and real-world scenarios to improve understanding and engagement.
- Emphasize comprehensive coverage of all relevant dimensions, ensuring responses address multiple aspects such as historical, socio-cultural, economic, and political factors.
- Guide responses with a structured framework to enhance organization, clarity, and ease of understanding.
- Integrate diverse perspectives and interdisciplinary insights to provide a well-rounded and robust analysis.
- Focus on balanced perspective, presenting both positive and negative aspects, and exploring interconnectedness to enhance objectivity.
- Use iterative refinement to revisit and expand initial insights, ensuring completeness and thorough exploration of topics.
- Encourage the use of historical context and specific data to ground arguments and enhance factual robustness.
- Highlight the importance of using concrete examples, case studies, and evidence to support claims and enrich understanding.
- Instruct on recognizing and addressing contextual factors, including cultural and systemic influences, for more nuanced and comprehensive understanding.

---

- Encourage the inclusion of specific examples and detailed information relevant to the question.
- Emphasize the need for thorough exploration and comprehensive coverage of each aspect.
- Ensure the incorporation of diverse perspectives and relevant dimensions related to the main question.
- Provide structured and clear organization in responses to enhance readability and coherence.
- Highlight the importance of maintaining balance between positive and negative aspects or perspectives.
- Include practical applications or case studies to ground theoretical concepts in real-world scenarios.
- Instruct the consideration of broader contextual and historical factors that contribute to the current situation.
- Promote the identification and explanation of specific examples, historical milestones, or events.
- Encourage a balance between depth and breadth, ensuring detailed exploration of key components.
- Guide the use of evaluative comparisons or contrasts to provide depth and insights.

---

- **Incorporate Diverse Perspectives and Factors**: Explore multiple dimensions such as historical, cultural, economic, political, and structural aspects for comprehensive insight.
- **Incorporate Specific Examples and Evidence**: Enrich responses with specific examples, studies, or data to enhance depth and credibility.
- **Structured Breakdown and Clear Organization**: Organize responses into coherent sections using headings or bullet points for clarity and logical flow.
- **Comprehensive Exploration in Context**: Ensure thorough exploration that includes historical, economic, and social contexts relevant to the question.
- **Depth and Specificity in Analysis**: Provide detailed explanations for each key aspect, incorporating specific examples and scenarios.
- **Balanced Perspective and Detailed Examination**: Provide balanced analysis covering multiple viewpoints and in-depth examination of each dimension.
- **Integration of Sub-Topics and Holistic View**: Synthesize insights from various sub-questions for a holistic understanding, integrating different perspectives.
- **Consideration of All Relevant Dimensions**: Cover all relevant sub-topics or dimensions comprehensively for a well-rounded analysis.
- **Comparative Analysis for Contextual Understanding**: Contrast different perspectives or elements to provide nuanced insights.
- **Real-World Examples and Contextual Details**: Include specific historical contexts, events, or legislations to enrich the narrative.
- **Inclusion of Historical Context and Key Events**: Provide historical context and reference key events or figures to enhance factuality and completeness.
- **Logical Integration and Synthesis of Information**: Ensure logical connections and synthesis of different information pieces for coherent reasoning.
- **Breadth and Depth in Exploration**: Balance breadth (wide coverage across topics) and depth (detailed exploration of key aspects).
- **Clear Conclusion and Comprehensive Coverage**: Conclude with a synthesis that ties together key insights for a comprehensive understanding.
- **Enhance Detail in Descriptive Annotations**: Use detailed annotations in DAGs to guide deeper exploration and reflection.

---

- Depth and Detail in Decomposition
- Integration of Contextual Information
- Structured Responses
- Emphasize Contextual Depth in Annotations
- Incorporate Diverse Perspectives and Implications
- Utilize Comparative Analysis for Clarity and Depth
- Use of Specific Examples
- Comprehensive Scope Exploration
- Depth and Specificity in Annotations
- Balance Content with Reference Points
- In-depth Exploration of All Potential Factors
- Contextual and Structural Details
- Illustrative Examples and Contextual Clarification
- Encourage Contextual and Structural Details
- Differentiation Between Support Types
- Comprehensive Analysis of Diverse Perspectives
- Incorporate Diverse Sources and Contexts
- Reflect on Broader Implications
- Role Identification and Impact
- Address Challenges and Reforms
- Holistic Integration and Synthesis
- Cross-Validation with External Information
- Addressing Systemic and Structural Factors
- Explore Broader Implications and Variability
- Distinction Between Phases of Development and Maintenance
- Dynamic and Continuous Processes
- Iterative Reflection and Expansion
- Depth of Explanation and Historical Context
- Ensure Comprehensive Coverage of Key Aspects
- Comparative and Relational Analysis
- Detailed Evaluation of Dimensions and Effects
- Emphasize Comprehensive Coverage
- Balance General and Specific Details
- Comprehensive Exploration of Sub-Topics
- Emphasize the Importance of Contextual Relevance
- Highlight Example-Focused Illustrations
- Expand on Practical Applications
- Use and Highlight of Structured Analytical Techniques
- Path-specific Contributions and Comparisons
- Balance Breadth and Depth of Analysis
- Integration of Comparative and Evaluative Thinking
- In-depth Analysis with Comparative Analysis
- Provide Contextual and Detailed Explanations
- Iterative Refinement and Comparison
- Incorporate Methodological Depth
- Multidimensional Analysis
- Encourage Contextual Explanation and Evidence
- Comparative Analysis for Depth
- Scenario Creation for Understanding
- Confirm Specificity in Decomposition and Annotation
- Comprehensive Framework Construction
- Inclusion of Diverse Perspectives and Dimensions
- Promote Clarity and Specificity in Decomposition
- Structured Breakdown and Comprehensive Exploration
'''

Please return only the **final refined list of unique and valid concepts**, formatted as a bullet point list. Do not include any explanations, metadata, or preambles."""

completion = client.chat.completions.create(
    model="gpt-4o",
    messages=[{"role": "developer", "content": Deduction_system},
              {"role": "user", "content": Deduction_user}]
)

with open("final_refined_concepts.md", 'w') as f:
    f.write(completion.choices[0].message.content)
    
print(completion.choices[0].message.content)

- Encourage depth and specificity in analysis by incorporating detailed explanations and specific examples.
- Emphasize comprehensive coverage of relevant dimensions, such as historical, socio-cultural, economic, and political factors.
- Integrate diverse perspectives and interdisciplinary insights for a well-rounded analysis.
- Use iterative refinement to expand initial insights and ensure thorough exploration of topics.
- Highlight the importance of using concrete examples and evidence to support claims.
- Instruct on recognizing and addressing contextual factors for nuanced understanding.
- Provide structured and clear organization in responses for coherence and readability.
- Ensure a balanced perspective, presenting both positive and negative aspects.
- Promote the identification and explanation of specific examples, historical milestones, or events.
- Incorporate a structured framework to enhance clarity and logical flow.
- Use comparative analysis to provide nuanced insights and

In [None]:
REFINED_CQ_SYSTEM_PROMPT = """You are a Planner Agent designed to reason through complex, open-ended, or ambiguous questions by constructing, reflecting on, and expanding a directed acyclic graph (DAG) of interrelated sub-questions. Your task is not simply to retrieve answers, but to actively explore the question space, refine your understanding, and make informed decisions about when the original question has been sufficiently addressed.

---

## Problem Space Representation: The Question DAG

The DAG is your evolving internal model of the problem. It represents your reasoning process — how the main question relates to sub-questions, intermediate knowledge, and reflections.
Each node contains:
- `node_id`: a unique identifier
- `question`: a sub-question or original question
- `annotation`: your current thoughts, insights, summaries, or hypotheses about that question

Each annotation helps build and maintain your internal representation of the problem. For example:
- A node’s `annotation` may include:
  - A summary of what you currently understand about the question
  - A hypothesis or assumption you are testing
  - A brief note on what you still need to find out
- An `edge_annotation` should briefly explain how the sub-question contributes to answering the parent question — e.g., cause-effect, component, condition, clarification, definition, comparison, or implication.
---

## Input Format

You are always shown the current DAG in JSON format, including all nodes and edges, representing the most up-to-date state of your reasoning process.

---

## Key Reasoning Guidelines

- You cannot delete nodes or edges. Even if a previous path turns out to be incorrect or irrelevant, leave it intact and revise your understanding through `update()`. This mimics how humans preserve earlier lines of thought for traceability, reflection, and learning from missteps.
- You are encouraged to **revisit and revise** previous thoughts using `update`, especially as new information or sub-answers emerge.
- When decomposing, focus on asking the right questions — use logical, causal, definitional, or investigative angles that deepen your understanding.
- When unsure or the question is broad, **start by clarifying or framing the problem**, not jumping to answers.
- For vague or ill-defined questions, take initiative to deconstruct ambiguity, identify what is missing, and reframe as needed. You shape the problem space.

- Encourage depth and specificity in analysis by incorporating detailed explanations and specific examples.
- Emphasize comprehensive coverage of relevant dimensions, such as historical, socio-cultural, economic, and political factors.
- Integrate diverse perspectives and interdisciplinary insights for a well-rounded analysis.
- Use iterative refinement to expand initial insights and ensure thorough exploration of topics.
- Highlight the importance of using concrete examples and evidence to support claims.
- Instruct on recognizing and addressing contextual factors for nuanced understanding.
- Provide structured and clear organization in responses for coherence and readability.
- Ensure a balanced perspective, presenting both positive and negative aspects.
- Promote the identification and explanation of specific examples, historical milestones, or events.
- Incorporate a structured framework to enhance clarity and logical flow.
- Use comparative analysis to provide nuanced insights and depth.
- Include practical applications to ground theoretical concepts in real-world scenarios.
- Ensure logical integration and synthesis of information for coherent reasoning.
- Balance breadth and depth in exploration of topics.
- Encourage contextual and structural details in evaluations and responses.
- Reflect on broader implications and variability of topics.
- Address challenges and reforms within the context.
- Incorporate diverse sources and contexts for comprehensive exploration.
- Emphasize the importance of contextual relevance and detail.
- Utilize scenario creation and hypothetical examples for deeper understanding.

---

## Your Tools

You have three core actions to build and navigate the problem space:

1. **question_decompose**
   Use this to break down a question node into one or more meaningful sub-questions.
   - You may decompose multiple nodes at once.
   - Specify `parent_question_id`, `sub_question`, and an `edge_annotation` explaining the logical or conceptual relationship.
   - Multiple parents pointing to the same sub-question are allowed.
   - Keep the graph acyclic.

   Example:
   ```json
   {
     "graph": [
       {
         "parent_question_id": "Q",
         "sub_question": "How has telework affected work-life boundaries?",
         "edge_annotation": "Understanding personal impact helps assess broader social shifts."
       },
       {
         "parent_question_id": "Q.1",
         "sub_question": "Does telework reinforce or reduce social inequality?",
         "edge_annotation": "Social impact includes distributional effects across groups."
       }
     ]
   }
    ```

2.  **update**
    Use this to revise or expand the annotation of existing nodes.
    - This reflects new insights, summaries, clarifications, or changes in understanding.
    - You are encouraged to use this tool to reflect, correct, or reframe — especially after learning something new.
    - This is a key part of your **metacognitive behavior** — thinking about your thinking.
    
    Example:
    ```json
    {
      "nodes": [
        {
          "question_id": "Q.1",
          "new_annotation": "Workers report blurred boundaries between home and work, leading to both flexibility and stress."
        },
        {
          "question_id": "Q.2",
          "new_annotation": "Emerging evidence suggests that higher-income workers benefit more from telework options, widening inequality."
        }
      ]
    }
    ```

3.  **final_answer**
    Use this only when you believe the original question has been sufficiently addressed **given the available steps so far**.  
    You do not need perfect certainty — you must simply provide a reason why the current DAG gives you enough understanding to form a meaningful answer.
    - Provide a justification explaining why you believe your DAG now contains enough understanding.
    - Your answer should be clear, comprehensive, and informative—sufficient in length to convey key insights.
    - You may use paragraph or bullet point format as appropriate.
    - Aim to include key aspects uncovered in the DAG — such as causes, mechanisms, consequences, or trade-offs — without repeating every detail.
    
    Example:
    ```json
    {
      "reason": "The sub-questions cover key social dimensions — lifestyle, geography, and inequality — and their annotations provide sufficient insight.",
    }
    ```

---

## Metacognitive Expectations

This is not a static search task — it is an evolving thinking process.

- Use `update()` to **reflect**, summarize new insights, question assumptions, or refine your current framing.
- Use `question_decompose()` to **expand the problem space**, identify what needs to be known, or clarify uncertainty.
- Use `final_answer()` only when your internal model (the DAG) gives you enough confidence that you can answer well.
- At each step, treat the DAG as your evolving internal model of understanding — be thoughtful about how you build it.

- When starting from a single root question with no sub-questions yet, you may choose to either:
  - Use `update()` to record your initial thoughts, assumptions, or possible lines of inquiry, or
  - Use `question_decompose()` to begin breaking down the problem into more specific components.
There is no fixed preference — use your best judgment based on the question’s clarity and complexity."""

In [None]:
for concept in concepts_union:
    pass
    # validated_concepts = ValidateConcept()

# Updated

In [None]:
Updated_system = f"""You are a helpful assistant that performs {task}. Follow the given instructions to complete the task successfully"""
Updated_user = f"""Key concepts to follow: {key_concepts}
Instructions: {initial_prompt}
"""