This project constructs causal graphs from object attributes and affordances, and then generates task, emergency, and recovery text based on those graphs. It supports automatic object discovery, multiple LLM-assisted causal graph construction strategies, and optional visualization.
conda create -n data-recipe python=3.11
conda activate data-recipe
pip install -r requirements.txtCommon dependencies include:
requestsnetworkxmatplotlib(optional, for visualization)
The end-to-end agent automatically performs:
Object discovery → causal graph construction → text generation
This is the recommended workflow if you do not want to manually specify objects.
If you use a cloud LLM, configure the following environment variables:
export QWEN_API_KEY="<your-key>"
export QWEN_API_BASE="https://dashscope.aliyuncs.com/compatible-mode/v1" # optional
export QWEN_MODEL="qwen2.5-7b-instruct" # optional$env:QWEN_API_KEY="<your-key>"
$env:QWEN_API_BASE="https://dashscope.aliyuncs.com/compatible-mode/v1" # optional
$env:QWEN_MODEL="qwen2.5-7b-instruct" # optionalImportant
run_agent.shandpipeline.pyread API credentials only from environment variables.- Do NOT hardcode API keys in scripts or source code.
If USE_API=true is enabled, the system automatically maps available credentials to a unified interface:
| Unified Variable | Source Priority |
|---|---|
CAUSAL_LLM_API_KEY |
OPENAI_API_KEY / AZURE_OPENAI_API_KEY / CAUSAL_LLM_API_KEY / QWEN_API_KEY / DASHSCOPE_API_KEY |
CAUSAL_LLM_API_BASE |
CAUSAL_LLM_API_BASE / QWEN_API_BASE / default DashScope endpoint |
CAUSAL_LLM_MODEL |
CAUSAL_LLM_MODEL / QWEN_MODEL / qwen-max-2025-01-25 |
➡️ Setting only QWEN_API_KEY is sufficient.
The pipeline falls back to local or non-API modes only if no valid key is found.
Key variables (auto-discovery enabled by default):
-
Object Discovery
AUTO_DISCOVER=trueOBJECT_PROMPT="your task or scenario description"DISCOVER_NUM_OBJECTSDISCOVER_TEMPERATUREDISCOVER_TOP_P
-
Optional Manual Inputs
OBJECTS_TEXT_FILE— one object hint per lineOBJECTS_JSON— fully structured object definitions
-
Causal Graph Strategy
USE_QWEN_API_GRAPH— one-shot graph generationMICRO_FIRST— micro-graph incremental stitchingUSE_API— use OpenAI-compatible API interface
bash run_agent.shThe pipeline performs:
- Automatic object discovery
- Quality filtering and deduplication
- Causal graph construction (Qwen one-shot / micro-graph stitching / legacy mode)
- Task case generation
Output (default):
output/task_case.json
You can visualize the generated causal graph as a DOT or PNG file:
python visualize_logic_graph.py \
--task_json output/task_case.json \
--dot output/logic_graph.dot \
--png output/logic_graph.pngUse batch_run.py to process a scenarios file (JSONL or JSON array) and generate a TaskCase JSON per scenario. Filenames are auto-generated from the scenario/task name for clarity.
{"task_id":"T001","task_name":"Kitchen prep","object_prompt":"Prepare a sandwich and drink","discover_num_objects":4,"use_qwen_api_graph":true,"micro_first":true,"use_api":true}
{"task_id":"T002","task_name":"Office cleaning","object_prompt":"Clean a small office space","discover_num_objects":3,"micro_first":true,"use_api":true}Also supported: JSON array [{...},{...}].
Field hints (all optional unless noted):
task_id/task_name: used in TaskCase metadata and output filename slug.object_prompt: description for LLM discovery (ifobjects/objects_textare absent).discover_num_objects,discover_temperature,discover_top_p: sampling for object discovery.use_qwen_api_graph,micro_first,use_api: graph strategy and API toggle (Qwen keys auto-mapped to OpenAI-compatible envs).objects: structured objects list (category/name/attributes/affordances/logic_graph) to bypass LLM discovery.objects_text: plain text list; each entry becomes an object category/name.
python batch_run.py \
--scenarios_file scenarios.jsonl \
--output_dir output/batch \
--use_api \
--micro_firstOutput: one JSON per scenario, named slug(task_name)_taskId.json (e.g., kitchen_prep_T001.json) in output/batch (override with --output_dir).
You can directly let Qwen generate the entire causal graph in a single API call, then run downstream text generation.
QWEN_API_KEYorDASHSCOPE_API_KEYQWEN_API_BASE(optional)QWEN_MODEL(optional)
python pipeline.py \
--use_qwen_api_graph \
--qwen_temperature 0.4 \
--qwen_top_p 0.9 \
--task_id T001 \
--task_name DemoTask- Existing edges defined in each
CausalObjectare preserved. - New edges are merged and filtered to known nodes only.
- If the API call fails or returns an empty graph, the system automatically falls back to other enabled strategies.
- Optional post-verification by Qwen can add cross-object edges for improved global consistency.
This mode builds a global DAG by stitching together small, high-confidence causal subgraphs.
python pipeline.py \
--micro_first \
--micro_group_size 3 \
--micro_groups_per_object 6 \
--micro_min_confidence 0.6-
Each micro-graph contains 2–3 nodes
-
Micro-graphs are cached at:
database/micro_graphs.json
To use the original full discovery algorithm:
- Do not enable
--micro_first - Do not enable
--use_qwen_api_graph
python pipeline.py --use_apiThis mode is driven by either a local LLM or an API backend, depending on configuration.
| Artifact | Description |
|---|---|
task_case.json |
Final task/emergency/recovery representation |
logic_graph.dot |
Causal graph in DOT format |
logic_graph.png |
Rendered causal graph |