Description
codegraph_context is documented as the primary tool for understanding "how does X work" questions. In practice, it systematically returns type/struct definitions rather than implementation logic — making it unreliable for architecture understanding tasks.
Observed Behavior
When asked "How does cr_task scheduling work?" or "How does sleep/timeout work?", codegraph_context returns things like:
- Struct definitions (e.g.,
TaskStatus_t, TimeoutConfig, cr_task_tcb_t)
- FreeRTOS-related type declarations
- Enum definitions (e.g.,
LOCK_MODE, DOORLOCK_WORKMODE)
What it misses:
- Actual implementation functions (e.g.,
cr_timer_loop, cr_task_schedule, cr_set_wakeup, cr_is_time_to_sleep)
- The entry points that drive the feature
- The callgraph/flow between functions
Test Results (embedded C codebase, 983 files)
| Query |
codegraph_context returned |
What was MISSED |
| "How does cr_task scheduling work?" |
cr_task_tcb_t struct, TaskStatus_t enum |
cr_timer_loop, cr_task_schedule, _cr_loop_handler, scheduler init |
| "How does cr_task sleep/timeout?" |
cr_timer_set, TimeoutConfig struct, cr_keep_wakeup |
_cr_time2sleep, cr_set_wakeup, cr_is_time_to_sleep |
| "Find state machines" |
DOORLOCK_WORKMODE, LOCK_MODE enums |
taskMenuSysEleMachine, taskFingerWait switch patterns, lwIP PPP FSM |
Expected Behavior
codegraph_context for "how does X work" should be biased toward returning functions that drive the feature:
- The main entry point functions
- The key functions in the call chain
- The control flow between them
Rather than prioritizing struct/enum type definitions (which are data, not behavior).
Suggested Fix
The ranking/scoring in codegraph_context seems to heavily weight struct type nodes over function nodes. Perhaps add a bias factor:
function/method nodes get higher relevance for behavioral queries
struct/type/enum nodes deprioritized when they lack behavioral code
Or add a query mode parameter: context(..., mode="behavior") vs context(..., mode="data")
Description
codegraph_contextis documented as the primary tool for understanding "how does X work" questions. In practice, it systematically returns type/struct definitions rather than implementation logic — making it unreliable for architecture understanding tasks.Observed Behavior
When asked "How does cr_task scheduling work?" or "How does sleep/timeout work?",
codegraph_contextreturns things like:TaskStatus_t,TimeoutConfig,cr_task_tcb_t)LOCK_MODE,DOORLOCK_WORKMODE)What it misses:
cr_timer_loop,cr_task_schedule,cr_set_wakeup,cr_is_time_to_sleep)Test Results (embedded C codebase, 983 files)
cr_task_tcb_tstruct,TaskStatus_tenumcr_timer_loop,cr_task_schedule,_cr_loop_handler, scheduler initcr_timer_set,TimeoutConfigstruct,cr_keep_wakeup_cr_time2sleep,cr_set_wakeup,cr_is_time_to_sleepDOORLOCK_WORKMODE,LOCK_MODEenumstaskMenuSysEleMachine,taskFingerWaitswitch patterns, lwIP PPP FSMExpected Behavior
codegraph_contextfor "how does X work" should be biased toward returning functions that drive the feature:Rather than prioritizing struct/enum type definitions (which are data, not behavior).
Suggested Fix
The ranking/scoring in
codegraph_contextseems to heavily weight struct type nodes over function nodes. Perhaps add a bias factor:function/methodnodes get higher relevance for behavioral queriesstruct/type/enumnodes deprioritized when they lack behavioral codeOr add a query mode parameter:
context(..., mode="behavior")vscontext(..., mode="data")