Skip to content

fix: Scope boundaries for JS/TS entity extraction#35

Open
c22 wants to merge 1 commit intoAtaraxy-Labs:mainfrom
c22:fix/scope-boundary-types
Open

fix: Scope boundaries for JS/TS entity extraction#35
c22 wants to merge 1 commit intoAtaraxy-Labs:mainfrom
c22:fix/scope-boundary-types

Conversation

@c22
Copy link

@c22 c22 commented Mar 17, 2026

The entity extractor recursed into JS/TS function expression bodies (arrow functions, function expressions, generator functions) and extracted local variables (const, let, var) as top-level entities. These spurious entities are unstable and small structural changes to surrounding code could cause different locals to be extracted, producing different entity sets from logically equivalent code.

Downstream consumers such as weave would interpret the instability as intentional additions or deletions, which could cause code to be silently dropped or mangled during merge conflict resolution.

Scope boundaries are now selectively transparent: local variable declarations inside function expression bodies are suppressed, but inner class and function declarations are still extracted as entities.

This keeps entity extraction stable across versions while preserving granularity for real semantic units.

Implementation uses two mechanisms configured through LanguageConfig:

  1. scope_boundary_types When the general recursion in visit_node encounters one of these node types, it propagates the boundary as the suppression_context rather than skipping the subtree entirely.

  2. suppressed_nested_entities Rules keyed on suppression_context filter out lexical_declaration and variable_declaration inside scope boundary types and named function/method bodies.

Both are configured through LanguageConfig fields and currently only apply to JavaScript/TypeScript.

The entity extractor recursed into JS/TS function expression bodies (arrow functions, function expressions, generator functions) and extracted local variables (const, let, var) as top-level entities. These spurious entities are unstable and small structural changes to surrounding code could cause different locals to be extracted, producing different entity sets from logically equivalent code.

Downstream consumers such as weave interpret the instability as intentional additions or deletions, which could cause code to be silently dropped or mangled during merge conflict resolution.

Scope boundaries are now selectively transparent: local variable declarations inside function expression bodies are suppressed, but inner class and function declarations are still extracted as entities.

This keeps entity extraction stable across versions while preserving granularity for real semantic units.

Implementation uses two mechanisms configured through LanguageConfig:

1. scope_boundary_types
   When the general recursion in visit_node encounters one of these node types, it propagates the boundary as the suppression_context rather than skipping the subtree entirely.

2. suppressed_nested_entities
   Rules keyed on suppression_context filter out lexical_declaration and variable_declaration inside scope boundary types and named function/method bodies.

Both are configured through LanguageConfig fields and currently only apply to JavaScript/TypeScript.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant