Context
Follow-up to the out-of-scope diagnostic cleanup spec (2026-04-22). That spec intentionally keeps the undefined-variable rewrite path narrow: a variable-kind reference only matches an out-of-scope `variable`, never a `scalar`, `matrix`, or `program`.
This issue tracks an optional future enhancement: make the analyzer / out-of-scope matcher scalar-aware.
The idea
In real Stata code, a bare identifier in an expression may be acting as a scalar, not a dataset variable — for example `display x + 1` where `x` was defined via `scalar x = 42`. The current analyzer only emits `UNDEFINED_VARIABLE` for varlist positions, so expression-position bare identifiers are not diagnosed today. A scalar-aware mode could:
- Detect bare identifiers in expression positions with enough context that they are clearly scalar-like (e.g., `display`, `assert`, `di`, RHS of `generate`/`replace`).
- Treat such references as scalar references for the purpose of undefined-symbol and out-of-scope matching — closer in spirit to global-macro analysis than to variable analysis.
- Ship behind a new user-facing setting (e.g., `sight.diagnostics.scalarAwareness: 'off' | 'strict'`) so users can opt into the stricter signal.
Why deferred
- Requires new analyzer work (identifying scalar-like positions) that is out of scope for the current cleanup.
- The scope-resolver already carries scalar out-of-scope entries; lighting up the matcher for them would produce false positives today because the ref-kind classifier would have no way to distinguish variable-kind from scalar-kind references.
- Bare-identifier disambiguation has corner cases (a user could shadow a scalar with an identically named variable — bad practice but legal). We do not need to resolve every pathological case before shipping a useful strictness option.
Scope when revisited
- Analyzer: detect scalar-like reference positions and emit a new diagnostic code (e.g., `UNDEFINED_SCALAR`) or reuse the existing code with a side-band kind.
- Provider: extend `classify_reference_kind` to return `'scalar'`; extend `out_of_scope_type_matches_reference` to allow `'scalar'` references to match `'scalar'` out-of-scope entries. No change to variable-kind matching.
- Config: add the new severity / opt-in setting with a conservative default.
Related
Context
Follow-up to the out-of-scope diagnostic cleanup spec (2026-04-22). That spec intentionally keeps the undefined-variable rewrite path narrow: a variable-kind reference only matches an out-of-scope `variable`, never a `scalar`, `matrix`, or `program`.
This issue tracks an optional future enhancement: make the analyzer / out-of-scope matcher scalar-aware.
The idea
In real Stata code, a bare identifier in an expression may be acting as a scalar, not a dataset variable — for example `display x + 1` where `x` was defined via `scalar x = 42`. The current analyzer only emits `UNDEFINED_VARIABLE` for varlist positions, so expression-position bare identifiers are not diagnosed today. A scalar-aware mode could:
Why deferred
Scope when revisited
Related