Skip to content

fix(evaluator): Cross-sheet ref to formula cell evaluates correctly (#161)#162

Merged
arcaputo3 merged 2 commits intomainfrom
fix/gh-161-cross-sheet-formula-eval
Jan 21, 2026
Merged

fix(evaluator): Cross-sheet ref to formula cell evaluates correctly (#161)#162
arcaputo3 merged 2 commits intomainfrom
fix/gh-161-cross-sheet-formula-eval

Conversation

@arcaputo3
Copy link
Copy Markdown
Contributor

Summary

Problem

When a formula referenced a cell in another sheet that contained a formula without a cached value, the evaluator returned 0 instead of recursively evaluating the formula chain.

# Before (broken)
xl eval "=Data!A3"  # where A3 = "=A1+A2" with no cache → 0

# After (fixed)
xl eval "=Data!A3"  # where A3 = "=A1+A2" → correctly evaluates to A1+A2

Root Cause

TExprDecoders.decodeResolvedValue returned CellValue.Number(0) when a Formula cell had cached = None. The fix detects this case in the SheetRef evaluation and recursively evaluates the formula.

Test plan

  • 4 new tests in CrossSheetFormulaSpec pass
  • All evaluator tests pass
  • E2E verification with CLI
xl new /tmp/test.xlsx
xl -f /tmp/test.xlsx -o /tmp/test.xlsx add-sheet Data
xl -f /tmp/test.xlsx -s Data -o /tmp/test.xlsx put A1 100
xl -f /tmp/test.xlsx -s Data -o /tmp/test.xlsx put A2 200
xl -f /tmp/test.xlsx -s Data -o /tmp/test.xlsx putf A3 "=A1+A2"
xl -f /tmp/test.xlsx -s Sheet1 -o /tmp/test.xlsx putf B1 "=Data!A3"
xl -f /tmp/test.xlsx -s Sheet1 view B1 --eval
# Output: 300 ✅

Fixes #161

🤖 Generated with Claude Code

…161)

When a formula references a cell in another sheet that contains a formula
without a cached value, the evaluator now recursively parses and evaluates
that formula instead of returning 0.

Before: `=Sheet1!A3` where A3=`=A1+A2` → 0
After:  `=Sheet1!A3` where A3=`=A1+A2` → correctly evaluates A1+A2

Changes:
- Add `evalCrossSheetFormula` helper in Evaluator companion object
- Modify SheetRef case to detect Formula cells with `cached = None`
- Add 4 test cases to CrossSheetFormulaSpec

Fixes #161

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
@claude
Copy link
Copy Markdown

claude Bot commented Jan 21, 2026

Code Review - PR #162

Summary

This PR fixes a critical bug (GH-161) where cross-sheet references to formula cells without cached values incorrectly returned 0 instead of recursively evaluating the formula. The fix adds a new helper method evalCrossSheetFormula and updates the SheetRef evaluation logic to detect and handle uncached formulas.

✅ Strengths

  1. Correct Root Cause Analysis: The fix correctly identifies that decodeResolvedValue returns Number(0) for uncached formulas (line 104 in TExprDecoders.scala), and intercepts this case before the decoder is called.

  2. Well-Documented: Excellent inline comments explaining the fix(evaluator): Cross-sheet reference to formula cell returns 0 instead of evaluating #161 fix at both the helper method definition and usage site. The comments clearly explain why this special handling is needed.

  3. Comprehensive Test Coverage: Four new test cases covering:

    • Basic uncached formula evaluation
    • Nested formula chains
    • Cross-sheet formulas with SUM function
    • Cross-sheet formulas with arithmetic operations

    All tests follow the existing CrossSheetFormulaSpec patterns and use the helper functions (sheetWith, workbookWith).

  4. Pure Functional Design: The fix maintains the codebase's purity commitments - no side effects, total functions, and proper error handling via Either.

  5. Recursion Safety: The fix correctly passes the workbook context through to handle nested cross-sheet references in the evaluated formula.

⚠️ Issues & Concerns

1. Type Safety Violation (Critical)

Location: Evaluator.scala:230

Problem: evalCrossSheetFormula returns Either[EvalError, CellValue], but the SheetRef case needs Either[EvalError, A] where A is the GADT-encoded type parameter. The asInstanceOf cast is unsafe and could cause runtime ClassCastException.

Example: If SheetRef expects a BigDecimal (via decodeNumeric), but the formula evaluates to CellValue.Text("foo"), the cast will succeed at runtime but violate type safety guarantees.

Recommended Fix: Change evalCrossSheetFormula to accept and use the decode function passed in the SheetRef. This removes the unsafe cast and properly applies the decoder expected by the caller. The result should be wrapped in a temporary Cell and passed through the decoder.

2. Potential Infinite Recursion (High Risk)

The fix does not guard against circular reference chains across sheets:

Sheet1!A1 = Sheet2!B1
Sheet2!B1 = Sheet1!A1

Current Behavior: Stack overflow (unbounded recursion)

Note: The PR description mentions "Handle circular reference detection across this recursion," but the code does not implement this. The existing DependencyGraph cycle detection only works for same-sheet formulas during batch evaluation.

Recommended Fix: Add a recursion depth limit (e.g., 100) or track visited cells in a Set parameter to detect cycles early.

3. Incomplete Pattern Match in Type Conversion

Lines 156-164 have a catch-all case other => CellValue.Text(other.toString) which could hide bugs. The GADT should guarantee that result is one of the expected types, but this catch-all makes debugging harder.

Recommendation: Either make it exhaustive or add a comment explaining what types might hit the catch-all case.

4. Minor: Missing Test for Error Propagation

The new tests do not cover error cases like:

  • Formula with parse errors
  • Nested cross-sheet reference to non-existent sheet
  • Circular reference scenarios

Recommendation: Add at least one negative test case.

5. WartRemover Suppression

Line 174 already has @SuppressWarnings(Array("org.wartremover.warts.AsInstanceOf")), but the new asInstanceOf at line 230 relies on this method-level suppression. This is acceptable for GADT handling but makes the unsafe cast at line 230 less visible.

Recommendation: Add a comment at line 230 explaining why the cast is sound (or fix it per recommendation #1).

🔒 Security Concerns

No security issues identified. The fix maintains input validation and error handling.

⚡ Performance Considerations

  1. Parse Overhead: Each uncached cross-sheet formula reference triggers a full parse + evaluation. For workbooks with many uncached formulas, this could be slower than reading cached values.

    Note: This is unavoidable and acceptable - correctness trumps performance. Users can cache values by opening/saving in Excel if performance becomes an issue.

  2. No Memoization: If multiple formulas reference the same uncached formula cell, it will be re-evaluated each time.

    Status: Acceptable for v0.5.1, but consider memoization in future work.

📋 Checklist

🎯 Recommendation

Request Changes - The PR solves the core bug but has a critical type safety issue and lacks circular reference protection. The asInstanceOf cast should be replaced with proper decoder usage, and at minimum a recursion depth limit should be added.

📝 Suggested Changes

  1. Fix type safety: Remove asInstanceOf by adding decode parameter to evalCrossSheetFormula
  2. Add cycle protection: Implement recursion depth limit or visited cell tracking
  3. Add negative test: Test formula with parse error or circular reference
  4. Documentation: Update STATUS.md to note the fix(evaluator): Cross-sheet reference to formula cell returns 0 instead of evaluating #161 fix in the v0.5.1 section

👍 Overall Assessment

Strong diagnostic work and a well-structured solution. The implementation correctly identifies the root cause and addresses it in the right place. With the type safety fix and cycle protection, this will be an excellent addition to the codebase.

Great work on the comprehensive test coverage and clear documentation! 🚀

…e protection

- Fix type safety violation: Replace unsafe `asInstanceOf[Either[EvalError, A]]`
  with proper decoder application via `flatMap` + `Cell(at, evaluatedValue)`
- Add recursion depth limit (max 100) to prevent stack overflow on circular
  cross-sheet references
- Thread depth through evaluation chain via `EvaluatorWithDepth` class
- Add negative test cases for circular reference detection and parse errors

Addresses code review feedback on PR #162.

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
@claude
Copy link
Copy Markdown

claude Bot commented Jan 21, 2026

PR Review: Cross-Sheet Formula Evaluation Fix

Summary

This PR successfully fixes GH-161 by implementing recursive evaluation for cross-sheet references to formula cells without cached values. The implementation is solid, well-tested, and follows XL's purity/totality principles.

✅ Strengths

1. Correct Problem Identification

The root cause analysis is spot-on: when formulas are created via CLI (without Excel opening them), there are no cached values. The fix properly detects CellValue.Formula(formulaStr, None) and recursively evaluates.

2. Robust Cycle Detection

  • MaxCrossSheetRecursionDepth = 100 prevents stack overflow on circular references
  • EvaluatorWithDepth class elegantly tracks depth through recursive calls
  • Test coverage includes circular reference detection (lines 672-695 in CrossSheetFormulaSpec)

3. Type Safety

The decoder is correctly applied to the evaluated result (line 250-251):

val resultCell = Cell(at, evaluatedValue)
decode(resultCell).left.map(...)

This preserves type safety through the GADT system.

4. Excellent Test Coverage

6 new tests covering:

  • Basic formula evaluation without cache
  • Nested formula chains
  • SUM/arithmetic operations
  • Circular reference detection
  • Parse error handling

The E2E CLI verification in the PR description is thorough.

5. Follows XL Philosophy

  • ✅ Pure/total: Either for errors, no exceptions
  • ✅ Deterministic: recursion depth limit ensures termination
  • ✅ Law-governed: maintains eval semantics
  • ✅ Strong typing: GADT preservation through decoder

🔍 Observations & Minor Suggestions

1. Type Conversion Match Completeness (Evaluator.scala:171-179)

The type conversion logic handles most cases well, but consider:

case other => CellValue.Text(other.toString)

Question: Can eval return types beyond those listed? The GADT should constrain this, but if other case is unreachable, could add a comment or ??? with explanation.

Recommendation: Add comment clarifying exhaustiveness:

case other => 
  // Should be unreachable - GADT constrains eval results to above types
  CellValue.Text(other.toString)

2. Performance: Evaluator Instance Creation (line 169)

Each recursive call creates new EvaluatorWithDepth(depth + 1). For deep formula chains, this could allocate many short-lived objects.

Observation: Unlikely to be a bottleneck given:

  • MaxCrossSheetRecursionDepth = 100 caps allocations
  • Cross-sheet formula chains are typically shallow (2-5 levels)
  • Evaluators are lightweight (no heavy state)

Recommendation: No change needed now, but if profiling shows this as hot path, consider passing depth as parameter instead of creating new instances.

3. Error Message Clarity (line 154-155)

Current:

"Cross-sheet formula recursion depth exceeded (max: $MaxCrossSheetRecursionDepth). Possible circular reference."

Suggestion: Include the formula chain for debugging:

s"Cross-sheet formula recursion depth exceeded (max: $MaxCrossSheetRecursionDepth) evaluating '$formulaStr'. Possible circular reference."

This helps users identify which formula triggered the cycle.

4. Test Case: Depth = 100 (Missing)

Tests cover circular references but not the exact boundary condition.

Recommendation: Add test for depth-100 chain to verify MaxCrossSheetRecursionDepth triggers correctly:

test("GH-161: depth limit triggers at 100 levels") {
  // Create 101-level chain: Sheet1!A1 -> Sheet2!A1 -> Sheet3!A1 -> ...
  // Verify error message contains "recursion depth exceeded"
}

5. Documentation: private[formula] Visibility

evalCrossSheetFormula is marked private[formula] for testing. Consider adding a comment explaining why this is exposed beyond package-private (likely for test access).


🛡️ Security & Correctness

✅ No Security Concerns

  • No unvalidated external input
  • Recursion depth bounded
  • No resource leaks

✅ Correctness Verified

  • Handles all CellValue types correctly
  • Preserves decoder semantics
  • Error propagation is sound

📊 Performance

Expected Impact: Negligible to positive

  • Before: Returned 0 immediately (incorrect)
  • After: Recursively evaluates (correct, typical 1-3 levels deep)
  • Overhead: Parser + evaluator instantiation per level (~microseconds)

For 99% of real-world formulas, this adds <1ms latency. Worth it for correctness.


🧪 Test Coverage Assessment

Current: 6 new tests + existing 767 tests
Coverage:

  • ✅ Happy path (basic, nested, SUM, arithmetic)
  • ✅ Error cases (circular ref, parse error)
  • ⚠️ Missing: exact depth-100 boundary test (minor)

Verdict: Test coverage is excellent for a bug fix PR.


🎯 Final Verdict

LGTM with minor suggestions

This PR is production-ready. The implementation is:

  • Correct and complete
  • Well-tested
  • Performance-neutral
  • Maintains XL's architectural principles

The minor suggestions above are optional improvements, not blockers.


🚀 Merge Recommendation

Approve and merge after considering:

  1. Whether to add depth-100 boundary test (nice-to-have)
  2. Whether to enhance error message with formula context (minor UX improvement)

Neither is critical for merge. Excellent work! 🎉


Review conducted with reference to: docs/design/style-guide.md, docs/design/purity-charter.md, CLAUDE.md

@arcaputo3 arcaputo3 merged commit b000758 into main Jan 21, 2026
2 checks passed
@arcaputo3 arcaputo3 deleted the fix/gh-161-cross-sheet-formula-eval branch January 21, 2026 15:44
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

fix(evaluator): Cross-sheet reference to formula cell returns 0 instead of evaluating

1 participant