Skip to content

Python: skip Any_to_bool for bool-typed assert conditions (#1102)#1119

Open
julesmt wants to merge 2 commits into
strata-org:mainfrom
julesmt:fix/isinstance-assert-1102
Open

Python: skip Any_to_bool for bool-typed assert conditions (#1102)#1119
julesmt wants to merge 2 commits into
strata-org:mainfrom
julesmt:fix/isinstance-assert-1102

Conversation

@julesmt
Copy link
Copy Markdown
Member

@julesmt julesmt commented May 5, 2026

When an assert condition translates to a .Hole (e.g. a call to an unmodelled function like isinstance), the assert translator hoists it into a fresh bool-typed variable. Unconditionally wrapping that variable in Any_to_bool is ill-typed since Any_to_bool : Any -> bool, producing:

Impossible to unify (arrow Any bool) with (arrow bool $__ty…)

Check the invariant directly: if the translated condition is already a reference to a bool-typed local, skip the Any_to_bool coercion. This handles the Hole-hoisting path and any future path producing a bool-typed identifier uniformly.

Adds regression fixtures under StrataTestExtra/Languages/Python/Issue1102/ and a Lean test harness that asserts the translation pipeline produces no type-check or Laurel-to-Core crash on the minimal fuzz reproductions from issue #1102 (seed 1777894483, fuzz_semantic_0004).

Issue #, if available:

Description of changes:

By submitting this pull request, I confirm that you can use, modify, copy, and redistribute this contribution, under the terms of your choice.

Copy link
Copy Markdown
Contributor

@tautschnig tautschnig left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fix is in the right place and the diagnosis is correct. The hoisted variable is declared TBool, so wrapping it in Any_to_bool : Any → bool produces the (arrow Any bool) vs (arrow bool $__ty…) unification error from #1102. Skipping the coercion when the condition is already a bool-typed local is a minimal, surgical fix. My concerns are about how the check is structured (fragile), the scope of the .Var (.Local …) predicate (narrow), and the strength of the new regression tests (thin).

1. The check is done via a runtime variableTypes.find? + == PyLauType.Bool string comparison on a name the same block just created. The Hole branch at lines 1745–1750 literally writes TBool plus (freshVar, "bool") into the context, so the code already knows at that point that the hoisted variable is bool-typed. Re-deriving that knowledge a few lines later via a linear lookup and a string equality is unnecessary work, and it makes the check vulnerable to drift — if PyLauType.Bool is ever renamed, or if the context insertion ever uses a different casing/constant, the lookup silently fails and we're back to ill-typed Any_to_bool(bool). Inline comment suggests returning an alreadyBool : Bool flag from the pattern match directly.

2. The check's pattern is narrower than the invariant it's enforcing. What we really mean is "don't apply Any_to_bool to something whose type is already bool." The check only catches .Var (.Local name) where name's recorded type is bool. That happens to be exactly what the Hole-hoisting path produces, so the current callers are all covered. But:

  • Any future translation path that produces a bool-typed non-.Var expression — a comparison primitive, a bool-valued .StaticCall, an .IfThenElse over bools, etc. — would silently hit Any_to_bool(bool) again.
  • .Var (.Field self bool_field) for a bool-typed self field also wouldn't match, though that's fixable.

Not a blocker — the check is correct for today's translator. But a short doc comment on the predicate (e.g. -- Today only the Hole-hoisting branch above produces this shape; if a future translation path produces a bool-typed non-Var, extend the predicate) documents the assumption.

3. The same Any_to_bool condExpr wrapping exists for .If (line 1701) and .While (line 1715). Neither currently does Hole-hoisting (PR #608's 77318d1f8 removed it from If/While but kept it for Assert), so they don't hit #1102's bug today. But if Hole-hoisting is ever reintroduced for If or While — or if a non-Hole translation path produces a bool — the same unification error will resurface. The fix here could be factored into a tiny helper so all three condition sites use the same logic:

private def coerceToBool (ctx : TranslationContext) (e : StmtExprMd) : StmtExprMd :=
  match e.val with
  | .Var (.Local name) =>
    match ctx.variableTypes.find? (fun (n, _) => n == name) with
    | some (_, ty) => if ty == PyLauType.Bool then e else Any_to_bool e
    | none => Any_to_bool e
  | _ => Any_to_bool e

All three of IfThenElse (Any_to_bool condRef), While (Any_to_bool condRef), and Assert { condition := Any_to_bool finalCondExpr } then become coerceToBool ctx condRef/finalCondExpr. Not blocking; follow-up-able.

4. The new test detects "no pipeline crash" via a string-prefix match on diagnostic messages. Lines 32–34 of Issue1102Test.lean:

private def isPipelineCrash (msg : String) : Bool :=
  msg.startsWith "❌ Type checking error" ||
  msg.startsWith "Laurel to Core translation failed"

Two problems with this:

  • Fragile. Rewording either of those diagnostics (something this repo does frequently — e.g. PR #1104) silently turns the test from "no crash" to "match nothing, pass vacuously", at which point a #1102 regression would land green. Matching on a structural property (diagnostic type / severity, or the presence of DiagnosticType.StrataBug) is more robust if the processing pipeline exposes it.
  • Passes vacuously on the happy path. The whole regression is a type-unification failure; the assertion itself is already unprovable (since isinstance is unmodelled). Nothing in the test distinguishes "fix worked, so translation ran through and verifier reported the assertion as unprovable" from "fix silently dropped the assertion". A stronger check would #guard_msgs on the actual verifier output: the expected line for assert isinstance(5, int) should report the assertion as "unprovable" (or unknown), not pass or vanish.

This repo has existing precedent for this stronger pattern in StrataTest/Languages/Python/expected_laurel/*.expected — one snapshot per fixture, matched via a test runner. Worth adding.

5. Coverage gaps in the test fixtures. Two fixtures, both of the form assert isinstance(v, T) at top level of main. Things that would materially strengthen coverage:

  • Non-Hole path. Add a fixture like assert x == 1 where x : Any (so the condExpr is a bool-typed comparison, not a Hole). Confirms the _ => … arm at line 1748 still wraps in Any_to_bool correctly. Today this case already works, but the fix introduces a conditional that could in principle mis-fire and silently drop the wrapper.
  • Negated / combined. assert not isinstance(x, int) and assert isinstance(x, int) and x > 0. Both exercise the Hole-hoisting path differently (the and routes through .BoolOp which wraps each argument; the not is a .UnaryOp).
  • Inside a conditional. if isinstance(x, int): assert x > 0 — exercises the .If translation's handling of Hole (which doesn't hoist, so Any_to_bool(Hole) is emitted directly). Confirms my observation above that If/While don't hit the bug today.
  • Property summary. assert isinstance(x, int), "x must be int" — the Hole-hoisting plus summary extraction happen in the same .Assert arm. The summary path isn't exercised by either current fixture.

Comment thread Strata/Languages/Python/PythonToLaurel.lean Outdated
Comment thread StrataTestExtra/Languages/Python/Issue1102Test.lean Outdated
…#1102)

When an assert condition translates to a .Hole (e.g. a call to an
unmodelled function like isinstance), the assert translator hoists it
into a fresh bool-typed variable. Unconditionally wrapping that
variable in Any_to_bool is ill-typed since Any_to_bool : Any -> bool,
producing:

  Impossible to unify (arrow Any bool) with (arrow bool $__ty…)

Check the invariant directly: if the translated condition is already a
reference to a bool-typed local, skip the Any_to_bool coercion. This
handles the Hole-hoisting path and any future path producing a
bool-typed identifier uniformly.

Adds regression fixtures under StrataTestExtra/Languages/Python/Issue1102/
and a Lean test harness that asserts the translation pipeline produces
no type-check or Laurel-to-Core crash on the minimal fuzz reproductions
from issue strata-org#1102 (seed 1777894483, fuzz_semantic_0004).
@julesmt julesmt force-pushed the fix/isinstance-assert-1102 branch from 55adf3c to 8a1c7b0 Compare May 5, 2026 17:37
@julesmt
Copy link
Copy Markdown
Member Author

julesmt commented May 5, 2026

@tautschnig I think I understood what you wanted, I modified the Test.lean, is it ok now?

Copy link
Copy Markdown
Contributor

@tautschnig tautschnig left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thread on isBoolTypedVar refactor (PythonToLaurel.lean:1751–1762). ⚠️ Thread marked resolved without a code change. The variableTypes.find? + ty == PyLauType.Bool pattern is still present. The fix is correct as written, so this is non-blocking — but the design point I raised still stands: the predicate's soundness depends on (a) PyLauType.Bool's string constant not being renamed and (b) no pass between hoisting and the check ever filtering variableTypes. If either assumption breaks in the future, the predicate silently falls back to emitting Any_to_bool(bool) — the exact shape this PR is fixing. Extracting the alreadyBool : Bool flag directly from the pattern match (as in my earlier suggestion block) removes both fragilities. I accept the author's judgment to leave it and won't push further; flagging here so the trade-off is visible for future readers.

The limitations I flagged earlier still hold — chained-access style predicates, parallel .If / .While sites at lines 1701 / 1715 that don't hoist today but would hit the same bug class if they ever do — but they're out of scope for a #1102 fix.

Replaces the downstream variableTypes.find? + string comparison with a
Bool produced directly by the match arm that constructs the bool-typed
ref. This removes two fragilities flagged on strata-org#1102:

  - Renaming PyLauType.Bool no longer affects the Any_to_bool skip.
  - A future pass filtering/replacing variableTypes between hoisting
    and the check also can't silently regress back to Any_to_bool(bool).

Also swaps the string literal "bool" for PyLauType.Bool in the
variableTypes insertion for consistency.
@julesmt
Copy link
Copy Markdown
Member Author

julesmt commented May 6, 2026

Fail the CI, because it comes from a fork...

Copy link
Copy Markdown
Contributor

@tautschnig tautschnig left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A handful of smaller suggestions inline: one stylistic simplification of the fix, additional fixtures (summary-extraction path + a "should still coerce" case + xfail for the adjacent StrataBug), and a note on where a property test / small lemma would pin down the invariant this PR relies on.

Pre-existing, not introduced here — at PythonToLaurel.lean:1766 the assert branch calls getExceptionCheckPreamble ctx condExpr s!"$assert_exc_…" but discards the returned exprRef with _. Compare the .If / .While sites at :1700 / :1714, which thread the substituted ref into the enclosing Any_to_bool condRef. In the assert branch the assertion uses condExpr directly, so when the condition contains a user call, the preamble binds the call into $assert_exc_N but the assert then re-evaluates the call — double-eval. The Hole-hoisting path this PR touches sits adjacent to that logic. Worth fixing in the same shape as the alreadyBool refactor or filing a separate issue; the current test fixtures don't exercise it.

Refactor: the file-wide variableTypes : List (String × String) association list (Strata/Languages/Python/PythonToLaurel.lean:90) is scanned linearly at ~10 sites in this file (e.g. :934, :1195, :1386, :1552, :1682). Out of scope for this PR, but worth a follow-up issue — a Std.HashMap String PyLauType (or similar) would give O(1) lookup and a single canonical insertion API, and would have made the current fix unnecessary to begin with (since the "type of var" question would be cheap to ask).

Comment on lines +1744 to +1758
let (condStmts, finalCondExpr, condCtx, alreadyBool) :=
match condExpr.val with
| .Hole =>
let freshVar := s!"assert_cond_{test.toAst.ann.start.byteIdx}"
let varType := mkHighTypeMd .TBool
let varDecl := mkVarDeclInit freshVar varType condExpr
let varRef := mkStmtExprMd (StmtExpr.Var (.Local freshVar))
([varDecl], varRef, { ctx with variableTypes := ctx.variableTypes ++ [(freshVar, "bool")] })
| _ => ([], condExpr, ctx)

let assertStmt := mkStmtExprMdWithLoc (StmtExpr.Assert { condition := Any_to_bool finalCondExpr, summary }) md
([varDecl], varRef, { ctx with variableTypes := ctx.variableTypes ++ [(freshVar, PyLauType.Bool)] }, true)
| _ => ([], condExpr, ctx, false)

-- Skip Any_to_bool when the condition is already a bool-typed local;
-- wrapping it would produce ill-typed Any_to_bool(bool). See #1102.
let coercedCond :=
if alreadyBool then finalCondExpr else Any_to_bool finalCondExpr
let assertStmt := mkStmtExprMdWithLoc (StmtExpr.Assert { condition := coercedCond, summary }) md
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Stylistic simplification: alreadyBool is used in exactly one if. You can push the Any_to_bool decision directly into the match arms and drop both the flag and the separate conditional. This is shorter, and the "bool-typed reference needs no coercion" invariant then lives at the site that establishes it rather than being re-asserted three lines later:

Suggested change
let (condStmts, finalCondExpr, condCtx, alreadyBool) :=
match condExpr.val with
| .Hole =>
let freshVar := s!"assert_cond_{test.toAst.ann.start.byteIdx}"
let varType := mkHighTypeMd .TBool
let varDecl := mkVarDeclInit freshVar varType condExpr
let varRef := mkStmtExprMd (StmtExpr.Var (.Local freshVar))
([varDecl], varRef, { ctx with variableTypes := ctx.variableTypes ++ [(freshVar, "bool")] })
| _ => ([], condExpr, ctx)
let assertStmt := mkStmtExprMdWithLoc (StmtExpr.Assert { condition := Any_to_bool finalCondExpr, summary }) md
([varDecl], varRef, { ctx with variableTypes := ctx.variableTypes ++ [(freshVar, PyLauType.Bool)] }, true)
| _ => ([], condExpr, ctx, false)
-- Skip Any_to_bool when the condition is already a bool-typed local;
-- wrapping it would produce ill-typed Any_to_bool(bool). See #1102.
let coercedCond :=
if alreadyBool then finalCondExpr else Any_to_bool finalCondExpr
let assertStmt := mkStmtExprMdWithLoc (StmtExpr.Assert { condition := coercedCond, summary }) md
-- Hoist unmodelled conditions (e.g. isinstance(...)) into a fresh
-- bool-typed variable, and pick the coerced-condition expression for
-- the assert up-front so the "already bool, don't re-wrap" invariant
-- lives at the point where it is established. Wrapping a bool-typed
-- ref in `Any_to_bool : Any → bool` produces ill-typed
-- `Any_to_bool(bool)` — see #1102.
let (condStmts, coercedCond, condCtx) :=
match condExpr.val with
| .Hole =>
let freshVar := s!"assert_cond_{test.toAst.ann.start.byteIdx}"
let varType := mkHighTypeMd .TBool
let varDecl := mkVarDeclInit freshVar varType condExpr
let varRef := mkStmtExprMd (StmtExpr.Var (.Local freshVar))
-- `varRef` is declared `TBool`; no `Any_to_bool` wrap needed.
([varDecl], varRef,
{ ctx with variableTypes := ctx.variableTypes ++ [(freshVar, PyLauType.Bool)] })
| _ =>
([], Any_to_bool condExpr, ctx)
let assertStmt := mkStmtExprMdWithLoc (StmtExpr.Assert { condition := coercedCond, summary }) md

Keeps the same behaviour — happy to leave as-is if you prefer the explicit flag for future extensibility (e.g. a second hoisting path that also produces a bool-typed ref), but in that case the comment should say so.

Comment on lines +52 to +75
-- `assert isinstance(5, int)` — the minimal fuzz reproduction. Before
-- the fix this raised a Laurel type error; after the fix the pipeline
-- completes and the verifier reports the assertion as unprovable
-- (expected, since `isinstance` is unmodelled).
/--
info: isinstance_int.py: Strata.DiagnosticType.UserError @ 2:4-29 :: assertion could not be proved
-/
#guard_msgs in
#eval withPython fun pythonCmd => snapshotDiags pythonCmd "isinstance_int.py"

-- `assert isinstance([1], list)` — the original fuzz_semantic_0004 shape.
/--
info: isinstance_list.py: Strata.DiagnosticType.UserError @ 2:4-32 :: assertion could not be proved
-/
#guard_msgs in
#eval withPython fun pythonCmd => snapshotDiags pythonCmd "isinstance_list.py"

-- `if isinstance(x, int): assert x > 0` — the non-hoisting .If path,
-- included to show the fix does not affect the path that already worked.
/--
info: isinstance_if.py: Strata.DiagnosticType.UserError @ 4:8-20 :: assertion could not be proved
-/
#guard_msgs in
#eval withPython fun pythonCmd => snapshotDiags pythonCmd "isinstance_if.py"
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The three fixtures cover the main Hole-hoisting paths well. A few more that would make this suite robust to adjacent refactors:

  1. assert isinstance(5, int), "must be int" — exercises both the Hole-hoisting path and the summary-extraction branch (line 1737's msg.val match) simultaneously. If someone ever reorders let summary := and the hoisting match, or introduces a dependency between them, this catches it. The summary "must be int" should show up in the verifier's property summary.

  2. "Should still coerce" case — the current suite only confirms the fix doesn't break working paths. Nothing confirms it doesn't over-apply. A fixture whose condition is a bool-typed local not introduced by Hole-hoisting — e.g. assert (x > 0) where x > 0 translates via the non-.Hole arm and is legitimately an Any-typed expression in the current pipeline — pins down that the _ => arm still emits Any_to_bool. This is cheap insurance against a future refactor that infers alreadyBool from variableTypes instead of from the match arm and then silently accepts false positives.

  3. assert isinstance(x, int) and x > 0 / assert not isinstance(x, int) — your header comment correctly notes these hit a separate StrataBug. Worth adding them as xfail-style fixtures here (asserting that the StrataBug is produced) so the day that separate bug gets fixed, this test file is the one that tells the fixer to come back and update the snapshot. A tracking issue number in the comment would also help.

Concrete proposal for (3): snapshot the current behaviour explicitly:

-- `assert isinstance(x, int) and x > 0` — tracks the separate
-- "block expression should have been lowered" StrataBug mentioned in
-- the header. When this fixture starts producing a UserError instead,
-- that separate bug has been fixed and this snapshot needs updating.
/--
info: isinstance_and.py: Strata.DiagnosticType.StrataBug @ … :: …
-/
#guard_msgs in
#eval withPython fun pythonCmd => snapshotDiags pythonCmd "isinstance_and.py"

(For this one, you'd also want to drop the "any StrataBug is a regression" guard in snapshotDiags, or parameterise it with an allowStrataBug : Bool knob, since by construction this fixture is producing one on purpose.)

Comment on lines +39 to +50
/-- Load a fixture and print each diagnostic on its own line, prefixed
with the fixture name. Prefer this over ad-hoc prose filters: it asserts
structurally on `type` and keeps the full message in the snapshot. -/
private def snapshotDiags (pythonCmd : System.FilePath) (pyFile : String) : IO Unit := do
let path := fixtureDir / pyFile
let source ← IO.FS.readFile path
let diags ← processPythonFile pythonCmd (stringInputContext pyFile source)
for d in diags do
IO.println s!"{pyFile}: {renderDiag d}"
-- Belt-and-braces structural check: any StrataBug is a regression.
if diags.any (fun d => d.type == Strata.DiagnosticType.StrataBug) then
throw <| .userError s!"{pyFile}: unexpected StrataBug diagnostic"
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Proof / property-coverage suggestion. The correctness of this fix depends on one invariant of translateStmt .Assert:

If the .Hole arm fires, the resulting coercedCond does not contain the subterm Any_to_bool (Var (Local freshVar)) where variableTypes[freshVar] = PyLauType.Bool.

This is straightforwardly checkable in Lean via pattern recognition on the returned StmtExprMd. Adding one #guard that inspects the translated AST directly (rather than routing through the verifier diagnostics) would turn the load-bearing invariant into a machine-checked property and wouldn't depend on the Python runtime being present (so it'd also run under plain lake build, not just lake test):

/-- Reject any `Any_to_bool` applied to a reference to a bool-typed local.
    Traverses the Laurel AST; conservatively over-approximates "bool-typed"
    by consulting the passed-in `variableTypes`. -/
def containsAnyToBoolOfBoolVar (vt : List (String × String)) : StmtExprMd → Bool := …

#guard
  let src := "def main():\n    assert isinstance(5, int)\n\nmain()\n"
  let ast ← … (run translator up to translateStmt on the Assert) …
  !containsAnyToBoolOfBoolVar ast.variableTypes ast.stmt

Alternatively — and I realise this is more involved — a small inductive lemma on the translator would nail it formally, roughly:

theorem translateAssert_no_Any_to_bool_of_bool
    (ctx : TranslationContext) (test msg : …)
    (h : someWellFormednessHypothesis ctx) :
    let (_, stmts) := translateStmt ctx (.Assert _ test msg)
    ∀ s ∈ stmts, ¬ containsAnyToBoolOfBoolVar ctx.variableTypes s

I don't expect either proof in this PR (the translator is partial and doesn't currently admit this style of reasoning without groundwork), but a minimal AST-inspection #guard is low-cost and would outlive any future message/diagnostic reword. Flagging as the testing direction most aligned with this PR's intent — diagnostic snapshots catch "it doesn't crash", AST inspection catches "the shape we expect is what we produce".

expression should have been lowered in a separate pass") that is
independent of #1102 and should be tracked in its own issue. -/

private meta def fixtureDir : System.FilePath :=
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This header comment is exactly the kind of documentation reviewers-of-the-future will thank you for — it explicitly names the adjacent bug ("block expression should have been lowered in a separate pass") and which shapes are in / out of scope. One small add: a tracking issue number for that separate bug, if one exists, so the reader can look it up. If it doesn't exist yet, filing it now and linking it here keeps this test's scope honest (so someone doesn't later "fix" this file by adding the compound cases and being surprised they still fail).

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants