Skip to content

testImportModule flakes in-suite with an empty-call-graph failure #693

Description

@khatchad

TestTensorflow2Model.testImportModule intermittently fails Function must exist in call graph in ~0.1s when the full suite (or full class) runs, while passing in isolation, in pairs, and in most full-class runs—including a run at the identical method order that had just failed. Occurrences so far: once on a full-suite run of the #691 branch (2026-07-03), once on a full-suite run of the #690 fix branch; an immediately-following identical full-class run passed 945/0.

The sub-second timing means the engine produced a call graph with the target function missing, without doing real analysis work—consistent with cross-test interference on shared JVM state rather than anything about the import fixture itself. Two leads:

  • The root POM sets the parallel property to both (surefire reads it via its user-property alias) with threadCount=1 and reuseForks=true, so suite runs execute JUnit 4 tests with intra-JVM concurrency; any mutable static in the front end or engine is then a race candidate.
  • A related deterministic poisoning was just documented in Multi-file mode binds script-toplevel class declarations and reads to different globals #692: a test that flips PythonCAstToIRTranslator.setSingleFileAnalysis(false) breaks 13 later tests across five classes even after restoring the flag, proving that translation mutates static state that survives engine construction. The flake may be the concurrent flavor of the same substrate.

Next step when picked up: rerun the suite with -Dparallel=none a few times; if the flake disappears, bisect the shared static.

Metadata

Metadata

Assignees

No one assigned

    Labels

    bugSomething isn't working

    Type

    Fields

    No fields configured for Bug.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions