Proof Scripts in JML by mattulbrich · Pull Request #3657 · KeYProject/key

mattulbrich · 2025-09-08T19:13:22Z

Related Issue

For quite some time, we have a prototype for JML proof scripts which we finally want to bring to the mater branch.

Intended Change

Proof scripts can be written into JML comments and can be obeyed during automatic verification.
The major benefit is that the localisation of the proof node to which a proof script should be applied is natural.
The other benefit is that one can use JML expressions and thus does not have to leave the realm of JML to do script-guided proofs.

This relies on #3587 and needs #3654 merged.

Plan

Type of pull request

New feature (non-breaking change which adds functionality)
There are changes to the (Java) code
There are changes to the taclet rule base (possibly small adaptations)

Ensuring quality

WIP:

I made sure that introduced/changed code is well documented (javadoc and inline comments). ✔️
I made sure that new/changed end-user features are well documented (https://github.com/KeYProject/key-docs). ✔️
I added new test case(s) for new functionality. ✔️
I have tested the feature as follows: ... examples and test cases
I have checked that runtime performance has not deteriorated. not for non-script proofs.

Additional information and contact(s)

Planned to be done Mid 09/25

The contributions within this pull request are licensed under GPLv2 (only) for inclusion in KeY.

wadoon · 2025-09-09T02:15:30Z

I have written an integer split command.

An induction command is on the way with support for int, user-data types, bsum(?).

Commit: 4644331

mattulbrich · 2025-09-27T06:20:08Z

The history of this PR was totally changed while I was working on it.
I will have to force-push my changes of the last few days.
In case it is still needed: e841235

wadoon · 2026-02-03T14:46:22Z

@WolframPfeifer Could you please revert the merge of main into this PR (and do a rebase instead, or just let to @mattulbrich for the sake of confusion avoidance)?

This was a merge-free PR, which results in an easier-to-read git history.

WolframPfeifer · 2026-02-06T11:36:47Z

@WolframPfeifer Could you please revert the merge of main into this PR (and do a rebase instead, or just let to @mattulbrich for the sake of confusion avoidance)?

This was a merge-free PR, which results in an easier-to-read git history.

Ok, I reverted the merge. I see the point about a clean history. However, I am not sure what a good workflow would be. Should we do a merge with main only at the end of the feature development, before the merge of the PR?

wadoon · 2026-02-23T12:22:12Z

35160      DEBUG Test worker     d.u.i.k.s.ProofScriptEngine Commands: macro  "autopilot-prep";
tryclose  ;
macro  "simp-upd";
rule  seqPermFromSwap;
rule  andRight;
auto  ;
instantiate with: (i_0) var: "iv" hide;
instantiate with: j_0 var: jv hide;
auto  ;
macro  "simp-upd";
rule  seqPermFromSwap;
rule  andRight;
auto  ;
instantiate with: (i_0) var: "iv" hide;
instantiate with: (to) var: "jv" hide;
auto  ; 

Error while executing script: Could not convert value 'j_0' from type 'ProofScriptExpressionContext' to type 'JTerm'

Command: instantiate with: j_0 var: jv hide;
Position: [file:/home/weigl/work/key/key.core/../key.ui/examples/heap/quicksort/split.key:30/2]

de.uka.ilkd.key.scripts.ScriptException: Error while executing script: Could not convert value 'j_0' from type 'ProofScriptExpressionContext' to type 'JTerm'

Command: instantiate with: j_0 var: jv hide;
Position: [file:/home/weigl/work/key/key.core/../key.ui/examples/heap/quicksort/split.key:30/2]

	at de.uka.ilkd.key.scripts.ProofScriptEngine.execute(ProofScriptEngine.java:162)
	at de.uka.ilkd.key.scripts.ProofScriptEngine.execute(ProofScriptEngine.java:80)
	at de.uka.ilkd.key.proof.runallproofs.ProveTest.autoMode(ProveTest.java:194)

5028      INFO  Test worker     d.u.i.k.p.r.ProveTest     (parallelGcd.proof|1412243253) Proof will be saved to: /home/weigl/work/key/key.core/../key.ui/examples/heap/verifyThis15_2_ParallelGcd/parallelGcd.proof.proof 
25028      INFO  Test worker     d.u.i.k.p.r.ProveTest     (parallelGcd.proof|1412243253) Start proving 
25028      INFO  Test worker     d.u.i.k.p.i.AbstractProblemLoader Loading environment from /home/weigl/work/key/key.core/../key.ui/examples/heap/verifyThis15_2_ParallelGcd/parallelGcd.proof 
25028      DEBUG Test worker     d.u.i.k.p.i.AbstractProblemLoader Environment load took 150.53ns 
25030      INFO  Test worker     d.u.i.k.p.i.AbstractProblemLoader Creating init config 
26664      DEBUG Test worker     d.u.i.k.p.i.AbstractProblemLoader Init config took 1.63s 
26895      DEBUG Test worker     d.u.i.k.p.i.IntermediateProofReplayer Proof replay took 194.78ms 
26895      INFO  Test worker     d.u.i.k.p.i.AbstractProblemLoader Replay result: Proof replayed successfully.Errors while reading the proof. Not all branches could be load successfully. 
26896      INFO  Test worker     d.u.i.k.p.r.ProveTest     (parallelGcd.proof|1412243253) Proving done 

Loading problem file failed
Expected :false
Actual   :true
<Click to see difference>

Macro 'script-auto' raised an exception: Error while executing script: Multiple or no formulas modified or added in last step, cannot identify sequent formula to skolemize.

Command: __obtain from_goal: true var: x ;
Position: [file:/home/weigl/work/key/key.core/../key.ui/examples/heap/BoyerMoore/src/BoyerMoore.java:74/18]

de.uka.ilkd.key.scripts.ScriptException: Macro 'script-auto' raised an exception: Error while executing script: Multiple or no formulas modified or added in last step, cannot identify sequent formula to skolemize.

Command: __obtain from_goal: true var: x ;
Position: [file:/home/weigl/work/key/key.core/../key.ui/examples/heap/BoyerMoore/src/BoyerMoore.java:74/18]

	at app//de.uka.ilkd.key.scripts.MacroCommand.execute(MacroCommand.java:117)
	at app//de.uka.ilkd.key.scripts.AbstractCommand.execute(AbstractCommand.java:79)
	at app//de.uka.ilkd.key.scripts.ProofScriptEngine.execute(ProofScriptEngine.java:129)
	at app//de.uka.ilkd.key.scripts.ProofScriptEngine.execute(ProofScriptEngine.java:80)
	at app//de.uka.ilkd.key.proof.runallproofs.ProveTest.autoMode(ProveTest.java:194)
	at app//de.uka.ilkd.key.proof.runallproofs.ProveTest.runKey(ProveTest.java:135)
	at app//de.uka.ilkd.key.proof.runallproofs.ProveTest.assertProvability(ProveTest.java:63)
	at app//de.uka.ilkd.key.proof.runallproofs.gen._example_algos.testBM_bm(_example_algos.java:36)
	at java.base@25.0.2/java.lang.reflect.Method.invoke(Method.java:565)
	at java.base@25.0.2/java.util.ArrayList.forEach(ArrayList.java:1604)
	at java.base@25.0.2/java.util.ArrayList.forEach(ArrayList.java:1604)
Caused by: de.uka.ilkd.key.scripts.ScriptException: Error while executing script: Multiple or no formulas modified or added in last step, cannot identify sequent formula to skolemize.

Command: __obtain from_goal: true var: x ;
Position: [file:/home/weigl/work/key/key.core/../key.ui/examples/heap/BoyerMoore/src/BoyerMoore.java:74/18]

	at app//de.uka.ilkd.key.scripts.ProofScriptEngine.execute(ProofScriptEngine.java:162)
	at app//de.uka.ilkd.key.macros.ApplyScriptsMacro.applyTo(ApplyScriptsMacro.java:210)
	at app//de.uka.ilkd.key.macros.AbstractProofMacro.applyTo(AbstractProofMacro.java:72)
	at app//de.uka.ilkd.key.macros.SequentialProofMacro.applyTo(SequentialProofMacro.java:99)
	at app//de.uka.ilkd.key.macros.AbstractProofMacro.applyTo(AbstractProofMacro.java:72)
	at app//de.uka.ilkd.key.scripts.MacroCommand.execute(MacroCommand.java:113)
	... 10 more
Caused by: java.lang.IllegalStateException: Multiple or no formulas modified or added in last step, cannot identify sequent formula to skolemize.
	at de.uka.ilkd.key.scripts.ObtainCommand.identifySequentFormula(ObtainCommand.java:139)
	at de.uka.ilkd.key.scripts.ObtainCommand.executeFromGoal(ObtainCommand.java:91)
	at de.uka.ilkd.key.scripts.ObtainCommand.execute(ObtainCommand.java:77)
	at de.uka.ilkd.key.scripts.AbstractCommand.execute(AbstractCommand.java:79)
	at de.uka.ilkd.key.scripts.ProofScriptEngine.execute(ProofScriptEngine.java:129)
	... 15 more

mattulbrich · 2026-06-29T09:48:00Z

I have no idea why the Windows test cases fail. I even used my Windows PC at home to run the tests: And they succeed. ...

Thanks for investigating, Richard @unp1

Apparently (at least C's analysis) line endings on Windows. It suggests two fixes:

Who is C? Or is it Claude?

This branch is on a reduced github workflow. It seems that the parameters are parsed identically. (see details below). In the current version, all \r are filtered out anyways.

What definitely is different is the order in which the files are investigated. I added a sort routine to make this order deterministic.

However, for some reason there are now again 8 merge conflicts even though I merged only last week; that is why the tests do not run. I do not know if I will find the time to do this merge in the next days. :(

Grep results showing that parameters are identically parsed.

AutoOnly2 fails on Windows, not on ubuntu.

linux:2026-06-27T06:25:25.3047846Z     !!! Parameters for /home/runner/work/key/key/key.core/build/resources/test/de/uka/ilkd/key/scripts/jml/AutoOnly2.java: Parameters{shouldClose=false, method='null', exception='null', deleteTmpDir=true, settings=null}
windows:2026-06-27T06:26:33.9999167Z     !!! Parameters for D:\a\key\key\key.core\build\resources\test\de\uka\ilkd\key\scripts\jml\AutoOnly2.java: Parameters{shouldClose=false, method='null', exception='null', deleteTmpDir=true, settings=null}

# Conflicts: # key.core/src/main/java/de/uka/ilkd/key/java/loader/JP2KeYConverter.java # key.core/src/main/java/de/uka/ilkd/key/scripts/ProofScriptEngine.java # key.ncore/src/main/antlr/KeYCommonParser.g4

# Conflicts: # key.core/src/main/java/de/uka/ilkd/key/scripts/ProofScriptEngine.java

unp1 · 2026-06-29T10:15:10Z

I have no idea why the Windows test cases fail. I even used my Windows PC at home to run the tests: And they succeed. ...

Thanks for investigating, Richard @unp1

Apparently (at least C's analysis) line endings on Windows. It suggests two fixes:

Who is C? Or is it Claude?
yupp, at the moment I have still the license (not for long anzmore :-)

This branch is on a reduced github workflow. It seems that the parameters are parsed identically. (see details below). In the current version, all \r are filtered out anyways.

What definitely is different is the order in which the files are investigated. I added a sort routine to make this order deterministic.

However, for some reason there are now again 8 merge conflicts even though I merged only last week; that is why the tests do not run. I do not know if I will find the time to do this merge in the next days. :(

Grep results showing that parameters are identically parsed.

If you want I can try to get Claude doing the resolution and we both check afterwards that everything is correct?
Only problem is I don't have access to a Windows machine, so tests will run on CI.

mattulbrich · 2026-06-29T10:59:20Z

If C could do the merge and you review it, that would be splendid.
The history on this branch is a bit broken since I modified the github workflows for quicker testing. So history rewrite would be required here. I was able to partially merge intermediate masters into the branch, but it is rather painful.

My bet and hope is that c0a7ae8 will solve the problem. ..

mattulbrich · 2026-06-29T11:01:54Z

Only problem is I don't have access to a Windows machine, so tests will run on CI.

I do have a Windows machine: And it goes through there 🫢 . But not on the CI machine. Therefore, I assume the indeterministic file order may be the problem.

This of course raises another issue: Why does this order matter? At least on could investigate then. ... if that is the reason.

Resolve conflicts from the source-location move to key.ncore (KeYProject#3870): Position/Location now live in org.key_project.util.parsing. Updated the stale imports (JP2KeYConverter, KeyAst, ScriptCommandAst, JmlFacade, JMLTransformer, ApplyScriptsMacro) and moved the branch's Location.fromPositionInfo helper into key.core's JavaSourceLocations (it uses PositionInfo, so it cannot live in the now JavaParser-free ncore Location).

…ripts The character loop's `case '\r' -> { }` did nothing, so a carriage return fell through to the `cmdBuilder.append(...)` below and ended up inside the command -- e.g. in a quoted value that spans a CRLF line break. Script files are marked `text` in .gitattributes and are therefore checked out with CRLF on Windows (and depending on the CI runner's eol config), so scripts parsed there carried stray '\r's and could fail to match. Skip the carriage return with `continue`. Regression test (ScriptLineParser-level) added in ProofScriptParserTest; it fails without the fix and passes with it. Created with AI tooling

wadoon · 2026-06-29T16:19:48Z

I do have a Windows machine: And it goes through there 🫢 .

Windows != Windows. On Github, it is a Windows Server (2023 or 2025, I think.)
Behavior on these machines can slightly differ.

In the old days, e.g., JVM decided for -server mode on Windows Server resulting into different behavior on volatile fields.

… -- REVERT Logs the applied-rule sequence + proof stats + line separator per case so the Windows-vs-ubuntu CI output can be diffed to find where automatic proof search diverges. To be reverted once the cause is found. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>

…ect#3657) -- REVERT Prints per-goal whether a JML assert/script was found, the rendered command lines (with \r/\n visualized), and any exception thrown by pse.execute, to locate the Windows-only failure. To be reverted with the JmlScriptTest one. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>

unp1 · 2026-06-29T19:20:17Z

I do have a Windows machine: And it goes through there 🫢 . But not on the CI machine. Therefore, I assume the indeterministic file order may be the problem.

This of course raises another issue: Why does this order matter? At least on could investigate then. ... if that is the reason.

The reason is actually quite interesting as a reminder of the subtleties between the platforms:

The problem was in SymbolicExecutionMacro:

ADMITTED_RULES = Arrays.asList(
               Streams.toString(SymbolicExecutionOnlyMacro.class
                       .getResourceAsStream("SymbolicExecutionOnlyMacro.admittedRules.txt"))
                       .split("\n"));
ADMITTED_RULE_SETS = Arrays.asList(
               Streams.toString(SymbolicExecutionOnlyMacro.class
                       .getResourceAsStream("SymbolicExecutionOnlyMacro.admittedRuleSets.txt"))
                       .split("\n"));

depending on the line endings "simplify\n" --> "simplify" and a later equals check to "simplify" when deciding whether to apply the taclet or not succeeded. While when the line endings in the txt file were
"simplify\r\n" --> "simplify\r" and the later equals check with "simplify" became false and no rules were applied.

The difference between the CI and your machine is most likely the git configuration that converts / does not convert the line endings in the files. Even so, not necessary, the fix for the deprecated ScriptLineParser is also in as additional hardening.

The merge and the commit that fixes the test are now on the branch.

SymbolicExecutionOnlyMacro read its admitted rule / rule-set lists from the bundled .txt resources and split them on "\n". On a Windows checkout these resources have CRLF line endings (.gitattributes marks them as text), so every entry kept a trailing '\r' ("assignment\r", "simplify_prog\r", ...). isAdmittedRule compares the actual rule name ("assignment") against those entries via List.contains, which then never matched. As a result no symbolic- execution rule was admitted, the macro applied nothing, ApplyScriptsMacro never reached a goal carrying a JmlAssert, and all goals fell through to the TryCloseMacro fallback -- making JmlScriptTest cases close (AutoOnly2) or fail to close (NestedAssert, ObtainFromGoal) only on Windows. Read the lists via String.lines() and strip each entry so the names match regardless of the checkout's line endings. Reproduced locally by converting the two resources to CRLF (3 cases fail), fixed by this change (all 7 pass). with AI tooling support

mattulbrich added JavaJMLParser JML (Semantics) P:HIGH Proof Scripts labels Sep 8, 2025

wadoon self-requested a review September 8, 2025 23:22

wadoon assigned mattulbrich Sep 8, 2025

wadoon added this to the v2.12.4 milestone Sep 8, 2025

wadoon reviewed Sep 9, 2025

View reviewed changes

Comment thread key.core/src/main/antlr4/JmlParser.g4

wadoon force-pushed the jmlScripts branch from 5cad3f1 to e841235 Compare September 26, 2025 16:26

mattulbrich force-pushed the jmlScripts branch 2 times, most recently from 96a6a98 to 2382655 Compare October 4, 2025 01:01

mattulbrich force-pushed the jmlScripts branch 4 times, most recently from 2066973 to ae8b141 Compare October 11, 2025 00:01

mattulbrich marked this pull request as ready for review October 11, 2025 00:11

mattulbrich added the Review Request Waiting for review label Oct 11, 2025

mattulbrich requested a review from WolframPfeifer October 11, 2025 00:11

mattulbrich force-pushed the jmlScripts branch from 07792f4 to 2bc3a36 Compare October 15, 2025 08:07

mattulbrich force-pushed the jmlScripts branch from d5d10bf to 46567da Compare October 24, 2025 13:22

WolframPfeifer modified the milestones: v2.13.0, v3.0.0 Jan 7, 2026

WolframPfeifer force-pushed the jmlScripts branch from ff1ce97 to ec5929d Compare February 6, 2026 11:26

wadoon force-pushed the jmlScripts branch from 16e79e7 to 0d62fa7 Compare April 9, 2026 13:40

KeYProject deleted a comment from codecov-commenter Apr 9, 2026

mattulbrich force-pushed the jmlScripts branch from ad8a384 to 442af76 Compare June 27, 2026 06:00

mattulbrich marked this pull request as draft June 27, 2026 06:01

TEMPORARY COMMIT TO INVESTIGATE TEST CASE FAILURE

e769768

mattulbrich force-pushed the jmlScripts branch from 442af76 to e769768 Compare June 27, 2026 06:07

mattulbrich added 3 commits June 27, 2026 08:24

TEMPORARY COMMIT TO INVESTIGATE TEST CASE FAILURE

4dd1303

TEMPORARY COMMIT TO INVESTIGATE TEST CASE FAILURE

c0a7ae8

Merge commit '7828992f4c97e7d893bb0bdbdd5f7981e6501799' into jmlScripts

4e988fb

mattulbrich added 3 commits June 29, 2026 11:56

Merge commit 'db471336507c9337d04879a4c13830afbdeac0fe' into jmlScripts

05fba5e

# Conflicts: # key.core/src/main/java/de/uka/ilkd/key/java/loader/JP2KeYConverter.java # key.core/src/main/java/de/uka/ilkd/key/scripts/ProofScriptEngine.java # key.ncore/src/main/antlr/KeYCommonParser.g4

Merge commit 'b83c89af98b5df800369ec523e4159d7f2172a97' into jmlScripts

13ec05b

# Conflicts: # key.core/src/main/java/de/uka/ilkd/key/scripts/ProofScriptEngine.java

Deprecating unused legacy script parser classes

f99d061

unp1 added 2 commits June 29, 2026 16:49

unp1 force-pushed the jmlScripts branch from 1fbd343 to 816e434 Compare June 29, 2026 15:30

Merge branch 'main' into jmlScripts

1763318

unp1 force-pushed the jmlScripts branch from 5f4c536 to db571a1 Compare June 29, 2026 19:22

unp1 force-pushed the jmlScripts branch from db571a1 to fcb7fa8 Compare June 29, 2026 19:29

restore Github workflows

7b88ca5

wadoon marked this pull request as ready for review June 29, 2026 21:55

wadoon added this pull request to the merge queue Jun 29, 2026

Merged via the queue into KeYProject:main with commit 993820e Jun 29, 2026
36 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Proof Scripts in JML#3657

Proof Scripts in JML#3657
wadoon merged 104 commits into
KeYProject:mainfrom
mattulbrich:jmlScripts

mattulbrich commented Sep 8, 2025 •

edited

Loading

Uh oh!

Uh oh!

wadoon commented Sep 9, 2025

Uh oh!

mattulbrich commented Sep 27, 2025

Uh oh!

wadoon commented Feb 3, 2026

Uh oh!

WolframPfeifer commented Feb 6, 2026

Uh oh!

wadoon commented Feb 23, 2026 •

edited

Loading

Uh oh!

mattulbrich commented Jun 29, 2026

Uh oh!

unp1 commented Jun 29, 2026 •

edited

Loading

Uh oh!

mattulbrich commented Jun 29, 2026

Uh oh!

mattulbrich commented Jun 29, 2026

Uh oh!

wadoon commented Jun 29, 2026

Uh oh!

unp1 commented Jun 29, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

Uh oh!

Conversation

mattulbrich commented Sep 8, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Related Issue

Intended Change

Plan

Type of pull request

Ensuring quality

Additional information and contact(s)

Uh oh!

Uh oh!

wadoon commented Sep 9, 2025

Uh oh!

mattulbrich commented Sep 27, 2025

Uh oh!

wadoon commented Feb 3, 2026

Uh oh!

WolframPfeifer commented Feb 6, 2026

Uh oh!

wadoon commented Feb 23, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

mattulbrich commented Jun 29, 2026

Uh oh!

unp1 commented Jun 29, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

mattulbrich commented Jun 29, 2026

Uh oh!

mattulbrich commented Jun 29, 2026

Uh oh!

wadoon commented Jun 29, 2026

Uh oh!

unp1 commented Jun 29, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

mattulbrich commented Sep 8, 2025 •

edited

Loading

wadoon commented Feb 23, 2026 •

edited

Loading

unp1 commented Jun 29, 2026 •

edited

Loading