Implement non-empty clause counting whi… by eladrion · Pull Request #129 · Workflomics/ape

eladrion · 2025-09-25T12:21:17Z

…ch fixes an empty clause error when having only one tool in the workflow

Pull Request Overview

Fixes #119 which is introduced since APE generates an empty clause which is not accepted by minisat and thus not counted as clause. Hence, the computed count of clauses by APE does not match the ones recognised by minisat.

Related Issue

#119

Changes Introduced

Since (in the case of a workflow of length 1) the output of op 1 cannot be reused, connectedModules seemingly yields an empty list of CNF literals. Thus, we check whether the constraints are empty and only set a clause delimiter (0) if there are constraints.

How Has This Been Tested?

Successfully tested locally with config_bug119.json

Checklist

I have referenced a related issue.
I have followed the project’s style guidelines.
My changes include tests, if applicable.
All tests pass locally.

…ch fixes an empty clause error when having only one tool in the workflow

eladrion · 2025-09-25T12:24:16Z

@vedran-kasalica I am currently not sure whether some comparable can happen in other situations since both useModuleInput and useModuleOutput also append a 0 outside of any loop. If necessary, I can add that.

eladrion · 2025-09-25T12:28:13Z

By the way, the (first 3) generated workflows

…mat and warn if empty clauses are found.

eladrion · 2025-09-25T13:21:08Z

@vedran-kasalica I now added a clause counting function based on DIMACS format that counts the non-empty clauses and warns if one was found. Using this function, minisat is more forgiving and ignores the empty clauses since the correct count is provided.

eladrion · 2025-09-25T13:25:54Z

Woop, still working on that one. Something is still wrong

…mat, warn if empty clauses are found and dump the respective clause number.

eladrion · 2025-09-25T14:18:00Z

@vedran-kasalica: behaviour stabilised and checks are passing. Skipped one line in the CNF file which made the count wrong. Ready for review

vedran-kasalica · 2025-09-26T10:16:35Z

src/main/java/nl/uu/cs/ape/utils/APEUtils.java

+     * @param cnfEncoding the CNF encoding
+     * @return the count of non-empty clauses
+     */
+    public static int countCNFClauses(InputStream cnfEncoding) {


Have you compared the performance of this method against countLines? I’m concerned that using hasNextInt may slow down solving when dealing with 1M+ clauses, since it relies on regex matching (e.g., nextInt). In countLines I used byte-wise comparison for efficiency. Could we reuse countLines and extend it for this case?

Just did that:

I tested the above JSON config with maximum workflow length 20 and in workflow length 4 and 11 we get

for the countLines variant

Total problem setup time: 3.019 sec (1258226 clauses).
...
Total problem setup time: 9.667 sec (4030028 clauses).

and for the countCNFClauses variant

Total problem setup time: 4.999 sec (1258203 clauses).
...
Total problem setup time: 17.289 sec (4030004 clauses)

so it definitely is slower to use countCNFClauses

Thanks for testing. Do you think it's doable to update countLines to fix the bug?

Let me take a short look...

Well, if I get that correctly, countLines actually does more or less nothing different (concerning complexity class) since it regexes for \n. In order to recognise in a stable way that a clause has ended, we need the pattern " " >> "0" >> _space where space may, but does not have to be a newline (according to DIMACS CNF spec on sat4J). To implement the desired behaviour in countLines, we would have to scan line by line. But a line could have n clause endings with n being arbitrary. So the implementation would be more complex (but still doable)

What we can do is to assume that if a line ends with 0, it is a clause and if it's length is 2, it's an empty clause which is not counted. I can implement that easily and test it. But I would propose to not call this function countLines, then.

I will implement and test that.

Small correction:

Usually each clause is listed on a separate line, using spaces between each of the literals and the final zero.
Sometimes long clauses use multiple lines.

So we can assume that a clause end always at the end of the line.

Okay, the new variant has the following timings for depth 4 and 11:

Total problem setup time: 3.06 sec (1258226 clauses).
Total problem setup time: 9.072 sec (4030007 clauses).

This is comparable to countLines but there is no guarantee that the count result is really a count of clauses conforming to the format specification.

If we want to make sure that it is a list of numbers, we have to introduce more checks and will have more or less the performance of countCNFClauses

What do you prefer, @vedran-kasalica?

…ways end with ` 0`, followed by newline.

…x check can be optionally chosen.

eladrion · 2025-09-29T10:57:28Z

Hi @vedran-kasalica, after thinking a bit about a "good" solution, I have the following proposal. We have countCNFClauses in two variants. The one parameter variant always checks the syntax and is thus slower. In the two parameter variant, the user can decide by the second parameter (boolean) whether the syntax shall be checked and internally, we use the non-checking variant since we know, that the syntax is correct if we did not do some foo. How does that sound?

vedran-kasalica · 2025-09-29T12:07:24Z

Hi @vedran-kasalica, after thinking a bit about a "good" solution, I have the following proposal. We have countCNFClauses in two variants. The one parameter variant always checks the syntax and is thus slower. In the two parameter variant, the user can decide by the second parameter (boolean) whether the syntax shall be checked and internally, we use the non-checking variant since we know, that the syntax is correct if we did not do some foo. How does that sound?

Hi @eladrion, thanks for checking this. If the new 2-parameter countCNFClauses works well, I’d make it the default. I’m not sure overloading is the best fit here, since I’d normally expect the version with more parameters to be the specialised one, and not the default. But this is a minor comment, and we can merge it as it is, not to drag the PRs further.

The following comment is more of a discussion-, rather than an action-point.
If we look at the 1-param version in detail, it should be faster to implement it as you mentioned before, by extending the existing countCNFClauses 2-param version with a clause if len() == 2 log a warning or if the first char is ? That is quite fast to compute, while parsing every int per clause (nextInt) takes time as it parses the whole row. In addition, it doesn't provide real syntax checking, because we are not checking whether the ints are higher than the number of variables (that would be a syntax error), and nextInt reads ints, so it would skip syntax errors such as using non-numeric characters. Those are rare and captured by MiniSAT error handling, so it might not be needed to check anyway.

eladrion · 2025-09-29T12:20:06Z

Hi @vedran-kasalica IIRC, the countLinesand also the new countCNFClauses are used only on CNF encodings without the p ... preamble. So the count of variables is in that case not really fixed. If we wanted to support scanning the SAT encodings, then we would have to check the preamble, too. But you have a point that there are no constraints on the numbers themselves. The speed of coundCNFClauses without syntax check is practically the same as countLines. And you have a point that it is counterintuitive to use the two-parameter function to not check. I think this can be resolved quite easily and the warning for empty clauses should also be quite easy to implement.

eladrion · 2025-09-29T12:30:15Z

I separated the functionality a bit now to resolve the counter-intuition and we now have countCNFClauses and checkCNFClauses with both having only the CNF encoding as parameter. I think we can leave it as is, now.

vedran-kasalica · 2025-09-29T13:28:40Z

I separated the functionality a bit now to resolve the counter-intuition and we now have countCNFClauses and checkCNFClauses with both having only the CNF encoding as parameter. I think we can leave it as is, now.

@eladrion thank you, looks good! Feel free to merge the PR. Should we make it part of the 2.5.3 release?

eladrion · 2025-09-29T13:36:52Z

Hi @vedran-kasalica, yes, we should introduce this in 2.5.3 since it is a functionality flaw and resolved. I will merge.

[FIX] Omit trailing 0 for empty constraint in connected modules whi…

7d76a85

…ch fixes an empty clause error when having only one tool in the workflow

eladrion self-assigned this Sep 25, 2025

eladrion requested a review from vedran-kasalica September 25, 2025 12:22

[FEAT/FIX] Add functionality for counting clauses based on DIMACS for…

5b78f96

…mat and warn if empty clauses are found.

[FEAT/FIX] Add functionality for counting clauses based on DIMACS for…

9927b74

…mat, warn if empty clauses are found and dump the respective clause number.

eladrion changed the title ~~[FIX] Omit trailing 0 for empty constraint in connected modules whi…~~ Implement non-empty clause counting whi… Sep 25, 2025

vedran-kasalica reviewed Sep 26, 2025

View reviewed changes

eladrion added 2 commits September 26, 2025 13:17

Implement an alternative clause counting that assumes that clauses al…

9a82361

…ways end with ` 0`, followed by newline.

Implement a unified clause counting functionality where the CNF synta…

690e34f

…x check can be optionally chosen.

vedran-kasalica approved these changes Sep 29, 2025

View reviewed changes

Separate clause counting and clause checking.

0b86a93

eladrion merged commit 112b831 into main Sep 29, 2025
1 check passed

vedran-kasalica deleted the fix_119_empty_clause_creation branch September 29, 2025 14:07

Conversation

eladrion commented Sep 25, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Pull Request Overview

Related Issue

Changes Introduced

How Has This Been Tested?

Checklist

Uh oh!

eladrion commented Sep 25, 2025

Uh oh!

eladrion commented Sep 25, 2025

Uh oh!

eladrion commented Sep 25, 2025

Uh oh!

eladrion commented Sep 25, 2025

Uh oh!

eladrion commented Sep 25, 2025

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

eladrion commented Sep 29, 2025

Uh oh!

vedran-kasalica commented Sep 29, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

eladrion commented Sep 29, 2025

Uh oh!

eladrion commented Sep 29, 2025

Uh oh!

vedran-kasalica commented Sep 29, 2025

Uh oh!

eladrion commented Sep 29, 2025

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

eladrion commented Sep 25, 2025 •

edited

Loading

vedran-kasalica commented Sep 29, 2025 •

edited

Loading