Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fix: terminate early on first counterexample with --early-exit #284

Closed
wants to merge 1 commit into from

Conversation

saurabhchalke
Copy link

Description

This pull request fixes the issue with the --early-exit flag not terminating Halmos immediately after finding the first counterexample, as reported in #243.

Changes

The following changes have been made to address the issue:

  1. Introduced an early_exit flag variable to keep track of whether an early exit is requested.

  2. In the future_callback function, when a counterexample is found and the --early-exit flag is set, the early_exit flag is set to True.

  3. The future_callback function now checks the early_exit flag at the beginning, and if it is True, it immediately returns without processing the result. This prevents any further counterexamples from being processed after an early exit is requested.

  4. The thread pool is shut down after all the submitted futures have completed or been canceled, ensuring a clean termination.

Issue

Closes #243

Testing

The updated code has been tested with the example contract provided in the issue description. Here's the updated output:

(.venv) 86fb4ce7caf3% python3 -m halmos --function test_manyCexes --early-exit
[⠊] Compiling...
[⠢] Compiling 28 files with 0.8.25
[⠔] Solc 0.8.25 finished in 2.15s
Compiler run successful!
Running 1 tests for test/Test60.t.sol:Test60
Counterexample:
p_x_uint256 = 0x8000000000000000000000000000000000000000000000000000000000000008 (57896044618658097711785492504343953926634992332820282019728792003956564819976)
[FAIL] test_manyCexes(uint256) (paths: 32, time: 0.08s, bounds: [])
Symbolic test result: 0 passed; 1 failed; time: 0.09s

As demonstrated, with the --early-exit flag set, Halmos now reports only the first counterexample found and terminates early.

Please review the changes and provide any feedback or suggestions. @karmacoma-eth @daejunpark

@saurabhchalke saurabhchalke changed the title fix: terminate early on first counterexample with --early-exit (#243) fix: terminate early on first counterexample with --early-exit May 5, 2024
@karmacoma-eth karmacoma-eth self-requested a review May 6, 2024 17:10
@karmacoma-eth
Copy link
Collaborator

thanks for taking a stab at this! I'm taking a look

@karmacoma-eth
Copy link
Collaborator

karmacoma-eth commented May 6, 2024

ok so I think the problem with this is that while it does improve the output, it does not actually exit early.

To demonstrate, we can make the assertion in manyCexes harder:

// SPDX-License-Identifier: MIT
pragma solidity ^0.8.15;

import "forge-std/Test.sol";
import {SymTest} from "halmos-cheatcodes/SymTest.sol";

contract Test60 is Test, SymTest {
    function test_manyCexes(uint256 x) external {
        uint256 i = 0;
        if (x & 1 == 0) ++i;
        if (x & 2 == 0) ++i;
        if (x & 4 == 0) ++i;
        if (x & 8 == 0) ++i;

        uint256 y = svm.createUint256("y");
        uint256 z = svm.createUint256("z");
        assertNotEq(y * z, 4096);  // can make this harder by changing the value here
    }
}

What happens is this:

  1. we spawn many solver threads (the hard work is in gen_model_from_sexpr)
  2. one thread finds an answer, its future_callback runs and sets counterexample_found = True and early_exit = True
  3. all the other threads keep running (⚠️ this is the issue)
  4. when they also find the counterexample, they don't print it because early_exit has been set to True

@saurabhchalke if you could help us figure out how to safely interrupt the other running threads as soon as a counterexample is found that would be amazing

@karmacoma-eth
Copy link
Collaborator

apparently we can't interrupt tasks that are running in ThreadPoolExecutor, but we can interrupt a solver context (solver.ctx.interrupt()), we could try that out

Copy link
Collaborator

@karmacoma-eth karmacoma-eth left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

see other comments, let's figure out how to solve the underlying problem before merging 🙏

@saurabhchalke
Copy link
Author

saurabhchalke commented May 8, 2024

Thank you @karmacoma-eth for the guidance and valuable feedback! Your suggestions have been really helpful in identifying the underlying issue and finding the right approach to solve it.

I have updated the code based on your recommendations. Here's a summary of the changes:

  • Introduced a global list solver_contexts to store the solver contexts created in the solve function.
  • Modified the solve function to append the solver context to the global list before returning.
  • Stored the list of futures in a variable before submitting them to the thread pool.
  • In the future_callback function, when args.early_exit is true and a counterexample is found:
    • All the solver contexts in the global list are interrupted using ctx.interrupt().
    • The remaining futures are canceled using future.cancel() to stop unnecessary computations.
    • The global lists are cleared to prevent memory leaks.
  • Added exception handling to the interruption process to avoid segmentation faults.

With these changes, the --early-exit flag should now work as expected, interrupting all the running solver contexts and canceling the remaining futures when a counterexample is found, allowing Halmos to terminate early.

I have a question regarding the use of ThreadPoolExecutor here. Have you considered using ProcessPoolExecutor instead? Since the solver computations are CPU-bound tasks, using separate processes instead of threads might lead to better performance and resource utilization. It would be interesting to explore this option and see if it brings any improvements.

Please let me know if you have any further suggestions or if there's anything else I can improve in this pull request.

Thank you for your time and support!

…#243)

The --early-exit flag is now properly handled to ensure that Halmos
terminates immediately after finding the first counterexample, avoiding
unnecessary solver computations.

The previous implementation allowed solver threads to continue running
even after a counterexample was found, leading to inefficient resource
usage and delayed termination.

The updated code introduces the following changes:

- A global list `solver_contexts` is introduced to store the solver
  contexts created in the `solve` function.

- The `solve` function is modified to append the solver context to the
  global list before returning.

- The list of futures is stored in a variable before submitting them to
  the thread pool.

- In the `future_callback` function, if `args.early_exit` is true and a
  counterexample is found:
  - All the solver contexts in the global list are interrupted using
    `ctx.interrupt()`.
  - The remaining futures are canceled using `future.cancel()` to stop
    unnecessary computations.
  - The global lists are cleared to prevent memory leaks.

- Exception handling is added to the interruption process to avoid
  segmentation faults.

With these changes, when `args.early_exit` is true and a counterexample
is found, the `future_callback` function interrupts all the running
solver contexts and cancels the remaining futures, effectively stopping
the unnecessary computations and allowing Halmos to terminate early.
@karmacoma-eth
Copy link
Collaborator

karmacoma-eth commented May 13, 2024

unfortunately this still doesn't seem to work as expected. I see the calls to ctx.interrupt(), but the other threads keep running anyway.

Here is another example that really shows what we need, it generates increasingly difficult problems (for z3 at least) and we want to see the runtime improved, it should really exit as soon as the first counterexample is found:

// SPDX-License-Identifier: MIT
pragma solidity ^0.8.15;

import "forge-std/Test.sol";
import {SymTest} from "halmos-cheatcodes/SymTest.sol";

contract Test60 is Test, SymTest {
    /// @custom:halmos --loop 12
    function test_manyCexes() external {
        for(uint256 i = 1; i <= 12; ++i) {
            uint256 y = svm.createUint256("y");
            uint256 z = svm.createUint256("z");
            assertNotEq(y * z, 1 << i);
        }
    }
}

@karmacoma-eth
Copy link
Collaborator

@saurabhchalke thank you for the attempt, I'll close this PR since unfortunately ctx.interrupt() doesn't cut it. It looks like the solution will be to move away from a ThreadPoolExecutor and use external processes instead

@saurabhchalke
Copy link
Author

@karmacoma-eth Thank you for your feedback and for highlighting the limitations with ctx.interrupt(). I understand the need to move away from ThreadPoolExecutor and appreciate the guidance.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

--early-exit doesn't work as expected
2 participants