Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[BUG] Timeout in str.indexof -1 #83

Closed
leonbett opened this issue Jul 31, 2023 · 6 comments
Closed

[BUG] Timeout in str.indexof -1 #83

leonbett opened this issue Jul 31, 2023 · 6 comments

Comments

@leonbett
Copy link

Describe the bug
(str.indexof <buffer> "a") should return the offset of the first occurence of "a" in <buffer>, or -1 if "a" is not in <buffer>.

Queries that constrain str.indexof to -1 hit a timeout ("Satisfiability of [...] could not be decided"), even when setting timeout_ms to a higher value like 5000 in z3_helpers.py.

To Reproduce

Run the following program:

    GRAMMAR = {
        "<start>": ["<entry>"],
        "<entry>": ["<A><X><A>"],

        "<A>": ["<number>"],
        "<number>": ["<digit>",  "<digit><number>"],
        "<digit>": ["0" , "1" , "2" , "3" , "4" , "5" , "6" , "7" , "8" , "9"],

        "<X>": ["x"],
    }

    constraint = '(= (str.indexof <entry> "x") (- 1))'
    solver = ISLaSolver(GRAMMAR, constraint)
    sol = solver.solve()
    print('Solution: ', sol)

Behavior
Expected behavior: ISLa should return unsat because every string derived from the grammar contains x.
Actual behavior: ISLa returns unsat, but due to hitting a timeout: "Satisfiability of [...] could not be decided".

System/Installation Specs:

  • ISLa Version: ISLa version 1.13.9
  • Python Version: 3.11.2
  • OS: macOS
@rindPHI
Copy link
Owner

rindPHI commented Aug 7, 2023

Hi @leonbett,

This a Z3 problem and not "ISLa's fault." ISLa reports

Satisfiability of ['IndexOf(<entry>_1, "x", 0) == -1', 'InRe(<entry>_1,
     re.++(re.++(re.++(Star(Range("0", "9")),
                       Range("0", "9")),
                 Re("x")),
           re.++(Star(Range("0", "9")), Range("0", "9"))))'] could not be decided
Traceback (most recent call last):
  File "/Users/dsteinhoefel/PycharmProjects/InputConstraints/playground.py", line 16, in <module>
    sol = solver.solve()
  File "/Users/dsteinhoefel/PycharmProjects/InputConstraints/src/isla/solver.py", line 712, in solve
    raise StopIteration()
StopIteration

which means that Z3 failed to evaluate the logic constraint. That's a pity, but there's not much to be done here from our side. ISLa already tries to solve SMT constraints with and without Z3's parallel mode and with different timeouts. It works if you remove one <A> from the definition of <entry>.

Would you happen to have any idea on how to improve this situation? Unsatisfiability can also not be answered by generating random values, which we could do for satisfiability in the case of ISLa (given enough time—semi-decidability).

I'm closing this for now; please feel free to follow up if you have any more ideas.

@rindPHI rindPHI closed this as completed Aug 7, 2023
@leonbett
Copy link
Author

leonbett commented Aug 7, 2023

Thanks Dominic!

In this particular example, one idea would be to transform:

(= (str.indexof <entry> "x") (- 1))

into

 (= (str.indexof <entry>.<A> "x") (- 1)) and
(= (str.indexof <entry>.<X> "x") (- 1)) and
(= (str.indexof <entry>.<A> "x") (- 1))

and solve each subexpression separately.

My intuition is that Z3 would return unsat for (= (str.indexof <entry>.<X> "x") (- 1)), which implies unsat for the full expression. Unfortunately I can‘t test this currently as I am replying from my phone.

@rindPHI
Copy link
Owner

rindPHI commented Aug 7, 2023

Right, <entry>.<X> works. Your idea is good; it's basically a "grammatical" case distinction that can be more generally expressed as $P(\mathit{nonterminal}) \iff \bigwedge_{c\text{ is a direct child of }\mathit{nonterminal}}P(c)$. However, this does not work for all $P$. For example, if the index you're comparing to in your example is not the "special" -1, it does not work. We would have to pin down this strategy more systematically; in the worst case, we would need another list of symbols /patterns with a semantics supporting this case distinction strategy.

@rindPHI
Copy link
Owner

rindPHI commented Aug 7, 2023

Added a feature request. I can't say when or whether I can address it, though. Contributions welcome :)

@leonbett
Copy link
Author

leonbett commented Aug 8, 2023

Thanks! I agree that it is more complicated for the non -1 case.

For constant -1 str.indexof equality checks, another idea would be to have a fast path that dispatches to not str.contains (which might or might not work out of the box :)). Similarly, (>= (str.indexof x y) 0) could be dispatched to (str.contains x y).

@rindPHI
Copy link
Owner

rindPHI commented Aug 8, 2023

That could be a feasible solution in some instances. I'm not sure whether I want to go this path, though. ISLa is a constraint solver, and it is nothing unusual that different formulations of semantically equivalent constraints work differently well. On the one hand, that continues beyond the SMT level, so we might end up with an extensive list of ad-hoc rewrites; on the other hand, it's not even obvious if such rewrites are always beneficial. Z3 might react differently for different constraint combinations; there may be situations where a rewrite works well and others where it doesn't. Automatic rewriting would take the possibility of manual constraint optimization away from the users.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants