Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

study_chunk recursion #17743

Closed
hvds opened this issue Apr 23, 2020 · 1 comment
Closed

study_chunk recursion #17743

hvds opened this issue Apr 23, 2020 · 1 comment

Comments

@hvds
Copy link
Contributor

hvds commented Apr 23, 2020

This is a placeholder ticket for consideration of a theoretically possible bug.

In #16947 we found that study_chunk reinvokes itself in two ways - by simple recursion, and by enframing. In some cases that involves restudying regexp ops multiple times, whereas in other cases the reinvocation is the only time the relevant ops are studied. The primary results of studying are a) to capture global information about the regexp that will be used for optimization at runtime; b) to make in-place modifications to the ops for optimization (optional but desirable); and c) to make mandatory modifications to the ops, replacing temporary compile-time-only ops that the runtime engine does not know how to handle.

Because of (c) it is required that every op is studied at least once.

When ops are studied multiple times that can cause problems: the first invocation may capture information about the program, then reinvoke, then attempt to use the captured information assuming it has not changed.

The conclusion is that mutation of ops must happen only once, at the outermost level of reinvocation that will act on the relevant ops.

As far as I was able to discover the only case in which ops are studied multiple times is in the handling of GOSUB, which reinvokes by enframing. In #16947 this was resolved by recording in each frame whether it, or any outer frame, represented the handling of a GOSUB, and suppressing all mutating changes if so (confident that the same ops will be studied at some point in some outer frame that is not within the handling of a GOSUB).

When we reinvoke by recursion, however, any frames used by the caller are not visible to the callee; as such it may still be possible to trigger the same types of problem if reinvocation involves a mix of enframing and recursion.

Extending the fix from 089ad25 to handle this case would involve adding an extra boolean argument was_mutate_ok to study_chunk. All principal calls would pass this in as 0; all recursive calls would pass in the local value of mutate_ok; and the setting of mutate_ok would change to:

    bool mutate_ok = (was_mutate_ok && (!frame || !frame->in_gosub));

I don't intend to make such a change unless we find a testcase to show this is a real rather than a theoretical problem.

steve-m-hay pushed a commit that referenced this issue Jun 1, 2020
As described in #17743, study_chunk can re-enter itself either by
simple recursion or by enframing. 089ad25 used the new mutate_ok
variable to track whether we were within the framing scope of GOSUB,
and to disallow mutating changes to ops if so.

This commit extends that logic to reentry by recursion, passing in
the current state as was_mutate_ok.

(CVE-2020-12723)

(cherry picked from commit 3445383845ed220eaa12cd406db2067eb7b8a741)
@hvds
Copy link
Contributor Author

hvds commented Sep 15, 2020

I don't intend to make such a change unless we find a testcase to show this is a real rather than a theoretical problem.

As it happens a testcase was found by fuzzing soon after; it was assigned CVE-2020-12723 and resolved via 3a1df45 using the approach described here.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

1 participant