New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Soliciting feedback on a (currently failing) attempt to improve "stability" by tweaking the compiler #2
Comments
I realized that I had more questions while writing this. @stedolan, why did you choose to perform the afl-fuzz instrumentation at the
Or is the latter point actually an advantage, that you cover more of the "implicit" control-flow of the program? Intuitively I would rather expect that instrumentation should correspond to user-visible control flow, and indeed all stability examples come from compiler-inserted control flow being instrumented "too much", but maybe instrumenting at the Lambda level would instrument "not enough"? (do we have an example?) (Also, if it's important to instrument more, then why isn't the C runtime instrumented as well? Would different allocation behaviors generate way too many paths within the GC for usable fuzzing?) One implementation approach that I could see would be to introduce an (I think it would actually be even nicer to separate the instrumentation action from the instrumentation "location", so that we have the ability to share the instrumentation location across several distinct source positions. So it could be val new_instrumentation_point : unit -> instrumentation_point
val instrument_afl : instrumentation_point -> unit ) |
Thanks for diving into this tricky problem, and thanks for the excellent write-up! Your understanding of the problem matches mine. I do prefer the static to the dynamic approach here, (although maybe with a simpler implementation!). For testing, it's worth exploring the I think that your repro case is right, although it's dependent on when the GC runs. Here is a reproduction case that's a bit easier to reproduce, as a patch against the unit tests in trunk.
This is a very good question. The honest answer is that this matches most closely what afl itself does (when instrumenting C++), and I didn't really understand the system all that well at the time. I think adding instrumentation at the Lambda level might well be a better approach. I think your |
Thanks for the repro case. I played with your testsuite a bit (it certainly served as inspiration for my own reproduction attempts), but I'm mildly suspicious of it because, unless I'm mistaken, "classes" is currently failing in trunk. |
That's news to me! It works on my machine. Do you have some details of a configuration where it fails? |
It fails when I configure the compiler with testsuite/tests/afl-instrumentation$
OCAMLOPT="../../../ocamlopt.opt -nostdlib -I ../../../stdlib" sh test.sh | grep '\(passed\|failed\)'
lists: passed.
manylists: passed.
exceptions: passed.
objects: passed.
classes: failed:
obj_ordering: passed. (current trunk state is 7a7bf0580d8ad84a44e2b495c400ce0e1e3a095c) |
Ah, so it does! I was compiling without |
I was thinking about the The ideal place to put instrumentation would be somewhere after pattern matches are expanded to branches, but before objects, Lazy, etc. are compiled away. Unfortunately, those are all done at the same time in the compiler. |
We could have a primitive available at several places in the pipeline, with the social contract that producing a new atomic control-flow branch (if or plain switch) requires thinking about whether the destination of the branch should be instrumented or not. The difficulty is that I don't know how to think of instrumentation in a modular way. As a code producer, how do I know whether I was called in a context where instrumentation should or should not happen? Should this intent (coming from my caller) be expressed as additional configuration parameters in the compiler codebase, or as special instructions to scope instrumentation in the emitted code? In either case, can/should this information cross function calls or other abstraction boundaries? |
(while I'm still open to continuing this discussion, I'm closing the issue because I don't think it's a thing to be fixed in this repo) |
I noticed that afl-persistent and crowbar have run into issues with
"stability" that are to be handled at the level of the compiler code
generation strategy:
ocaml/ocaml#1345
(disabling class caching in afl-instrument mode)
and
stedolan/crowbar#14
(yet-unfixed stability issue with `Lazy.force)
are instances of this.
Energized by my use of Crowbar during the excellent Mirage retreat
this year, I have wondered if it was something I could help with, and
tried to implement pieces of a solution at the level of the OCaml
compiler. Specifically I would like to make it easy to say, in the
compiler, "don't instrument this branch" or something like that, and
then this opt-out mechanism could be used to solve those stability
issues. This is the dream; what I have right now is a branch that does
what I wanted it to do, but does not solve stability issues in any way
that I was able to measure (so maybe I wanted the wrong thing). So,
very Work In Progress. I'm posting here to solicit feedback and
comments (cc @stedolan and @yomimono); in particular, I'm curious if
someone think that this approach has a chance to ever work.
The branch is
afl
on my personal fork:https://github.com/gasche/ocaml/tree/afl
there are commits that describe what it does
https://github.com/gasche/ocaml/commits/afl
https://github.com/gasche/ocaml/compare/trunk...gasche:afl?expand=1
Understanding of the problem
The stability issues, if I understand correctly (but then I don't
actually understand what "stability" means in afl-fuzz parlance), come
from code that will run into different control path if they are
executed several times from the same input, when the change-of-control
mechanism are outside the view of the tool. (In the classes case, it
is caching; in the Lazy.force case, if @stedolan's analysis of it is
correct, it is the non-deterministic effect of a GC removing a Forward
tag before or after the user tries to force the already-evaluated
thunk for a second time).
So if we don't want to change how the compiler produces code (I would
rather not), we have to find how to selectively disable the
instrumentation point that are involved in these un-stable
conditionals. For class caching, I had the impression that if we could
make sure that neither the "has the method table already been
computed" test nor any of the method-table-computation code was
instrumented, we would recover stability. For the lazy forcing, I had
the impression that if the only instrumentation points to run were the
ones involved in forcing the trunk, and one when the final value is
returned, then that would be correct (I'm not convinced anymore).
Design of a solution attempt
It seems impossible or at least extremely fragile to try to annotate
individual control-flow expressions at the level of OCaml or Lambda
code, and hope to robustly control instrumentation with that, because
instrumentation happens at the Cmm level and further control-flow
constructs may be inserted by the compiler in the middle.
So I think one would need a way to disable all instrumentation within
one portion (defined as static code zone or dynamic fragment of
execution) of the program. (It would also be useful to be able to
share instrumentation points across two program locations, but that
sounded more difficult to implement so I haven't worked on it.)
My solution attempt is to introduce two primitives that would look
essentially as follows:
val suspend_afl : unit -> unit
val restore_afl : unit -> unit
Between a call to suspend_afl and the nest call to restore_afl, no
instrumentation happens.
I implement these two primitives in the commits
(this propagates the primitives from the frontend to cmm,
where I expect that they will be given a semantics in afl_instrument.ml)
(this ignores then in instruction selection, that is after cmm)
(I'm using commit names instead of hashes as I expect the hashes to
change as I keep rebasing the PR.)
Attempt 1: disabling instrumentation dynamically
The dumbest possible implementation for this is to have a piece of
global state, a boolean, to indicate whether instrumentation is
currently active or suspended:
the {suspend,restore} operations are compiled into writes that
change that global state
the instrumentation code is changed to first check this global flag
before performing any action
I implemented this approach in
(I realize as I'm writing this report that this implementation does
not bracket well, calling "suspend" twice and then "restore" only once
restores execution, and that incrementing and decrementing an integer
would work better. Easy to implement.)
Attempt 2: disabling instrumentation statically
When I started thinking about {suspend,restore}_afl, I had in mind
something closer to static annotations that would direct the
non-emission of afl instrumentation code. Basically the idea is that
after you encounter
suspend_afl
, you stop emitting insturmentationcode, and you start again on the next
restore_afl
.The naive implementation of this assumes that function calls always
start with instrumentation enabled, so that instrumentation always
restarts during inner function calls (so it is not equivalent to the
static semantics above). This is not good for the class caching
scenario, when we want the no-insturmentation zone to cross function
calls. But it would be easy to add a blacklist of specific functions
from CamlinternalOO to start in suspended mode, or to just add
well-placed (suspendafl) instructions at the beginning of those
functions (as a ppx extension or what not).
The way this "static" policy is implemented may qualify as a "cool
hack", as in: I think it's cool, but it is also a horrible
hack. Anyway, this was just for experiment, and I checked the output
with -dcmm and it seems to be doing what I want:
Negative results
I haven't tried to measure stability of object method table
caching. I tried to measure stability of thunk forcing, and the
results are terrible: none of what I have done improves stability, in
fact it decreases it.
I made two attempts at artfully placing {suspend,restore}_afl
instructions in the Lazy.force generated code:
The code emitted by the compiler for Lazy.force looks as follows:
and I tried to change it as follows (in the first commit):
or as follows (in the second commit)
Finally, I decided that what I really wanted was to have both the
Forward_tag and the
_
branches assigned the same instrumentationpoint, so I wrote a different version of the lazy-forcing code
generation strategy that looks like this:
None of that seems to work: without my changes, my reproduction code
(below) sports a stability of 57.14%, and some of what I've done bring
stability down to 0%, or 33%, or at best 52.94%.
It was not easy for me to find code to reproduce the stability issue
for laziness, below is my repro-case. Is it sensible?
The text was updated successfully, but these errors were encountered: