Extended trace function #9

ph4r05 · 2018-06-13T15:37:00Z

Hi!

Thanks for this nice AFL python binding!

I needed a few changes so here is my PR. Motivation is we need to fuzz smartcards with AFL with some instrumentation enabled. As smartcard is a blackbox for us we cannot use neither classical AFL nor python sys.settrace instrumentation.

On the other hand, after the card responds we can somehow recover some information from the run. Naive sources are: return code, return data, timing of the operation. Later some other side channels can be added (e.g., powertraces - working on that right now). The point is we need to add some more info to the shared memory bitmap after fuzzed input is processed.

For that we added a few new methods to the python-afl so we can manually do something like

afl.trace_buff(data)  # data buffer is being hashed
afl.trace_offset(direct_offset_to_memory)  # for direct access to the memory

Pls take a look at the changes and reconsider merging.

Thanks!

jwilk · 2018-06-20T16:27:31Z

I like the idea in principle, but I'm not sure if the proposed API is the right tool for the job. I need to think about it.

Is the code that uses the new API available? That would help me understand what's actually needed.

Naive sources are: return code, return data, timing of the operation. Later some other side channels can be added

Timing and other side-channels are not fully deterministic. How did you address this?

ph4r05 · 2018-06-20T18:41:20Z

Hi, thanks for the response!

The code using the API is here:

https://github.com/petrs/pyAPDUFuzzer/blob/7a99eb09735fc1b0500f95edda185638236be348/apdu_fuzzer/main_afl.py#L360

Its a prototype version for simple blackbox APDU fuzzing for smart cards. I've tested it and it works well - AFL is able to find interesting inputs. We were also able to partially recover payload structure for some commands. The published version uses only (error code, response buffer, timing) tuple. We are currently working on incorporating powertrace side-channel, which is not public at the moment (we are writing a paper).

The idea is to use this python-afl project as a low-level binding for AFL and shared memory access due to challenges which blackbox smartcard fuzzing brings. We just need an access to the shared memory segment and a basic interface to AFL, such as afl.loop().

I think the higher level abstraction is highly application dependent thus it is a bit complicated to design a robust API on python-afl layer. The architect/pen-tester probably knows how to map application behavior to the shared memory bitmap read by AFL. Similarly for the case of the side-channels you mentioned. In our case, the timing is quite an accurate side channel. Non-determinism is not a problem if we use binning 10 ms wide or use more clever postprocessing techniques for powertraces (feature extraction). We also need direct access to shared memory bitmap so we have flexibility in transforming our features to bitmaps.

We were using afl.trace_buff variants but then we switched to afl.trace_offset together with xxhash function. The Fowler-Noll-Vo hash function did not perform well on buffers starting with zero (same hash for all buffers with the zero prefix).

Thanks for considering the PR.

jwilk · 2018-06-22T10:47:25Z

We also need direct access to shared memory bitmap so we have flexibility in transforming our features to bitmaps.

How about #10? I think it's closer to what you need. :-)

The Fowler-Noll-Vo hash function did not perform well on buffers starting with zero (same hash for all buffers with the zero prefix).

Hmm, do you have an example when FNV misbehaves?
(I didn't spend much time testing it, though I guess hash quality is not that important for python-afl proper.)

ph4r05 · 2018-06-22T11:30:22Z

Hmm, the direct access to the trace map is nice, maybe for further work. But for now, we are fine with the current API as I have it implemented in my fork. The wrapping methods provide simple memory protection (modulo mem size MAP_SIZE) and a possibility to take the previous location / offset into consideration without need to create additional wrapper methods. I personally like simple method which increments particular offset as it abstracts memory sizes and is consistent with the original trace mechanism. Not sure how Cython behaves in I access invalid memory offset.

I could remove trace_buff due to FNV being inappropriate for this task, but I plan to continue using trace_offset.

Regarding FNV, try the following. All have the same hash:

afl.hash32(bytes([0,1]))
afl.hash32(bytes([0,2]))
afl.hash32(bytes([0,255]))
afl.hash32(bytes([0,255,255,255]))
# 2166136261

From the first occurrence of zero-byte in the buffer the hash will be the same.

afl.hash32(bytes([1,1,0,1]))
afl.hash32(bytes([1,1,0,2]))
# 3967033079

ph4r05 · 2018-06-22T12:11:00Z

And yes, FNV is OK for the original usage as the file name cannot contain zero byte and the offset computation XORs the whole offset in the first iteration thus it seems OK.

- add result from remote service to the SHM trace bitmat

jwilk · 2018-09-04T18:01:42Z

Relevant thread on afl-users:
https://groups.google.com/d/topic/afl-users/Hv57XTZ-sVw

ph4r05 force-pushed the generalization branch from 45c3e2b to 10fa45f Compare June 13, 2018 15:42

jwilk mentioned this pull request Jun 22, 2018

[WIP] Expose trace map to Python code #10

Draft

2 tasks

ph4r05 force-pushed the generalization branch from de20a17 to d090e5b Compare June 22, 2018 12:18

extended trace function

e469e1a

- add result from remote service to the SHM trace bitmat

ph4r05 force-pushed the generalization branch from d090e5b to e469e1a Compare June 22, 2018 12:34

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Extended trace function #9

Extended trace function #9

ph4r05 commented Jun 13, 2018

jwilk commented Jun 20, 2018

ph4r05 commented Jun 20, 2018

jwilk commented Jun 22, 2018

ph4r05 commented Jun 22, 2018 •

edited

Loading

ph4r05 commented Jun 22, 2018

jwilk commented Sep 4, 2018

Extended trace function #9

Are you sure you want to change the base?

Extended trace function #9

Conversation

ph4r05 commented Jun 13, 2018

jwilk commented Jun 20, 2018

ph4r05 commented Jun 20, 2018

jwilk commented Jun 22, 2018

ph4r05 commented Jun 22, 2018 • edited Loading

ph4r05 commented Jun 22, 2018

jwilk commented Sep 4, 2018

ph4r05 commented Jun 22, 2018 •

edited

Loading