Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Extended trace function #9

Open
wants to merge 1 commit into
base: master
Choose a base branch
from
Open

Conversation

ph4r05
Copy link

@ph4r05 ph4r05 commented Jun 13, 2018

Hi!

Thanks for this nice AFL python binding!

I needed a few changes so here is my PR. Motivation is we need to fuzz smartcards with AFL with some instrumentation enabled. As smartcard is a blackbox for us we cannot use neither classical AFL nor python sys.settrace instrumentation.

On the other hand, after the card responds we can somehow recover some information from the run. Naive sources are: return code, return data, timing of the operation. Later some other side channels can be added (e.g., powertraces - working on that right now). The point is we need to add some more info to the shared memory bitmap after fuzzed input is processed.

For that we added a few new methods to the python-afl so we can manually do something like

afl.trace_buff(data)  # data buffer is being hashed
afl.trace_offset(direct_offset_to_memory)  # for direct access to the memory

Pls take a look at the changes and reconsider merging.

Thanks!

@jwilk
Copy link
Owner

jwilk commented Jun 20, 2018

I like the idea in principle, but I'm not sure if the proposed API is the right tool for the job. I need to think about it.

Is the code that uses the new API available? That would help me understand what's actually needed.

Naive sources are: return code, return data, timing of the operation. Later some other side channels can be added

Timing and other side-channels are not fully deterministic. How did you address this?

@ph4r05
Copy link
Author

ph4r05 commented Jun 20, 2018

Hi, thanks for the response!

The code using the API is here:

https://github.com/petrs/pyAPDUFuzzer/blob/7a99eb09735fc1b0500f95edda185638236be348/apdu_fuzzer/main_afl.py#L360

Its a prototype version for simple blackbox APDU fuzzing for smart cards. I've tested it and it works well - AFL is able to find interesting inputs. We were also able to partially recover payload structure for some commands. The published version uses only (error code, response buffer, timing) tuple. We are currently working on incorporating powertrace side-channel, which is not public at the moment (we are writing a paper).

The idea is to use this python-afl project as a low-level binding for AFL and shared memory access due to challenges which blackbox smartcard fuzzing brings. We just need an access to the shared memory segment and a basic interface to AFL, such as afl.loop().

I think the higher level abstraction is highly application dependent thus it is a bit complicated to design a robust API on python-afl layer. The architect/pen-tester probably knows how to map application behavior to the shared memory bitmap read by AFL. Similarly for the case of the side-channels you mentioned. In our case, the timing is quite an accurate side channel. Non-determinism is not a problem if we use binning 10 ms wide or use more clever postprocessing techniques for powertraces (feature extraction). We also need direct access to shared memory bitmap so we have flexibility in transforming our features to bitmaps.

We were using afl.trace_buff variants but then we switched to afl.trace_offset together with xxhash function. The Fowler-Noll-Vo hash function did not perform well on buffers starting with zero (same hash for all buffers with the zero prefix).

Thanks for considering the PR.

@jwilk jwilk mentioned this pull request Jun 22, 2018
2 tasks
@jwilk
Copy link
Owner

jwilk commented Jun 22, 2018

We also need direct access to shared memory bitmap so we have flexibility in transforming our features to bitmaps.

How about #10? I think it's closer to what you need. :-)

The Fowler-Noll-Vo hash function did not perform well on buffers starting with zero (same hash for all buffers with the zero prefix).

Hmm, do you have an example when FNV misbehaves?
(I didn't spend much time testing it, though I guess hash quality is not that important for python-afl proper.)

@ph4r05
Copy link
Author

ph4r05 commented Jun 22, 2018

Hmm, the direct access to the trace map is nice, maybe for further work. But for now, we are fine with the current API as I have it implemented in my fork. The wrapping methods provide simple memory protection (modulo mem size MAP_SIZE) and a possibility to take the previous location / offset into consideration without need to create additional wrapper methods. I personally like simple method which increments particular offset as it abstracts memory sizes and is consistent with the original trace mechanism. Not sure how Cython behaves in I access invalid memory offset.

I could remove trace_buff due to FNV being inappropriate for this task, but I plan to continue using trace_offset.

Regarding FNV, try the following. All have the same hash:

afl.hash32(bytes([0,1]))
afl.hash32(bytes([0,2]))
afl.hash32(bytes([0,255]))
afl.hash32(bytes([0,255,255,255]))
# 2166136261

From the first occurrence of zero-byte in the buffer the hash will be the same.

afl.hash32(bytes([1,1,0,1]))
afl.hash32(bytes([1,1,0,2]))
# 3967033079

@ph4r05
Copy link
Author

ph4r05 commented Jun 22, 2018

And yes, FNV is OK for the original usage as the file name cannot contain zero byte and the offset computation XORs the whole offset in the first iteration thus it seems OK.

- add result from remote service to the SHM trace bitmat
@jwilk
Copy link
Owner

jwilk commented Sep 4, 2018

Relevant thread on afl-users:
https://groups.google.com/d/topic/afl-users/Hv57XTZ-sVw

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants