Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

AFLnwe seeds timestamps are non-sequential #6

Open
acidghost opened this issue Mar 12, 2021 · 9 comments
Open

AFLnwe seeds timestamps are non-sequential #6

acidghost opened this issue Mar 12, 2021 · 9 comments

Comments

@acidghost
Copy link

While doing some testing I noticed that the timestamps of the seeds stored in the queue folder of AFLnwe runs are non-sequential:

$ ll bftpd/out-bftpd-aflnwe-3/queue | head -n50
-rw-rw-r-- 1 acidghost acidghost   86 Mar  3 17:26 id:000000,orig:seed_1.raw
-rw------- 1 acidghost acidghost  108 Mar  5 13:05 id:000001,orig:seed_10.raw
-rw-rw-r-- 1 acidghost acidghost  109 Mar  3 17:26 id:000002,orig:seed_11.raw
-rw------- 1 acidghost acidghost  108 Mar  5 13:17 id:000003,orig:seed_12.raw
-rw------- 1 acidghost acidghost   56 Mar  5 13:07 id:000004,orig:seed_13.raw
-rw-rw-r-- 1 acidghost acidghost  102 Mar  3 17:26 id:000005,orig:seed_2.raw
-rw------- 1 acidghost acidghost  107 Mar  5 13:08 id:000006,orig:seed_3.raw
-rw-rw-r-- 1 acidghost acidghost  120 Mar  3 17:26 id:000007,orig:seed_4.raw
-rw------- 1 acidghost acidghost  139 Mar  5 13:10 id:000008,orig:seed_5.raw
-rw------- 1 acidghost acidghost   44 Mar  5 13:10 id:000009,orig:seed_6.raw
-rw------- 1 acidghost acidghost   44 Mar  5 13:10 id:000010,orig:seed_7.raw
-rw------- 1 acidghost acidghost   40 Mar  5 13:11 id:000011,orig:seed_8.raw
-rw------- 1 acidghost acidghost   40 Mar  5 13:11 id:000012,orig:seed_9.raw
-rw------- 1 acidghost acidghost    1 Mar  5 13:04 id:000013,src:000000,op:havoc,rep:64,+cov
-rw------- 1 acidghost acidghost   67 Mar  5 13:04 id:000014,src:000000,op:havoc,rep:8,+cov
-rw------- 1 acidghost acidghost   90 Mar  5 13:04 id:000015,src:000000,op:havoc,rep:8
-rw------- 1 acidghost acidghost  108 Mar  5 13:04 id:000016,src:000000,op:havoc,rep:4,+cov
-rw------- 1 acidghost acidghost   78 Mar  5 13:23 id:000017,src:000000,op:havoc,rep:4
-rw------- 1 acidghost acidghost   65 Mar  5 13:04 id:000018,src:000000,op:havoc,rep:2
-rw------- 1 acidghost acidghost   57 Mar  5 13:04 id:000019,src:000000,op:havoc,rep:32
-rw------- 1 acidghost acidghost   86 Mar  5 13:04 id:000020,src:000000,op:havoc,rep:2
-rw------- 1 acidghost acidghost   73 Mar  5 13:04 id:000021,src:000000,op:havoc,rep:4,+cov
-rw------- 1 acidghost acidghost   86 Mar  5 13:04 id:000022,src:000000,op:havoc,rep:8
-rw------- 1 acidghost acidghost  125 Mar  5 13:04 id:000023,src:000000,op:havoc,rep:16
-rw------- 1 acidghost acidghost   58 Mar  5 13:11 id:000024,src:000000,op:havoc,rep:4,+cov
-rw------- 1 acidghost acidghost   85 Mar  5 13:04 id:000025,src:000000,op:havoc,rep:16
-rw------- 1 acidghost acidghost   86 Mar  5 13:04 id:000026,src:000000,op:havoc,rep:2
-rw------- 1 acidghost acidghost  113 Mar  5 13:04 id:000027,src:000000,op:havoc,rep:8
-rw------- 1 acidghost acidghost  122 Mar  5 13:04 id:000028,src:000000,op:havoc,rep:32
-rw------- 1 acidghost acidghost   62 Mar  5 13:04 id:000029,src:000000,op:havoc,rep:32
-rw------- 1 acidghost acidghost   90 Mar  5 13:04 id:000030,src:000000,op:havoc,rep:8
-rw------- 1 acidghost acidghost   86 Mar  5 13:04 id:000031,src:000000,op:havoc,rep:2
-rw------- 1 acidghost acidghost  104 Mar  5 13:04 id:000032,src:000000,op:havoc,rep:16
-rw------- 1 acidghost acidghost   76 Mar  5 13:04 id:000033,src:000000,op:havoc,rep:32
-rw------- 1 acidghost acidghost   77 Mar  5 13:04 id:000034,src:000000,op:havoc,rep:8
-rw------- 1 acidghost acidghost   91 Mar  5 13:04 id:000035,src:000000,op:havoc,rep:64
-rw------- 1 acidghost acidghost   97 Mar  5 13:04 id:000036,src:000000,op:havoc,rep:16
-rw------- 1 acidghost acidghost   86 Mar  5 13:04 id:000037,src:000000,op:havoc,rep:2
-rw------- 1 acidghost acidghost   14 Mar  5 13:04 id:000038,src:000000,op:havoc,rep:128
-rw------- 1 acidghost acidghost   80 Mar  5 13:11 id:000039,src:000000,op:havoc,rep:16,+cov
-rw------- 1 acidghost acidghost   76 Mar  5 13:04 id:000040,src:000000,op:havoc,rep:4
-rw------- 1 acidghost acidghost   63 Mar  5 13:04 id:000041,src:000000,op:havoc,rep:8
-rw------- 1 acidghost acidghost   16 Mar  5 13:04 id:000042,src:000000,op:havoc,rep:32
-rw------- 1 acidghost acidghost    7 Mar  5 13:04 id:000043,src:000000,op:havoc,rep:32
-rw------- 1 acidghost acidghost   36 Mar  5 13:04 id:000044,src:000000,op:havoc,rep:64
-rw------- 1 acidghost acidghost  100 Mar  5 13:04 id:000045,src:000000,op:havoc,rep:8
-rw------- 1 acidghost acidghost   78 Mar  5 13:12 id:000046,src:000000,op:havoc,rep:4,+cov
-rw------- 1 acidghost acidghost   89 Mar  5 13:04 id:000047,src:000000,op:havoc,rep:8
-rw------- 1 acidghost acidghost   86 Mar  5 13:04 id:000048,src:000000,op:havoc,rep:2

The timestamps for the initials seeds (i.e. ending in .raw) are also not updated while pivoting them from the input seeds folder (see above as the dates are 3 and 5 Mar):
https://github.com/aflnet/aflnwe/blob/113102a3ba552028e6fb0193cc2039503def7ef4/afl-fuzz.c#L3303

When generating plots the cut-off time may turn out wrong because it's based on the timestamp of the first initial (linked) seed:

start = df2.iloc[0, 0]
#get all rows given a cutoff time
df3 = df2[df2['time'] <= start + time*60]

for f in $(echo $folder/$testdir/*.raw); do
time=$(stat -c %Y $f)

for f in $(echo $folder/$testdir/*.raw); do
time=$(stat -c %Y $f)

for f in $(echo $folder/$testdir/*.raw); do
time=$(stat -c %Y $f)

(this is the same for all targets/subjects)

This bug can cause the plots for AFLnwe to be totally wrong as they're cut too short. In the example above the difference in timestamps from the first seed and the others is in days (the run was 1 hour long), causing the plot for AFLnwe to only pick up some of the initial seeds.

This does not seem to be the case for AFLnet as it uses the replayable-queue where files are created from scratch.

@thuanpv
Copy link
Collaborator

thuanpv commented Mar 15, 2021

Hi @acidghost,

Thank you for reporting these issues! As you pointed out, the old timestamp issue is caused by the link_or_copy function -- if a seed is linked to the queue folder, its timestamp is not updated. I have tried to modify the function so that it always copies seeds and the issue seems to be fixed.

Edited:

  • Regarding the non-sequential timestamps issue, I have observed the same issue in the decompressed folders (stored inside results-* folder on host computer) but the timestamps of the original files in the Docker containers seem correct -- the these timestamps are used for generating the CSV files so I think the code coverage results are OK. There might be an issue while compressing/decompressing the result folders leading to non-sequential timestamps. We need to investigate a bit more.

Thuan

@thuanpv
Copy link
Collaborator

thuanpv commented Mar 15, 2021

I have completed an experiment for LightFTP, using the same commands written in README, and the graph shows a bit better result for AFLNwe. See the attached figure.
cov_over_time

@acidghost
Copy link
Author

acidghost commented Mar 15, 2021

I can confirm that the timestamps from the ls above was from an extracted archive and not from the container (cannot verify the contents of the container as I have deleted them).

If I try to check the timestamps in the generated CSV, I still see incoherent values (at least in the beginning):
window_2021-03-15-14-44-25

Edit: I count 17 unexpected timestamps in this specific run

❯ awk -F, 'BEGIN {cnt=0;prev=0}; $4 == 3 && $3 == "aflnwe" && $5 == "b_abs" {if (prev > $1) { print(strftime("%T %F", $1)); cnt++} prev=$1}; END {print cnt}' bftpd/results.csv
17:26:56 2021-03-03
13:07:01 2021-03-05
17:26:56 2021-03-03
17:26:56 2021-03-03
13:07:01 2021-03-05
13:04:08 2021-03-05
13:04:13 2021-03-05
13:04:21 2021-03-05
13:04:41 2021-03-05
13:05:28 2021-03-05
13:07:07 2021-03-05
13:07:28 2021-03-05
13:09:17 2021-03-05
13:11:48 2021-03-05
13:16:50 2021-03-05
13:43:37 2021-03-05
13:49:36 2021-03-05
17

@thuanpv
Copy link
Collaborator

thuanpv commented Mar 16, 2021

Can you please rebuild the Docker image and run the experiments again? And please let me know if you still observe the issues. Thanks.

@acidghost
Copy link
Author

acidghost commented Mar 19, 2021

@thuanpv It seems like the issue is still there. The seeds don't seem to have extremely old timestamps, but the timestamps are still non-sequential (even some seeds have wrong timestamp):

$ for f in queue/id*; do echo "$(date -r $f +'%F %T %N') : $f"; done
2021-03-19 11:58:43 000000000 : queue/id:000000,orig:seed_1.raw
2021-03-19 12:00:35 000000000 : queue/id:000001,orig:seed_10.raw
2021-03-19 11:58:43 000000000 : queue/id:000002,orig:seed_11.raw
2021-03-19 11:58:43 000000000 : queue/id:000003,orig:seed_12.raw
2021-03-19 12:01:44 000000000 : queue/id:000004,orig:seed_13.raw
2021-03-19 11:58:43 000000000 : queue/id:000005,orig:seed_2.raw
2021-03-19 12:03:35 000000000 : queue/id:000006,orig:seed_3.raw
2021-03-19 11:58:43 000000000 : queue/id:000007,orig:seed_4.raw
2021-03-19 11:58:43 000000000 : queue/id:000008,orig:seed_5.raw
2021-03-19 12:05:52 000000000 : queue/id:000009,orig:seed_6.raw
2021-03-19 12:06:03 000000000 : queue/id:000010,orig:seed_7.raw
2021-03-19 11:58:43 000000000 : queue/id:000011,orig:seed_8.raw
2021-03-19 11:58:43 000000000 : queue/id:000012,orig:seed_9.raw
2021-03-19 11:58:52 000000000 : queue/id:000013,src:000000,op:havoc,rep:128,+cov
2021-03-19 11:58:52 000000000 : queue/id:000014,src:000000,op:havoc,rep:16,+cov
2021-03-19 11:58:53 000000000 : queue/id:000015,src:000000,op:havoc,rep:8,+cov
2021-03-19 12:06:42 000000000 : queue/id:000016,src:000000,op:havoc,rep:32,+cov
2021-03-19 11:58:53 000000000 : queue/id:000017,src:000000,op:havoc,rep:32
2021-03-19 12:06:53 000000000 : queue/id:000018,src:000000,op:havoc,rep:8
2021-03-19 11:58:54 000000000 : queue/id:000019,src:000000,op:havoc,rep:8
2021-03-19 11:58:54 000000000 : queue/id:000020,src:000000,op:havoc,rep:32
2021-03-19 11:58:55 000000000 : queue/id:000021,src:000000,op:havoc,rep:2,+cov
2021-03-19 11:58:55 000000000 : queue/id:000022,src:000000,op:havoc,rep:8
2021-03-19 11:58:56 000000000 : queue/id:000023,src:000000,op:havoc,rep:4
2021-03-19 11:58:56 000000000 : queue/id:000024,src:000000,op:havoc,rep:2
2021-03-19 11:58:57 000000000 : queue/id:000025,src:000000,op:havoc,rep:64

@thuanpv
Copy link
Collaborator

thuanpv commented Mar 21, 2021

Thanks @acidghost. Interesting! In AFLnwe, we did not make any changes to AFL regarding its code to save test cases. Let us look into it more deeply.

@rnatella
Copy link
Collaborator

rnatella commented Apr 30, 2021

I did some investigation on this issue. When the file is created in the "queue" folder for the first time, it has the correct ctime. When AFL saves a new interesting input, the ctime for the "src" input file also gets updated. Note that the "src" input file is mmap-ed (even if the fd is read-only), and that it is munmap-ed by abandon_entry() when the new interesting input is saved.

It seems that timestamps with mmap are not really reliable, see also https://yarchive.net/comp/linux/mtime_mmap.html or https://apenwarr.ca/log/20181113 . I did not yet check why this issue surfaced with AFLnwe. As for the solution, we may want to save a second copy of the interesting inputs in another folder, or we may include the timestamp in the file name (but this may break some script).

@rnatella
Copy link
Collaborator

rnatella commented May 3, 2021

After more debugging, it turns out that there is no weird problem with mmap() and timestamps (fortunately!).
When a new interesting input is found, it is immediately enqueued, but it is trimmed only when it is dequeued for further fuzzing. The trimmed test case is saved back on the disk, thus updating the timestamp.

I wrote a patch for AFLnwe to add a new option to take control of this. If you pass -z '-trimmed', it will save the trimmed test case in a separate file, with the same name of the original file suffixed with '-trimmed'. If you pass -z '', the trimmed test case is not saved. We may want to pass -z '' in run.sh when calling afl-fuzz.

@thuanpv can you kindly check the attached patch, and commit it on AFLnwe? I don't have permissions on the repo for a pull request.

save-trimmed-test-cases.patch.zip

@maxammann
Copy link

Uh this made me spend hours today on investigating. Sadly I only found this now.

Here is the patch for better visibility: tlspuffin/aflnwe@69cd1eb

So as I can see by setting -z to an empty string no more trimmed test cases are written. That should work! I'll test this!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants