New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

envoy: modified identification of corpus path #1607

Merged
merged 4 commits into from Jul 18, 2018

Conversation

Projects
None yet
5 participants
@anirudhmurali
Copy link
Contributor

anirudhmurali commented Jul 11, 2018

[envoy] - A python script (find_corpus.py) along with build.sh and Dockerfile which reads the BUILD file to identify the exact corpus path for a specific fuzzer target, as of now it does an append _corpus which prevents from using custom corpus directory names or sharing of corpus among fuzz targets.

Signed-off-by: Anirudh M m.anirudh18@gmail.com

envoy: modified identification of corpus path
Signed-off-by: Anirudh M <m.anirudh18@gmail.com>
@googlebot

This comment has been minimized.

Copy link

googlebot commented Jul 11, 2018

Thanks for your pull request. It looks like this may be your first contribution to a Google open source project (if not, look below for help). Before we can look at your pull request, you'll need to sign a Contributor License Agreement (CLA).

📝 Please visit https://cla.developers.google.com/ to sign.

Once you've signed (or fixed any issues), please reply here (e.g. I signed it!) and we'll verify it.


What to do if you already signed the CLA

Individual signers
Corporate signers
@anirudhmurali

This comment has been minimized.

Copy link
Contributor

anirudhmurali commented Jul 11, 2018

I signed it!

@googlebot

This comment has been minimized.

Copy link

googlebot commented Jul 11, 2018

CLAs look good, thanks!

@kcc

This comment has been minimized.

Copy link
Contributor

kcc commented Jul 11, 2018

@htuch

This comment has been minimized.

Copy link
Contributor

htuch commented Jul 11, 2018

@kcc ACK, will review by tomorrow.

@htuch
Copy link
Contributor

htuch left a comment

I can think of a bunch of ways of doing this cleaner, e.g. reading the BUILD file in as a Python file and mocking out the macros, or doing a multi-line repeated regex on envoy_cc_fuzz_target patterns, but what you have here looks like a good solution for unblocking the shared corpus work, so let's go with it with the suggested fixes. Thanks.

@@ -22,6 +22,8 @@ export CXXFLAGS="$CXXFLAGS -fno-sanitize=vptr"
declare -r FUZZER_TARGETS_CC=$(find . -name *_fuzz_test.cc)
declare -r FUZZER_TARGETS="$(for t in ${FUZZER_TARGETS_CC}; do echo "${t:2:-3}"; done)"

echo FUZZER_TARGETS_CC

This comment has been minimized.

@htuch

htuch Jul 12, 2018

Contributor

I don't think this does anything useful; it should be echo "${FUZZER_TARGETS_CC}" if you want to keep it around.

@@ -61,10 +63,11 @@ bazel build --verbose_failures --dynamic_mode=off --spawn_strategy=standalone \
# Copy out test binaries from bazel-bin/ and zip up related test corpuses.
for t in ${FUZZER_TARGETS}
do
CORPUS_TARGET=`python $SRC/find_corpus.py $t`

This comment has been minimized.

@htuch

htuch Jul 12, 2018

Contributor

I would write this CORPUS_TARGET=$(python "${SRC}"/find_corpus.py "$t") to ensure proper escaping.

@@ -0,0 +1,15 @@
#!/usr/bin/python
import os, sys, re

This comment has been minimized.

@htuch

htuch Jul 12, 2018

Contributor

Should ideally use Google Python style here; https://github.com/google/styleguide/blob/gh-pages/pyguide.md. For the spacing, I would write:

#!/usr/bin/python

import os
import re
import sys

<rest of code>
@@ -0,0 +1,15 @@
#!/usr/bin/python
import os, sys, re
fuzzer_target = sys.argv[1].split('/')

This comment has been minimized.

@htuch

htuch Jul 12, 2018

Contributor

Use "/" quote style to be consistent with elsewhere in this file.

#!/usr/bin/python
import os, sys, re
fuzzer_target = sys.argv[1].split('/')
directory, fuzz_test = fuzzer_target[:-1], fuzzer_target[-1]

This comment has been minimized.

@htuch

htuch Jul 12, 2018

Contributor

s/directory/directory_segments/

import os, sys, re
fuzzer_target = sys.argv[1].split('/')
directory, fuzz_test = fuzzer_target[:-1], fuzzer_target[-1]
directory_string = "/".join(directory)

This comment has been minimized.

@htuch

htuch Jul 12, 2018

Contributor

s/directory_string/directory/

fuzzer_target = sys.argv[1].split('/')
directory, fuzz_test = fuzzer_target[:-1], fuzzer_target[-1]
directory_string = "/".join(directory)
path = "../envoy/" + directory_string + "/BUILD"

This comment has been minimized.

@htuch

htuch Jul 12, 2018

Contributor

This is a bit evil (what if the caller moves to a different relative path before invoking?), but it's fine for the duct tape we're doing in the oss-fuzz Docker image.

directory_string = "/".join(directory)
path = "../envoy/" + directory_string + "/BUILD"
with open(path, "r") as f:
searchlines = f.readlines()

This comment has been minimized.

@htuch

htuch Jul 12, 2018

Contributor

Please use 2-space indents.

with open(path, "r") as f:
searchlines = f.readlines()
for i, line in enumerate(searchlines):
if fuzz_test in line:

This comment has been minimized.

@htuch

htuch Jul 12, 2018

Contributor

Do you want to make this stricter? E.g. if "name = \"fuzz_test\"" in line?

This comment has been minimized.

@anirudhmurali

anirudhmurali Jul 12, 2018

Contributor

Well, fuzz_test is not string, it's a variable which has something like h1_capture_fuzz_test, I look for its occurrences as I traverse the file. Adding name = wouldn't be needed here. It can't be any more strict than this, this gets the exact occurrence.

for l in searchlines[i:]:
if "corpus =" in l:
corpus_path = l
break

This comment has been minimized.

@htuch

htuch Jul 12, 2018

Contributor

If you fail to find a corpus line, we should probably raise some error or assert here.

addressed comments
Signed-off-by: Anirudh M <m.anirudh18@gmail.com>
corpus_path = l
break
if not corpus_path:
raise Exception("No corpus path for the given fuzz target")

This comment has been minimized.

@htuch

htuch Jul 12, 2018

Contributor

Two space indents

import re

fuzzer_target = sys.argv[1].split("/")
(directory_segments, fuzz_test) = (fuzzer_target[:-1], fuzzer_target[-1])

This comment has been minimized.

@htuch

htuch Jul 12, 2018

Contributor

No need for parens.


with open(path, 'r') as f:
searchlines = f.readlines()
for (i, line) in enumerate(searchlines):

This comment has been minimized.

@htuch

htuch Jul 12, 2018

Contributor

No need for parens.

import sys
import re

fuzzer_target = sys.argv[1].split("/")

This comment has been minimized.

@htuch

htuch Jul 12, 2018

Contributor

There's still inconsistency in this file in the use of ' vs. '"`. Please stick to one consistent style.

if "corpus =" in l:
corpus_path = l
break
#!/usr/bin/pxython

This comment has been minimized.

@htuch

htuch Jul 12, 2018

Contributor

Looks like there is a typo now pxython.

addressed comments
Signed-off-by: Anirudh M <m.anirudh18@gmail.com>
@htuch

htuch approved these changes Jul 12, 2018

Copy link
Contributor

htuch left a comment

LGTM, thanks.

import re

fuzzer_target = sys.argv[1].split('/')
directory_segments, fuzz_test = fuzzer_target[:-1], fuzzer_target[-1]

This comment has been minimized.

@inferno-chromium

inferno-chromium Jul 13, 2018

Collaborator

please use pythonic functions :) os.path.dirname and os.path.basename

fuzzer_target = sys.argv[1].split('/')
directory_segments, fuzz_test = fuzzer_target[:-1], fuzzer_target[-1]
directory = '/'.join(directory_segments)
path = '../envoy/' + directory + '/BUILD'

This comment has been minimized.

@inferno-chromium

inferno-chromium Jul 13, 2018

Collaborator

please use os.path.join('..', 'envoy', ......)

directory_segments, fuzz_test = fuzzer_target[:-1], fuzzer_target[-1]
directory = '/'.join(directory_segments)
path = '../envoy/' + directory + '/BUILD'
corpus_path = ""

This comment has been minimized.

@inferno-chromium

inferno-chromium Jul 13, 2018

Collaborator

better to use None

@@ -61,10 +61,11 @@ bazel build --verbose_failures --dynamic_mode=off --spawn_strategy=standalone \
# Copy out test binaries from bazel-bin/ and zip up related test corpuses.
for t in ${FUZZER_TARGETS}
do
CORPUS_TARGET=$(python "${SRC}"/find_corpus.py "$t")

This comment has been minimized.

@inferno-chromium

inferno-chromium Jul 13, 2018

Collaborator

TARGET_CORPUS sounds better for name.

import re

fuzzer_target = sys.argv[1].split('/')
directory_segments, fuzz_test = fuzzer_target[:-1], fuzzer_target[-1]

This comment has been minimized.

@inferno-chromium

inferno-chromium Jul 13, 2018

Collaborator

s/fuzz_test/fuzz_target_name or fuzzer_target_name

addressed comments
Signed-off-by: Anirudh M <m.anirudh18@gmail.com>
@htuch

This comment has been minimized.

Copy link
Contributor

htuch commented Jul 18, 2018

@inferno-chromium any chance you could take another pass? Thanks.

@inferno-chromium inferno-chromium merged commit cd8e557 into google:master Jul 18, 2018

1 check passed

cla/google All necessary CLAs are signed

tmatth added a commit to tmatth/oss-fuzz that referenced this pull request Oct 22, 2018

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment