Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

unit tests sometimes have sporadic failures #7

Closed
brarcher opened this issue Dec 23, 2015 · 4 comments
Closed

unit tests sometimes have sporadic failures #7

brarcher opened this issue Dec 23, 2015 · 4 comments

Comments

@brarcher
Copy link
Contributor

It has been observed at least on OSX that some of the unit tests sometimes have sporadic failures. Following are some example failures as output by tests/run:

-n o tests/ts1.sh: 
sort: string comparison failed: Illegal byte sequence
sort: Set LC_ALL='C' to work around the problem.
sort: The strings compared were `\'v\217\347' and `\376"i\3317.\345\250V\333w>\346\311\203\034\316=\337~\233n\320\325\005\371\320Sp\301|\247"\036\024\221\247\016\213\222;\256=<c&\3224'.
-n  o tests/tr2.sh: 
sort: string comparison failed: Illegal byte sequence
sort: Set LC_ALL='C' to work around the problem.
sort: The strings compared were `6\030v\317\3133о\366|\263htv9\0173\340\2421\275F\a\232\360DJ\017\233\037:)\241\023\375\350.\272\r<6\201\002\330\203+\005\221#\355$\343\321F\357T\036\264g[>]\344\200Ę\265s\236\031E\302-\220ǰܺob<\210\004\32415\246\300{ˏ\030\270xrژ\335/\243_ވ8\255y\a\177\362\234!\251N\336\322\371\325p\024\f\241\353&#6\371\204\313\020V\031\311\210V\302\004\\\237\374\316\215!i\357s\231,P\373+\346\303\310tX\300\355\177\247R\347u:3bA-\2148\03114\361\271k\241\376\247/\033\271S\\|,\a#\200w\237\374\002\232!\024\316\346\371C\017\370\354˕\343\241\301\244\025\2763\000iÜP\340\021.\001\301\246\304\363\233\266\022!\030\232L\024\204\311K\030\340\3249.\310\354\a_\t\374{j.$0\021q\267\252<\021\023\260\301Z\235m\005\330H\342~\016\t\242\310\303Oڏ\210S\311\177\275\240\345AwQ g\334\370\302\336\021\207\r}`;8\326Ҵ\270.\363q6\325J,\234(\253QƼ\226V\310\301$W\231A\273\033\000\251\274ѥ\321\322\027\320\000뚦{@\277~-\205\343݅E\200\341\032\203\240\027\3338\366z\351CM6\177C\201\312(N\273\346\201d\200\032\177\371*\177\sort: string comparison failed: Illegal byte sequence
sort: Set LC_ALL='C' to work around the problem.
sort: The strings compared were `\rX\266\204(a) (b)' and `\rX\266\204'.

Some of the unit tests when they fail do not emit output to help diagnose the failure. Here is an example invoking tests/ab.sh directly:

$ rc=0
$ attempt=1
$ while [ $rc -eq 0 ]; do tests/ab.sh bin/radamsa ; rc=$?; echo $attempt; attempt=$((attempt+1)); done
1
2
3
4
5
6
7
8
9
10
11
12
$

After 12 attempts a failure was observed, but the reason for the failure is not emitted.

Likely it is the expectation that the unit test results be consistent. If the current revision in git is under development and the sporadic failure is expected or some cleanup is still underway, kindly ignore. I was unable to determine if release v0.4's unit tests encountered sporadic failures as issue #5 affects the v0.4 release.

As a comparison, release v0.3 had consistently passing unit tests.

(As a side note, at least on OSX the built-in echo command in sh does not support the -n option. This is the reason that "-n" is printed before all of the tests in tests/run . Consider reworking when echo is used in that script so that the -n option is unnecessary, if relevant).

@aoh
Copy link
Owner

aoh commented Dec 23, 2015

Hi,

Great! Build issues are very welcome. I'll fix soonish when I have spare time. My *BSD buildbots are currently offline, so might be that I haven't noticed some issue on BSDish platforms.

There have been many issues with unit tests on OSX due to minor differences in arguments etc. Might make sense to use the version of owl used for building also for sorting and echoing in tests.

@aoh
Copy link
Owner

aoh commented Dec 23, 2015

So the behavior is intended, because radamsa is supposed to pad the input with a low probability with random data, if the input is very short. This is done to improve coverage of test with tiny fixed inputs. The only issues is with OSX sort getting confused by non-textual data. This does not matter, because the tests are probabilistic and are expected to fail on occasion, in which case they are tried again many times before tests/run considers them to fail. Sort stderr is now piped to /dev/null, so it doesn't get in the way.

Does the build work otherwise on OSX?

@brarcher
Copy link
Contributor Author

The only issues is with OSX sort getting confused by non-textual data.

Oh, I did not realize that the output was not representative of a failure. Sorry for the false alarm on those tests.

I've modified ab.sh as follows to determine why it is sometimes failing:

# check bad string insertion happens as intended (more likely within quoted area)
mkdir -p tmp

echo '-----------------------------------------------------------------""---------------------------------------------------------------------------' \
   | $@ -m ab -p od -n 20 > tmp/ab.sh.tmp
cat tmp/ab.sh.tmp | grep -q '^-*\".*%.*\"-*$'
rc=$?
if [ $rc -ne 0 ]; then
   echo "Unexpected output:"
   cat tmp/ab.sh.tmp
   exit 1
fi

Attached is one the output from one such test run which failed.

ab.sh.tmp

This does not matter, because the tests are probabilistic and are expected to fail on occasion, in which case they are tried again many times before tests/run considers them to fail.

When building Radamsa the unit tests automatically run. As unit tests may sporadically fail, could the unit tests be sectioned into their own make target (e.g. "make check") so that they do not run automatically? Otherwise, one may build Radamsa, see a failure, and wonder if the sporadic failure indicates an issue with Radamsa on one's platform.

Sure, I can understand that Radamsa, being probabilistic, may not always fuzz as expected. However, is it possible to modify the unit tests which may fail with some probability so that the tests will be more deterministic?

Does the build work otherwise on OSX?

Seems to work so far. I've not tried the TCP stuff yet, which looks interesting.

@aoh
Copy link
Owner

aoh commented Dec 24, 2015

Makes sense. There is now a separate test build target, which runs the tests if necessary. I also added a thanks-section to readme.md.

@aoh aoh closed this as completed Dec 24, 2015
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants