Skip to content

Commit

Permalink
move survey results into tests
Browse files Browse the repository at this point in the history
  • Loading branch information
Dave Pacheco committed Jul 30, 2012
1 parent adb5504 commit f2931e6
Show file tree
Hide file tree
Showing 11 changed files with 75 additions and 68 deletions.
5 changes: 5 additions & 0 deletions .gitignore
@@ -0,0 +1,5 @@
tests/StringSplitTest.class
tests/java.csv
tests/perl.csv
tests/python.csv
tests/js-strsplit.csv
26 changes: 23 additions & 3 deletions Makefile
Expand Up @@ -22,23 +22,43 @@ NPM = npm
#
# Files
#
JS_FILES := $(shell find lib tests survey -name '*.js')
JS_FILES := $(shell find lib tests -name '*.js')
JSL_CONF_NODE = tools/jsl.node.conf
JSL_FILES_NODE = $(JS_FILES)
JSSTYLE_FILES = $(JS_FILES)

TEST_FILES = java.csv perl.csv python.csv js-strsplit.csv
TEST_OUTPUTS = $(TEST_FILES:%=tests/%)
CLEAN_FILES += $(TEST_OUTPUTS) tests/StringSplitTest.class

#
# Repo-specific targets
#
.PHONY: all
all:
all: $(TEST_OUTPUTS)
$(NPM) install

test:
.PHONY: test
test: $(TEST_OUTPUTS)
tests/tst.strsplit.sh
tests/tst.strpatterns.js
@echo All tests passed.

tests/java.csv: tests/testcases.csv tests/StringSplitTest.class
java -cp tests StringSplitTest < $< > $@

tests/StringSplitTest.class: tests/StringSplitTest.java
javac $^

tests/js-strsplit.csv: tests/testcases.csv tests/strsplit.js
tests/strsplit.js < $< > $@

tests/perl.csv: tests/testcases.csv tests/strsplit.pl
tests/strsplit.pl < $< > $@

tests/python.csv: tests/testcases.csv tests/strsplit.py
tests/strsplit.py < $< > $@

DISTCLEAN_FILES += node_modules

include ./Makefile.targ
40 changes: 40 additions & 0 deletions README.md
Expand Up @@ -71,3 +71,43 @@ and here's strsplit:
[ 'alpha', 'bravo', 'charlie delta' ]

This is the behavior implemented by `split` in Perl, Java, and Python.

## Background: survey of "split" in Java, Perl, and Python

The tests directory contains test cases and test programs in Java, Perl, and
Python for figuring out what these language's string split function does.
Specifically, this is:

* Java: String.split.
* Perl: split.
* Python: re.split. While the "split" method on strings may be more common, it
does not handle regular expressions, while the Java and Perl counterparts do.

For comparison, there's also a test case for this implementation of "strsplit".
in JavaScript.

The test cases here test both a simple string as a splitter (a space) and a
simple regular expression (`\s+`, indicating some non-zero number of whitespace
characters), as well as various values of the optional "limit" parameter.

In summary, in all of the cases tried, the Java and Perl implementations are
identical. The Python implementation differs in a few ways:

* The "limit" argument is off-by-one relative to the Java and Perl APIs. It
represents the maximum number of splits to be made, rather than the maximum
number of returned fields.
* -1 for "limit" is not special, and seems to mean that at most -1 splits will
be made, meaning the string is not split at all. In Java and Perl, -1 means
there is no limit to the number of returned fields.
* Java and Perl strip trailing empty fields when "limit" is 0. Python never
strips trailing empty fields.

JavaScript has a "split" method, but it behaves substantially different than all
of these implementations when "limit" is specified. This implementation of
"strsplit" for JavaScript mirrors the Java and Perl implementations, as the
differences in Python do not seem substantial or better.

The remaining use case that would be nice to address is splitting fields the way
awk(1) and bash(1) do, which is to strip leading whitespace. Python's *string*
split also does this, but only if you specify None as the pattern. strsplit
doesn't support this; just trim the string first if you want that behavior.
19 changes: 0 additions & 19 deletions survey/Makefile

This file was deleted.

39 changes: 0 additions & 39 deletions survey/README.md

This file was deleted.

File renamed without changes.
File renamed without changes.
File renamed without changes.
File renamed without changes.
File renamed without changes.
14 changes: 7 additions & 7 deletions tests/tst.strsplit.sh
Expand Up @@ -2,18 +2,18 @@

#
# The main test suite for strsplit is to run the body of test cases in
# ../survey/ and compare the output to that of Java and Perl, whose
# implementations we intend to mirror exactly. errexit will cause this script
# to exit with failure if any of these operations fail.
# testcases.csv and compare the output to that of Java and Perl, whose
# implementations we intend to mirror exactly. All of these outputs have been
# generated automatically by "make test". errexit will cause this script to
# exit with failure if any of these operations fail.
#
set -o errexit

surveydir=$(dirname $0)/../survey
cd $(dirname $0)

set -o xtrace
make -C $surveydir perl.csv java.csv js-strsplit.csv
diff $surveydir/js-strsplit.csv $surveydir/perl.csv > /dev/null
diff $surveydir/js-strsplit.csv $surveydir/java.csv > /dev/null
diff js-strsplit.csv perl.csv > /dev/null
diff js-strsplit.csv java.csv > /dev/null
set +o xtrace

echo "Test PASSED"

0 comments on commit f2931e6

Please sign in to comment.