Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

End-to-End Tests #70

Merged
merged 7 commits into from Jul 5, 2018
Merged

End-to-End Tests #70

merged 7 commits into from Jul 5, 2018

Conversation

niklas88
Copy link
Member

@niklas88 niklas88 commented Jun 29, 2018

This PR adds a simple End-to-End testing mechanism to start tackling #68. Using a bash script as the test driver together with a simple query tool written in python. It tests the entire data set to query pipeline on the scientist-collection

The queries are stored in an easy to edit YAML file and include per query checks for the result. At the moment I have only added a small set of example queries. I plant to significantly expand these soon.

One particular check that is still missing is one that verifies order.

@niklas88
Copy link
Member Author

niklas88 commented Jul 2, 2018

@floriankramer can you give this a quick look before I merge? This doesn't change anything on QLever so should be very safe. I'm not sure whether we should merge with the current set of test queries and then add new ones or add them before the merge. Currently I'm trending towards merging before since this would mean that the other pending PRs get the End-to-End Test treatment as soon as possible and we can immediately add specific example queries for the introduced features together with that PR. On the other hand there are still some existing features without corresponding test queries. Especially patterns and GROUP_CONCAT

Copy link
Member

@floriankramer floriankramer left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good to me all in all apart from a couple of minor things. I would second merging this as soon as possible and then adding more tests in later commits.

README.md Outdated

**Note**: This does not include compilation and unit tests, though these are
also run on Travis CI. Refer to the previous section for Unit Tests and
compilation. Also this does assume
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This sentence appears to be missing its second half.

e2e/e2e.sh Outdated

# Travis CI is super cool but also uses ancient OS images
if [ -f "/usr/bin/python3.6" ]; then
export PYTHON_BINARY="/usr/bin/python3.6"
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why does this check specifically for the python 3.6 executable, and not just python3? On a normal setup python3 should point to the newest python3 available and python3 versions should be backwards compatible with other python3 versions.

Copy link
Member Author

@niklas88 niklas88 Jul 4, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Because Travis CI runs Ubuntu 14.04 and the system Python can't cope with the typing package. So the .travis.yml installs python3.6 from the deadsnakes repo but that doesn't override the system python.

e2e/e2e.sh Outdated
cd build && ./IndexBuilderMain -a -l -i "../$INDEX" \
-n "../e2e_data/scientist-collection/scientists.nt" \
-w "../e2e_data/scientist-collection/scientists.wordsfile.tsv" \
-d "../e2e_data/scientist-collection/scientists.docsfile.tsv"
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

To enable tests than use the patterns trick either -P or --patterns is required during index creation.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done

e2e/e2e.sh Outdated

# Launch the Server using the freshly baked index
(
cd build && ./ServerMain -i "../$INDEX" -p 9099 -t -a -l &> server_log.txt
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This also needs the -P or --patterns flag if it should use the patterns trick.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done

e2e/queryit.py Outdated
import json
import yaml

class BCol:
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This could use the builtin Enum class for proper enum like behaviour. (https://docs.python.org/3.6/library/enum.html)

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This has the problem that then I can no longer concatenate BCol.BOLD directly to a string because it would then convert to "BCol.BOLD"

e2e/e2e.sh Outdated
fi


mkdir -p "e2e_data"
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This assumes that the script is called from the root folder of the repo. When using the script for manual testing it might be easier to be able to call the script from anywhere and then have the script cd to the scripts location here (e.g. using ${BASH_SOURCE[0]}.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh I didn't know about ${BASH_SOURCE[0]} very handy, thanks.

Add a simple query script, bash script for driving the test and a set of
queries against the scientists-collection.

Currently we are only checking the results for sucess and minimal
fields. Further checks will come in a later commit
This adds a way to add checks to the Query YAML including checks for:

- Number of rows in the result
- Number of columns in the result
- Exactly matched result
- Result contains row

Checks for the exact result and contained row allow ignoring cells (by
using null in the gold result) and matching floats to an epsilon.

See the e2e/scientists_queries.yaml for examples
Also we add an option to skip creation of the index
The directory is created by Travis' caching mechanism
We use FILTER(?x < ?y) to make sure we only get one direction for each
marriage relation.
Automatically detect the project directory using ${BASH_SOURCE[0]}.

Don't use subshells for starting the ServerMain, it introduced a subtle
bug since the SERVER_PID was actually the PID of the subshell.  Turns
out there's no easy way to get the correct PID out of the subshell, so
instead we just use pushd/popd and no subshell.
@niklas88 niklas88 merged commit 2cbf8d1 into ad-freiburg:master Jul 5, 2018
@niklas88 niklas88 deleted the e2e-test branch July 5, 2018 09:58
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

2 participants