Add integration tests and make tests easier to develop #66

jameslamb · 2018-05-22T04:11:01Z

This PR adds a bunch of integration tests and makes it easier to test locally.

added creation of an empty index so we can test the "your query was valid but returned nothing" code
wrapped all return(NULL) in return(invisible(NULL)) to avoid phantom NULLs in logs
changed strategy for scrolling requests. Scrolling queries are not working on ES6 right now. They changed ES6 to this structure.
related to that...moved stuff needed for scroll_url down into .keep_on_pullin so that thing can figure out how to format scrolls. This is a cleaner place to handle that anyway
added a bunch of integration tests. Including an actual test on scrolling which we somehow never had before
added a docker-compose.yml and supporting gnarly bash scripts to make it easier to develop tests locally

Test coverage as of this PR:

I think it's important we get this PR reviewed and working before the next release. Addresses #33 , #36 , and to some extent #43

jameslamb · 2018-05-22T04:23:53Z

uuuugh as I feared, ES 1.x and 2.x have a different strategy for scrolling. ES 5.x must have supported both, and ES6.x only supports the strategy I changed to in this PR.

I don't love this but I have an idea... I think I need to just hit the cluster, parse the version, and use the right method based on major version.

I hate everything.

austin3dickey · 2018-05-22T13:20:50Z

README.md

+
+### Running Tests Locally
+
+When developing on this package, you may want to run Elasticsearch locally to speed up the testing cycle. We've provided some gross bash scripts at the root of this repo to help!


austin3dickey · 2018-05-22T13:24:45Z

setup_es.sh

+curl --silent \
+     -X PUT "http://${ES_HOST}/empty_index" \
+     -H 'Content-Type: application/json' \
+     -d @shakespeare_mapping.json


I'm confused; what makes this index empty? It looks like you're uploading shakespeare_mapping.json again with the same code. Same in the .travis.yml.

we talked but responding here for other viewers on the PR. This is empty in the sense that we never write any docs to it. Like every query you execute against it is going to return 0 docs.

This seemed like the safest way to get test coverage on the code that handles "you wrote a valid query but it happened to match nothing"

austin3dickey · 2018-05-22T13:25:44Z

tests/testthat/test-integration.R

@@ -16,22 +16,181 @@ if (!identical(loggerOptions, list())){
 }
 futile.logger::flog.threshold(0)

-#--- 1. es_search
+#--- es_search


austin3dickey · 2018-05-22T13:26:38Z

tests/testthat/test-integration.R

+
+        # NOTE: Creating an intentionally empty index is the safest way to test
+        #       this functionality. Any other test would involve writing a query
+        #       and I want to avoid exposing our tests to changes in the query DSL


Yeah I like this idea

austin3dickey · 2018-05-22T13:34:11Z

Are you sure that scrolling is causing the problems? Diving into the failed builds, it looks like the errors could be unrelated to scrolling. Some erroring versions don't even use scrolling in the failed tests.

austin3dickey

Overall I love this PR. It looks great! I'll review again once you fix the failing build.

jameslamb · 2018-05-23T05:46:57Z

@austin3dickey this is not ready for a second yet. Just want to let you know what I did tonight:

Just added a tiny bit of test data to the repo (instead of pulling all of that shakespeare data). ES6 doesn't allow multiple _types for docs within one index. So I just intentionally saved a small sample of the data that's all a single type
The scrolling difference is definitely a thing...it just wasn't showing up in the logs because other changes I made to support ES 6.2 broke things (like types in indexes). So I implemented a switch that changes scroll request strategies.

Also I rebased to current master after merging #62 . Will keep working on this! I think the package will be in a more stable place after this PR finally gets working

jameslamb · 2018-05-28T23:11:57Z

ok fixed SOME of the issues causing this to fail. Basically the bulk endpoint was rejecting my test data because of a weird ordering of my curl arguments or newlines in the data file "or something".

Now I need to fix the lingering test failures in legacy ES versions, which I think are a combination of the cluster taking too slow to respond (gonna try adding a _refresh post) and slight changes in the aggs API.

updates to follow

… bugs for scrolling behavior in ES6.

…s ES 6

…tring types in ES6

…m slow starts

…ving localhost on Travis

…g environment to allow easy build of any ES version

…S versions

…buildignore

jameslamb · 2018-06-02T19:13:07Z

ok @austin3dickey Travis is working right now but I'm feeling confident that it's gonna build. I had to do a lot of crazy stuff today to get it working, but they're all good changes. Like stuff that was actually broken that this testing exposed. TL;DR uptasticsearch as it stands on master is just barely working for legacy versions and doesn't really work for ES6

changes since you last reviewed:

when you don't have any aliases set up for a cluster, in ES1.x _cat/aliases returns "[]". In later versions, it returns a NULL or the empty string "". Changed our code in get_fields() to account for the "[]" case.
Fixed a wild bug in .major_version(). a CRAZY error was causing major_version() to resolve to version information about the host machine. Like "x86-apple-darwin". That function takes an argument version_string but I was referencing version internally. base::version is available in R by default and gives information on the version of R you're running. FML. added some tests on .major_version and .get_es_version to catch this kind of stuff in the future
Added a separate mapping file for ES5. In ES5, Elasticsearch changed the default behavior for how fielddata is handled. You can read more here. Also added that fielddata change to the mapping for ES6
Added a _refresh call in test setup. Looking back, it doesn't seem like the absence of this was an issue, but it is still a thing we should do. It's defense against occasional random failing builds.
Changed types in ES6 mapping from "keyword" to "text" so results would be the same as the default behavior of "string" type in legacy versions. Basically our aggs tests where failing because in legacy versions with "string" type, "king henry" would show up as the terms "king" and "henry" (separate entries) in a terms agg. With "keyword" type in ES6, they show up as a single thing.
Fixed number-of-rows check for aggregation queries. I was checking for the wrong number of rows in the aggregation tests (should have been 4, not 12)
upgraded all our ES testing versions to the last release on each major version (e.g. 5.0.2 instead of 5.0.0). No one should be ignoring bugfix releases.
removed all the defaults on internal functions. Since we're passing values through from the higher functions to these, we should drop the defaults to ensure that we catch issues where parameters are not getting passed through. This actually caught something!!! We weren't passing ignore_scroll_restriction through to .fetch_all.
updated .Rbuildignore to ignore all the new files that might be laying around as a result of local testing
removed use of docker-compose and simplified the "how to test stuff locally" setup. Our tests all assume you're only running on version of ES, so docker-compose was overkill. See the updated README for the setup. I can confirm I used the new setup MANY times to test issues specific to one version of another and it was super easy to work with

I think that when we merge this PR, we'll be able to close #36 . There are some lines that are gonna be super tough to cover in tests, so we should leave #33 open for now. Our coverage after merging this will be 85%-ish, but remember that we aren't testing get_counts and that that function is going to be removed in future releases. So our coverage of core functionality is upwards of 95%

Let me know if you have other questions!

codecov-io · 2018-06-02T19:17:54Z

Codecov Report

Merging #66 into master will increase coverage by 17.09%.
The diff coverage is 91.17%.

@@             Coverage Diff             @@
##           master      #66       +/-   ##
===========================================
+ Coverage   68.82%   85.92%   +17.09%     
===========================================
  Files           9        9               
  Lines         555      604       +49     
===========================================
+ Hits          382      519      +137     
+ Misses        173       85       -88

Impacted Files	Coverage Δ
R/get_counts.R	`0% <0%> (ø)`	⬆️
R/get_fields.R	`98.27% <100%> (+16.45%)`	⬆️
R/es_search.R	`92.5% <100%> (+38.76%)`	⬆️

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update a4ad753...2c916f8. Read the comment docs.

jameslamb · 2018-06-02T19:19:33Z

HEY IT ACTUALLY BUILT

austin3dickey

Looks amazing. What a :hero: you are

jameslamb · 2018-06-05T23:14:47Z

Hey thanks @austin3dickey for looking it over!! Gonna merge it up

jameslamb requested review from ngparas and austin3dickey May 22, 2018 04:11

austin3dickey reviewed May 22, 2018

View reviewed changes

austin3dickey requested changes May 22, 2018

View reviewed changes

jameslamb force-pushed the testing_stuff branch from 8bb6c3b to b210f1d Compare May 23, 2018 05:43

jameslamb force-pushed the testing_stuff branch 7 times, most recently from 745b9b7 to 3b0cb8d Compare May 28, 2018 22:57

jameslamb mentioned this pull request May 29, 2018

Add deprecation warning on get_counts #69

Merged

jameslamb force-pushed the testing_stuff branch 3 times, most recently from f2d8cf0 to 90255c3 Compare May 31, 2018 03:01

James Lamb and others added 5 commits June 2, 2018 11:06

Added a bunch of integration tests and infra for local testing. Fixed…

4b4f1da

… bugs for scrolling behavior in ES6.

Simplified data loading to get around differences in older versions v…

cf900ba

…s ES 6

Split testing indexes into legacy vs new to deal with the change in s…

053920b

…tring types in ES6

Added a _refresh call to testing scripts to avoid false positives fro…

9fe9a22

…m slow starts

Changed hostname in integration tests to deal with weirdness in resol…

6081b87

…ving localhost on Travis

jameslamb force-pushed the testing_stuff branch from 90255c3 to 6081b87 Compare June 2, 2018 16:15

Fixed bugs in get_fields and .major_version

e188682

jameslamb force-pushed the testing_stuff branch from 140db8c to e188682 Compare June 2, 2018 17:00

jameslamb added 4 commits June 2, 2018 13:09

Added separate test mapping for Elasticsearch 5. Updated local testin…

d417872

…g environment to allow easy build of any ES version

Changed ES6 test mapping to give consistent aggs results across all E…

37d9701

…S versions

Removed defaults on internal functions. Added misc testing files to R…

1604214

…buildignore

Added tests on internal functions used to determine ES version

2c916f8

austin3dickey approved these changes Jun 5, 2018

View reviewed changes

jameslamb merged commit b07ccd9 into uptake:master Jun 5, 2018

This was referenced Jun 5, 2018

Integration tests are very minimal right now #36

Closed

R package unit test coverage is not 100% #33

Open

jameslamb deleted the testing_stuff branch June 23, 2018 18:16

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add integration tests and make tests easier to develop #66

Add integration tests and make tests easier to develop #66

jameslamb commented May 22, 2018

jameslamb commented May 22, 2018

austin3dickey May 22, 2018

austin3dickey May 22, 2018

jameslamb May 23, 2018

austin3dickey May 22, 2018

austin3dickey May 22, 2018

austin3dickey commented May 22, 2018

austin3dickey left a comment

jameslamb commented May 23, 2018

jameslamb commented May 28, 2018

jameslamb commented Jun 2, 2018 •

edited

Loading

codecov-io commented Jun 2, 2018 •

edited

Loading

jameslamb commented Jun 2, 2018

austin3dickey left a comment

jameslamb commented Jun 5, 2018


		### Running Tests Locally

		When developing on this package, you may want to run Elasticsearch locally to speed up the testing cycle. We've provided some gross bash scripts at the root of this repo to help!

Add integration tests and make tests easier to develop #66

Add integration tests and make tests easier to develop #66

Conversation

jameslamb commented May 22, 2018

jameslamb commented May 22, 2018

austin3dickey May 22, 2018

Choose a reason for hiding this comment

austin3dickey May 22, 2018

Choose a reason for hiding this comment

jameslamb May 23, 2018

Choose a reason for hiding this comment

austin3dickey May 22, 2018

Choose a reason for hiding this comment

austin3dickey May 22, 2018

Choose a reason for hiding this comment

austin3dickey commented May 22, 2018

austin3dickey left a comment

Choose a reason for hiding this comment

jameslamb commented May 23, 2018

jameslamb commented May 28, 2018

jameslamb commented Jun 2, 2018 • edited Loading

codecov-io commented Jun 2, 2018 • edited Loading

Codecov Report

jameslamb commented Jun 2, 2018

austin3dickey left a comment

Choose a reason for hiding this comment

jameslamb commented Jun 5, 2018

jameslamb commented Jun 2, 2018 •

edited

Loading

codecov-io commented Jun 2, 2018 •

edited

Loading