Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add integration tests and make tests easier to develop #66

Merged
merged 10 commits into from
Jun 5, 2018

Conversation

jameslamb
Copy link
Collaborator

This PR adds a bunch of integration tests and makes it easier to test locally.

  • added creation of an empty index so we can test the "your query was valid but returned nothing" code
  • wrapped all return(NULL) in return(invisible(NULL)) to avoid phantom NULLs in logs
  • changed strategy for scrolling requests. Scrolling queries are not working on ES6 right now. They changed ES6 to this structure.
  • related to that...moved stuff needed for scroll_url down into .keep_on_pullin so that thing can figure out how to format scrolls. This is a cleaner place to handle that anyway
  • added a bunch of integration tests. Including an actual test on scrolling which we somehow never had before
  • added a docker-compose.yml and supporting gnarly bash scripts to make it easier to develop tests locally

Test coverage as of this PR:

image

I think it's important we get this PR reviewed and working before the next release. Addresses #33 , #36 , and to some extent #43

@jameslamb
Copy link
Collaborator Author

uuuugh as I feared, ES 1.x and 2.x have a different strategy for scrolling. ES 5.x must have supported both, and ES6.x only supports the strategy I changed to in this PR.

I don't love this but I have an idea... I think I need to just hit the cluster, parse the version, and use the right method based on major version.

I hate everything.


### Running Tests Locally

When developing on this package, you may want to run Elasticsearch locally to speed up the testing cycle. We've provided some gross bash scripts at the root of this repo to help!
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice

setup_es.sh Outdated
curl --silent \
-X PUT "http://${ES_HOST}/empty_index" \
-H 'Content-Type: application/json' \
-d @shakespeare_mapping.json
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm confused; what makes this index empty? It looks like you're uploading shakespeare_mapping.json again with the same code. Same in the .travis.yml.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

we talked but responding here for other viewers on the PR. This is empty in the sense that we never write any docs to it. Like every query you execute against it is going to return 0 docs.

This seemed like the safest way to get test coverage on the code that handles "you wrote a valid query but it happened to match nothing"

@@ -16,22 +16,181 @@ if (!identical(loggerOptions, list())){
}
futile.logger::flog.threshold(0)

#--- 1. es_search
#--- es_search
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice


# NOTE: Creating an intentionally empty index is the safest way to test
# this functionality. Any other test would involve writing a query
# and I want to avoid exposing our tests to changes in the query DSL
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah I like this idea

@austin3dickey
Copy link
Collaborator

Are you sure that scrolling is causing the problems? Diving into the failed builds, it looks like the errors could be unrelated to scrolling. Some erroring versions don't even use scrolling in the failed tests.

Copy link
Collaborator

@austin3dickey austin3dickey left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Overall I love this PR. It looks great! I'll review again once you fix the failing build.

@jameslamb
Copy link
Collaborator Author

@austin3dickey this is not ready for a second yet. Just want to let you know what I did tonight:

  1. Just added a tiny bit of test data to the repo (instead of pulling all of that shakespeare data). ES6 doesn't allow multiple _types for docs within one index. So I just intentionally saved a small sample of the data that's all a single type
  2. The scrolling difference is definitely a thing...it just wasn't showing up in the logs because other changes I made to support ES 6.2 broke things (like types in indexes). So I implemented a switch that changes scroll request strategies.

Also I rebased to current master after merging #62 . Will keep working on this! I think the package will be in a more stable place after this PR finally gets working

@jameslamb jameslamb force-pushed the testing_stuff branch 7 times, most recently from 745b9b7 to 3b0cb8d Compare May 28, 2018 22:57
@jameslamb
Copy link
Collaborator Author

ok fixed SOME of the issues causing this to fail. Basically the bulk endpoint was rejecting my test data because of a weird ordering of my curl arguments or newlines in the data file "or something".

Now I need to fix the lingering test failures in legacy ES versions, which I think are a combination of the cluster taking too slow to respond (gonna try adding a _refresh post) and slight changes in the aggs API.

updates to follow

@jameslamb
Copy link
Collaborator Author

jameslamb commented Jun 2, 2018

ok @austin3dickey Travis is working right now but I'm feeling confident that it's gonna build. I had to do a lot of crazy stuff today to get it working, but they're all good changes. Like stuff that was actually broken that this testing exposed. TL;DR uptasticsearch as it stands on master is just barely working for legacy versions and doesn't really work for ES6

changes since you last reviewed:

  • when you don't have any aliases set up for a cluster, in ES1.x _cat/aliases returns "[]". In later versions, it returns a NULL or the empty string "". Changed our code in get_fields() to account for the "[]" case.
  • Fixed a wild bug in .major_version(). a CRAZY error was causing major_version() to resolve to version information about the host machine. Like "x86-apple-darwin". That function takes an argument version_string but I was referencing version internally. base::version is available in R by default and gives information on the version of R you're running. FML. added some tests on .major_version and .get_es_version to catch this kind of stuff in the future
  • Added a separate mapping file for ES5. In ES5, Elasticsearch changed the default behavior for how fielddata is handled. You can read more here. Also added that fielddata change to the mapping for ES6
  • Added a _refresh call in test setup. Looking back, it doesn't seem like the absence of this was an issue, but it is still a thing we should do. It's defense against occasional random failing builds.
  • Changed types in ES6 mapping from "keyword" to "text" so results would be the same as the default behavior of "string" type in legacy versions. Basically our aggs tests where failing because in legacy versions with "string" type, "king henry" would show up as the terms "king" and "henry" (separate entries) in a terms agg. With "keyword" type in ES6, they show up as a single thing.
  • Fixed number-of-rows check for aggregation queries. I was checking for the wrong number of rows in the aggregation tests (should have been 4, not 12)
  • upgraded all our ES testing versions to the last release on each major version (e.g. 5.0.2 instead of 5.0.0). No one should be ignoring bugfix releases.
  • removed all the defaults on internal functions. Since we're passing values through from the higher functions to these, we should drop the defaults to ensure that we catch issues where parameters are not getting passed through. This actually caught something!!! We weren't passing ignore_scroll_restriction through to .fetch_all.
  • updated .Rbuildignore to ignore all the new files that might be laying around as a result of local testing
  • removed use of docker-compose and simplified the "how to test stuff locally" setup. Our tests all assume you're only running on version of ES, so docker-compose was overkill. See the updated README for the setup. I can confirm I used the new setup MANY times to test issues specific to one version of another and it was super easy to work with

I think that when we merge this PR, we'll be able to close #36 . There are some lines that are gonna be super tough to cover in tests, so we should leave #33 open for now. Our coverage after merging this will be 85%-ish, but remember that we aren't testing get_counts and that that function is going to be removed in future releases. So our coverage of core functionality is upwards of 95%

Let me know if you have other questions!

@codecov-io
Copy link

codecov-io commented Jun 2, 2018

Codecov Report

Merging #66 into master will increase coverage by 17.09%.
The diff coverage is 91.17%.

Impacted file tree graph

@@             Coverage Diff             @@
##           master      #66       +/-   ##
===========================================
+ Coverage   68.82%   85.92%   +17.09%     
===========================================
  Files           9        9               
  Lines         555      604       +49     
===========================================
+ Hits          382      519      +137     
+ Misses        173       85       -88
Impacted Files Coverage Δ
R/get_counts.R 0% <0%> (ø) ⬆️
R/get_fields.R 98.27% <100%> (+16.45%) ⬆️
R/es_search.R 92.5% <100%> (+38.76%) ⬆️

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update a4ad753...2c916f8. Read the comment docs.

@jameslamb
Copy link
Collaborator Author

HEY IT ACTUALLY BUILT

Copy link
Collaborator

@austin3dickey austin3dickey left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks amazing. What a :hero: you are

@jameslamb
Copy link
Collaborator Author

Hey thanks @austin3dickey for looking it over!! Gonna merge it up

@jameslamb jameslamb merged commit b07ccd9 into uptake:master Jun 5, 2018
@jameslamb jameslamb deleted the testing_stuff branch June 23, 2018 18:16
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants