Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Rollover max docs should only count primaries #24977

Merged
merged 10 commits into from
Jun 13, 2017

Conversation

fred84
Copy link
Contributor

@fred84 fred84 commented May 31, 2017

max_doc condition for index rollover should use document count only from primary shards

Fixes #24217

@elasticmachine
Copy link
Collaborator

Since this is a community submitted pull request, a Jenkins build has not been kicked off automatically. Can an Elastic organization member please verify the contents of this patch and then kick off a build manually?

@karmi
Copy link
Contributor

karmi commented May 31, 2017

Hi @fred84, we have found your signature in our records, but it seems like you have signed with a different e-mail than the one used in yout Git commit. Can you please add both of these e-mails into your Github profile (they can be hidden), so we can match your e-mails to your Github profile?

Copy link
Contributor

@bleskes bleskes left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you for the PR. The change looks good but I would love to move the test to be a unit test. I left a suggestion on how to do it.

- do:
indices.create:
index: logs-1
wait_for_active_shards: all
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would much prefer it if you add a unit test in TransportRolloverActionTests instead. The REST yaml tests are there to make sure we pass requests and read responses correctly but it has a big overhead to test simple inner behavior with it. To do so you can add an overload of evaluateConditions that takes IndicesStatsResponse and translates it to a call to an evaluateConditions which gets the number of docs (it seems that's the only stats we use in here

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for suggestion. I'll create unit test. Should I keep yaml test?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yeah please keep it

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@bleskes I added 2 more tests to TransportRolloverActionTests

@fred84
Copy link
Contributor Author

fred84 commented May 31, 2017

@karmi I have used incorrect email in commit. Should I create new PR?

@s1monw
Copy link
Contributor

s1monw commented May 31, 2017

@karmi I have used incorrect email in commit. Should I create new PR?

@fred84 you can change you setting for git and then rebase the PR and force push to you branch that should fix it?

@clintongormley clintongormley changed the title 24217 rollover max docs Rollover max docs should only count primaries May 31, 2017
@clintongormley clintongormley added :Data Management/Indices APIs APIs to create and manage indices and templates >bug labels May 31, 2017
Copy link
Contributor

@bleskes bleskes left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you @fred84! I think we are moving in the right direction. I would like to suggest another simplifications. We already have a fairly extended test to test the condition logic (testEvaluateConditions). We don't really need to duplicate the testing logic in it. I think we need to test here is only the conversion between IndicesStatsResponse and the DocStats used for the conditions. We can do so, using the new method you added, by adding a single test call testDocStatsSelection (or something like) that has a single custom condition when asserts that the numbers you got are what you expect. You can then randomize the stats in a IndicesStatsResponse and use the custom condition to assert that the right ones (i.e., the primaries) have been passed into it.

@fred84
Copy link
Contributor Author

fred84 commented Jun 7, 2017

Thanks for suggestions, @bleskes. I updated the PR. The only thing I disagree that we need randomize stats in test. "testDocStatsSelectionFromPrimariesOnly" checks only that right value passed from IndicesStatResponse to Condition. There is no manipulation with this value inside "evaluateConditions".

@s1monw
Copy link
Contributor

s1monw commented Jun 7, 2017

@elasticmachine ok to test

@karmi
Copy link
Contributor

karmi commented Jun 7, 2017

@fred84, sorry, I've missed your notification, I can see the CLA check is green now, @s1monw's advice was spot on, thanks for updating the e-mail!

@s1monw
Copy link
Contributor

s1monw commented Jun 7, 2017

I see this in the build logs:

{p0=indices.rollover/20_max_doc_condition/Rollover conditions matched with replica node}]: after test
  1> [2017-06-07T03:35:53,692][INFO ][o.e.t.r.IntegTestZipClientYamlTestSuiteIT] Stash dump on failure [{
  1>   "stash" : {
  1>     "body" : null
  1>   }
  1> }]
ERROR   30.0s | IntegTestZipClientYamlTestSuiteIT.test {p0=indices.rollover/20_max_doc_condition/Rollover conditions matched with replica node} <<< FAILURES!
   > Throwable #1: java.lang.RuntimeException: Failure at [indices.rollover/20_max_doc_condition:4]: listener timeout after waiting for [30000] ms
   > 	at __randomizedtesting.SeedInfo.seed([856786C5C842552:800247B6F27848AA]:0)
   > 	at org.elasticsearch.test.rest.yaml.ESClientYamlSuiteTestCase.executeSection(ESClientYamlSuiteTestCase.java:346)
   > 	at org.elasticsearch.test.rest.yaml.ESClientYamlSuiteTestCase.test(ESClientYamlSuiteTestCase.java:328)
   > 	at java.lang.Thread.run(Thread.java:748)
   > Caused by: java.io.IOException: listener timeout after waiting for [30000] ms
   > 	at org.elasticsearch.client.RestClient$SyncResponseListener.get(RestClient.java:660)
   > 	at org.elasticsearch.client.RestClient.performRequest(RestClient.java:219)
   > 	at org.elasticsearch.client.RestClient.performRequest(RestClient.java:191)
   > 	at org.elasticsearch.test.rest.yaml.ClientYamlTestClient.callApi(ClientYamlTestClient.java:169)
   > 	at org.elasticsearch.test.rest.yaml.ClientYamlTestExecutionContext.callApiInternal(ClientYamlTestExecutionContext.java:157)
   > 	at org.elasticsearch.test.rest.yaml.ClientYamlTestExecutionContext.callApi(ClientYamlTestExecutionContext.java:89)
   > 	at org.elasticsearch.test.rest.yaml.section.DoSection.execute(DoSection.java:221)
   > 	at org.elasticsearch.test.rest.yaml.ESClientYamlSuiteTestCase.executeSection(ESClientYamlSuiteTestCase.java:344)
   > 	... 37 more```

@fred84 can you take a look at it?

@bleskes
Copy link
Contributor

bleskes commented Jun 7, 2017

The rest test are run (sometimes) against a 1 node cluster. This means that replicas won't always be assigned. This wait_for_active_shards: all condition in the rest test makes it fail. I suggest just removing it. We also probably don't need explicit setting replicas to 1. Better leave to the default imo.

@fred84
Copy link
Contributor Author

fred84 commented Jun 7, 2017

@bleskes @s1monw This yml test make no sense when running without replica (and will fail anyway with different number of shards/replicas). Is it possible to run it on specific number of shards/replicas? Or better remove this test?

@s1monw
Copy link
Contributor

s1monw commented Jun 8, 2017

@bleskes @s1monw This yml test make no sense when running without replica (and will fail anyway with different number of shards/replicas). Is it possible to run it on specific number of shards/replicas? Or better remove this test?

we run test with one node and with multiple nodes I think your test will be fine it should just use defaults and remove the wait_for_active_shards: all condition to make sure it also passes in a single node enviroment

@bleskes
Copy link
Contributor

bleskes commented Jun 8, 2017

This yml test make no sense when running without replica

I agree it will be great if we can pin down the test to always fail without your fix. Sadly this is not possible today. Doing it like we suggested means it will fail sometimes (if the replicas were fast enough to allocate). That's better than nothing.

(and will fail anyway with different number of shards/replicas)

I'm not sure I follow this one. can you clarify?

@fred84
Copy link
Contributor Author

fred84 commented Jun 8, 2017

@bleskes

(and will fail anyway with different number of shards/replicas)

I'm not sure I follow this one. can you clarify?

Test had checks for docs count in both "total" and "primaries":

- match:    { _all.primaries.docs.count: 1 }
- match:    { _all.total.docs.count: 2 } 

It was done to explicitly demonstrate that condition will be applied only after primaries reach max_docs. So test was expected to fail with replica count other then 1.

The rest test are run (sometimes) against a 1 node cluster. This means that replicas won't always be assigned. This wait_for_active_shards: all condition in the rest test makes it fail. I suggest just removing it. We also probably don't need explicit setting replicas to 1. Better leave to the default imo.

we run test with one node and with multiple nodes I think your test will be fine it should just use defaults and remove the wait_for_active_shards: all condition to make sure it also passes in a single node enviroment

@s1monw @bleskes Thanks for suggestions!

@fred84
Copy link
Contributor Author

fred84 commented Jun 9, 2017

":distribution:integ-test-zip:integTest" now works fine, but ":qa:mixed-cluster:integTest" fails most of the times locally. I will try to figure it out.

@fred84
Copy link
Contributor Author

fred84 commented Jun 11, 2017

@bleskes I currently added "skip before 5.6.1 version" in yml to pass mixed cluster test, but I'm not sure that it is proper version.

@bleskes
Copy link
Contributor

bleskes commented Jun 13, 2017

@fred84 that makes sense for now, I'll fix it later based on how we decide to backport this. Thanks for all the itertations

@bleskes bleskes merged commit 1c95cbc into elastic:master Jun 13, 2017
bleskes pushed a commit that referenced this pull request Jun 13, 2017
max_doc condition for index rollover should use document count only from primary shards 

Fixes #24217
bleskes pushed a commit that referenced this pull request Jun 13, 2017
max_doc condition for index rollover should use document count only from primary shards 

Fixes #24217
bleskes pushed a commit that referenced this pull request Jun 13, 2017
max_doc condition for index rollover should use document count only from primary shards 

Fixes #24217
jasontedor added a commit to jasontedor/elasticsearch that referenced this pull request Jun 14, 2017
* master: (27 commits)
  Refactor TransportShardBulkAction.executeUpdateRequest and add tests
  Make sure range queries are correctly profiled. (elastic#25108)
  Test: allow setting socket timeout for rest client (elastic#25221)
  Migration docs for elastic#25080 (elastic#25218)
  Remove `discovery.type` BWC layer from the EC2/Azure/GCE plugins elastic#25080
  When stopping via systemd only kill the JVM, not its control group (elastic#25195)
  Remove PrefixAnalyzer, because it is no longer used.
  Internal: Remove Strings.cleanPath (elastic#25209)
  Docs: Add note about which secure settings are valid (elastic#25212)
  Indices.rollover/10_basic should refresh to make the doc visible in lucene stats
  Port support for commercial GeoIP2 databases from Logstash. (elastic#24889)
  [DOCS] Add ML node to node.asciidoc (elastic#24495)
  expose simple pattern tokenizers (elastic#25159)
  Test: add setting to change request timeout for rest client (elastic#25201)
  Fix secure repository-hdfs tests on JDK 9
  Add target_field parameter to gsub, join, lowercase, sort, split, trim, uppercase (elastic#24133)
  Add Cross Cluster Search support for scroll searches (elastic#25094)
  Adapt skip version in rest-api-spec/test/indices.rollover/20_max_doc_condition.yml
  Rollover max docs should only count primaries (elastic#24977)
  Add remote cluster infrastructure to fetch discovery nodes. (elastic#25123)
  ...
@fred84 fred84 deleted the 24217_rollover_max_docs branch June 22, 2017 19:10
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

7 participants