Add reindex operation #92

henningandersen · 2019-11-25T13:02:42Z

Add reindex operation to http_logs track in order to
verify reindex performance before and after resilient
reindex implementation as well as keep an eye on it
for the future.

Relates elastic/elasticsearch#42612

Add reindex operation to http_logs track in order to verify reindex performance before and after resilient reindex implementation as well as keep an eye on it for the future. Relates elastic/elasticsearch#42612

dliappis

Thanks for the PR. I left some comments, mostly in the form of questions.

dliappis · 2019-11-27T12:05:13Z

http_logs/challenges/default.json

+          "clients": 1
+        },
+        {
+          "name": "reindex",


If you don't want to use a custom runner, you use the raw-request operation.

It would then look like:

{ "name": "reindex", "operation": { "operation-type": "raw-request", "method": "POST", "path": "/_reindex", "body": { "source": { "index": "logs-*" }, "dest": { "index": "reindexed-logs" } }, "request-params": { "request_timeout": 7200 } }, "clients": 1 }

The primary purpose of the custom runner is to ensure that the request_timeout is passed to the python "driver" since it uses a low timeout by default. This was based on elastic/rally#567 and elastic/rally#669. I think that if I add it into request-params, it will simply be sent to the elasticsearch server as a http request-parameter?

I see this can be confusing.

For raw-request, any request_params specified in the operation get passed as params -- by Rally -- to the es.transport.perform_request method. In turn this method when it detects the request_timeout key in the params it passes it as timeout in the connection method.

So to summarize either for specific Rally operations that directly support setting request_timeout as added in elastic/rally#567 / elastic/rally#669 / your custom runner or defining it as a key in the request-params dict of the raw-request operation, ultimately it will be converted to a connection timeout by the Elasticsearch Python client.

dliappis · 2019-11-27T12:06:33Z

http_logs/challenges/default.json

+                "index": "reindexed-logs"
+              }
+            },
+            "request_timeout": 7200


Did you intend to make this operation blocking too? In other words, do you want to add wait_for_completion=true so that there is a record of how long this operation takes?

One FYI thing to be generally aware of is that if there are proxies involved in the path between Rally and Elasticsearch the proxy may timeout the request abruptly (if wait_for_completion=true) and then the Elasticsearch Python client will retry.

Reindex defaults to wait_for_completion=true, so I think that part is OK?

I am not sure there is anything to do about the proxy issue? Is there maybe a way to instruct the python client to never retry and instead fail (unless we have scenarios where a retry is expected)? That would be a general rally change I think?

Ah great, I missed that true is the default for reindex! I don't think there's anything to be done, I just raised it as a FYI in case you want to run it against an external environment and get surprising results.

dliappis · 2019-11-27T12:07:16Z

http_logs/track.py

@@ -0,0 +1,5 @@
+def reindex(es, params):


See the earlier comment about the use of raw-request if you'd like to avoid adding a custom runner.

dliappis · 2019-11-27T12:10:16Z

http_logs/challenges/default.json

@@ -121,6 +121,22 @@
          "warmup-iterations": 200,
          "iterations": 100,
          "target-throughput": 0.5
+        },
+        {


What is the intention of adding reindex to the default append-no-conflicts challenge? Do you want to get some sort of graph displayed in https://elasticsearch-benchmarks.elastic.co/#tracks/http-logs/nightly/default/30d and if yes what metrics?

Good question. I think it could be either docs/s or total run-time (docs/s is probably better, but harder to find)? What will I need to do in order to get it to show up?

To get metrics (depending on what the _reindex API returns) you should stick with the runner that you've created and then return a tuple or dict as described here -- under This function can return: -- where you can define weight/unit. You can additionally specify more KV pairs that will be appended in the meta section of the metric records.

I removed the addition to the default challenge and added docs/s metric for reindex in c86dcc2

No longer run reindex as part of the default challenge, since it takes too long. Changed reindex to report back docs/s instead of ops/s

…dd_reindex_operation

henningandersen · 2020-01-06T15:28:50Z

This should be ready for another round, sorry about the delay on this one.

dliappis

LGTM.

@henningandersen how far back do you want to backport this?

Add new challenge with reindex operation to http_logs track in order to verify reindex performance before and after resilient reindex implementation. Relates elastic/elasticsearch#42612

Add reindex operations

f91561b

Add reindex operation to http_logs track in order to verify reindex performance before and after resilient reindex implementation as well as keep an eye on it for the future. Relates elastic/elasticsearch#42612

henningandersen added the enhancement label Nov 25, 2019

dliappis reviewed Nov 27, 2019

View reviewed changes

henningandersen added 2 commits January 6, 2020 15:54

Only add explicit challenge

c86dcc2

No longer run reindex as part of the default challenge, since it takes too long. Changed reindex to report back docs/s instead of ops/s

Merge remote-tracking branch 'origin/master' into enhance_http_logs_a…

36717a9

…dd_reindex_operation

henningandersen requested a review from dliappis January 6, 2020 15:28

dliappis approved these changes Jan 9, 2020

View reviewed changes

henningandersen merged commit 84281be into elastic:master Jan 9, 2020

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add reindex operation #92

Add reindex operation #92

henningandersen commented Nov 25, 2019

dliappis left a comment

dliappis Nov 27, 2019

henningandersen Nov 27, 2019

dliappis Nov 27, 2019

dliappis Nov 27, 2019

dliappis Nov 27, 2019

henningandersen Nov 27, 2019

dliappis Nov 27, 2019

dliappis Nov 27, 2019

dliappis Nov 27, 2019

henningandersen Nov 27, 2019

dliappis Nov 27, 2019

henningandersen Jan 6, 2020

henningandersen commented Jan 6, 2020

dliappis left a comment

Add reindex operation #92

Add reindex operation #92

Conversation

henningandersen commented Nov 25, 2019

dliappis left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

henningandersen commented Jan 6, 2020

dliappis left a comment

Choose a reason for hiding this comment