[FEATURE] Performance Testing for AD Extension #725

owaiskazi19 · 2023-05-03T23:07:05Z

Is your feature request related to a problem?

With the current development of AD Extension on feature/extensions benchmarking numbers are still unknown. This issue talks about using opensearch-benchmark to get the new extension performance numbers.
OpenSearch cluster can be deployed using opensearch-cluster-cdk

Get performance numbers of API's on Single Node cluster Extensions vs Plugin

What solution would you like?

Deploy the cluster using above mentioned CDK. Use the macrobenching tool opensearch-benchmark to run performance test against the deployed cluster.

Steps to setup performance testing:

Setup opensearch cluster
Run ad extension
setup opensearch benchmark in the instance
do benchmarking for the cluster running with Ad extension
do benchmarking for the cluster running with AD plugin
Post benchmarking of cluster explore how to apple to apple comparison between running an api on extension vs plugin
Write Javascript code for perf testing of specific apis to get total time of execution
Run JS code for extensions and plugins
Plot the data in the graph

Api's

vibrantvarun · 2023-06-01T17:40:53Z

Benchmarking of OpenSearch Cluster Running with AD extension.

_______             __   _____

/ () __ / / / /_ ______
/ /_ / / / `/ / / / / / _
/ / / / / / / // / / / / // // / / / /
// /// //_,// /__/_/_// ___/

Metric	Task	Value	Unit
Cumulative indexing time of primary shards		0.0112	min
Min cumulative indexing time across primary shards		0.0112	min
Median cumulative indexing time across primary shards		0.0112	min
Max cumulative indexing time across primary shards		0.0112	min
Cumulative indexing throttle time of primary shards		0	min
Min cumulative indexing throttle time across primary shards		0	min
Median cumulative indexing throttle time across primary shards		0	min
Max cumulative indexing throttle time across primary shards		0	min
Cumulative merge time of primary shards		0	min
Cumulative merge count of primary shards		0
Min cumulative merge time across primary shards		0	min
Median cumulative merge time across primary shards		0	min
Max cumulative merge time across primary shards		0	min
Cumulative merge throttle time of primary shards		0	min
Min cumulative merge throttle time across primary shards		0	min
Median cumulative merge throttle time across primary shards		0	min
Max cumulative merge throttle time across primary shards		0	min
Cumulative refresh time of primary shards		0.0022	min
Cumulative refresh count of primary shards		5
Min cumulative refresh time across primary shards		0.0022	min
Median cumulative refresh time across primary shards		0.0022	min
Max cumulative refresh time across primary shards		0.0022	min
Cumulative flush time of primary shards		0	min
Cumulative flush count of primary shards		0
Min cumulative flush time across primary shards		0	min
Median cumulative flush time across primary shards		0	min
Max cumulative flush time across primary shards		0	min
Total Young Gen GC time		0	s
Total Young Gen GC count		0
Total Old Gen GC time		0	s
Total Old Gen GC count		0
Store size		0.000253449	GB
Translog size		5.12227e-08	GB
Heap used for segments		0	MB
Heap used for doc values		0	MB
Heap used for terms		0	MB
Heap used for norms		0	MB
Heap used for points		0	MB
Heap used for stored fields		0	MB
Segment count		8
Min Throughput	index	5585.73	docs/s
Mean Throughput	index	5585.73	docs/s
Median Throughput	index	5585.73	docs/s
Max Throughput	index	5585.73	docs/s
50th percentile latency	index	166.95	ms
100th percentile latency	index	173.231	ms
50th percentile service time	index	166.95	ms
100th percentile service time	index	173.231	ms
error rate	index	0	%
Min Throughput	wait-until-merges-finish	96.03	ops/s
Mean Throughput	wait-until-merges-finish	96.03	ops/s
Median Throughput	wait-until-merges-finish	96.03	ops/s
Max Throughput	wait-until-merges-finish	96.03	ops/s
100th percentile latency	wait-until-merges-finish	10.1884	ms
100th percentile service time	wait-until-merges-finish	10.1884	ms
error rate	wait-until-merges-finish	0	%
Min Throughput	default	19.63	ops/s
Mean Throughput	default	19.63	ops/s
Median Throughput	default	19.63	ops/s
Max Throughput	default	19.63	ops/s
100th percentile latency	default	55.6047	ms
100th percentile service time	default	4.49441	ms
error rate	default	0	%
Min Throughput	range	60.82	ops/s
Mean Throughput	range	60.82	ops/s
Median Throughput	range	60.82	ops/s
Max Throughput	range	60.82	ops/s
100th percentile latency	range	21.4162	ms
100th percentile service time	range	4.84291	ms
error rate	range	0	%
Min Throughput	distance_amount_agg	36.31	ops/s
Mean Throughput	distance_amount_agg	36.31	ops/s
Median Throughput	distance_amount_agg	36.31	ops/s
Max Throughput	distance_amount_agg	36.31	ops/s
100th percentile latency	distance_amount_agg	31.2152	ms
100th percentile service time	distance_amount_agg	3.54171	ms
error rate	distance_amount_agg	0	%
Min Throughput	autohisto_agg	49.36	ops/s
Mean Throughput	autohisto_agg	49.36	ops/s
Median Throughput	autohisto_agg	49.36	ops/s
Max Throughput	autohisto_agg	49.36	ops/s
100th percentile latency	autohisto_agg	25.6165	ms
100th percentile service time	autohisto_agg	5.19717	ms
error rate	autohisto_agg	0	%
Min Throughput	date_histogram_agg	104.86	ops/s
Mean Throughput	date_histogram_agg	104.86	ops/s
Median Throughput	date_histogram_agg	104.86	ops/s
Max Throughput	date_histogram_agg	104.86	ops/s
100th percentile latency	date_histogram_agg	14.149	ms
100th percentile service time	date_histogram_agg	4.4783	ms
error rate	date_histogram_agg	0	%

[INFO] SUCCESS (took 13 seconds)

vibrantvarun · 2023-06-19T23:29:59Z

minalsha · 2023-06-20T17:05:33Z

Thanks @vibrantvarun for updating the issue with benchmarking data. Is there anything else pending for this issue? If not, can we close it? Thanks cc @dbwiddis

vibrantvarun · 2023-06-20T17:12:22Z

@minalsha There is nothing pending from my side. I have covered following tasks mentioned below in the issue

Benchmarking of opensearch cluster running extensions
Performance testing of extensions API's vs plugin Api's

@dbwiddis waiting for your insights on it.

Thanks

minalsha · 2023-06-20T18:22:33Z

Thanks @vibrantvarun . Curious why Create/get/Search Detector graphs are not consistent? Please share the findings. Thanks

dbwiddis · 2023-06-21T19:08:51Z

Looks great, thanks @vibrantvarun !

Some context: the human brain can't really detect differences less than 250 ms, so that is somewhat of a good threshold for performance differences.

Some thoughts:

As expected there is some latency in the transport communications. Search detector and Get Detector are probably the best one to evaluate that impact as it's mostly a single REST call so there are 4 transport hops. The network latency only looks like about 25-50ms total for those. It looks like there's a lot of variation visually but it's a small number of milliseconds that are probably within various measurement tolerances.
Create Detector also has a bit of variance, but there are a lot of back-and-forth network hops for that (checking non-existence of indices, creating them, checking existence). This call used to take almost 13 seconds so it's awesome that @joshpalis work to rewrite cluster state calls brought it down to a few hundred milliseconds.
Preview and Profile probably shift some work from the cluster to the extension node. Not quite sure how to explain the length of time and variance, but could be related to different JVM settings on extension node vs. OpenSearch cluster nodes. If we do further fine tuning work we should probably focus on these to identify where the bottlenecks are and if there's any way to work around them.

minalsha · 2023-07-12T17:06:20Z

Closing this issue since we completed AD Extension API performance testing

owaiskazi19 added enhancement New feature or request untriaged labels May 3, 2023

owaiskazi19 assigned vibrantvarun May 3, 2023

dbwiddis mentioned this issue May 17, 2023

[BUG] Single-node OpenSearch crashes when connecting to an extension on a remote node #761

Closed

minalsha closed this as completed Jul 12, 2023

minalsha removed the untriaged label Jul 12, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[FEATURE] Performance Testing for AD Extension #725

[FEATURE] Performance Testing for AD Extension #725

owaiskazi19 commented May 3, 2023 •

edited by vibrantvarun

vibrantvarun commented Jun 1, 2023

vibrantvarun commented Jun 19, 2023

minalsha commented Jun 20, 2023

vibrantvarun commented Jun 20, 2023 •

edited

minalsha commented Jun 20, 2023

dbwiddis commented Jun 21, 2023

minalsha commented Jul 12, 2023

[FEATURE] Performance Testing for AD Extension #725

[FEATURE] Performance Testing for AD Extension #725

Comments

owaiskazi19 commented May 3, 2023 • edited by vibrantvarun

Is your feature request related to a problem?

What solution would you like?

vibrantvarun commented Jun 1, 2023

/ () ____ / / / /_____ ________ / /_ / / __ / __ `/ / __ / / __ / / _ / __/ / / / / / // / / / / // // / / / __/ // /// //_,// /__/_/_// ___/

[INFO] SUCCESS (took 13 seconds)

vibrantvarun commented Jun 19, 2023

minalsha commented Jun 20, 2023

vibrantvarun commented Jun 20, 2023 • edited

minalsha commented Jun 20, 2023

dbwiddis commented Jun 21, 2023

minalsha commented Jul 12, 2023

owaiskazi19 commented May 3, 2023 •

edited by vibrantvarun

/ () __ / / / /_ ______
/ /_ / / / `/ / / / / / _
/ / / / / / / // / / / / // // / / / /
// /// //_,// /__/_/_// ___/

vibrantvarun commented Jun 20, 2023 •

edited