Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add a track for Metricbeat data #56

Merged
merged 13 commits into from
Mar 18, 2019
68 changes: 68 additions & 0 deletions metricbeat/challenges/default.json
Original file line number Diff line number Diff line change
@@ -0,0 +1,68 @@
{
"name": "append-no-conflicts",
"description": "Indexes the whole document corpus using Elasticsearch default settings. We only adjust the number of replicas as we benchmark a single node cluster and Rally will only start the benchmark if the cluster turns green. Document ids are unique so all index operations are append only. After that a couple of queries are run.",
"default": true,
"schedule": [
{
"operation": "delete-index"
},
{
"operation": {
"operation-type": "create-index",
"settings": {{index_settings | default({}) | tojson}}
}
},
{
"name": "check-cluster-health",
"operation": {
"operation-type": "cluster-health",
"index": "metricbeat",
"request-params": {
"wait_for_status": "{{cluster_health | default('green')}}",
"wait_for_no_relocating_shards": "true"
}
}
},
{
"operation": "index-append",
"warmup-time-period": 0,
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I guess bulk-indexing the document corpus is considered more of a setup task here (i.e. you're not interested in bulk-indexing throughput)? In that case setting the warmup time-period to zero is fine.

"clients": {{bulk_indexing_clients | default(8)}}
},
{
"name": "refresh-after-index",
"operation": "refresh",
"clients": 1
},
{
"operation": "force-merge",
"clients": 1
},
{
"name": "refresh-after-force-merge",
"operation": "refresh",
"clients": 1
},
{
"operation": "default",
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is it intentional that the aggregation queries are not referenced here?

"clients": 1,
"warmup-iterations": 0,
"iterations": 1000,
"target-throughput": 50
},
{
"operation": "autohisto_agg",
"clients": 1,
"warmup-iterations": 50,
"iterations": 100,
"target-throughput": 2
},
{
"operation": "date_histogram_agg",
"clients": 1,
"warmup-iterations": 50,
"iterations": 100,
"target-throughput": 2
}
]
}

1 change: 1 addition & 0 deletions metricbeat/files.txt
Original file line number Diff line number Diff line change
@@ -0,0 +1 @@
documents.json
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This file is needed for the script that prepares the corpus for offline usage (see https://github.com/elastic/rally-tracks/blob/master/download.sh). I think when you're done this should only contain the compressed corpus file (+ a smaller file that contains the test corpus, see our docs).

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this should contain:

documents.json.bz2
documents-1k.json.bz2

For details please see my comment above.

Loading