New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Adds a geoshapes track #61
Conversation
Adds a track from OSM-derived geoshape data. Closes elastic#60
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Woohoo! I'm not familiar enough with Rally to do a proper review though so let's wait for Daniel to review. :)
geoshape/files.txt
Outdated
@@ -0,0 +1,2 @@ | |||
documents.json.bz2 | |||
documents-1k.json.bz2 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
is the latter still needed?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, it is needed in order to support test-mode.
@imotov I'm curious whether you already ran this benchmark with both legacy quad-tree based geoshapes and the new bkd-backed geo shapes? |
@jpountz I tried to run it on 6.5.4 with default settings but it died after going through 3% of the data set. I suspect it probably ran out of memory but I don't know for sure. The result for master can be found here. There are some failures that still need to be investigated. I am pretty sure the indexing issues are related to elastic/elasticsearch#26286 but search issues are somewhat puzzling. I am planning to look into it next. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for the PR. I left a few comments. I also tried to run it against latest master of ES (revision 046f86f
) with:
esrally --track=geoshape --on-error=abort
However, indexing failed immediately with:
[ERROR] Cannot race. Error in load generator [3]
('Request returned an error. Error type: bulk, Description: HTTP status: 400, message: failed to parse field [shape] of type [geo_shape]', None)
The mapping it has created, looks fine to me:
{
"osmgeoshapes" : {
"mappings" : {
"_doc" : {
"dynamic" : "strict",
"properties" : {
"shape" : {
"type" : "geo_shape"
}
}
}
}
}
}
This is probably related to what you mention in #61 (comment).
I also checked your gist quickly and from the large variation in indexing throughput it seems to me the system is not properly warmed up. Thus I suggest you increase the warmup time-period.
geoshape/README.md
Outdated
* `bulk_size` (default: 5000) | ||
* `bulk_indexing_clients` (default: 8): Number of clients that issue bulk indexing requests. | ||
* `ingest_percentage` (default: 100): A number between 0 and 100 that defines how much of the document corpus should be ingested. | ||
* `conflict_probability` (default: 25): A number between 0 and 100 that defines the probability of id conflicts. This requires to run the respective challenge. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Are we interested in update performance? If not I suggest to remove this parameter.
geoshape/README.md
Outdated
* `bulk_indexing_clients` (default: 8): Number of clients that issue bulk indexing requests. | ||
* `ingest_percentage` (default: 100): A number between 0 and 100 that defines how much of the document corpus should be ingested. | ||
* `conflict_probability` (default: 25): A number between 0 and 100 that defines the probability of id conflicts. This requires to run the respective challenge. | ||
* `on_conflict` (default: "index"): Whether to use an "index" or an "update" action when simulating an id conflict. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Are we interested in update performance? If not I suggest to remove this parameter.
geoshape/README.md
Outdated
* `ingest_percentage` (default: 100): A number between 0 and 100 that defines how much of the document corpus should be ingested. | ||
* `conflict_probability` (default: 25): A number between 0 and 100 that defines the probability of id conflicts. This requires to run the respective challenge. | ||
* `on_conflict` (default: "index"): Whether to use an "index" or an "update" action when simulating an id conflict. | ||
* `recency` (default: 0): A number between 0 and 1 that defines whether to bias towards more recent ids when simulating conflicts. See the [Rally docs](http://esrally.readthedocs.io/en/latest/track.html#bulk) for the full definition of this parameter. This requires to run the respective challenge. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Are we interested in update performance? If not I suggest to remove this parameter.
### Example Document | ||
|
||
```json | ||
{ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can you please provide the script that you have used to generate the corpus? Otherwise we are unable to update it in case this is necessary.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sure. The script basically takes the file that @nknize provided in #60 (comment) and converts it into a single element JSON, so we are unlikely to run it ever again. But I will add it just in case.
geoshape/challenges/default.json
Outdated
] | ||
}, | ||
{ | ||
"name": "append-no-conflicts-index-only", |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think we should remove this challenge. geopoints
still has it but only for backwards compatibility reasons and we will also remove it there at some point.
geoshape/challenges/default.json
Outdated
] | ||
}, | ||
{ | ||
"name": "append-fast-with-conflicts", |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think we should remove this challenge unless we are interested in benchmarking update performance with this specific corpus.
geoshape/files.txt
Outdated
@@ -0,0 +1,2 @@ | |||
documents.json.bz2 | |||
documents-1k.json.bz2 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, it is needed in order to support test-mode.
geoshape/index.json
Outdated
@@ -0,0 +1,19 @@ | |||
{ | |||
"settings": { | |||
"index.number_of_shards": {{number_of_shards | default(5)}}, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Unless you want to compare geoshape
with geopoint
we should probably go with the new default of Elasticsearch (which is one shard) for new tracks. Wdyt?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Makes sense.
geoshape/operations/default.json
Outdated
"ingest-percentage": {{ingest_percentage | default(100)}} | ||
}, | ||
{ | ||
"name": "index-update", |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is only necessary if we are interested in update benchmarks.
+1 to skip update benchmarks. Updates make some sense due to the fact that they trigger increased merging activity and that multidimensional points are slower than other Lucene data structures at merging when the number of dimensions is greater than 1. That said I could easily live without it. |
Co-Authored-By: imotov <igor@motovs.org>
I pushed some changes.
I spent some time analyzing the failures. It looks like we have 3 types of failures there. As I mentioned before, some of them seems to be related to elastic/elasticsearch#26286. However, I also found some instances of elastic/elasticsearch#36883 and a few instances of
The test was running for several hours and I noticed significant slowdown towards the end of the test. So, I think it might be something else. |
I reran the benchmark once more on the same environment and let it continue when errors occur. This is the error rate (on a per document basis, not per bulk request) over time: I also created a chart for the indexing throughput over time: In both charts, the warmup is the grey part and the actual measurement phase is the green part. Leaving the errors aside, I think it's problematic that the benchmark never reaches a steady state. While at the beginning of the measurement phase we reach roughly 45500 docs/s, throughput declines over time and reaches 10900 docs/s at the end of the benchmark. I think it is dangerous to report a median throughput in that case because it suggests that a higher throughput can be reached over a longer period of time when in fact it is heavily dependent on how long we execute the benchmark. One thing I am wondering about is why we never reach steady state. Maybe @nknize has more thoughts on this? |
I suspect it's related to the fact that merging becomes increasingly costly as the index size grows with multi-dimensional points since the whole BKD tree needs to be rebuilt from scratch. Geo points and range fields have the issue as well, but geo shapes make it worse by using 4 indexed dimensions when ranges and geo points only use 2. |
geoshape/challenges/default.json
Outdated
"clients": 1, | ||
"warmup-iterations": 200, | ||
"iterations": 100, | ||
"target-throughput": 2 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The empirical data that I have gathered on hardware that is similar to our nightly environment suggests that this throughput is too high. With service times around 2.5 seconds I suggest a target throughput of 0.3 (meaning that a query will be scheduled roughly every 3 seconds)
geoshape/challenges/default.json
Outdated
"clients": 1, | ||
"warmup-iterations": 200, | ||
"iterations": 100, | ||
"target-throughput": 2 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The empirical data that I have gathered on hardware that is similar to our nightly environment suggests that this throughput is too high. With service times around 3 to 3.5 seconds I suggest a target throughput of 0.25 (meaning that a query will be scheduled every 4 seconds)
That sounds reasonable. If I understand you correctly this means we will never reach steady state though. I wonder how we can make the benchmark meaningful w.r.t. to throughput metrics. FWIW the complete benchmark has finished by now and I did not see any errors for any of the queries (in Igor's benchmark results we see a positive error rate for |
Maybe we can create a new index every 5mln records or so? It is strange though, I was kind of expecting that limit on the maximum segment size should take care of that.
Yeah, my machine was unresponsive at the end of the indexing run. I am not sure if it was thermal throttling or something else. |
This is a good point, the index is much larger than 5GB in the end so merging activity should reach a steady state at some point. |
Not sure if relevant but I would not expect a constant indexing throughput because of the nature of the data. It is not the same to index a document with a point (one triangle) than a document with a polygon with thousand of points (~thousand of triangles + tessellator overhead). I think the batch size might be too big as some of the documents can be quite big (polygons with thousand of points). In my experience I had to limit the batch size to 2500. We should expect 8751 documents to fail. |
Indeed, the test file starts with a whole bunch of LINESTRINGS followed by MULTILINES and then POLYGONS. However, I thought we randomize the records that we index. If I am mistaken, then this might be indeed an issue. |
As per our discussion, I am going to split this PR into 3 separate tracks for LINESTRINGS, MULTILINES and POLYGONS and will remove all shapes that elasticsearch cannot process at the moment. |
Thanks for iterating @imotov. Just a suggestion: You can keep everything in one track but you could create three different corpora and three challenges (one for each corpus). That makes it a bit easier to maintain because you can keep everything in the same track. I'm happy to assist if you have more specific questions. |
- splits shapes into 3 separate corpora - removes failed polygons - reduces default bulk size
Thanks for rerunning it. Could this be explained by #61 (comment)?
Btw, I think it is great that we have isolated the three different types now and can see how they behave. |
I am not sure. The index is much bigger than a maximum segment size. I don't understand the mechanism that prevents it from stabilizing. We shouldn't modify the older segments, so I am not sure what exactly kicks in that slows down the overall performance. The change is quite drastic though especially at the beginning where we index linestrings for 1 hour and the performance steadily drops from 24,000 to 8,000. That's 3x difference. At the end of the test our throughput is 2,000, which is 10x slower than when we started.
@danielmitterdorfer I wonder if it would make sense to also index them in 3 different indices instead of a single index as I do at the moment. Any other ideas on how to make this test more meaningful? My main concern is that if settle on using this overall test running time as "a canary in a coal mine" whatever is causing the slowdown on the large index will overshadow whatever smaller regressions and improvements we will introduce until we figure out and fix the source of this slowdown. |
You could try rerunning with a profiler (
That would make sense to me, yes. |
Switch to using different indices for different shape types.
I think the answer is actually yes. My assumption was wrong. With the final index size 71.6gb here are the 10 largest segments at the end of the linestrings run with total of 42 segments. So, it looks like there is still plenty of merging going on there. I was watching this test run for a while and while the biggest 2.7GB segment was formed quite earlier in the run the second and third showed up only closer to the end of it.
I also realized that I need to wait for the merge of linestrings to finish before I start indexing multilinestrings since it looks like it still continues for a while and interferes with multilinestrings indexing. |
@danielmitterdorfer I would say #62 is a parallel effort. Knowing how geoshapes work for points is very useful for the effort to consolidate geo_point and geo_shape data types but it is not sufficient as a geo_shape metric. We still need a full-blown geo_shape test with different geo_shape types. I think I addressed your comments from this PR as much as I could. If you see any additional issues I can address, please let me know. The slowdown through the test is still present, but I think it's more a reflection of reality rather than a flaw in the test. Hopefully, @iverase's merge optimization will reduce this issue. |
Thanks for the update @imotov! I'll do a final pass very soon. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks mostly fine but I left a few comments around removal of types and the bulk size.
geoshape/README.md
Outdated
|
||
This track allows to overwrite the following parameters with Rally 0.8.0+ using `--track-params`: | ||
|
||
* `bulk_size` (default: 500) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This differs from the value in the track (100) + also I am not sure whether the same bulk size for all corpora is right. I wonder whether we need to introduce dedicated parameters per corpus?
geoshape/index.json
Outdated
"index.number_of_replicas": {{number_of_replicas | default(0)}} | ||
}, | ||
"mappings": { | ||
"_doc": { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This works for 6.x
but not on master anymore as types have been removed now.
geoshape/operations/default.json
Outdated
{ | ||
"name": "index-append-linestrings", | ||
"operation-type": "bulk", | ||
"bulk-size": {{bulk_size | default(100)}}, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
As I've mentioned in my feedback for the README, I'm not sure whether we should have a different bulk size per corpus?
geoshape/track.json
Outdated
{ | ||
"name": "osmlinestrings", | ||
"body": "index.json", | ||
"types": ["_doc"] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Support for types has been removed on master, so this works against 6.x but not against master.
geoshape/track.json
Outdated
"name": "linestrings", | ||
"base-url": "http://benchmarks.elasticsearch.org.s3.amazonaws.com/corpora/geoshape", | ||
"target-index": "osmlinestrings", | ||
"target-type": "_doc", |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Support for types has been removed on master, so this works against 6.x but not against master.
Adds different bulk size settings for different shape types and removes doc type.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for iterating @imotov. LGTM. Feel free to merge at any time.
Adds a track from OSM-derived geoshape data.
Closes #60