LUCENE-9651: Make benchmarks run again, correct javadocs#71
Conversation
|
Thanks Robert. I'll go through these benchmark files and correct them so that they work. It is a bit worrying that nobody noticed they're broken. :) Anybody using these at all? |
I've not used this mechanism of the benchmark to do any performance benchmarking: It seems most performance benchmarking from contributors/committers is using https://github.com/mikemccand/luceneutil for this, or writing ad-hoc benchmarks. Personally, I use this benchmarking package, but via QualityRun's main method, to measure relevance, and I always write my own parser (because every trec-like dataset differs oh-so-slightly and the generic TREC parser we supply never works), and I just hold it in a minimum way (generate submission.txt, then i run trec_eval etc from commandline myself). The issue why it isn't used might be the dataset, I'm unfamiliar with this reuters dataset and maybe its not big enough for useful benchmarks? I think in general people tend to use these datasets more often for performance benchmarks, often ad-hoc:
Or maybe its just because perf issues are usually complicated? For example to reproduce LUCENE-9827 I downloaded geonames and wrote a simple standalone .java Indexer (attached to issue) that essentially changes IW's config (flush every doc, SerialMergeScheduler, LZ4 and DEFLATE codec compression) to keep it simple measuring using only a single thread. It ran so slow i had to limit the number of docs to the first N as well. |
mikemccand
left a comment
There was a problem hiding this comment.
Thank you for fixing this @dweiss! Alas these benchmarks indeed do not get much love/attention.
… updates are distributed (apache#71) Fixes PerReplicaStatesIntegrationTest.testRestart()
No description provided.