New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Low-overhead profiling #77
Comments
linux perf usage becomes a lot easier with jdk-16 early access build. We don't need to compile/install any java agent and can just pass JVM flags such as |
LOL |
I chatted w/ @rmuir and he suggested we start simpler here with Java Flight Recorder (JFR). We can later upgrade to Some notes from our chat:
|
I have a start at this! Still iterating on it ... it produces output like this, after all 20 JVM iterations of a benchmark run, when running on
Look at that 2.72% CPU in |
…rks and in nightly search benchmarks
I pushed JFR for searching (still need to enable for indexing too), and it ran last night for the first time: https://home.apache.org/~mikemccand/lucenebench/2021.01.12.00.03.16.html The results are ... curious. E.g. 39% of HEAP samples were for 6.4% of HEAP was ~2% of CPU is spent in I will increase reported stack depth so we can get more details :) |
OK I re-ran the
|
And CPU:
|
… 1 for interactive benchmarks); also print the full command-line executed to aggregate all search JFR results
The VectorDictionary parsing should be a one-time startup cost that is part
of the luceneutil test harness - probably it's not worth optimizing?
Although we could certainly produce a more compact easy-to-read dictionary
representation if we want to.
…On Tue, Jan 12, 2021 at 8:57 AM Michael McCandless ***@***.***> wrote:
And CPU:
PERCENT CPU SAMPLES STACK
3.79% 232348 org.apache.lucene.util.packed.DirectMonotonicReader#get()
at org.apache.lucene.codecs.lucene80.Lucene80DocValuesProducer$15#binaryValue()
at org.apache.lucene.facet.taxonomy.FastTaxonomyFacetCounts#countAll()
at org.apache.lucene.facet.taxonomy.FastTaxonomyFacetCounts#<init>()
at perf.SearchTask#go()
at perf.TaskThreads$TaskThread#run()
3.55% 217460 org.apache.lucene.codecs.lucene80.Lucene80DocValuesProducer$15#binaryValue()
at org.apache.lucene.facet.taxonomy.FastTaxonomyFacetCounts#countAll()
at org.apache.lucene.facet.taxonomy.FastTaxonomyFacetCounts#<init>()
at perf.SearchTask#go()
at perf.TaskThreads$TaskThread#run()
2.70% 165557 org.apache.lucene.util.packed.DirectReader$DirectPackedReader12#get()
at org.apache.lucene.util.packed.DirectMonotonicReader#get()
at org.apache.lucene.codecs.lucene80.Lucene80DocValuesProducer$15#binaryValue()
at org.apache.lucene.facet.taxonomy.FastTaxonomyFacetCounts#countAll()
at org.apache.lucene.facet.taxonomy.FastTaxonomyFacetCounts#<init>()
at perf.SearchTask#go()
at perf.TaskThreads$TaskThread#run()
2.13% 130744 org.apache.lucene.facet.taxonomy.FastTaxonomyFacetCounts#countAll()
at org.apache.lucene.facet.taxonomy.FastTaxonomyFacetCounts#<init>()
at perf.SearchTask#go()
at perf.TaskThreads$TaskThread#run()
1.95% 119568 org.apache.lucene.search.grouping.TermGroupSelector#advanceTo()
at org.apache.lucene.search.grouping.SecondPassGroupingCollector#collect()
at org.apache.lucene.search.Weight$DefaultBulkScorer#scoreAll()
at org.apache.lucene.search.Weight$DefaultBulkScorer#score()
at org.apache.lucene.search.BulkScorer#score()
at org.apache.lucene.search.IndexSearcher#search()
at org.apache.lucene.search.IndexSearcher#search()
at perf.SearchTask#go()
at perf.TaskThreads$TaskThread#run()
1.74% 106828 org.apache.lucene.codecs.lucene84.Lucene84PostingsReader$BlockImpactsPostingsEnum#advance()
at org.apache.lucene.search.ConjunctionDISI#doNext()
at org.apache.lucene.search.ConjunctionDISI#nextDoc()
at org.apache.lucene.search.Weight$DefaultBulkScorer#scoreAll()
at org.apache.lucene.search.Weight$DefaultBulkScorer#score()
at org.apache.lucene.search.BulkScorer#score()
at org.apache.lucene.search.IndexSearcher#search()
at org.apache.lucene.search.IndexSearcher#search()
at org.apache.lucene.search.IndexSearcher#search()
at org.apache.lucene.search.IndexSearcher#searchAfter()
1.54% 94213 java.nio.ByteBuffer#get()
at java.nio.DirectByteBuffer#get()
at org.apache.lucene.store.ByteBufferGuard#getBytes()
at org.apache.lucene.store.ByteBufferIndexInput#readBytes()
at org.apache.lucene.codecs.lucene80.Lucene80DocValuesProducer$15#binaryValue()
at org.apache.lucene.facet.taxonomy.FastTaxonomyFacetCounts#countAll()
at org.apache.lucene.facet.taxonomy.FastTaxonomyFacetCounts#<init>()
at perf.SearchTask#go()
at perf.TaskThreads$TaskThread#run()
1.03% 63224 org.apache.lucene.facet.sortedset.SortedSetDocValuesFacetCounts#countOneSegment()
at org.apache.lucene.facet.sortedset.SortedSetDocValuesFacetCounts#countAll()
at org.apache.lucene.facet.sortedset.SortedSetDocValuesFacetCounts#<init>()
at org.apache.lucene.facet.sortedset.SortedSetDocValuesFacetCounts#<init>()
at perf.SearchTask#go()
at perf.TaskThreads$TaskThread#run()
1.02% 62342 org.apache.lucene.queries.intervals.OrderedIntervalsSource$OrderedIntervalIterator#nextInterval()
at org.apache.lucene.queries.intervals.IntervalFilter#nextInterval()
at org.apache.lucene.queries.intervals.IntervalScorer$1#matches()
at org.apache.lucene.search.Weight$DefaultBulkScorer#scoreAll()
at org.apache.lucene.search.Weight$DefaultBulkScorer#score()
at org.apache.lucene.search.BulkScorer#score()
at org.apache.lucene.search.IndexSearcher#search()
at org.apache.lucene.search.IndexSearcher#search()
at org.apache.lucene.search.IndexSearcher#search()
at org.apache.lucene.search.IndexSearcher#searchAfter()
1.02% 62230 org.apache.lucene.search.PhraseScorer$1#matches()
at org.apache.lucene.search.Weight$DefaultBulkScorer#scoreAll()
at org.apache.lucene.search.Weight$DefaultBulkScorer#score()
at org.apache.lucene.search.BulkScorer#score()
at org.apache.lucene.search.IndexSearcher#search()
at org.apache.lucene.search.IndexSearcher#search()
at org.apache.lucene.search.IndexSearcher#search()
at org.apache.lucene.search.IndexSearcher#searchAfter()
at org.apache.lucene.search.IndexSearcher#search()
at perf.SearchTask#go()
1.01% 61871 org.apache.lucene.index.SingletonSortedSetDocValues#nextDoc()
at org.apache.lucene.facet.sortedset.SortedSetDocValuesFacetCounts#countOneSegment()
at org.apache.lucene.facet.sortedset.SortedSetDocValuesFacetCounts#countAll()
at org.apache.lucene.facet.sortedset.SortedSetDocValuesFacetCounts#<init>()
at org.apache.lucene.facet.sortedset.SortedSetDocValuesFacetCounts#<init>()
at perf.SearchTask#go()
at perf.TaskThreads$TaskThread#run()
0.93% 57187 org.apache.lucene.search.ConjunctionDISI#doNext()
at org.apache.lucene.search.ConjunctionDISI#nextDoc()
at org.apache.lucene.search.Weight$DefaultBulkScorer#scoreAll()
at org.apache.lucene.search.Weight$DefaultBulkScorer#score()
at org.apache.lucene.search.BulkScorer#score()
at org.apache.lucene.search.IndexSearcher#search()
at org.apache.lucene.search.IndexSearcher#search()
at org.apache.lucene.search.IndexSearcher#search()
at org.apache.lucene.search.IndexSearcher#searchAfter()
at org.apache.lucene.search.IndexSearcher#search()
0.82% 50023 org.apache.lucene.search.grouping.TermGroupSelector#advanceTo()
at org.apache.lucene.search.grouping.FirstPassGroupingCollector#collect()
at org.apache.lucene.search.Weight$DefaultBulkScorer#scoreAll()
at org.apache.lucene.search.Weight$DefaultBulkScorer#score()
at org.apache.lucene.search.BulkScorer#score()
at org.apache.lucene.search.IndexSearcher#search()
at org.apache.lucene.search.IndexSearcher#search()
at perf.SearchTask#go()
at perf.TaskThreads$TaskThread#run()
0.63% 38777 org.apache.lucene.search.Weight$DefaultBulkScorer#scoreAll()
at org.apache.lucene.search.Weight$DefaultBulkScorer#score()
at org.apache.lucene.search.BulkScorer#score()
at org.apache.lucene.search.IndexSearcher#search()
at org.apache.lucene.search.IndexSearcher#search()
at perf.SearchTask#go()
at perf.TaskThreads$TaskThread#run()
0.62% 37725 jdk.internal.misc.Unsafe#convEndian()
at jdk.internal.misc.Unsafe#getShortUnaligned()
at java.nio.DirectByteBuffer#getShort()
at java.nio.DirectByteBuffer#getShort()
at org.apache.lucene.store.ByteBufferGuard#getShort()
at org.apache.lucene.store.ByteBufferIndexInput$SingleBufferImpl#readShort()
at org.apache.lucene.util.packed.DirectReader$DirectPackedReader12#get()
at org.apache.lucene.util.packed.DirectMonotonicReader#get()
at org.apache.lucene.codecs.lucene80.Lucene80DocValuesProducer$15#binaryValue()
at org.apache.lucene.facet.taxonomy.FastTaxonomyFacetCounts#countAll()
0.59% 36204 org.apache.lucene.search.TermScorer#score()
at org.apache.lucene.search.ScoreCachingWrappingScorer#score()
at org.apache.lucene.search.FieldComparator$RelevanceComparator#compareBottom()
at org.apache.lucene.search.grouping.FirstPassGroupingCollector#isCompetitive()
at org.apache.lucene.search.grouping.FirstPassGroupingCollector#collect()
at org.apache.lucene.search.Weight$DefaultBulkScorer#scoreAll()
at org.apache.lucene.search.Weight$DefaultBulkScorer#score()
at org.apache.lucene.search.BulkScorer#score()
at org.apache.lucene.search.IndexSearcher#search()
at org.apache.lucene.search.IndexSearcher#search()
0.56% 34042 org.apache.lucene.codecs.lucene84.Lucene84PostingsReader#findFirstGreater()
at org.apache.lucene.codecs.lucene84.Lucene84PostingsReader$BlockImpactsPostingsEnum#advance()
at org.apache.lucene.search.ConjunctionDISI#doNext()
at org.apache.lucene.search.ConjunctionDISI#nextDoc()
at org.apache.lucene.search.Weight$DefaultBulkScorer#scoreAll()
at org.apache.lucene.search.Weight$DefaultBulkScorer#score()
at org.apache.lucene.search.BulkScorer#score()
at org.apache.lucene.search.IndexSearcher#search()
at org.apache.lucene.search.IndexSearcher#search()
at org.apache.lucene.search.IndexSearcher#search()
0.55% 33870 org.apache.lucene.facet.taxonomy.IntTaxonomyFacets#increment()
at org.apache.lucene.facet.taxonomy.IntTaxonomyFacets#increment()
at org.apache.lucene.facet.taxonomy.FastTaxonomyFacetCounts#countAll()
at org.apache.lucene.facet.taxonomy.FastTaxonomyFacetCounts#<init>()
at perf.SearchTask#go()
at perf.TaskThreads$TaskThread#run()
0.54% 33294 org.apache.lucene.store.ByteBufferGuard#ensureValid()
at org.apache.lucene.store.ByteBufferGuard#getShort()
at org.apache.lucene.store.ByteBufferIndexInput$SingleBufferImpl#readShort()
at org.apache.lucene.util.packed.DirectReader$DirectPackedReader12#get()
at org.apache.lucene.util.packed.DirectMonotonicReader#get()
at org.apache.lucene.codecs.lucene80.Lucene80DocValuesProducer$15#binaryValue()
at org.apache.lucene.facet.taxonomy.FastTaxonomyFacetCounts#countAll()
at org.apache.lucene.facet.taxonomy.FastTaxonomyFacetCounts#<init>()
at perf.SearchTask#go()
at perf.TaskThreads$TaskThread#run()
0.53% 32211 org.apache.lucene.search.ScoreCachingWrappingScorer#score()
at org.apache.lucene.search.FieldComparator$RelevanceComparator#compareBottom()
at org.apache.lucene.search.grouping.FirstPassGroupingCollector#isCompetitive()
at org.apache.lucene.search.grouping.FirstPassGroupingCollector#collect()
at org.apache.lucene.search.Weight$DefaultBulkScorer#scoreAll()
at org.apache.lucene.search.Weight$DefaultBulkScorer#score()
at org.apache.lucene.search.BulkScorer#score()
at org.apache.lucene.search.IndexSearcher#search()
at org.apache.lucene.search.IndexSearcher#search()
at perf.SearchTask#go()
0.50% 30812 org.apache.lucene.codecs.lucene84.Lucene84PostingsReader$EverythingEnum#skipPositions()
at org.apache.lucene.codecs.lucene84.Lucene84PostingsReader$EverythingEnum#nextPosition()
at org.apache.lucene.queries.intervals.TermIntervalsSource$1#nextInterval()
at org.apache.lucene.queries.intervals.OrderedIntervalsSource$OrderedIntervalIterator#nextInterval()
at org.apache.lucene.queries.intervals.IntervalFilter#nextInterval()
at org.apache.lucene.queries.intervals.IntervalScorer$1#matches()
at org.apache.lucene.search.Weight$DefaultBulkScorer#scoreAll()
at org.apache.lucene.search.Weight$DefaultBulkScorer#score()
at org.apache.lucene.search.BulkScorer#score()
at org.apache.lucene.search.IndexSearcher#search()
0.50% 30709 org.apache.lucene.search.grouping.BlockGroupingCollector#collect()
at org.apache.lucene.search.Weight$DefaultBulkScorer#scoreAll()
at org.apache.lucene.search.Weight$DefaultBulkScorer#score()
at org.apache.lucene.search.BulkScorer#score()
at org.apache.lucene.search.IndexSearcher#search()
at org.apache.lucene.search.IndexSearcher#search()
at perf.SearchTask#go()
at perf.TaskThreads$TaskThread#run()
0.49% 30235 org.apache.lucene.search.similarities.BM25Similarity$BM25Scorer#score()
at org.apache.lucene.search.LeafSimScorer#score()
at org.apache.lucene.search.TermScorer#score()
at org.apache.lucene.search.ScoreCachingWrappingScorer#score()
at org.apache.lucene.search.FieldComparator$RelevanceComparator#compareBottom()
at org.apache.lucene.search.grouping.FirstPassGroupingCollector#isCompetitive()
at org.apache.lucene.search.grouping.FirstPassGroupingCollector#collect()
at org.apache.lucene.search.Weight$DefaultBulkScorer#scoreAll()
at org.apache.lucene.search.Weight$DefaultBulkScorer#score()
at org.apache.lucene.search.BulkScorer#score()
0.48% 29554 org.apache.lucene.search.SloppyPhraseMatcher#maxFreq()
at org.apache.lucene.search.PhraseScorer$1#matches()
at org.apache.lucene.search.Weight$DefaultBulkScorer#scoreAll()
at org.apache.lucene.search.Weight$DefaultBulkScorer#score()
at org.apache.lucene.search.BulkScorer#score()
at org.apache.lucene.search.IndexSearcher#search()
at org.apache.lucene.search.IndexSearcher#search()
at org.apache.lucene.search.IndexSearcher#search()
at org.apache.lucene.search.IndexSearcher#searchAfter()
at org.apache.lucene.search.IndexSearcher#search()
0.48% 29119 jdk.internal.misc.Unsafe#convEndian()
at jdk.internal.misc.Unsafe#getIntUnaligned()
at java.nio.DirectByteBuffer#getInt()
at java.nio.DirectByteBuffer#getInt()
at org.apache.lucene.store.ByteBufferGuard#getInt()
at org.apache.lucene.store.ByteBufferIndexInput$SingleBufferImpl#readInt()
at org.apache.lucene.util.packed.DirectReader$DirectPackedReader20#get()
at org.apache.lucene.codecs.lucene80.Lucene80DocValuesProducer$21#ordValue()
at org.apache.lucene.search.grouping.TermGroupSelector#advanceTo()
at org.apache.lucene.search.grouping.SecondPassGroupingCollector#collect()
0.44% 26883 org.apache.lucene.util.packed.DirectReader$DirectPackedReader12#get()
at org.apache.lucene.codecs.lucene80.Lucene80DocValuesProducer$21#ordValue()
at org.apache.lucene.index.SingletonSortedSetDocValues#nextDoc()
at org.apache.lucene.facet.sortedset.SortedSetDocValuesFacetCounts#countOneSegment()
at org.apache.lucene.facet.sortedset.SortedSetDocValuesFacetCounts#countAll()
at org.apache.lucene.facet.sortedset.SortedSetDocValuesFacetCounts#<init>()
at org.apache.lucene.facet.sortedset.SortedSetDocValuesFacetCounts#<init>()
at perf.SearchTask#go()
at perf.TaskThreads$TaskThread#run()
0.44% 26770 java.nio.Buffer#position()
at java.nio.ByteBuffer#position()
at java.nio.MappedByteBuffer#position()
at java.nio.MappedByteBuffer#position()
at org.apache.lucene.store.ByteBufferIndexInput$SingleBufferImpl#seek()
at org.apache.lucene.codecs.lucene80.Lucene80DocValuesProducer$15#binaryValue()
at org.apache.lucene.facet.taxonomy.FastTaxonomyFacetCounts#countAll()
at org.apache.lucene.facet.taxonomy.FastTaxonomyFacetCounts#<init>()
at perf.SearchTask#go()
at perf.TaskThreads$TaskThread#run()
0.43% 26243 org.apache.lucene.codecs.lucene80.Lucene80NormsProducer$3#longValue()
at org.apache.lucene.search.LeafSimScorer#getNormValue()
at org.apache.lucene.search.LeafSimScorer#score()
at org.apache.lucene.search.TermScorer#score()
at org.apache.lucene.search.BlockMaxConjunctionScorer#score()
at org.apache.lucene.search.TopScoreDocCollector$SimpleTopScoreDocCollector$1#collect()
at org.apache.lucene.search.Weight$DefaultBulkScorer#scoreAll()
at org.apache.lucene.search.Weight$DefaultBulkScorer#score()
at org.apache.lucene.search.BulkScorer#score()
at org.apache.lucene.search.IndexSearcher#search()
0.42% 25962 org.apache.lucene.index.SingletonSortedSetDocValues#nextOrd()
at org.apache.lucene.facet.sortedset.SortedSetDocValuesFacetCounts#countOneSegment()
at org.apache.lucene.facet.sortedset.SortedSetDocValuesFacetCounts#countAll()
at org.apache.lucene.facet.sortedset.SortedSetDocValuesFacetCounts#<init>()
at org.apache.lucene.facet.sortedset.SortedSetDocValuesFacetCounts#<init>()
at perf.SearchTask#go()
at perf.TaskThreads$TaskThread#run()
0.41% 25051 org.apache.lucene.search.Weight$DefaultBulkScorer#scoreAll()
at org.apache.lucene.search.Weight$DefaultBulkScorer#score()
at org.apache.lucene.search.BulkScorer#score()
at org.apache.lucene.search.IndexSearcher#search()
at org.apache.lucene.search.IndexSearcher#search()
at org.apache.lucene.search.IndexSearcher#search()
at org.apache.lucene.search.IndexSearcher#searchAfter()
at org.apache.lucene.search.IndexSearcher#search()
at perf.SearchTask#go()
at perf.TaskThreads$TaskThread#run()
—
You are receiving this because you are subscribed to this thread.
Reply to this email directly, view it on GitHub
<#77 (comment)>,
or unsubscribe
<https://github.com/notifications/unsubscribe-auth/AAHHUQMKUHMEZKUEHDQNOXLSZRILLANCNFSM4PRCRYEA>
.
|
Good point -- that is a one-time cost, amortized over the life of a service that then handles query traffic. Can I somehow instruct JFR to NOT profile at startup, and then trigger (from within my Java program, not via external JMC or so) for JFR to start? |
Mike, there are various ways to do this. For this case, maybe simply filter it out in the analysis? That way you have precise control. You can look at method / stacktrace / whatever to determine whatever logic you want, e.g. ignoring anything coming from Similar stuff is already done in unit tests for the gradle epoll thread: https://github.com/apache/lucene-solr/blob/master/buildSrc/src/main/java/org/apache/lucene/gradle/ProfileResults.java#L101 |
Another alternative (untested, based on my understanding): If you can start the thing programmatically in the spawned jvms, then don't pass all the options as In order to "start it" you then need to issue a diagnostic command. Usually done with |
…r results into visible nightly logs
Last night's benchmarks ran with indexer profiling too:
and:
It's curious that building FSTs is the top allocator (by sampling)! And, that I will fix nightly benchmark to include these profiler outputs in the accessible nightly log. |
Hi, I just came across this issue, and as you were discussing JFR, I thought the JfrUnit project might be useful here. It lets you run assertions against JFR events, allowing you to identify regressions e.g. in terms of object allocations or I/O triggered by specific operations. Of course, you could also use it to assert custom JFR events (which in turn you could inject via the JMC Agent project, in case there's no suitable events matching your needs). |
I think the benchmark results would be more useful if we could see some basic analysis of the code under load. We could imagine summary tables, graphs, flame charts, whatever is needed to knock out the hotspots and keep them under control.
Instead of seeing just that performance dropped or improved, we might be able to reason about why: it could make things better for developers. And maybe, just maybe, we have some giant frogs boiling for years that nobody never noticed yet: you know, easy wins to improve perf that went undetected.
As a start I imagine breaking the problem down into two profiling mechamisms:
The text was updated successfully, but these errors were encountered: