Highlighting the actual state observed in LUCENE-9328 #1625

mkhludnev · 2020-06-28T10:38:50Z

Hi, @ctargett. I'm not sure if it's valid to stroke through, or it's better to just drop this word?

@ctargett

Hi, @ctargett. I'm not sure if it's valid to stroke through, or it's better to just drop this word?

ctargett · 2020-06-29T13:45:53Z

solr/solr-ref-guide/src/docvalues.adoc

@@ -24,7 +24,7 @@ The standard way that Solr builds the index is with an _inverted index_. This st

 For other features that we now commonly associate with search, such as sorting, faceting, and highlighting, this approach is not very efficient. The faceting engine, for example, must look up each term that appears in each document that will make up the result set and pull the document IDs in order to build the facet list. In Solr, this is maintained in memory, and can be slow to load (depending on the number of documents, terms, etc.).

-In Lucene 4.0, a new approach was introduced. DocValue fields are now column-oriented fields with a document-to-value mapping built at index time. This approach promises to relieve some of the memory requirements of the fieldCache and make lookups for faceting, sorting, and grouping much faster.
+In Lucene 4.0, a new approach was introduced. DocValue fields are now column-oriented fields with a document-to-value mapping built at index time. This approach promises to relieve some of the memory requirements of the fieldCache and make lookups for faceting, sorting much faster.


If you're going to take a word out of the list and leave only 2 items, you should remove the comma and add "and" instead: ...make lookups for faceting and sorting much faster.

ctargett · 2020-06-29T13:56:43Z

I commented with a specific recommendation on the change, but in general wonder if it's worth it? LUCENE-9328 is marked as an Improvement, which would imply to me that we never should have expected grouping to be faster with docValues? If that's the case, then the docs have been incorrect and this change makes sense.

However, if grouping was correctly documented as a specific thing that should be faster but isn't today because of a regression, then the Jira should be a Bug and this change makes less sense because we don't document every single Bug in Solr or Lucene in the Ref Guide - we'd have hundreds of tiny edits as things break and get fixed and we'd miss a lot of them. There's obviously differences of degrees here - if SSL totally broke for an entire release, we'd probably want to document that, but grouping being slower for a release or two (if that's the case, I didn't study the Jiras so maybe it's been longer), that's a less pressing edit IMO.

A related point is that if this has been wrong all along but is now going to be supported in the upcoming release (not sure the timing of LUCENE-9328), making this change now means you'll have to add the word back in another commit before release and it wouldn't be worth removing it now.

mkhludnev · 2020-08-31T14:51:53Z

it wouldn't be worth removing it now.

ok. It's fair. Thanks. @ctargett

mkhludnev added 2 commits June 28, 2020 13:37

Highlighting the actual state observed in LUCENE-9328

a123a62

Hi, @ctargett. I'm not sure if it's valid to stroke through, or it's better to just drop this word?

Update docvalues.adoc

0598235

ctargett reviewed Jun 29, 2020

View reviewed changes

mkhludnev closed this Aug 31, 2020

asfimport mentioned this pull request Sep 5, 2020

Sorting by DocValues while grouping is slower than old good FieldCache [LUCENE-9328] apache/lucene#10368

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Highlighting the actual state observed in LUCENE-9328 #1625

Highlighting the actual state observed in LUCENE-9328 #1625

mkhludnev commented Jun 28, 2020

ctargett Jun 29, 2020

ctargett commented Jun 29, 2020

mkhludnev commented Aug 31, 2020

Highlighting the actual state observed in LUCENE-9328 #1625

Highlighting the actual state observed in LUCENE-9328 #1625

Conversation

mkhludnev commented Jun 28, 2020

ctargett Jun 29, 2020

Choose a reason for hiding this comment

ctargett commented Jun 29, 2020

mkhludnev commented Aug 31, 2020