Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Cascaded grouping [LUCENE-3122] #4195

Open
asfimport opened this issue May 19, 2011 · 5 comments
Open

Cascaded grouping [LUCENE-3122] #4195

asfimport opened this issue May 19, 2011 · 5 comments

Comments

@asfimport
Copy link

asfimport commented May 19, 2011

Similar to SOLR-2526, in that you are grouping on 2 separate fields, but instead of treating those fields as a single grouping by a compound key, this change would let you first group on key1 for the primary groups and then secondarily on key2 within the primary groups.

Ie, the result you get back would have groups A, B, C (grouped by key1) but then the documents within group A would be grouped by key 2.

I think this will be important for apps whose documents are the product of denormalizing, ie where the Lucene document is really a sub-document of a different identifier field. Borrowing an example from #4170, you have doctors but each doctor may have multiple offices (addresses) where they practice and so you index doctor X address as your lucene documents. In this case, your "identifier" field (that which "counts" for facets, and should be "grouped" for presentation) is doctorid. When you offer users search over this index, you'd likely want to 1) group by distance (ie, < 0.1 miles, < 0.2 miles, etc., as a function query), but 2) also group by doctorid, ie cascaded grouping.

I suspect this would be easier to implement than it sounds: the per-group collector used by the 2nd pass grouping collector for key1's grouping just needs to be another grouping collector. Spookily, though, that collection would also have to be 2-pass, so it could get tricky since grouping is sort of recursing on itself.... once we have #4185, though, that should enable efficient single pass grouping by the identifier (doctorid).


Migrated from LUCENE-3122 by Michael McCandless (@mikemccand), updated May 09 2016

@asfimport
Copy link
Author

Robert Muir (@rmuir) (migrated from JIRA)

bulk move 3.2 -> 3.3

@asfimport
Copy link
Author

Steven Rowe (@sarowe) (migrated from JIRA)

Bulk move 4.4 issues to 4.5 and 5.0

@asfimport
Copy link
Author

Furkan Kamaci (migrated from JIRA)

@mikemccand Could you explain this issue a bit more?

@asfimport
Copy link
Author

Otis Gospodnetic (@otisg) (migrated from JIRA)

Is this about grouping or about faceting (a la Solr pivot faceting?)

@asfimport
Copy link
Author

Uwe Schindler (@uschindler) (migrated from JIRA)

Move issue to Lucene 4.9.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

1 participant