DuplicateFilter in _search

Hi,

I have indexed documents that have a duplicate field. During the search I want to display only the document that has the highest score among its duplicate class (the group of documents that have the same field value).

There has been a discussion about deduplication in Lucene mailing-list:
http://www.mail-archive.com/java-user@lucene.apache.org/msg21437.html

Would it be possible to implement the Duplicate filter which is in contrib:
https://svn.apache.org/repos/asf/lucene/dev/trunk/lucene/contrib/sandbox/src/test/org/apache/lucene/sandbox/queries/DuplicateFilterTest.java

So the request would look like:

```
{
    "query": {
        "term": { "name": "alexis" }
    },
    "filter": {
         "duplicate": { "field": "name" }
    }
}
```

Let me know if that makes sense...


Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Uh oh!

DuplicateFilter in _search #1405

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Uh oh!

DuplicateFilter in _search #1405

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions