Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Scroll search with ASC sort order on missing fields does not terminate #9136

Closed
gokl opened this issue Jan 5, 2015 · 4 comments
Closed

Scroll search with ASC sort order on missing fields does not terminate #9136

gokl opened this issue Jan 5, 2015 · 4 comments
Assignees
Labels
>bug :Search/Search Search-related issues that do not fall into other categories v1.4.3

Comments

@gokl
Copy link

gokl commented Jan 5, 2015

Hi,

doing an ASC sorted scroll search on a field that only exists in the mapping but not in the document yields to a non terminating scroll. It always returns the documents that don't have the field on which we sorted.

POST test/test/1
{
    "data":"test"
}

PUT test/test/_mapping
{
   "test": {
      "properties": {
         "data": {
            "type": "string"
         },
         "bad": {
            "type": "string"
         }
      }
   }
}

POST test/test/_search?scroll=1m
{
    "sort": [
       {
          "bad": {
             "missing" : "_last",
             "order": "asc"
          }
       }
    ]
}

GET _search/scroll?scroll=1m&scroll_id=FILL_IN_SCROLL_ID
GET _search/scroll?scroll=1m&scroll_id=FILL_IN_SCROLL_ID

Both _search/scroll requests should not return documents because the initial search already returned the document with id 1, but they do return the document.

However, using either a DESC sort order or "missing":"_first" makes the scroll work/terminate as expected.

@gokl gokl changed the title Scroll search with ASC sort order on non existing fields does not terminate Scroll search with ASC sort order on non missing fields does not terminate Jan 5, 2015
@clintongormley
Copy link

Nice catch! thanks @gokl

@clintongormley clintongormley added >bug :Search/Search Search-related issues that do not fall into other categories labels Jan 5, 2015
@clintongormley
Copy link

@martijnvg could you have a look at this please?

@gokl gokl changed the title Scroll search with ASC sort order on non missing fields does not terminate Scroll search with ASC sort order on missing fields does not terminate Jan 5, 2015
@jpountz
Copy link
Contributor

jpountz commented Jan 6, 2015

This is related to #9155. When sorting missing values last on a string field, we artificially replace null with an hypothetic maximum term so that the coordinating node knows how to compare this document to documents coming from other shards. But we forgot to do the opposite operation in setTopValue which is used for paging. So the comparator is feeded with this maximum term, which compares lower than a document without a value given that this maximum term IS a value (we should provide the comparator with null). Here is an untested patch that should fix the issue:

diff --git a/src/main/java/org/elasticsearch/index/fielddata/fieldcomparator/BytesRefFieldComparatorSource.java b/src/main/java/org/elasticsearch/index/fielddata/fieldcomparator/BytesRefFieldComparatorSource.java
index 7e64720..66fa306 100644
--- a/src/main/java/org/elasticsearch/index/fielddata/fieldcomparator/BytesRefFieldComparatorSource.java
+++ b/src/main/java/org/elasticsearch/index/fielddata/fieldcomparator/BytesRefFieldComparatorSource.java
@@ -97,11 +97,21 @@ public class BytesRefFieldComparatorSource extends IndexFieldData.XFieldComparat
                     // TopDocs.merge deal with it (it knows how to)
                     BytesRef value = super.value(slot);
                     if (value == null) {
+                        assert sortMissingFirst(missingValue) || sortMissingLast(missingValue);
                         value = missingBytes;
                     }
                     return value;
                 }

+                public void setTopValue(BytesRef topValue) {
+                    // symetric of value(int): if we need to feed the comparator with <tt>null</tt>
+                    // if we overrode the value with MAX_TERM in value(int)
+                    if (topValue == missingBytes && (sortMissingFirst(missingValue) || sortMissingLast(missingValue))) {
+                        topValue = null;
+                    }
+                    super.setTopValue(topValue);
+                }
+                
             };
         }

But I hope we can come with a cleaner solution on 2.x by doing #9155

@martijnvg
Copy link
Member

@jpountz This proposed fix fixes the reported bug, so +1 for adding this initially to master, 1.x and 1.4 branches and then work on a better fix in master.

jpountz added a commit to jpountz/elasticsearch that referenced this issue Jan 6, 2015
For the comparator to work correctly, we need to give it the same value in
`setTopValue` as the value that it gave back in `value`.

Close elastic#9136
@jpountz jpountz closed this as completed in 4cb23a0 Jan 6, 2015
jpountz added a commit that referenced this issue Jan 6, 2015
For the comparator to work correctly, we need to give it the same value in
`setTopValue` as the value that it gave back in `value`.

Close #9136
jpountz added a commit that referenced this issue Jan 6, 2015
For the comparator to work correctly, we need to give it the same value in
`setTopValue` as the value that it gave back in `value`.

Close #9136
mute pushed a commit to mute/elasticsearch that referenced this issue Jul 29, 2015
For the comparator to work correctly, we need to give it the same value in
`setTopValue` as the value that it gave back in `value`.

Close elastic#9136
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
>bug :Search/Search Search-related issues that do not fall into other categories v1.4.3
Projects
None yet
Development

Successfully merging a pull request may close this issue.

4 participants