Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

size:1 on a query applied to _all indexes only returns from the newest index, size:2 returns from all. #6614

Closed
weberr13 opened this issue Jun 25, 2014 · 24 comments
Assignees

Comments

@weberr13
Copy link

Our system is searching for the oldest record that hits a criteria. We attempted to accomplish this with a size limited ascending range query with bool filters.

On ES .90 and 1.1.2 the following queries were run:

curl -XGET http://localhost:9200/_all/meta/_search -d '{ "sort" : [ { "TimeUpdated" : { "order": "asc", "ignore_unmapped" : true } } ], query : {filtered : { filter : { bool : { must : [ { term : {"Written" : true }} , { term : { "LatestUpdate" : true } } , { range : { "TimeUpdated" : { gte : "2013/05/29 20:51:00" } } } ] } } } , _cache : false , from : 0,size : 1 , "fields" : [ "Session", "TimeUpdated", "TimeStart" ] } }'

{"took":616,"timed_out":false,"_shards":{"total":65,"successful":65,"failed":0},"hits":{"total":42959661,"max_score":null,"hits":[{"_index":"network_2014_06_25","_type":"meta","_id":"f3bb5501-ce17-4f13-a262-48b89e175201_2","_score":null,"fields":{"TimeStart":["2014/06/25 03:03:49"],"TimeUpdated":["2014/06/25 03:25:49"],"Session":["f3bb5501-ce17-4f13-a262-48b89e175201"]},"sort":[1403666749000]}]}}

The indices are built by date, this one "network_2014_06_25" is the newest. However adding one to the size gives a radically different result:

curl -XGET http://localhost:9200/_all/meta/_search -d '{ "sort" : [ { "TimeUpdated" : { "order": "asc", "ignore_unmapped" : true } } ], query : {filtered : { filter : { bool : { t : [ { term : {"Written" : true }} , { term : { "LatestUpdate" : true } } , { range : { "TimeUpdated" : { gte : "2013/05/29 20:51:00" } } } ] } } } , _cache : false , from : 0,size : 2 , "fields" : [ "Session", "TimeUpdated", "TimeStart" ] } }'

{"took":770,"timed_out":false,"_shards":{"total":65,"successful":65,"failed":0},"hits":{"total":42973532,"max_score":null,"hits":[{"_index":"network_2014_05_25","_type":"meta","_id":"b97c12c0-b863-49d2-9355-483aee55de16_1","_score":null,"fields":{"TimeStart":["2014/05/25 01:13:05"],"TimeUpdated":["2014/05/25 01:14:04"],"Session":["b97c12c0-b863-49d2-9355-483aee55de16"]},"sort":[1400980444000]},{"_index":"network_2014_06_25","_type":"meta","_id":"bdc9d4b3-4e60-4667-b66a-db831cf2f59b_2","_score":null,"fields":{"TimeStart":["2014/06/25 03:03:49"],"TimeUpdated":["2014/06/25 03:25:49"],"Session":["bdc9d4b3-4e60-4667-b66a-db831cf2f59b"]},"sort":[1403666749000]}]}}

Further increasing the size keeps hitting the correct "b97c12c0-b863-49d2-9355-483aee55de16_1" record rather than the certainly not the oldest "bdc9d4b3-4e60-4667-b66a-db831cf2f59b_2" record.

@areek areek assigned rjernst and unassigned rjernst Jun 29, 2014
@clintongormley
Copy link

Hi @weberr13

Your request is nested incorrectly:

GET /_all/meta/_search
{
  "sort": [
    {
      "TimeUpdated": {
        "order": "asc",
        "ignore_unmapped": true
      }
    }
  ],
  "query": {
    "filtered": {
      "filter": {
        "bool": {
          "must": [
            {
              "term": {
                "Written": true
              }
            },
            {
              "term": {
                "LatestUpdate": true
              }
            },
            {
              "range": {
                "TimeUpdated": {
                  "gte": "2013/05/29 20:51:00"
                }
              }
            }
          ]
        }
      }
    },
    "_cache": false,
    "from": 0,
    "size": 1,
    "fields": [
      "Session",
      "TimeUpdated",
      "TimeStart"
    ]
  }
}

_cache, from and size are at the wrong level completely.

Please could you fix the nesting and see if that resolves the issue? If it doesn't, add a comment with the correct query and I'll take another look.

@weberr13
Copy link
Author

weberr13 commented Jul 1, 2014

I don't understand the scope problem. If the size is outside the "query" it is ignored, if the size is inside the "filtered" it doesn't function. The size applies to the Query and it is scoped as such. If the "size" was at the wrong level, wouldn't it be ignored rather than returning the wrong 1 result? If it was at the wrong level, wouldn't the query for 2 also function the same way (only returning from the latest index, instead of returning one that is oldest and one from the latest? ) All logic seems to point to an "off by one" error when it comes to how the size parameter mixes with sorted results...

@clintongormley
Copy link

from, size and fields should all be at the same level as query. _cache should be inside the bool query.

it is quite possible, especially on older versions, that Elasticsearch hits an unknown key and just stops parsing, rather than throwing an error.

@weberr13
Copy link
Author

weberr13 commented Jul 1, 2014

When I move size up to Query, it is ignored, as I said. The only way to get 1 result is to write the query as above. It isn't about unknown keys either. All the results are valid, it is the sorting that is off. Basically this is what is occuring:

Index 2 days ago: 2 records ( id a,b)
Index 1 days ago: 2 records ( id c,d)
Index today : 2 records ( e,f)

When I query and limit the size to 10 (default) I get

a, b, c, d, e, f

When I query and limit the size to 5 I get

a, b, c, d, e

when I limit to 4 I get

a, b, c,e

3:

a, b, e

2:

a, e

1:

e

Do you see the issue?

@clintongormley
Copy link

@weberr13 Yes I see the issue. What I don't see is a query that actually runs. The query that you sent me is invalid and throws an error on 1.2.1: ElasticsearchParseException[Expected field name but got VALUE_BOOLEAN "_cache"]

You can see from the documentation that from and size should be at the same level as queryhttp://www.elasticsearch.org/guide/en/elasticsearch/reference/current/search-request-from-size.html

Also, that _cache should be applied to a particular filter: http://www.elasticsearch.org/guide/en/elasticsearch/reference/current/query-dsl-filters.html#caching such as the bool filter: http://www.elasticsearch.org/guide/en/elasticsearch/reference/current/query-dsl-bool-filter.html#_caching_2

So either you sent me a different query than the one you posted originally, or you are running a broken query.

I'm not saying there isn't an issue. All I'm saying is that you haven't given me the information that I need to figure it out. What would be awesome is if you could give me a short recreation of the problem, with curl statements which

  • create the indices, plus mapping
  • insert documents
  • search, demonstrating the problem

With that in hand, it becomes a lot easier to figure out if there is a real problem or not.

@weberr13
Copy link
Author

weberr13 commented Jul 1, 2014

I'm running on 1.1.2 (as I said in the previous post). If you want to test on your version, go ahead and pull the size up to the same level of query, but for my install that doesn't work (it ignores the size parameter all together).

As for re-creation, any data that can be sorted ASC on date would work.

@clintongormley
Copy link

@weberr13 the size parameter has only ever been accepted at the same level as the query, right since the first release. The only thing that has changed is whether an error is thrown or not.

Also, I've just looked at your second query which has another error: your bool query has a t key which is meaningless.

Please send me an actual working recreation. For instance, from the data you've provided, I've written the following, which works exactly as it is supposed to work:

DELETE /_all

PUT /index_2014-07-01/meta/1
{
  "foo": "bar",
  "timeUpdated": "2014/07/01 01:00:00"
}
PUT /index_2014-07-01/meta/2
{
  "foo": "bar",
  "timeUpdated": "2014/07/01 02:00:00"
}
PUT /index_2014-07-02/meta/3
{
  "foo": "bar",
  "timeUpdated": "2014/07/02 01:00:00"
}
PUT /index_2014-07-02/meta/4
{
  "foo": "bar",
  "timeUpdated": "2014/07/02 02:00:00"
}
PUT /index_2014-07-03/meta/5
{
  "foo": "bar",
  "timeUpdated": "2014/07/03 01:00:00"
}
PUT /index_2014-07-03/meta/6
{
  "foo": "bar",
  "timeUpdated": "2014/07/03 02:00:00"
}

GET /_all/meta/_search
{
  "query": {
    "filtered": {
      "filter": {
        "bool": {
          "must": [
            {
              "term": {
                "foo": "bar"
              }
            },
            {
              "range": {
                "timeUpdated": {
                  "gt": "2014/06/01"
                }
              }
            }
          ],
          "_cache": false
        }
      }
    }
  },
  "from": 0,
  "size": 1,
  "sort": [
    {
      "timeUpdated": {
        "order": "asc",
        "ignore_unmapped": true
      }
    }
  ]
}

So I am currently unable to recreate your issue. If you are able to recreate it and can send me that recreation, that would be very useful.

@weberr13
Copy link
Author

weberr13 commented Jul 1, 2014

I literally cut and pasted from my terminal when I submitted the bug. I'm sorry that you cannot believe me.

@clintongormley
Copy link

@weberr13 It's not that I don't believe what you're seeing, but I'm telling you that what you copied and pasted is invalid. So before we can decide if there is a bug or not, we need a valid recreation.

I demonstrated a recreation above that shows that (for this case) things work as expected. Of course there may be some other issue that does indeed have a bug, but unless you give me that information I am unable to figure out what causes it.

Have you tried my recreation? Does it work for you? Do you not believe me?

@weberr13
Copy link
Author

weberr13 commented Jul 1, 2014

Here is what happens:

#1 the query as I originally posted it:

[weberr@probe Upgrade]$ curl -XGET http://localhost:9200/_all/meta/_search -d '{
"sort" : [ { "TimeUpdated" : { "order": "asc", "ignore_unmapped" : true } } ],
query : {
filtered : {
filter : {
bool : {
must : [ { term : {"Written" : true }} , { term : { "LatestUpdate" : true } } , { range : { "TimeUpdated" : { gte : "2013/05/29 20:51:00" } } } ]
}
}
} , _cache : false , from : 0,size : 1 , "fields" : [ "Session", "TimeUpdated", "TimeStart" ]
}
}'
{"took":54,"timed_out":false,"_shards":{"total":65,"successful":65,"failed":0},"hits":{"total":1994381,"max_score":null,"hits":[{"_index":"network_2014_06_26","_type":"meta","_id":"af210e47-0bd6-4609-8f0d-aaa41f1e8aae_1","_score":null,"fields":{"TimeStart":["2014/06/26 01:05:21"],"TimeUpdated":["2014/06/26 01:06:03"],"Session":["af210e47-0bd6-4609-8f0d-aaa41f1e8aae"]},"sort":[1403744763000]}]}}

The query as you describe (where I took _cache, sort and from outside of the query bock:

curl -XGET http://localhost:9200/_all/meta/_search -d '{
"sort" : [ { "TimeUpdated" : { "order": "asc", "ignore_unmapped" : true } } ],
query : {
filtered : {
filter : {
bool : {
must : [ { term : {"Written" : true }} , { term : { "LatestUpdate" : true } } , { range : { "TimeUpdated"
: { gte : "2013/05/29 20:51:00" } } } ]
}
}
} , "fields" : [ "Session", "TimeUpdated", "TimeStart" ]
} , _cache : false , from : 0,size : 1
}'

I get:

{"took":50,
"timed_out":false,
"_shards":{"total":65,
"successful":65,
"failed":0},
"hits":{"total":1994381,
"max_score":null,
"hits":[{"_index":"network_2014_06_20",
"_type":"meta",
"_id":"355a05fc-27dc-4bed-beb3-2df23afeebdb_1",
"_score":null,
"_source" : {"TimePreviousRaw":1403253269,
"TimePrevious":"2014/06/20 08:34:29",
"Host":["usbo1pdc02.schq.secious.com"],
"QDCount":1,
"DestIP":"10.128.64.252",
"ServerAddr":"10.128.64.252",
"TTL":3600,
"TransactionID":15468,
"DestBytesDelta":670,
"SrcBytesDelta":374,
"FlowID":2529124,
"Session":"355a05fc-27dc-4bed-beb3-2df23afeebdb",
"SrcIP":"10.1.4.184",
"TimeUpdated":"2014/06/20 08:35:28",
"Protocol":17,
"SrcMAC":"00:02:55:4f:da:3a",
"ThreadID":6,
"Written":true,
"FlowSessionCount":0,
"ApplicationID":794,
"SrcPort":46025,
"StoragePath":"/pcap0/9041/355a05fc-27dc-4bed-beb3-2df23afeebdb",
"DestPort":53,
"Application":"dns",
"CnxDuration":"1.236548",
"ANCount":0,
"TimeStartRaw":1403253269,
"FamilyEnd":["Network Service"],
"MessageSize":990544,
"Query":["qa-aix61.schq.secious.com",
"qa-aix61.schq.secious.com.schq.secious.com"],
"ApplicationPath":"/udp/dns",
"TimeStart":"2014/06/20 08:34:29",
"Dev":["bond0"],
"TotalBytes":1044,
"ARCount":0,
"Duration":59,
"StartTime":"2014/06/20 08:34:29.728553",
"ChildFlowNumber":1,
"FieldCount":10,
"Captured":true,
"TotalBytesDelta":1044,
"ResponseTime":"1970/01/01 00:00:00.000138",
"PacketsDelta":8,
"LatestUpdate":true,
"TimeDelta":59,
"NSCount":1,
"DestBytes":670,
"SrcBytes":374,
"Name":["schq.secious.com"],
"TimeUpdatedRaw":1403253328,
"CreateTime":"2014/06/20 08:34:29.728553",
"ClientAddr":"10.1.4.184",
"FlowCompleted":true,
"Family":["Network Service"],
"DestMAC":"f0:f7:55:dc:a8:7f",
"SessionPktCounter":4,
"TotalPackets":8},
"sort":[1403253328000]},
{"_index":"network_2014_06_20",
"_type":"meta",
"_id":"bfefefa3-1c0e-4b97-ac16-a54c711be208_1",
"_score":null,
"_source" : {"TimePreviousRaw":1403253084,
"TimePrevious":"2014/06/20 08:31:24",
"DestIP":"74.125.227.134",
"ServerName":["safebrowsing-cache.google.com"],
"ServerAddr":"74.125.227.134",
"SeqNum":2080575399,
"DestBytesDelta":18751,
"SrcBytesDelta":6388,
"RetransmissionBytes":412,
"FlowID":2520449,
"Session":"bfefefa3-1c0e-4b97-ac16-a54c711be208",
"SrcIP":"10.1.20.62",
... continues for 600 lines...

Where do you suggest I put the size: parameter in order for the query to work? Even though it returns the wrong entry, it at least works. Even more, by moving it, the Fields block suddenly stopped working as well. I believe it doesn't work for you, but I need something from you that I can run.

@clintongormley
Copy link

Try this:

  1. Move the fields parameter to the same level as query, from and size
  2. Remove the _cache parameter as it is completely out of place there

The query:

curl -XGET "http://localhost:9200/_all/meta/_search" -d'
{
  "sort": [
    {
      "TimeUpdated": {
        "order": "asc",
        "ignore_unmapped": true
      }
    }
  ],
  "query": {
    "filtered": {
      "filter": {
        "bool": {
          "must": [
            {
              "term": {
                "Written": true
              }
            },
            {
              "term": {
                "LatestUpdate": true
              }
            },
            {
              "range": {
                "TimeUpdated": {
                  "gte": "2013/05/29 20:51:00"
                }
              }
            }
          ]
        }
      }
    }
  },
  "fields": [
    "Session",
    "TimeUpdated",
    "TimeStart"
  ],
  "from": 0,
  "size": 1
}'

What does this give you?

@weberr13
Copy link
Author

weberr13 commented Jul 1, 2014

The above reproduces the issue:

{"took":8512,"timed_out":false,"_shards":{"total":65,"successful":2,"failed":0},"hits":{"total":0,"max_score":null,"hits":[]}}[weberr@probe Upgrade]$ sh foo
{"took":293,"timed_out":false,"_shards":{"total":65,"successful":65,"failed":0},"hits":{"total":1994381,"max_score":null,"hits":[{"_index":"network_2014_06_26","_type":"meta","_id":"af210e47-0bd6-4609-8f0d-aaa41f1e8aae_1","_score":null,"fields":{"TimeStart":["2014/06/26 01:05:21"],"TimeUpdated":["2014/06/26 01:06:03"],"Session":["af210e47-0bd6-4609-8f0d-aaa41f1e8aae"]},"sort":[1403744763000]}]}}

with size 2:
{"took":62,"timed_out":false,"_shards":{"total":65,"successful":65,"failed":0},"hits":{"total":1994381,"max_score":null,"hits":[{"_index":"network_2014_06_20","_type":"meta","_id":"355a05fc-27dc-4bed-beb3-2df23afeebdb_1","_score":null,"fields":{"TimeStart":["2014/06/20 08:34:29"],"TimeUpdated":["2014/06/20 08:35:28"],"Session":["355a05fc-27dc-4bed-beb3-2df23afeebdb"]},"sort":[1403253328000]},{"_index":"network_2014_06_26","_type":"meta","_id":"250a9777-ff40-46f4-acdf-c9d8ac341d8d_1","_score":null,"fields":{"TimeStart":["2014/06/26 01:05:17"],"TimeUpdated":["2014/06/26 01:06:04"],"Session":["250a9777-ff40-46f4-acdf-c9d8ac341d8d"]},"sort":[1403744764000]}]}}

@clintongormley
Copy link

OK great - now we're getting somewhere. Please could you run the same query a few times in a row and make sure that you get back the same results every time. I'm wondering if your primary and replica shards have diverged.

@clintongormley clintongormley reopened this Jul 1, 2014
@clintongormley clintongormley self-assigned this Jul 1, 2014
@weberr13
Copy link
Author

weberr13 commented Jul 1, 2014

I re-ran it several times with the same results. We've actually seen this on no fewer than 8 test machines over the last month or more (I'm not sure when our tester first saw it).

@clintongormley
Copy link

hi @weberr13

We've tried to replicate this on 1.1.2 with lots of randomized testing, and so far have been unable to do so. Are you able to recreate a small test case which (when run from scratch) replicates the problem? Alternatively, would you be able to upload all of your data somewhere so that we can replicate the problem locally?

You can send a link to the data to me personally if you'd prefer: clinton.gormley at elasticsearch.com

@clintongormley
Copy link

You can see the test we're running here: 79309dc

@weberr13
Copy link
Author

weberr13 commented Jul 2, 2014

I'm confused, does that test successfully recreate or do you still need data from me?

@s1monw
Copy link
Contributor

s1monw commented Jul 2, 2014

this test does pass all the time I ran it over 10k iterations with random number of shards, nodes etc. so it does NOT recreate the issue

@clintongormley
Copy link

Hi @weberr13

Have you managed to put together a test case for this yet? We are unable to investigate further until we can reproduce the problem. If the only way you can do this is to send me your indices, I'd be happy to take a look at them too.

@weberr13
Copy link
Author

weberr13 commented Jul 8, 2014

I am still figuring out a way to do this. I'd like to send you our indexes but they contain Deep Packet Inspection metadata for our engineering department and are not suitable for public consumption. If there was a way to produce an index that only had the

"fields": [
"Session",
"TimeUpdated",
"TimeStart"
]

data I could send that.

@clintongormley
Copy link

@weberr13 Understood - reindexing would be the only way, but that may change something else which "fixes" the issue. I'm happy to receive your indexes privately - just email me a link (but understand if you can't share them with me)..

@clintongormley
Copy link

@weberr13 Any chance of that recreation?

@weberr13
Copy link
Author

At this point I guess we cannot go further. We have a work around (we don't query for 1 anymore, always doing 2 and discarding one of them) and no way to share our indexes with you. If I had the free cycles I would to something, but I don't.

@s1monw
Copy link
Contributor

s1monw commented Oct 28, 2014

FYI - I think you are hitting #8226 which is now fixed in the 1.3 branch and above. I would love if you could confirm that once it's released?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

4 participants