Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Elastic search doesn't seem to work properly with "scroll" (BNPP Hackathon issue) #585

Closed
bzdyelnik opened this issue Jun 7, 2017 · 20 comments
Assignees

Comments

@bzdyelnik
Copy link

The query https://bnpparibas-api.openbankproject.com/obp/v3.0.0/search/warehouse/q=_index:20170531-declared-products&scroll=1m&size=10000 successfully retrieves a _scroll_id and a valid response, but using that _scroll_id value subsequently in https://bnpparibas-api.openbankproject.com/obp/v3.0.0/search/warehouse/q=_index:20170531-declared-products&scroll=1m&size=10000&scroll_id=<_scroll_id_value> fails (replacing <_scroll_id_value> with the _scroll_id value from the first query) with the following error message: {"error":{"root_cause":[{"type":"illegal_argument_exception","reason":"request [/20170531-declared-products,20170531-revenues,20170531-taxes,20170531-markets,20170531-charges,20170531-contracts,20170531-transactions,20170531-individual-clients,20170531-assets,20170531-sme-transactions,20170531-sme-contracts,20170531-sme-client/_search/scroll] contains unrecognized parameters: [scroll], [scroll_id]"}],"type":"illegal_argument_exception","reason":"request [/20170531-declared-products,20170531-revenues,20170531-taxes,20170531-markets,20170531-charges,20170531-contracts,20170531-transactions,20170531-individual-clients,20170531-assets,20170531-sme-transactions,20170531-sme-contracts,20170531-sme-client/_search/scroll] contains unrecognized parameters: [scroll], [scroll_id]"},"status":400}

@simonredfern
Copy link
Member

We're looking into this.

@bzdyelnik
Copy link
Author

Sorry to be impatient - are there any workarounds?

@bzdyelnik
Copy link
Author

Again, sorry to bug you guys - is there any progress on this issue? The Hackathon begins today.

@simonredfern
Copy link
Member

Hi - We're just deploying a new version. Will update you in the next hour

@constantine2nd
Copy link
Collaborator

We have been preparing a new version which will have a little bit different approach. It will act in this way:

POST URL: https://bnpparibas-api.openbankproject.com/obp/v3.0.0/search/warehouse
POST JSON:
{
  "es_uri_part": "/sports/athlete/_search",
  "es_body_part": {
    "size": 0,
    "aggregations": {
      "baseball_player_ring": {
        "geo_distance": {
          "field": "location",
          "origin": "46.12,-68.55",
          "unit": "mi",
          "ranges": [
            {
              "from": 0,
              "to": 20
            }
          ]
        }
      }
    }
  }
} 

is translated to:

curl -XPOST "http://OUR_ES_SERVER/sports/athlete/_search" -d'
{
   "size": 0,
   "aggregations": {
      "baseball_player_ring": {
         "geo_distance": {
            "field": "location",
            "origin": "46.12,-68.55",
            "unit": "mi",
            "ranges": [
               {
                  "from": 0,
                  "to": 20
               }
            ]
         }
      }
   }
}'

@bzdyelnik
Copy link
Author

Good news! Looking forward to it.

Will this be valid to get the first set of results and scroll_id?

POST URL: https://bnpparibas-api.openbankproject.com/obp/v3.0.0/search/warehouse
POST JSON:
{
"es_uri_part": "/20170531-sme-client",
"es_body_part": {
"size": "10000",
"scroll": "1m"
}
}

And then will this be valid to get the subsequent results?

POST URL: https://bnpparibas-api.openbankproject.com/obp/v3.0.0/search/warehouse
POST JSON:
{
"es_uri_part": "/20170531-sme-client",
"es_body_part": {
"size": "10000",
"scroll": "1m",
"scroll_id": the_scroll_id_from_the_prior_query"
}
}

@bzdyelnik
Copy link
Author

Pardon the missing " character before the_scroll_id_from_the_prior_query

@simonredfern
Copy link
Member

We're just testing that now. Its live you can check. Here's an example of v3.0.0 : https://github.com/OpenBankProject/OBP-API/wiki/BNP-Paribas-IRB-OBP-API-Sandbox#individuals-using-obp-v300

@constantine2nd
Copy link
Collaborator

This will work for the first part because you used URI Search aproach

{
  "es_uri_part": "/_search?q=_index:20170531-declared-products&scroll=1m&size=10000",
  "es_body_part": {}
}

@constantine2nd
Copy link
Collaborator

This will work for the second part because you used URI Search aproach

{
  "es_uri_part": "/_search?q=_index:20170531-declared-products&scroll=1m&size=10000&scroll_id=the_scroll_id_from_the_prior_query",
  "es_body_part": {}
}

@bzdyelnik
Copy link
Author

It works, except it seems like the scroll_id returned from each query is too long for each subsequent query.

Here's the error response:

{"$outer":{},"json":{"error":{"root_cause":[{"type":"too_long_frame_exception","reason":"An HTTP line is larger than 4096 bytes."}],"type":"too_long_frame_exception","reason":"An HTTP line is larger than 4096 bytes."},"status":400},"headers":[{"_1":"Access-Control-Allow-Origin","_2":"*"}],"cookies":[],"code":400}

And here's my query info (yeah, that scroll_id is really long):

https://bnpparibas-api.openbankproject.com/obp/v3.0.0/search/warehouse {"es_uri_part":"/_search?q=_index:20170531-sme-transactions&scroll=1m&size=10000&scroll_id=DnF1ZXJ5VGhlbkZldGNoeAAAAAAAAJZGFnFubE5hT3Q0U1lXWUN5SG8xckVpVncAAAAAAACWARZxbmxOYU90NFNZV1lDeUhvMXJFaVZ3AAAAAAAAli0WcW5sTmFPdDRTWVdZQ3lIbzFyRWlWdwAAAAAAAJYUFnFubE5hT3Q0U1lXWUN5SG8xckVpVncAAAAAAACWAhZxbmxOYU90NFNZV1lDeUhvMXJFaVZ3AAAAAAAAlgMWcW5sTmFPdDRTWVdZQ3lIbzFyRWlWdwAAAAAAAJYEFnFubE5hT3Q0U1lXWUN5SG8xckVpVncAAAAAAACWBRZxbmxOYU90NFNZV1lDeUhvMXJFaVZ3AAAAAAAAlgYWcW5sTmFPdDRTWVdZQ3lIbzFyRWlWdwAAAAAAAJYHFnFubE5hT3Q0U1lXWUN5SG8xckVpVncAAAAAAACWCBZxbmxOYU90NFNZV1lDeUhvMXJFaVZ3AAAAAAAAlgkWcW5sTmFPdDRTWVdZQ3lIbzFyRWlWdwAAAAAAAJYKFnFubE5hT3Q0U1lXWUN5SG8xckVpVncAAAAAAACWCxZxbmxOYU90NFNZV1lDeUhvMXJFaVZ3AAAAAAAAlgwWcW5sTmFPdDRTWVdZQ3lIbzFyRWlWdwAAAAAAAJYNFnFubE5hT3Q0U1lXWUN5SG8xckVpVncAAAAAAACWDhZxbmxOYU90NFNZV1lDeUhvMXJFaVZ3AAAAAAAAlg8WcW5sTmFPdDRTWVdZQ3lIbzFyRWlWdwAAAAAAAJYQFnFubE5hT3Q0U1lXWUN5SG8xckVpVncAAAAAAACWERZxbmxOYU90NFNZV1lDeUhvMXJFaVZ3AAAAAAAAlhIWcW5sTmFPdDRTWVdZQ3lIbzFyRWlWdwAAAAAAAJYTFnFubE5hT3Q0U1lXWUN5SG8xckVpVncAAAAAAACWFRZxbmxOYU90NFNZV1lDeUhvMXJFaVZ3AAAAAAAAlhYWcW5sTmFPdDRTWVdZQ3lIbzFyRWlWdwAAAAAAAJYXFnFubE5hT3Q0U1lXWUN5SG8xckVpVncAAAAAAACWGBZxbmxOYU90NFNZV1lDeUhvMXJFaVZ3AAAAAAAAlhkWcW5sTmFPdDRTWVdZQ3lIbzFyRWlWdwAAAAAAAJYaFnFubE5hT3Q0U1lXWUN5SG8xckVpVncAAAAAAACWGxZxbmxOYU90NFNZV1lDeUhvMXJFaVZ3AAAAAAAAlhwWcW5sTmFPdDRTWVdZQ3lIbzFyRWlWdwAAAAAAAJYdFnFubE5hT3Q0U1lXWUN5SG8xckVpVncAAAAAAACWHhZxbmxOYU90NFNZV1lDeUhvMXJFaVZ3AAAAAAAAlh8WcW5sTmFPdDRTWVdZQ3lIbzFyRWlWdwAAAAAAAJYgFnFubE5hT3Q0U1lXWUN5SG8xckVpVncAAAAAAACWIRZxbmxOYU90NFNZV1lDeUhvMXJFaVZ3AAAAAAAAliIWcW5sTmFPdDRTWVdZQ3lIbzFyRWlWdwAAAAAAAJYjFnFubE5hT3Q0U1lXWUN5SG8xckVpVncAAAAAAACWJBZxbmxOYU90NFNZV1lDeUhvMXJFaVZ3AAAAAAAAliUWcW5sTmFPdDRTWVdZQ3lIbzFyRWlWdwAAAAAAAJYmFnFubE5hT3Q0U1lXWUN5SG8xckVpVncAAAAAAACWJxZxbmxOYU90NFNZV1lDeUhvMXJFaVZ3AAAAAAAAligWcW5sTmFPdDRTWVdZQ3lIbzFyRWlWdwAAAAAAAJYpFnFubE5hT3Q0U1lXWUN5SG8xckVpVncAAAAAAACWKhZxbmxOYU90NFNZV1lDeUhvMXJFaVZ3AAAAAAAAlisWcW5sTmFPdDRTWVdZQ3lIbzFyRWlWdwAAAAAAAJYsFnFubE5hT3Q0U1lXWUN5SG8xckVpVncAAAAAAACWLhZxbmxOYU90NFNZV1lDeUhvMXJFaVZ3AAAAAAAAli8WcW5sTmFPdDRTWVdZQ3lIbzFyRWlWdwAAAAAAAJYwFnFubE5hT3Q0U1lXWUN5SG8xckVpVncAAAAAAACWMRZxbmxOYU90NFNZV1lDeUhvMXJFaVZ3AAAAAAAAljIWcW5sTmFPdDRTWVdZQ3lIbzFyRWlWdwAAAAAAAJYzFnFubE5hT3Q0U1lXWUN5SG8xckVpVncAAAAAAACWNBZxbmxOYU90NFNZV1lDeUhvMXJFaVZ3AAAAAAAAljUWcW5sTmFPdDRTWVdZQ3lIbzFyRWlWdwAAAAAAAJY2FnFubE5hT3Q0U1lXWUN5SG8xckVpVncAAAAAAACWNxZxbmxOYU90NFNZV1lDeUhvMXJFaVZ3AAAAAAAAljgWcW5sTmFPdDRTWVdZQ3lIbzFyRWlWdwAAAAAAAJY5FnFubE5hT3Q0U1lXWUN5SG8xckVpVncAAAAAAACWOhZxbmxOYU90NFNZV1lDeUhvMXJFaVZ3AAAAAAAAljsWcW5sTmFPdDRTWVdZQ3lIbzFyRWlWdwAAAAAAAJY8FnFubE5hT3Q0U1lXWUN5SG8xckVpVncAAAAAAACWPRZxbmxOYU90NFNZV1lDeUhvMXJFaVZ3AAAAAAAAlj4WcW5sTmFPdDRTWVdZQ3lIbzFyRWlWdwAAAAAAAJY_FnFubE5hT3Q0U1lXWUN5SG8xckVpVncAAAAAAACWQBZxbmxOYU90NFNZV1lDeUhvMXJFaVZ3AAAAAAAAlkEWcW5sTmFPdDRTWVdZQ3lIbzFyRWlWdwAAAAAAAJZCFnFubE5hT3Q0U1lXWUN5SG8xckVpVncAAAAAAACWQxZxbmxOYU90NFNZV1lDeUhvMXJFaVZ3AAAAAAAAlkQWcW5sTmFPdDRTWVdZQ3lIbzFyRWlWdwAAAAAAAJZFFnFubE5hT3Q0U1lXWUN5SG8xckVpVncAAAAAAACWRxZxbmxOYU90NFNZV1lDeUhvMXJFaVZ3AAAAAAAAlkgWcW5sTmFPdDRTWVdZQ3lIbzFyRWlWdwAAAAAAAJZJFnFubE5hT3Q0U1lXWUN5SG8xckVpVncAAAAAAACWShZxbmxOYU90NFNZV1lDeUhvMXJFaVZ3AAAAAAAAlksWcW5sTmFPdDRTWVdZQ3lIbzFyRWlWdwAAAAAAAJZMFnFubE5hT3Q0U1lXWUN5SG8xckVpVncAAAAAAACWTRZxbmxOYU90NFNZV1lDeUhvMXJFaVZ3AAAAAAAAlk4WcW5sTmFPdDRTWVdZQ3lIbzFyRWlWdwAAAAAAAJZPFnFubE5hT3Q0U1lXWUN5SG8xckVpVncAAAAAAACWUBZxbmxOYU90NFNZV1lDeUhvMXJFaVZ3AAAAAAAAllEWcW5sTmFPdDRTWVdZQ3lIbzFyRWlWdwAAAAAAAJZSFnFubE5hT3Q0U1lXWUN5SG8xckVpVncAAAAAAACWUxZxbmxOYU90NFNZV1lDeUhvMXJFaVZ3AAAAAAAAllQWcW5sTmFPdDRTWVdZQ3lIbzFyRWlWdwAAAAAAAJZVFnFubE5hT3Q0U1lXWUN5SG8xckVpVncAAAAAAACWVhZxbmxOYU90NFNZV1lDeUhvMXJFaVZ3AAAAAAAAllcWcW5sTmFPdDRTWVdZQ3lIbzFyRWlWdwAAAAAAAJZYFnFubE5hT3Q0U1lXWUN5SG8xckVpVncAAAAAAACWWRZxbmxOYU90NFNZV1lDeUhvMXJFaVZ3AAAAAAAAlloWcW5sTmFPdDRTWVdZQ3lIbzFyRWlWdwAAAAAAAJZbFnFubE5hT3Q0U1lXWUN5SG8xckVpVncAAAAAAACWXBZxbmxOYU90NFNZV1lDeUhvMXJFaVZ3AAAAAAAAll0WcW5sTmFPdDRTWVdZQ3lIbzFyRWlWdwAAAAAAAJZeFnFubE5hT3Q0U1lXWUN5SG8xckVpVncAAAAAAACWXxZxbmxOYU90NFNZV1lDeUhvMXJFaVZ3AAAAAAAAlmAWcW5sTmFPdDRTWVdZQ3lIbzFyRWlWdwAAAAAAAJZhFnFubE5hT3Q0U1lXWUN5SG8xckVpVncAAAAAAACWYhZxbmxOYU90NFNZV1lDeUhvMXJFaVZ3AAAAAAAAlmMWcW5sTmFPdDRTWVdZQ3lIbzFyRWlWdwAAAAAAAJZkFnFubE5hT3Q0U1lXWUN5SG8xckVpVncAAAAAAACWZRZxbmxOYU90NFNZV1lDeUhvMXJFaVZ3AAAAAAAAlmYWcW5sTmFPdDRTWVdZQ3lIbzFyRWlWdwAAAAAAAJZnFnFubE5hT3Q0U1lXWUN5SG8xckVpVncAAAAAAACWaBZxbmxOYU90NFNZV1lDeUhvMXJFaVZ3AAAAAAAAlmkWcW5sTmFPdDRTWVdZQ3lIbzFyRWlWdwAAAAAAAJZqFnFubE5hT3Q0U1lXWUN5SG8xckVpVncAAAAAAACWaxZxbmxOYU90NFNZV1lDeUhvMXJFaVZ3AAAAAAAAlmwWcW5sTmFPdDRTWVdZQ3lIbzFyRWlWdwAAAAAAAJZtFnFubE5hT3Q0U1lXWUN5SG8xckVpVncAAAAAAACWbhZxbmxOYU90NFNZV1lDeUhvMXJFaVZ3AAAAAAAAlm8WcW5sTmFPdDRTWVdZQ3lIbzFyRWlWdwAAAAAAAJZwFnFubE5hT3Q0U1lXWUN5SG8xckVpVncAAAAAAACWcRZxbmxOYU90NFNZV1lDeUhvMXJFaVZ3AAAAAAAAlnIWcW5sTmFPdDRTWVdZQ3lIbzFyRWlWdwAAAAAAAJZzFnFubE5hT3Q0U1lXWUN5SG8xckVpVncAAAAAAACWdBZxbmxOYU90NFNZV1lDeUhvMXJFaVZ3AAAAAAAAlnUWcW5sTmFPdDRTWVdZQ3lIbzFyRWlWdwAAAAAAAJZ2FnFubE5hT3Q0U1lXWUN5SG8xckVpVncAAAAAAACWdxZxbmxOYU90NFNZV1lDeUhvMXJFaVZ3AAAAAAAAlngWcW5sTmFPdDRTWVdZQ3lIbzFyRWlWdw==","es_body_part":{}} DirectLogin token="eyJhbGciOiJIUzI1NiIsInR5cCI6IkpXVCJ9.eyIiOiIifQ.5R0dP42k_h0jl1cv4YFtR-SwXwHKGxOmSJsyhqdhpOg"

@constantine2nd
Copy link
Collaborator

constantine2nd commented Jun 9, 2017

Please try this JSON

{
  "es_uri_part": "/_search?q=_index:20170531-declared-products&scroll=10m&size=100",
  "es_body_part": {}
}

and after obtaining a scroll_id this one in order to specify scroll_id via Request Body Search method.

{
  "es_uri_part": "/_search/scroll",
  "es_body_part": {
    "scroll_id": "OBTAINED_SCROLL_ID_VALUE"
  }
}

@bzdyelnik
Copy link
Author

It works! Only problem is that after about 5 times of using the scroll (with any specified size), I get the following:

{"$outer":{},"json":{"took":4,"timed_out":false,"_shards":{"total":120,"successful":115,"failed":5,"failures":[{"shard":-1,"index":null,"reason":{"type":"search_context_missing_exception","reason":"No search context found for id [115094]"}},{"shard":-1,"index":null,"reason":{"type":"search_context_missing_exception","reason":"No search context found for id [115118]"}},{"shard":-1,"index":null,"reason":{"type":"search_context_missing_exception","reason":"No search context found for id [115142]"}},{"shard":-1,"index":null,"reason":{"type":"search_context_missing_exception","reason":"No search context found for id [115168]"}},{"shard":-1,"index":null,"reason":{"type":"search_context_missing_exception","reason":"No search context found for id [115192]"}}]},"hits":{"total":0,"max_score":null,"hits":[]}},"headers":[{"_1":"Access-Control-Allow-Origin","_2":"*"}],"cookies":[],"code":200}

@sebtesobe
Copy link
Contributor

sebtesobe commented Jun 10, 2017

@bzdyelnik Maybe your search context has been invalidated because it was configured to just last for 1m? Have you tried a value like 10m for the scroll parameter?

@bzdyelnik
Copy link
Author

I just tried that - unfortunately it still fails at the same point - every 6th request with the scroll_id fails.

I'm not sure if "timed_out":false in the error message confirms this as well, but maybe.

@sebtesobe
Copy link
Contributor

@bzdyelnik Have you included the scroll parameter in subsequent scroll requests, alongside scroll_id?

@bzdyelnik
Copy link
Author

@sebtesobe I haven't, as it wasn't in the example above - should I do that? I'll try.

@bzdyelnik
Copy link
Author

Yes, that worked! Thanks!

@sebtesobe
Copy link
Contributor

@bzdyelnik Do you reckon we can close this issue?

@sebtesobe
Copy link
Contributor

shota also reported it works now for him

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants