Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

version_conflict_engine_exception with bulk update #17165

Closed
atm028 opened this issue Mar 17, 2016 · 6 comments
Closed

version_conflict_engine_exception with bulk update #17165

atm028 opened this issue Mar 17, 2016 · 6 comments

Comments

@atm028
Copy link

atm028 commented Mar 17, 2016

Elasticsearch version:

"version" : {
    "number" : "2.1.1",
    "build_hash" : "40e2c53a6b6c2972b3d13846e450e66f4375bd71",
    "build_timestamp" : "2015-12-15T13:05:55Z",
    "build_snapshot" : false,
    "lucene_version" : "5.3.1"
  }

JVM version:

"jvm":{"pid":15324,"version":"1.7.0_07","vm_name":"Java HotSpot(TM) Client VM","vm_version":"23.3-b01","vm_vendor":"Oracle Corporation","start_time_in_millis":1458163388025,"mem":{"heap_init_in_bytes":268435456,"heap_max_in_bytes":1037959168,"non_heap_init_in_bytes":12746752,"non_heap_max_in_bytes":100663296,"direct_max_in_bytes":1037959168}

OS version:

"os":{"refresh_interval_in_millis":1000,"name":"Windows Server 2008 R2","arch":"x86","version":"6.1","available_processors":4,"allocated_processors":4},"process":{"refresh_interval_in_millis":1000,"id":15324,"mlockall":false},

Description of the problem including expected versus actual behavior:
I'm doing the document update with two bulk requests. The first request contains three updates and the second bulk request contains just one.
For the first bulk request the response is completely success but response for the second one said about version conflict.
The first request contains three updates of the document:

16:27:34.325 {ElasticSearch} 
HTTP Path: /_bulk 
HTTP POST Request: {
....
{"update": {"_index": "session-2016.03.14", "_type": "session", "_id": "3"}}
{"doc":{"states":{"state1":{"info": "some state info"}}}}

{"update": {"_index": "session-2016.03.14", "_type": "session", "_id": "3"}}
{"doc":{"states":{"state2":{"info": "some state info"}}}}


{"update": {"_index": "session-2016.03.14", "_type": "session", "_id": "3"}}
{"doc":{"states":{"state3":{"info": "some state info"}}}}
....

Then the second one which contains just one update:

16:27:34.334 {ElasticSearch} 
HTTP Path: /_bulk 
HTTP POST Request: 
{"update": {"_index": "session-2016.03.14", "_type": "session", "_id": "3"}}
{"doc":{"states":{"state4":{"info": "some state info"}}}}

And then the response for first request where all statuses are 200:

16:27:34.391 {ElasticSearch} Response from ElasticSearch localhost:9200: 
("took"=63,"errors"="false","items"=(

"JSON_ARRAY_ELEM"=("update"=("_index"="session-2016.03.14","_type"="session","_id"="3","_version"=6,"_shards"=("total"=2,"successful"=1,"failed"=0),"status"=200)),

"JSON_ARRAY_ELEM"=("update"=("_index"="session-2016.03.14","_type"="session","_id"="3","_version"=7,"_shards"=("total"=2,"successful"=1,"failed"=0),"status"=200)),

"JSON_ARRAY_ELEM"=("update"=("_index"="session-2016.03.14","_type"="session","_id"="3","_version"=8,"_shards"=("total"=2,"successful"=1,"failed"=0),"status"=200))))

And response for the second request with status 409:

16:27:34.391 {ElasticSearch} Response from ElasticSearch localhost:9200: 
("took"=25,"errors"="true","items"=(
...
"JSON_ARRAY_ELEM"=("update"=("_index"="session-2016.03.14","_type"="session","_id"="3","status"=409,"error"=("type"="version_conflict_engine_exception","reason"="[session][3]: version conflict, current [6], provided [5]","index"="session-2016.03.14","shard"="1"))),
....

Steps to reproduce:
There is no some especial steps for reproduce, and I've observed it just once.

Additional info:

"gc_collectors":["Copy","MarkSweepCompact"],"memory_pools":["Code Cache","Eden Space","Survivor Space","Tenured Gen","Perm Gen"]},"thread_pool":{"generic":{"type":"cached","keep_alive":"30s","queue_size":-1},"index":{"type":"fixed","min":4,"max":4,"queue_size":200},"fetch_shard_store":{"type":"scaling","min":1,"max":8,"keep_alive":"5m","queue_size":-1},"get":{"type":"fixed","min":4,"max":4,"queue_size":1000},"snapshot":{"type":"scaling","min":1,"max":2,"keep_alive":"5m","queue_size":-1},"force_merge":{"type":"fixed","min":1,"max":1,"queue_size":-1},"suggest":{"type":"fixed","min":4,"max":4,"queue_size":1000},"bulk":{"type":"fixed","min":4,"max":4,"queue_size":50},"warmer":{"type":"scaling","min":1,"max":2,"keep_alive":"5m","queue_size":-1},"flush":{"type":"scaling","min":1,"max":2,"keep_alive":"5m","queue_size":-1},"search":{"type":"fixed","min":7,"max":7,"queue_size":1000},"fetch_shard_started":{"type":"scaling","min":1,"max":8,"keep_alive":"5m","queue_size":-1},"listener":{"type":"fixed","min":2,"max":2,"queue_size":-1},"percolate":{"type":"fixed","min":4,"max":4,"queue_size":1000},"refresh":{"type":"scaling","min":1,"max":2,"keep_alive":"5m","queue_size":-1},"management":{"type":"scaling","min":1,"max":5,"keep_alive":"5m","queue_size":-1}},..."max_content_length_in_bytes":104857600},"plugins":[]}}}
@clintongormley
Copy link

@atm028 Your second update request happened at the same time as another request, so between fetching the document, updating it, and reindexing it, another request made an update.

See the retry_on_conflict parameter in the docs: https://www.elastic.co/guide/en/elasticsearch/reference/2.2/docs-update.html#_parameters_3

@atm028
Copy link
Author

atm028 commented Mar 18, 2016

@clintongormley But single client and single Elasticsearch node has been used and client sent both requests in range of single connection(http 1.1 with keep-alived connection). Where the another process comes from? Or it means that each request handling in own thread? Even from the same connection.

@clintongormley
Copy link

If you send a request and wait for the response before sending the next request, then they will be executed serially. But I think you've sent more requests than you realise, eg looking at the error message:

version conflict, current [6], provided [5]

...you've made more than one update to that document

@atm028
Copy link
Author

atm028 commented Mar 18, 2016

That's true, the second update request has been sent before the first one has been done. But if the requests has been sent in single connection then updates to the document should be enrolled sequentially. And then two responses will be send to the client. Of course if the handling of them works in single thread, since it single connection. At least in code the same thread context used for dispatching request. Doesn't it?

@clintongormley
Copy link

No. Requests are handled asynchronously.

@atm028
Copy link
Author

atm028 commented Mar 18, 2016

@clintongormley ok, thank you, now the reason is clear

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants