Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

VersionConflictEngineException with script update in cluster #13619

Closed
falkorichter opened this issue Sep 16, 2015 · 2 comments
Closed

VersionConflictEngineException with script update in cluster #13619

falkorichter opened this issue Sep 16, 2015 · 2 comments

Comments

@falkorichter
Copy link

We´re having problems with VersionConflictEngineExceptions all the time. The update should happen as a script and increment a number value (see sample document below)

We´re running a cluster of two els instances and I can only imagine that the synchronization is causing the conflict version in one node. I would expect the update not to throw this kind of exception in a cluster, as each update is atomically. We are running four application servers that execute this code and the Exceptions are thrown randomly on all instances.

stacktrace:

Caused by: org.elasticsearch.index.engine.VersionConflictEngineException: [kpi][4] [opportunity][1442415600000]: version conflict, current [5933], provided [5932]
        at org.elasticsearch.index.engine.internal.InternalEngine.innerIndex(InternalEngine.java:582) [elasticsearch-1.4.4.jar:]
        at org.elasticsearch.index.engine.internal.InternalEngine.index(InternalEngine.java:522) [elasticsearch-1.4.4.jar:]
        at org.elasticsearch.index.shard.service.InternalIndexShard.index(InternalIndexShard.java:425) [elasticsearch-1.4.4.jar:]
        at org.elasticsearch.action.index.TransportIndexAction.shardOperationOnPrimary(TransportIndexAction.java:193) [elasticsearch-1.4.4.jar:]
        at org.elasticsearch.action.support.replication.TransportShardReplicationOperationAction$AsyncShardOperationAction.performOnPrimary(TransportShardReplicationOperationAction.java:512) [elasticsearch-1.4.4.jar:]
        at org.elasticsearch.action.support.replication.TransportShardReplicationOperationAction$AsyncShardOperationAction.doStart(TransportShardReplicationOperationAction.java:426) [elasticsearch-1.4.4.jar:]
        at org.elasticsearch.action.support.replication.TransportShardReplicationOperationAction$AsyncShardOperationAction.start(TransportShardReplicationOperationAction.java:342) [elasticsearch-1.4.4.jar:]
        at org.elasticsearch.action.support.replication.TransportShardReplicationOperationAction.doExecute(TransportShardReplicationOperationAction.java:97) [elasticsearch-1.4.4.jar:]
        at org.elasticsearch.action.index.TransportIndexAction.innerExecute(TransportIndexAction.java:134) [elasticsearch-1.4.4.jar:]
        at org.elasticsearch.action.index.TransportIndexAction.doExecute(TransportIndexAction.java:112) [elasticsearch-1.4.4.jar:]
        at org.elasticsearch.action.index.TransportIndexAction.doExecute(TransportIndexAction.java:60) [elasticsearch-1.4.4.jar:]
        at org.elasticsearch.action.support.TransportAction.execute(TransportAction.java:75) [elasticsearch-1.4.4.jar:]
        at org.elasticsearch.action.update.TransportUpdateAction.shardOperation(TransportUpdateAction.java:217) [elasticsearch-1.4.4.jar:]
        at org.elasticsearch.action.update.TransportUpdateAction.shardOperation(TransportUpdateAction.java:170) [elasticsearch-1.4.4.jar:]
        at org.elasticsearch.action.support.single.instance.TransportInstanceSingleOperationAction$AsyncSingleAction$1.run(TransportInstanceSingleOperationAction.java:187) [elasticsearch-1.4.4.jar:]
        at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142) [rt.jar:1.8.0_20]
        at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617) [rt.jar:1.8.0_20]
        at java.lang.Thread.run(Thread.java:745) [rt.jar:1.8.0_20]

sample document:

{
  "_index": "kpi",
  "_type": "opportunity",
  "_id": "1442412000000",
  "_version": 14742,
  "found": true,
  "_source": {
    "timestamp": "2015-09-16T14:00:00.249+0000",
    "own": 224,
    "shared": 2,
    "network": 3941,
    "unknown": 10575
  }
}

each update script looks like this (one of the lines, only one increment per script):

ctx._source.own+=1;
ctx._source.shared+=1;
ctx._source.network+=1;
ctx._source.unknown+=1;
@jasontedor
Copy link
Member

Can you confirm that you are not setting the retry_on_conflict parameter? This parameter is zero by default and is designed exactly for your use case of updates where the ordering of updates (say incrementing a counter) isn't important.

If you do confirm this, this behavior is expected when you have multiple writers attempting to update the same document. You can address this issue by using the retry_on_conflict parameter to retry when a version conflict occurs. You can read more about this issue in the documentation on partial updates including the specific section on conflicts.

@falkorichter
Copy link
Author

I can confirm I´m not setting the retry_on_conflict parameter. But it sounds exactly like the parameter I want to use. I deployed with the parameter and the exceptions seem to be gone.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

2 participants