Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Error executing tasks defined via index filter when the task alters the result of the search #1232

Closed
luis100 opened this issue May 17, 2018 · 0 comments
Assignees
Labels
Milestone

Comments

@luis100
Copy link
Member

luis100 commented May 17, 2018

Example when removing AIP recursively (internal roda-impl/roda-dglab#60) :

2018-05-07 16:52:02,162 [http-nio-8080-exec-13] INFO  o.r.c.p.o.AkkaEmbeddedPluginOrchestrator - Success adding job 'Delete AIP' (d9800ed6-292a-4a29-88b4-98440ce5bd08) to be executed
2018-05-07 16:52:02,165 [JobsSystem-io-1-dispatcher-6434] INFO  o.r.c.p.o.AkkaEmbeddedPluginOrchestrator - Starting Delete RODA entities (which will be done asynchronously)
2018-05-07 16:52:06,088 [JobsSystem-io-2-dispatcher-6487] WARN  o.r.c.i.utils.IterableIndexResult - Error iterating through index result
java.lang.IndexOutOfBoundsException: Index: 0, Size: 0
        at java.util.ArrayList.rangeCheck(ArrayList.java:653)
        at java.util.ArrayList.get(ArrayList.java:429)
        at org.roda.core.index.utils.IterableIndexResult$IteratorIndexResult.next(IterableIndexResult.java:51)
        at org.roda.core.index.utils.IterableIndexResult$IteratorIndexResult.next(IterableIndexResult.java:42)
        at java.lang.Iterable.forEach(Iterable.java:74)
        at org.roda.core.index.utils.SolrUtils.execute(SolrUtils.java:2941)
        at org.roda.core.index.IndexService.execute(IndexService.java:533)
        at org.roda.core.plugins.plugins.internal.DeleteRODAObjectPlugin.processAIP(DeleteRODAObjectPlugin.java:144)
        at org.roda.core.plugins.plugins.internal.DeleteRODAObjectPlugin.access$000(DeleteRODAObjectPlugin.java:62)
        at org.roda.core.plugins.plugins.internal.DeleteRODAObjectPlugin$1.process(DeleteRODAObjectPlugin.java:123)
        at org.roda.core.plugins.plugins.PluginHelper.processObjects(PluginHelper.java:170)
        at org.roda.core.plugins.plugins.PluginHelper.processObjects(PluginHelper.java:209)
        at org.roda.core.plugins.plugins.internal.DeleteRODAObjectPlugin.execute(DeleteRODAObjectPlugin.java:118)
        at org.roda.core.plugins.orchestrate.akka.AkkaWorkerActor.handlePluginExecuteIsReady(AkkaWorkerActor.java:54)
        at org.roda.core.plugins.orchestrate.akka.AkkaWorkerActor.onReceive(AkkaWorkerActor.java:39)
        at akka.actor.UntypedAbstractActor$$anonfun$receive$1.applyOrElse(AbstractActor.scala:258)
        at akka.actor.Actor$class.aroundReceive(Actor.scala:517)
        at akka.actor.AbstractActor.aroundReceive(AbstractActor.scala:147)
        at akka.actor.ActorCell.receiveMessage(ActorCell.scala:590)
        at akka.actor.ActorCell.invoke(ActorCell.scala:559)
        at akka.dispatch.Mailbox.processMailbox(Mailbox.scala:257)
        at akka.dispatch.Mailbox.run(Mailbox.scala:224)
        at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
        at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
        at java.lang.Thread.run(Thread.java:748)
2018-05-07 16:52:06,088 [JobsSystem-io-2-dispatcher-6487] ERROR o.r.c.plugins.plugins.PluginHelper - Unexpected exception during 'perObjectLogic' execution
java.util.NoSuchElementException: Index: 0, Size: 0
        at org.roda.core.index.utils.IterableIndexResult$IteratorIndexResult.next(IterableIndexResult.java:68)
        at org.roda.core.index.utils.IterableIndexResult$IteratorIndexResult.next(IterableIndexResult.java:42)
        at java.lang.Iterable.forEach(Iterable.java:74)
        at org.roda.core.index.utils.SolrUtils.execute(SolrUtils.java:2941)
        at org.roda.core.index.IndexService.execute(IndexService.java:533)
        at org.roda.core.plugins.plugins.internal.DeleteRODAObjectPlugin.processAIP(DeleteRODAObjectPlugin.java:144)
        at org.roda.core.plugins.plugins.internal.DeleteRODAObjectPlugin.access$000(DeleteRODAObjectPlugin.java:62)
        at org.roda.core.plugins.plugins.internal.DeleteRODAObjectPlugin$1.process(DeleteRODAObjectPlugin.java:123)
        at org.roda.core.plugins.plugins.PluginHelper.processObjects(PluginHelper.java:170)
        at org.roda.core.plugins.plugins.PluginHelper.processObjects(PluginHelper.java:209)
        at org.roda.core.plugins.plugins.internal.DeleteRODAObjectPlugin.execute(DeleteRODAObjectPlugin.java:118)
        at org.roda.core.plugins.orchestrate.akka.AkkaWorkerActor.handlePluginExecuteIsReady(AkkaWorkerActor.java:54)
        at org.roda.core.plugins.orchestrate.akka.AkkaWorkerActor.onReceive(AkkaWorkerActor.java:39)
        at akka.actor.UntypedAbstractActor$$anonfun$receive$1.applyOrElse(AbstractActor.scala:258)
        at akka.actor.Actor$class.aroundReceive(Actor.scala:517)
        at akka.actor.AbstractActor.aroundReceive(AbstractActor.scala:147)
        at akka.actor.ActorCell.receiveMessage(ActorCell.scala:590)
        at akka.actor.ActorCell.invoke(ActorCell.scala:559)
        at akka.dispatch.Mailbox.processMailbox(Mailbox.scala:257)
        at akka.dispatch.Mailbox.run(Mailbox.scala:224)
        at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
        at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
        at java.lang.Thread.run(Thread.java:748)

Example when executing PREMIS generation job with a filter of files that do not have hash field defined:
From logs:

2018-05-11 21:51:29,605 [JobsSystem-io-1-dispatcher-1358] WARN  o.r.c.i.utils.IterableIndexResult - Error iterating through index result
java.lang.IndexOutOfBoundsException: Index: 0, Size: 0
        at java.util.ArrayList.rangeCheck(ArrayList.java:657)
        at java.util.ArrayList.get(ArrayList.java:433)
        at org.roda.core.index.utils.IterableIndexResult$IteratorIndexResult.next(IterableIndexResult.java:51)
        at org.roda.core.index.utils.IterableIndexResult$IteratorIndexResult.next(IterableIndexResult.java:42)
        at org.roda.core.plugins.orchestrate.AkkaEmbeddedPluginOrchestrator.runPluginFromIndex(AkkaEmbeddedPluginOrchestrator.java:191)
        at org.roda.core.plugins.orchestrate.akka.AkkaJobActor.runFromFilter(AkkaJobActor.java:128)
        at org.roda.core.plugins.orchestrate.akka.AkkaJobActor.onReceive(AkkaJobActor.java:83)
        at akka.actor.UntypedAbstractActor$$anonfun$receive$1.applyOrElse(AbstractActor.scala:258)
        at akka.actor.Actor$class.aroundReceive(Actor.scala:517)
        at akka.actor.AbstractActor.aroundReceive(AbstractActor.scala:147)
        at akka.actor.ActorCell.receiveMessage(ActorCell.scala:590)
        at akka.actor.ActorCell.invoke(ActorCell.scala:559)
        at akka.dispatch.Mailbox.processMailbox(Mailbox.scala:257)
        at akka.dispatch.Mailbox.run(Mailbox.scala:224)
        at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
        at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
        at java.lang.Thread.run(Thread.java:748)
2018-05-11 21:51:29,605 [JobsSystem-io-1-dispatcher-1358] ERROR o.r.c.p.o.AkkaEmbeddedPluginOrchestrator - Error running plugin from index
java.util.NoSuchElementException: Index: 0, Size: 0
        at org.roda.core.index.utils.IterableIndexResult$IteratorIndexResult.next(IterableIndexResult.java:68)
        at org.roda.core.index.utils.IterableIndexResult$IteratorIndexResult.next(IterableIndexResult.java:42)
        at org.roda.core.plugins.orchestrate.AkkaEmbeddedPluginOrchestrator.runPluginFromIndex(AkkaEmbeddedPluginOrchestrator.java:191)
        at org.roda.core.plugins.orchestrate.akka.AkkaJobActor.runFromFilter(AkkaJobActor.java:128)
        at org.roda.core.plugins.orchestrate.akka.AkkaJobActor.onReceive(AkkaJobActor.java:83)
        at akka.actor.UntypedAbstractActor$$anonfun$receive$1.applyOrElse(AbstractActor.scala:258)
        at akka.actor.Actor$class.aroundReceive(Actor.scala:517)
        at akka.actor.AbstractActor.aroundReceive(AbstractActor.scala:147)
        at akka.actor.ActorCell.receiveMessage(ActorCell.scala:590)
        at akka.actor.ActorCell.invoke(ActorCell.scala:559)
        at akka.dispatch.Mailbox.processMailbox(Mailbox.scala:257)
        at akka.dispatch.Mailbox.run(Mailbox.scala:224)
        at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
        at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
        at java.lang.Thread.run(Thread.java:748)

When iterating, the task executed is removing search hits from the index. Specifically, the executed job had a search filter by files where the hash field was not present, and the execution was filling up this field by executing the PREMIS generation plugin.

The IterableIndexResult iterates through index results in pages. As search hits are being modified it actually is missing whole pages (except the ones that miss due to soft commit). The total count is not updated, so when it is reached (much sooner than expected) it gives an empty list of results which when iterating gives the shown error.

We could iterate in the inverse order (from the end to the beginning), to better cope with removal operations, but this would still be vulnerable to changes in the index.

We could also use the export result handler to calculate the complete list of action targets before executing, but this would require changes on the index (see github.com/keeps/roda#1222 ).

@luis100 luis100 added the bug label May 17, 2018
@luis100 luis100 added this to the 2.2.4 milestone May 17, 2018
nunovieira220 added a commit that referenced this issue May 18, 2018
…ask alters the result of the search issue
luis100 pushed a commit that referenced this issue May 18, 2018
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

2 participants