NPE in retry after a partitioned chunk #69

bflorat · 2016-01-03T14:45:54Z

(targets jberet 1.2.0.final)

We have a partitioned chunk step (let's say step2) following another partitioned chunk step "step1" (using the same mapper but I don't think this relevant). The mapper is in override = false mode.

When a step1 partition fails and the job is restarted, the step1 finishes (the only remaining partition is completed) but when the step2 begins, we get the NPE provided in [1].

In StepExecutionRunner.beginPartition(), the decision to read again the previous context of the step is given by : final boolean isRestartNotOverride = isRestart && !isOverride;
but in our case, the step2 returns a null context as it has never been launched.

A workaround we use successfully (in the partitions mapper) :

StepExecutionImpl originalStepExec = ((StepContextImpl) stepContext)
                .getOriginalStepExecution();
plan.setPartitionsOverride(originalStepExec == null);

Thanks, KUTGW
[1]

15:32:30,049 WARN  [org.jberet] (Batch Thread - 3) JBERET000018: Could not find the original step execution to restart.  Current step execution id: 0, step name: ajouter_metadonnees
15:32:30,051 ERROR [org.jberet] (Batch Thread - 3) JBERET000007: Failed to run job collecterAtlas, ajouter_metadonnees, org.jberet.job.model.Step@164a6d81: java.lang.NullPointerException
    at org.jberet.runtime.runner.StepExecutionRunner.beginPartition(StepExecutionRunner.java:256) [jberet-core-1.2.0.Final.jar:1.2.0.Final]
    at org.jberet.runtime.runner.StepExecutionRunner.runBatchletOrChunk(StepExecutionRunner.java:216) [jberet-core-1.2.0.Final.jar:1.2.0.Final]
    at org.jberet.runtime.runner.StepExecutionRunner.run(StepExecutionRunner.java:140) [jberet-core-1.2.0.Final.jar:1.2.0.Final]
    at org.jberet.runtime.runner.CompositeExecutionRunner.runStep(CompositeExecutionRunner.java:164) [jberet-core-1.2.0.Final.jar:1.2.0.Final]
    at org.jberet.runtime.runner.CompositeExecutionRunner.runJobElement(CompositeExecutionRunner.java:128) [jberet-core-1.2.0.Final.jar:1.2.0.Final]
    at org.jberet.runtime.runner.StepExecutionRunner.run(StepExecutionRunner.java:196) [jberet-core-1.2.0.Final.jar:1.2.0.Final]
    at org.jberet.runtime.runner.CompositeExecutionRunner.runStep(CompositeExecutionRunner.java:164) [jberet-core-1.2.0.Final.jar:1.2.0.Final]
    at org.jberet.runtime.runner.CompositeExecutionRunner.runJobElement(CompositeExecutionRunner.java:128) [jberet-core-1.2.0.Final.jar:1.2.0.Final]
    at org.jberet.runtime.runner.StepExecutionRunner.run(StepExecutionRunner.java:196) [jberet-core-1.2.0.Final.jar:1.2.0.Final]
    at org.jberet.runtime.runner.CompositeExecutionRunner.runStep(CompositeExecutionRunner.java:164) [jberet-core-1.2.0.Final.jar:1.2.0.Final]
    at org.jberet.runtime.runner.CompositeExecutionRunner.runFromHeadOrRestartPoint(CompositeExecutionRunner.java:88) [jberet-core-1.2.0.Final.jar:1.2.0.Final]
    at org.jberet.runtime.runner.JobExecutionRunner.run(JobExecutionRunner.java:59) [jberet-core-1.2.0.Final.jar:1.2.0.Final]
    at org.wildfly.jberet.services.BatchEnvironmentService$WildFlyBatchEnvironment$1.run(BatchEnvironmentService.java:193) [eap6-jberet-1.0.3-SNAPSHOT.jar:1.0.3-SNAPSHOT]
    at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:471) [rt.jar:1.7.0_80]
    at java.util.concurrent.FutureTask.run(FutureTask.java:262) [rt.jar:1.7.0_80]
    at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145) [rt.jar:1.7.0_80]
    at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615) [rt.jar:1.7.0_80]
    at java.lang.Thread.run(Thread.java:745) [rt.jar:1.7.0_80]
    at org.jboss.threads.JBossThread.run(JBossThread.java:122)

The text was updated successfully, but these errors were encountered:

… a partitioned chunk.

chengfang · 2016-01-05T16:41:53Z

We currently check for isRestart via isRestart = jobContext.isRestart();. Since this is for the step, we should instead check for if the current step has been executed before:

isRestart = originalStepExecution != null;

…e current step is restart instead of checking if the current job execution is restart).

…ed chunk; these tests verify partition mapper override true and restart.

bflorat · 2016-01-06T16:35:32Z

Thanks for the fix,
Do you plan a 1.2.1 maintenance release bringing this or will it only be available in 1.3.x branch ?

(The workaround we provided works but required overnumerous database access to get previous start status in some mappers, that will impact performances I guess)

chengfang · 2016-01-06T16:42:52Z

Thanks for reporting this issue. The fix will be in the next release (1.3.0.Beta1).

chengfang · 2016-06-02T23:32:20Z

Created a shadow JIRA issue for EAP 7.0.2, which will include jberet-core 1.2.1.

https://issues.jboss.org/browse/JBEAP-4847

… after a partitioned chunk.

… if the current step is restart instead of checking if the current job execution is restart).

…titioned chunk; these tests verify partition mapper override true and restart.

chengfang self-assigned this Jan 3, 2016

chengfang added a commit that referenced this issue Jan 5, 2016

add tests to reproduce issues in github issue #69: NPE in retry after…

eb67fb7

… a partitioned chunk.

chengfang added a commit that referenced this issue Jan 6, 2016

github issue #69: NPE in retry after a partitioned chunk (check if th…

8eb81dd

…e current step is restart instead of checking if the current job execution is restart).

chengfang added a commit that referenced this issue Jan 6, 2016

add tests related to github issue #69: NPE in retry after a partition…

a1817a5

…ed chunk; these tests verify partition mapper override true and restart.

chengfang closed this as completed Jan 6, 2016

spyrkob pushed a commit to spyrkob/jsr352 that referenced this issue Jun 3, 2016

add tests to reproduce issues in github issue jberet#69: NPE in retry…

53d56af

… after a partitioned chunk.

spyrkob pushed a commit to spyrkob/jsr352 that referenced this issue Jun 3, 2016

github issue jberet#69: NPE in retry after a partitioned chunk (check…

21a2b9c

… if the current step is restart instead of checking if the current job execution is restart).

spyrkob pushed a commit to spyrkob/jsr352 that referenced this issue Jun 3, 2016

add tests related to github issue jberet#69: NPE in retry after a par…

95b0aaa

…titioned chunk; these tests verify partition mapper override true and restart.

spyrkob mentioned this issue Jun 3, 2016

[JBEAP-4847] NPE in retry after a partitioned chunk #74

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

NPE in retry after a partitioned chunk #69

NPE in retry after a partitioned chunk #69

bflorat commented Jan 3, 2016

chengfang commented Jan 5, 2016

bflorat commented Jan 6, 2016

chengfang commented Jan 6, 2016

chengfang commented Jun 2, 2016

NPE in retry after a partitioned chunk #69

NPE in retry after a partitioned chunk #69

Comments

bflorat commented Jan 3, 2016

chengfang commented Jan 5, 2016

bflorat commented Jan 6, 2016

chengfang commented Jan 6, 2016

chengfang commented Jun 2, 2016