Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

NPE in retry after a partitioned chunk #69

Closed
bflorat opened this issue Jan 3, 2016 · 4 comments
Closed

NPE in retry after a partitioned chunk #69

bflorat opened this issue Jan 3, 2016 · 4 comments
Assignees

Comments

@bflorat
Copy link

bflorat commented Jan 3, 2016

(targets jberet 1.2.0.final)

We have a partitioned chunk step (let's say step2) following another partitioned chunk step "step1" (using the same mapper but I don't think this relevant). The mapper is in override = false mode.

When a step1 partition fails and the job is restarted, the step1 finishes (the only remaining partition is completed) but when the step2 begins, we get the NPE provided in [1].

In StepExecutionRunner.beginPartition(), the decision to read again the previous context of the step is given by : final boolean isRestartNotOverride = isRestart && !isOverride;
but in our case, the step2 returns a null context as it has never been launched.

A workaround we use successfully (in the partitions mapper) :

StepExecutionImpl originalStepExec = ((StepContextImpl) stepContext)
                .getOriginalStepExecution();
plan.setPartitionsOverride(originalStepExec == null);

Thanks, KUTGW
[1]

15:32:30,049 WARN  [org.jberet] (Batch Thread - 3) JBERET000018: Could not find the original step execution to restart.  Current step execution id: 0, step name: ajouter_metadonnees
15:32:30,051 ERROR [org.jberet] (Batch Thread - 3) JBERET000007: Failed to run job collecterAtlas, ajouter_metadonnees, org.jberet.job.model.Step@164a6d81: java.lang.NullPointerException
    at org.jberet.runtime.runner.StepExecutionRunner.beginPartition(StepExecutionRunner.java:256) [jberet-core-1.2.0.Final.jar:1.2.0.Final]
    at org.jberet.runtime.runner.StepExecutionRunner.runBatchletOrChunk(StepExecutionRunner.java:216) [jberet-core-1.2.0.Final.jar:1.2.0.Final]
    at org.jberet.runtime.runner.StepExecutionRunner.run(StepExecutionRunner.java:140) [jberet-core-1.2.0.Final.jar:1.2.0.Final]
    at org.jberet.runtime.runner.CompositeExecutionRunner.runStep(CompositeExecutionRunner.java:164) [jberet-core-1.2.0.Final.jar:1.2.0.Final]
    at org.jberet.runtime.runner.CompositeExecutionRunner.runJobElement(CompositeExecutionRunner.java:128) [jberet-core-1.2.0.Final.jar:1.2.0.Final]
    at org.jberet.runtime.runner.StepExecutionRunner.run(StepExecutionRunner.java:196) [jberet-core-1.2.0.Final.jar:1.2.0.Final]
    at org.jberet.runtime.runner.CompositeExecutionRunner.runStep(CompositeExecutionRunner.java:164) [jberet-core-1.2.0.Final.jar:1.2.0.Final]
    at org.jberet.runtime.runner.CompositeExecutionRunner.runJobElement(CompositeExecutionRunner.java:128) [jberet-core-1.2.0.Final.jar:1.2.0.Final]
    at org.jberet.runtime.runner.StepExecutionRunner.run(StepExecutionRunner.java:196) [jberet-core-1.2.0.Final.jar:1.2.0.Final]
    at org.jberet.runtime.runner.CompositeExecutionRunner.runStep(CompositeExecutionRunner.java:164) [jberet-core-1.2.0.Final.jar:1.2.0.Final]
    at org.jberet.runtime.runner.CompositeExecutionRunner.runFromHeadOrRestartPoint(CompositeExecutionRunner.java:88) [jberet-core-1.2.0.Final.jar:1.2.0.Final]
    at org.jberet.runtime.runner.JobExecutionRunner.run(JobExecutionRunner.java:59) [jberet-core-1.2.0.Final.jar:1.2.0.Final]
    at org.wildfly.jberet.services.BatchEnvironmentService$WildFlyBatchEnvironment$1.run(BatchEnvironmentService.java:193) [eap6-jberet-1.0.3-SNAPSHOT.jar:1.0.3-SNAPSHOT]
    at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:471) [rt.jar:1.7.0_80]
    at java.util.concurrent.FutureTask.run(FutureTask.java:262) [rt.jar:1.7.0_80]
    at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145) [rt.jar:1.7.0_80]
    at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615) [rt.jar:1.7.0_80]
    at java.lang.Thread.run(Thread.java:745) [rt.jar:1.7.0_80]
    at org.jboss.threads.JBossThread.run(JBossThread.java:122)
@chengfang chengfang self-assigned this Jan 3, 2016
chengfang added a commit that referenced this issue Jan 5, 2016
@chengfang
Copy link
Contributor

We currently check for isRestart via isRestart = jobContext.isRestart();. Since this is for the step, we should instead check for if the current step has been executed before:

isRestart = originalStepExecution != null;

chengfang added a commit that referenced this issue Jan 6, 2016
…e current step is restart instead of checking if the current job execution is restart).
chengfang added a commit that referenced this issue Jan 6, 2016
…ed chunk; these tests verify partition mapper override true and restart.
@bflorat
Copy link
Author

bflorat commented Jan 6, 2016

Thanks for the fix,
Do you plan a 1.2.1 maintenance release bringing this or will it only be available in 1.3.x branch ?

(The workaround we provided works but required overnumerous database access to get previous start status in some mappers, that will impact performances I guess)

@chengfang
Copy link
Contributor

Thanks for reporting this issue. The fix will be in the next release (1.3.0.Beta1).

@chengfang
Copy link
Contributor

Created a shadow JIRA issue for EAP 7.0.2, which will include jberet-core 1.2.1.

https://issues.jboss.org/browse/JBEAP-4847

spyrkob pushed a commit to spyrkob/jsr352 that referenced this issue Jun 3, 2016
spyrkob pushed a commit to spyrkob/jsr352 that referenced this issue Jun 3, 2016
… if the current step is restart instead of checking if the current job execution is restart).
spyrkob pushed a commit to spyrkob/jsr352 that referenced this issue Jun 3, 2016
…titioned chunk; these tests verify partition mapper override true and restart.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants