-
Notifications
You must be signed in to change notification settings - Fork 2
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[BUG] SR1 1.14.0 - Some jobs failed with the error "CFI Orbit interpolation failed." #1029
Comments
Hereafter 3 sample Job for failed processing: Job created by Job 13341 Job created by Job 13342 Job created by: |
Republished from previous messagre from @w-jka (Deleted by mistake) From the provided logs and AppDataJob extracts I could not find any problems on our side. As Florian is on vacation this week, I do not have any access to the documentation of the processors, so I can not check if the ICD of the processor contains any additional information regarding the exit code 136. From the logs the provided orbit files are fine and, while not first in priority, are listed by the new tasktable. The processor itself states that the files are good to go before running into an error. |
A PSC issue is opened => https://esa-csc-gs.atlassian.net/browse/PSC-63 |
Werum_CCB_2023_w28 : Moved into "Refused Werum" to place it into "On hold" pipeline in CCB Board, waiting for ESA answer |
@Woljtek : I agree with this approach. As @w-jka pointed out, it looks unlikely to be an issue within our software as there was no change on our side and the kind of error looks more like an issue within the IPF itself. Exit code 136 is often associated in C/C++ programs as SIGFPE and might be caused by an exception with a floating point or an integer oveflow. This is very likely an issue within the IPF as the code is executed as blackbox on our side. |
IVV_CCB_2023_w29 : moved to accepted OPS . |
My understanding of the issue is that in degraded cases (missing ROE_AX and DO_0_NAV) the CFI is using TM_0_NAT and in this case the initialisation of the orbit fails. This point is linked to the change of the version of EO CFI inside the CFI. |
System_CCB_2023-w30 : The issue is on CFI side for a degraded case. Priority reduced to major. |
4 new occurences on SR1-NRT
Note : CAMS Ticket on this issue : 4118 |
4 new occurences on SR1-NRT
with following logs:
|
Environment:
Traceability:
Current Behavior:
During the NON-REGRESSION test with PREINT, we observed that several executions ended with the following error:
Expected Behavior:
The addon shall be able to compute all products (it worked with 1.13.x)
Steps To Reproduce:
Play PREINT procedure with the dataset
s3://ops-rs-preint/s3/NRT/S3-SR1/input-data/
Test execution artefacts (i.e. logs, screenshots…)
Tip: You can attach images or log files by dragging & dropping, selecting or pasting them.
Each error is restarted 3 times before being discarded. (8 Jobs in error)
An example NOK Job: (from s3://ops-rs-failed-workdir/s3-sr1-nrt-preint-part2-execution-worker-v18-74b88978fc-g249k_S3B_SR_0_SRA____20230409T214430_20230409T215430_20230410T001919_0599_078_129______LN3_D_NR_002.SEN3_bf0677b7-e91e-46d2-9076-c7dea83f05ee_0/)
Full logs of EW:
https://app.zenhub.com/files/398313496/deccd093-7653-4a8b-96b0-d818d54122ba/download
Bug Generic Definition of Ready (DoR)
Bug Generic Definition of Done (DoD)
The text was updated successfully, but these errors were encountered: