Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[BUG] [OPS] S3 PUG NTC Execution failed on too short product #1050

Open
12 tasks
suberti-ads opened this issue Jul 27, 2023 · 7 comments
Open
12 tasks

[BUG] [OPS] S3 PUG NTC Execution failed on too short product #1050

suberti-ads opened this issue Jul 27, 2023 · 7 comments
Assignees
Labels
bug Something isn't working CCB Issue for CCB ops Ticket from ADS operation team priority:blocking Set the priority to blocking because the production is blocked S3 Relative to Sentinel-3 RS Addons to_be_fixed_phase1 Issue to be fixed for RS phase 1 WERUM dev Ticket dedicated to WERUM development

Comments

@suberti-ads
Copy link

suberti-ads commented Jul 27, 2023

Environment:

  • Delivery tag:
  • Platform: OPS Orange Cloud
  • Configuration:
    rs-addon-s3-pug-ntc: 1.14.1-rc1

Traceability:

Current Behavior:
Some execution fall in error with following message:

[code 290] [exitCode 255] [msg Task /usr/local/components/PUG-3.48/bin/PUGCoreProcessor failed]

It appears product sensing duration was very short for these products.

Expected Behavior:

Steps To Reproduce:
Start 3% production or 24h00 test

Test execution artefacts (i.e. logs, screenshots…)
Execution logs:
PUG-NTC-joborder-120263.txt
JobOrder:
Job120263.txt
Preparation log:
s3-pug-ntc-part1-preparation-worker-v2-84f98487d-9hph5.log

Whenever possible, first analysis of the root cause
sample for job 120263
Product which trigger production ==> S3B_SL_1_RBT____20230427T234045_20230427T235320_20230630T215052_0754_079_001______LN3_D_NT_002.SEN3

First error seen in logs:

2023-07-26T12:02:32+00:00	2023-07-26T12:02:32.005784 s3-pug-ntc-part1-execution-worker-v2-698877f8f7-vscsv PUG_SL_1_RBT 03.48 [0000000132]: [I] PUGCoreProcessor: Exiting with EXIT CODE: 255
2023-07-26T12:02:32+00:00		FATAL: All the product data unit generations exited in error!
2023-07-26T12:02:32+00:00	2023-07-26T12:02:32.005710 s3-pug-ntc-part1-execution-worker-v2-698877f8f7-vscsv PUG_SL_1_RBT 03.48 [0000000132]: [E] PUGCoreProcessor: [PUGCoreProcessor.C: execute:(359)] Unable to generate the required PDUs! --- acs::exCriticalException in PDUGenerator.C(270) from void acs::PDUGenerator::createPDUs() thread "" [140201793349824]
2023-07-26T12:02:32+00:00			Duration from JO: 0.002618 [s]  Minimum duration: 2 lines, i.e. 0.599972 [s]
2023-07-26T12:02:32+00:00		COULD NOT TRIGGER THE GENERATION OF THIS PDU [2023-04-27T23:40:45.143984, 2023-04-27T23:40:45.146602]:
2023-07-26T12:02:32+00:00		acs::StripeGeneratorThread::exStripeGeneratorThreadException in StripeGeneratorThread.C(410) from void acs::StripeGeneratorThread::createPDU() thread "unnamedThread" [140201343260416]
2023-07-26T12:02:32+00:00		caused by:
2023-07-26T12:02:32+00:00		Problem found during product data unit generation -> skipping to the next, if any ...
2023-07-26T12:02:32+00:00	acs::PDUGenerator::exPDUGeneratorException in (0) from  thread "" [140201793349824]

So it try to generate a very short product:

2023-07-26T12:02:32+00:00		COULD NOT TRIGGER THE GENERATION OF THIS PDU [2023-04-27T23:40:45.143984, 2023-04-27T23:40:45.146602]:

This match with value found in preparation job:

	"taskTableName" : "TaskTable.PUG_SL_1_RBT.03.xml",
	"startTime" : "2023-04-27T23:40:45.143984Z",
	"stopTime" : "2023-04-27T23:40:45.146602Z",

So it seems to be an configuration issue
hereafter our configuration preparation:

app.preparation-worker.pdu.config.SL_1_RBT___.type=FRAME
app.preparation-worker.pdu.config.SL_1_RBT___.length-in-s=180
app.preparation-worker.pdu.config.SL_1_RBT___.gap-threshhold-in-s=3
app.preparation-worker.pdu.config.SL_1_RBT___.dyn-proc-params.facilityName=LN3
app.preparation-worker.pdu.config.SL_1_RBT___.dyn-proc-params.hardwareName=O
[...]
app.housekeep.pdu.config.SL_1_RBT___.type=FRAME
app.housekeep.pdu.config.SL_1_RBT___.length-in-s=180
app.housekeep.pdu.config.SL_1_RBT___.gap-threshhold-in-s=0.2
app.housekeep.pdu.config.SL_1_RBT___.dyn-proc-params.facilityName=LN3
app.housekeep.pdu.config.SL_1_RBT___.dyn-proc-params.hardwareName=O

preparation log received product seen:

2023-07-25T03:28:23+00:00	{"header":{"type":"LOG","timestamp":"2023-07-25T03:28:23.511838Z","level":"INFO","line":260,"file":"MetadataClient.java","thread":"KafkaConsumerDestination{consumerDestinationName='s3-pug-ntc-part1.message-filter', partitions=30, dlqName='error-warning'}.container-0-C-1"},"message":{"content":"First Product of Orbit query for product type 'SL_1_RBT___' and orbit '26068' returned S3Metadata [absoluteStartOrbit=26068, anxTime=2023-04-27T21:59:45.974883Z, anx1Time=2023-04-27T23:40:45.146602Z, creationTime=2023-06-30T21:50:52.000000Z, granuleNumber=17, granulePosition=NONE, insertionTime=2023-06-30T22:28:15.970452Z, productName=S3B_SL_1_RBT____20230427T234045_20230427T235320_20230630T215052_0754_079_001______LN3_D_NT_002.SEN3, productType=SL_1_RBT___, keyObjectStorage=S3B_SL_1_RBT____20230427T234045_20230427T235320_20230630T215052_0754_079_001______LN3_D_NT_002.SEN3, validityStart=2023-04-27T23:40:45.143984Z, validityStop=2023-04-27T23:53:19.714455Z, missionId=null, satelliteId=B, stationCode=null]"},"custom":{"logger_string":"esa.s1pdgs.cpoc.metadata.client.MetadataClient"}}
2023-07-25T03:28:23+00:00	{"header":{"type":"LOG","timestamp":"2023-07-25T03:28:23.506518Z","level":"INFO","line":225,"file":"MetadataClient.java","thread":"KafkaConsumerDestination{consumerDestinationName='s3-pug-ntc-part1.message-filter', partitions=30, dlqName='error-warning'}.container-0-C-1"},"message":{"content":"S3Metadata query for family 'S3_L1_NTC' and product name 'S3B_SL_1_RBT____20230427T234045_20230427T235320_20230630T215052_0754_079_001______LN3_D_NT_002.SEN3' returned S3Metadata [absoluteStartOrbit=26068, anxTime=2023-04-27T21:59:45.974883Z, anx1Time=2023-04-27T23:40:45.146602Z, creationTime=2023-06-30T21:50:52.000000Z, granuleNumber=17, granulePosition=NONE, insertionTime=2023-06-30T22:28:15.970452Z, productName=S3B_SL_1_RBT____20230427T234045_20230427T235320_20230630T215052_0754_079_001______LN3_D_NT_002.SEN3, productType=SL_1_RBT___, keyObjectStorage=S3B_SL_1_RBT____20230427T234045_20230427T235320_20230630T215052_0754_079_001______LN3_D_NT_002.SEN3, validityStart=2023-04-27T23:40:45.143984Z, validityStop=2023-04-27T23:53:19.714455Z, missionId=null, satelliteId=B, stationCode=null]"},"custom":{"logger_string":"esa.s1pdgs.cpoc.metadata.client.MetadataClient"}}
2023-07-25T03:28:23+00:00	{"header":{"type":"LOG","timestamp":"2023-07-25T03:28:23.501050Z","level":"INFO","line":134,"file":"TaskTableMapperService.java","thread":"KafkaConsumerDestination{consumerDestinationName='s3-pug-ntc-part1.message-filter', partitions=30, dlqName='error-warning'}.container-0-C-1"},"message":{"content":"Created IpfPreparationJobs for product S3B_SL_1_RBT____20230427T234045_20230427T235320_20230630T215052_0754_079_001______LN3_D_NT_002.SEN3"},"custom":{"logger_string":"esa.s1pdgs.cpoc.preparation.worker.service.TaskTableMapperService"}}
2023-07-25T03:28:23+00:00	{"header":{"type":"REPORT","timestamp":"2023-07-25T03:28:23.500000Z","level":"INFO","mission":"S3","workflow":"NOMINAL","rs_chain_name":"S3-PUG-NTC","rs_chain_version":"1.14.0-rc1"},"message":{"content":"End associating TaskTables to CatalogEvent S3B_SL_1_RBT____20230427T234045_20230427T235320_20230630T215052_0754_079_001______LN3_D_NT_002.SEN3"},"task":{"uid":"797eea65-0859-4d82-8914-e29f93976429","name":"TaskTableLookup","event":"END","status":"OK","output":{},"input":{"filename_string":"S3B_SL_1_RBT____20230427T234045_20230427T235320_20230630T215052_0754_079_001______LN3_D_NT_002.SEN3"},"quality":{},"error_code":0,"duration_in_seconds":0.0,"missing_output":[]}}
2023-07-25T03:28:23+00:00	{"header":{"type":"REPORT","timestamp":"2023-07-25T03:28:23.500000Z","level":"INFO","mission":"S3","workflow":"NOMINAL","rs_chain_name":"S3-PUG-NTC","rs_chain_version":"1.14.0-rc1"},"message":{"content":"Start associating TaskTables to CatalogEvent S3B_SL_1_RBT____20230427T234045_20230427T235320_20230630T215052_0754_079_001______LN3_D_NT_002.SEN3"},"task":{"uid":"797eea65-0859-4d82-8914-e29f93976429","name":"TaskTableLookup","event":"BEGIN","input":{"filename_string":"S3B_SL_1_RBT____20230427T234045_20230427T235320_20230630T215052_0754_079_001______LN3_D_NT_002.SEN3"},"child_of_task":"caa156d4-769f-44d1-b65c-0460d68d3635"}}
2023-07-25T03:28:23+00:00	{"header":{"type":"LOG","timestamp":"2023-07-25T03:28:23.500415Z","level":"INFO","line":76,"file":"RoutingBasedTasktableMapper.java","thread":"KafkaConsumerDestination{consumerDestinationName='s3-pug-ntc-part1.message-filter', partitions=30, dlqName='error-warning'}.container-0-C-1"},"message":{"content":"Got tasktable [TaskTable.PUG_SL_1_RBT.03.xml] for SL_1_RBT____B"},"custom":{"logger_string":"esa.s1pdgs.cpoc.preparation.worker.tasktable.mapper.RoutingBasedTasktableMapper"}}
2023-07-25T03:28:23+00:00	{"header":{"type":"REPORT","timestamp":"2023-07-25T03:28:23.499000Z","level":"INFO","mission":"S3","workflow":"NOMINAL","rs_chain_name":"S3-PUG-NTC","rs_chain_version":"1.14.0-rc1"},"message":{"content":"Received CatalogEvent for S3B_SL_1_RBT____20230427T234045_20230427T235320_20230630T215052_0754_079_001______LN3_D_NT_002.SEN3"},"task":{"uid":"caa156d4-769f-44d1-b65c-0460d68d3635","name":"ProductionTrigger","event":"BEGIN","input":{"filename_string":"S3B_SL_1_RBT____20230427T234045_20230427T235320_20230630T215052_0754_079_001______LN3_D_NT_002.SEN3"},"follows_from_task":"b4c8a3b3-44a1-4fe2-b68a-cd6636608c85"}}

Bug Generic Definition of Ready (DoR)

  • The affect version in which the bug has been found is mentioned
  • The context and environment of the bug is detailed
  • The description of the bug is clear and unambiguous
  • The procedure (steps) to reproduce the bug is clearly detailed
  • The tested User Story / features is linked to the bug if available
  • Logs are attached if available
  • A data set attached if available

Bug Generic Definition of Done (DoD)

  • the modification implemented (the solution to fix the bug) is described in the bug.
  • Unit tests & Continuous integration performed - Test results available - Structural Test coverage reported by SONAR
  • Code committed in GIT with right tag or Analysis/Trade Off documentation up-to-date in reference-system-documentation repository
  • Code is compliant with coding rules (SONAR Report as evidence)
  • Acceptance criteria of the related User story are checked and Passed
@suberti-ads suberti-ads added bug Something isn't working CCB Issue for CCB ops Ticket from ADS operation team WERUM dev Ticket dedicated to WERUM development priority:blocking Set the priority to blocking because the production is blocked S3 Relative to Sentinel-3 RS Addons to_be_fixed_phase1 Issue to be fixed for RS phase 1 labels Jul 27, 2023
@w-jka
Copy link

w-jka commented Jul 27, 2023

@suberti-ads
We added a new parameter for this issue:

app.preparation-worker.pdu.config.<product_type>.minPDULengthThreshold=0.0
app.housekeep.pdu.config.<product_type>.minPDULengthThreshold=0.0

This parameter tries to merge (or drop if not possible) too short time intervals in order to prevent this issue. It will be included in the next delivery 1.14.0-rc2.

@vgava-ads
Copy link

vgava-ads commented Aug 1, 2023

System_CCB_2023_w31: Delivered in the Processing Sentinel-3 v.14.0 (Refer to https://github.com/COPRS/processing-sentinel-3/releases/tag/1.14.0-rc2) and in the Processing Sentinel-1 v1.14.0 (Refer to https://github.com/COPRS/processing-sentinel-1/releases/tag/1.14.0-rc2) and in the Processing Common v1.14.0 (Refer to https://github.com/COPRS/production-common/releases/tag/1.14.0-rc2)

To be validated by IVV/OPS team.

@LAQU156
Copy link

LAQU156 commented Aug 2, 2023

System_CCB_2023_w31 : Moved into "Accepted Werum", and to validate, action done.

@suberti-ads
Copy link
Author

I don't find configuration in contents for pug-ntc in last delivery:
v 1.14.0-rc-2:
https://github.com/COPRS/processing-sentinel-3/blob/1.14.0-rc2/s3-pug-ntc/content/stream-parameters.properties
develop branch:
https://github.com/COPRS/processing-sentinel-3/blob/develop/s3-pug-ntc/content/stream-parameters.properties
So i add workaround tag

We will add and test this workaround with 1.14.

@suberti-ads suberti-ads added the workaround Workaround activated label Aug 2, 2023
@LAQU156
Copy link

LAQU156 commented Aug 2, 2023

System_CCB_2023_w31 : To be tested with 1.14.0 version

@Woljtek
Copy link

Woljtek commented Aug 11, 2023

There is not any occurrence of this error after deployment of 1.14.0:

System_CCB_2023_w31 : Fixed, closed

@suberti-ads
Copy link
Author

I have bad news on this issue:

@suberti-ads suberti-ads reopened this Jul 16, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working CCB Issue for CCB ops Ticket from ADS operation team priority:blocking Set the priority to blocking because the production is blocked S3 Relative to Sentinel-3 RS Addons to_be_fixed_phase1 Issue to be fixed for RS phase 1 WERUM dev Ticket dedicated to WERUM development
Projects
None yet
Development

No branches or pull requests

5 participants