You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Current Behavior:
On sample processing, we forecast to download 2 sessions S3B
S3B_20240411160247031055 (ingestion end time 16:28)
S3B_20240411142312031054 (ingestion end time 16:02)
All L0 Product have been generated during this period except 4 products generated after more than 15h00:
hereafter we saw for HKTM prouct 2 of these with ingestion_time over 15hours (ingestion_time is difference between start_processing_date and t0_pdgs_date)
monitoring=# SELECT
filename,
(generation_begin_date - product.t0_pdgs_date) As ingestion_duration,
(generation_end_date - generation_begin_date) As generation_duration,
(prip_storage_end_date - generation_end_date) As distribution_duration,
(prip_storage_end_date - product.t0_pdgs_date) As total_duration
FROM product
LEFT JOIN output_list ON output_list.product_id = product.id
LEFT JOIN processing ON processing.id = output_list.processing_id
WHERE
(processing_date - product.t0_pdgs_date) > INTERVAL '30 MINUTES' AND
processing.rs_chain_name='S3-L0P' AND
(processing.processing_date BETWEEN '2024-04-11 14:00:00' AND '2024-04-18 08:00:00') AND
mission = 'S3'AND
(product.filename like '%HK%' OR product.filename like '%NAT%')
ORDER BY processing_date;
filename | ingestion_duration | generation_duration | distribution_duration | total_durat
ion
-----------------------------------------------------------------------------------------------------+--------------------+---------------------+-----------------------+--------------
-------
S3B_TM_0_NAT____20240411T124346_20240411T142313_20240411T162634_5967_091_366______LN3_O_AL_002.SEN3 | 01:55:25.419 | 00:00:14.565 | 4 days 23:08:28.358 | 5 days 01:04:
08.342
S3B_TM_0_NAT____20240411T124346_20240411T142324_20240411T162649_5978_091_366______LN3_O_AL_002.SEN3 | 01:55:40.018 | 00:00:14.122 | 4 days 22:55:40.048 | 5 days 00:51:
34.188
S3B_TM_0_HKM____20240411T124345_20240411T142324_20240411T165101_5979_091_366______LN3_O_NR_002.SEN3 | 02:19:52.622 | 00:00:18.594 | 4 days 22:55:07.814 | 5 days 01:15:
19.03
S3B_TM_0_NAT____20240411T142325_20240411T160242_20240411T165120_5957_091_367______LN3_O_AL_002.SEN3 | 00:40:34.216 | 00:00:14.154 | 4 days 22:20:05.68 | 4 days 23:00:
54.05
S3B_TM_0_HKM2___20240411T124346_20240411T142324_20240411T165208_5978_091_366______LN3_O_NR_002.SEN3 | 02:20:59.16 | 00:00:14.084 | 4 days 22:19:18.061 | 5 days 00:40:
31.305
S3B_TM_0_NAT____20240411T142325_20240411T160259_20240411T165222_5973_091_367______LN3_O_AL_002.SEN3 | 00:41:36.243 | 00:00:13.701 | 4 days 22:42:32.508 | 4 days 23:24:
22.452
S3B_TM_0_HKM2___20240411T142325_20240411T160259_20240412T080211_5973_091_367______LN3_O_NR_002.SEN3 | 15:51:25.854 | 00:00:14.223 | 4 days 07:22:39.31 | 4 days 23:14:
19.387
S3B_TM_0_HKM____20240411T142324_20240411T160259_20240412T080240_5974_091_367______LN3_O_NR_002.SEN3 | 15:51:54.262 | 00:00:19.31 | 4 days 07:03:27.589 | 4 days 22:55:
41.161
(8 rows)
We saw these 4 job have been generated at about 08:00:00 by the preparation_worker
Products and jobs impacted:
JobOrder.149473.xml (MW0)
JobOrder.149509.xml (HKTM2)
JobOrder.149472.xml (DOP)
JobOrder.149494.xml (HKTM)
Expected Behavior:
Job should be successfully generated by preparation once all data needed data were available.
Steps To Reproduce:
This issue has been detected on sample processing S3
Whenever possible, first analysis of the root cause
On preparation log we saw these 4 jobs seems not detected as completed due to missing TM_0_NAT products.
sample for last job created 149494
Detection search start:
{"header":{"type":"LOG","timestamp":"2024-04-11T16:52:31.189641Z","level":"INFO","line":67,"file":"InputSearchService.java","thread":"KafkaConsumerDestination{consumerDestinationName='s3-l0p-part1.message-filter', partitions=4, dlqName='error-warning'}.container-0-C-1"},"message":{"content":"Start main input search for AppDataJob 149494"},"custom":{"logger_string":"esa.s1pdgs.cpoc.preparation.worker.service.InputSearchService"}}
Product detected (we saw NAT products)
{"header":{"type":"LOG","timestamp":"2024-04-11T16:52:31.199163Z","level":"INFO","line":653,"file":"MetadataClient.java","thread":"KafkaConsumerDestination{consumerDestinationName='s3-l0p-part1.message-filter', partitions=4, dlqName='error-warning'}.container-0-C-1"},"message":{"content":"Metadata query for family 'S3_L0' and product type 'TM_0_NAT___' returned 2 results"},"custom":{"logger_string":"esa.s1pdgs.cpoc.metadata.client.MetadataClient"}}
{"header":{"type":"LOG","timestamp":"2024-04-11T16:52:31.208419Z","level":"INFO","line":653,"file":"MetadataClient.java","thread":"KafkaConsumerDestination{consumerDestinationName='s3-l0p-part1.message-filter', partitions=4, dlqName='error-warning'}.container-0-C-1"},"message":{"content":"Metadata query for family 'S3_L0' and product type 'TM_0_NAT___' returned 2 results"},"custom":{"logger_string":"esa.s1pdgs.cpoc.metadata.client.MetadataClient"}}
But it seems not enought to generate execution:
{"header":{"type":"LOG","timestamp":"2024-04-11T16:52:31.208688Z","level":"INFO","line":70,"file":"InputSearchService.java","thread":"KafkaConsumerDestination{consumerDestinationName='s3-l0p-part1.message-filter', partitions=4, dlqName='error-warning'}.container-0-C-1"},"message":{"content":"Main input search did not complete successfully: [input TM_0_NAT___] [reason Incomplete result set for TM_0_NAT___][msg Missing inputs]"},"custom":{"logger_string":"esa.s1pdgs.cpoc.preparation.worker.service.InputSearchService"}}
{"header":{"type":"LOG","timestamp":"2024-04-11T16:52:31.236382Z","level":"INFO","line":119,"file":"PreparationWorkerService.java","thread":"KafkaConsumerDestination{consumerDestinationName='s3-l0p-part1.message-filter', partitions=4, dlqName='error-warning'}.container-0-C-1"},"message":{"content":"End preparation of new execution jobs"},"custom":{"logger_string":"esa.s1pdgs.cpoc.preparation.worker.service.PreparationWorkerService"}}
Once receive new AUX data the following days at 08:00:11, it seems restart job generation processus whhich raised an timeout which force processing
{"header":{"type":"REPORT","timestamp":"2024-04-12T08:00:11.301000Z","level":"INFO","mission":"S3","workflow":"NOMINAL","rs_chain_name":"S3-L0P","rs_chain_version":"1.14.0"},"message":{"content":"Received CatalogEvent for S3A_AX___FPO_AX_20240412T000000_20240419T000000_20240412T064627___________________EUM_O_AL_001.SEN3"},"task":{"uid":"6e330276-84ab-4261-bae0-2e252d9262db","name":"ProductionTrigger","event":"BEGIN","input":{"filename_string":"S3A_AX___FPO_AX_20240412T000000_20240419T000000_20240412T064627___________________EUM_O_AL_001.SEN3"},"follows_from_task":"49b65571-d83a-4b21-990d-847acc6315d9"}}
[...]
"header":{"type":"LOG","timestamp":"2024-04-12T08:00:11.307335Z","level":"INFO","line":67,"file":"InputSearchService.java","thread":"KafkaConsumerDestination{consumerDestinationName='s3-l0p-part1.message-filter', partitions=4, dlqName='error-warning'}.container-0-C-1"},"message":{"content":"Start main input search for AppDataJob 149472"},"custom":{"logger_string":"esa.s1pdgs.cpoc.preparation.worker.service.InputSearchService"}}
{"header":{"type":"LOG","timestamp":"2024-04-12T08:00:11.318281Z","level":"INFO","line":653,"file":"MetadataClient.java","thread":"KafkaConsumerDestination{consumerDestinationName='s3-l0p-part1.message-filter', partitions=4, dlqName='error-warning'}.container-0-C-1"},"message":{"content":"Metadata query for family 'S3_L0' and product type 'TM_0_NAT___' returned 2 results"},"custom":{"logger_string":"esa.s1pdgs.cpoc.metadata.client.MetadataClient"}}
{"header":{"type":"LOG","timestamp":"2024-04-12T08:00:11.344545Z","level":"INFO","line":653,"file":"MetadataClient.java","thread":"KafkaConsumerDestination{consumerDestinationName='s3-l0p-part1.message-filter', partitions=4, dlqName='error-warning'}.container-0-C-1"},"message":{"content":"Metadata query for family 'S3_L0' and product type 'TM_0_NAT___' returned 2 results"},"custom":{"logger_string":"esa.s1pdgs.cpoc.metadata.client.MetadataClient"}}
{"header":{"type":"LOG","timestamp":"2024-04-12T08:00:11.344801Z","level":"INFO","line":315,"file":"S3TypeAdapter.java","thread":"KafkaConsumerDestination{consumerDestinationName='s3-l0p-part1.message-filter', partitions=4, dlqName='error-warning'}.container-0-C-1"},"message":{"content":"Timeout reached for job 149472. Continue without missing products..."},"custom":{"logger_string":"esa.s1pdgs.cpoc.preparation.worker.type.s3.S3TypeAdapter"}}
Note 2 NAT product selected ( S3B_TM_0_HKM__G_20240411T142342_20240411T160317_20240411T164017_5975______________SVL_O_NR_OPE.ISIP) for processing
So product stuck because generated NAT files (142325 -->160259) not cover all product duration (142342 --> 160317)
So issue can talk about generation NAT which are shorter
Environment:
rs-addon s3-l0p v1.14
Traceability:
Current Behavior:
On sample processing, we forecast to download 2 sessions S3B
All L0 Product have been generated during this period except 4 products generated after more than 15h00:
hereafter we saw for HKTM prouct 2 of these with ingestion_time over 15hours (ingestion_time is difference between start_processing_date and t0_pdgs_date)
We saw these 4 job have been generated at about 08:00:00 by the preparation_worker
Products and jobs impacted:
JobOrder.149473.xml (MW0)
JobOrder.149509.xml (HKTM2)
JobOrder.149472.xml (DOP)
JobOrder.149494.xml (HKTM)
Expected Behavior:
Job should be successfully generated by preparation once all data needed data were available.
Steps To Reproduce:
This issue has been detected on sample processing S3
Test execution artefacts (i.e. logs, screenshots…)
Hereafter preparation log
s3-l0p-part1-preparation-worker-v52-d9dcb799f-9fzpj.log.gz
sample mongo job 149494:
joborder149494.json
sample mongo job 149509:
joborder149509.json
Whenever possible, first analysis of the root cause
On preparation log we saw these 4 jobs seems not detected as completed due to missing TM_0_NAT products.
sample for last job created 149494
Detection search start:
Product detected (we saw NAT products)
But it seems not enought to generate execution:
Once receive new AUX data the following days at 08:00:11, it seems restart job generation processus whhich raised an timeout which force processing
Note 2 NAT product selected ( S3B_TM_0_HKM__G_20240411T142342_20240411T160317_20240411T164017_5975______________SVL_O_NR_OPE.ISIP) for processing
Following tasktable
Herafter TM_NAT covering needed:
So product stuck because generated NAT files (142325 -->160259) not cover all product duration (142342 --> 160317)
So issue can talk about generation NAT which are shorter
All NAT job for this processing;
So we should explain why NAT generated by job 149479 was shorter
On generation log for S3B_TM_0_NAT____20240411T142325_20240411T160259_20240411T165222_5973_091_367______LN3_O_AL_002.SEN3 No error seen:
S3_L0P_S3B_TM_0_NAT____20240411T142325_20240411T160259_20240411T165222_5973_091_367______LN3_O_AL_002.log
Bug Generic Definition of Ready (DoR)
Bug Generic Definition of Done (DoD)
The text was updated successfully, but these errors were encountered: