Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Validate and Load all PDS4 MESSENGER data products with Nucleus #54

Open
jordanpadams opened this issue Jun 14, 2023 · 19 comments
Open

Comments

@jordanpadams
Copy link
Member

💡 Description

  • Document benchmark metrics for execution time, cost, ?
@jordanpadams jordanpadams changed the title Complete execution of Nucleus on all PDS4 MESSENGER data products Validate and Load all PDS4 MESSENGER data products with Nucleus Jun 14, 2023
@tloubrieu-jpl
Copy link
Member

@ramesh-maddegoda focuses on MSGRMDS_4001 and MESSDEM_1001 that need to be loaded in the registry first so that they can be used in ticket NASA-PDS/search-api-notebook#24

@tloubrieu-jpl
Copy link
Member

Blocked because AWS Airflow is unavailable on NGAP

@tloubrieu-jpl
Copy link
Member

Unblocked since Ramesh work on MCP. He is now testing the ECS task called by the nucleus workflow.

@jordanpadams
Copy link
Member Author

Status: @ramesh-maddegoda working on improving Terraform deployments

@tloubrieu-jpl
Copy link
Member

@ramesh-maddegoda is deploying everything needed on MCP, from scratch.

@tloubrieu-jpl
Copy link
Member

@ramesh-maddegoda will test nucleus to validate its robustness with a bigger dataset.

@ramesh-maddegoda
Copy link
Contributor

Some of the files in the s3://asc-pds-messenger failed to copy to the PDS Nucleus staging bucket with a permission issue.

aws s3 cp s3://asc-pds-messenger/MSGRMDS_8001/RTM/MDIS_RTM_N01/2013_228/MDIS_RTM_N01_006974_4644396_1.IMG s3://pds-nucleus-staging/messenger-data/MSGRMDS_8001/RTM/MDIS_RTM_N01/2013_228/
copy failed: s3://asc-pds-messenger/MSGRMDS_8001/RTM/MDIS_RTM_N01/2013_228/MDIS_RTM_N01_006974_4644396_1.IMG to s3://pds-nucleus-staging/messenger-data/MSGRMDS_8001/RTM/MDIS_RTM_N01/2013_228/MDIS_RTM_N01_006974_4644396_1.IMG An error occurred (AccessDenied) when calling the GetObjectTagging operation: Access Denied

@tloubrieu-jpl
Copy link
Member

tloubrieu-jpl commented Jan 23, 2024

A new parameter enable to copy all the metadata .

ramesh-maddegoda added a commit that referenced this issue Feb 7, 2024
…e verified files DataSync report instead of the transferred files report

Refer to task #54
ramesh-maddegoda added a commit that referenced this issue Feb 7, 2024
…ion lambda code to make sure both product table product_data_file_mapping table are updated in a consistent way (make sure both tables are updated).

Refer to task #54
ramesh-maddegoda added a commit that referenced this issue Feb 7, 2024
…ion lambda code to make sure both product table product_data_file_mapping table are updated in a consistent way (make sure both tables are updated).

Refer to task #54
@tloubrieu-jpl
Copy link
Member

@ramesh-maddegoda identified a bug while doing that test. The lambda reading the data sync report is now taking more that 15 minutes. Now there will be a single lambda call per report.

@tloubrieu-jpl
Copy link
Member

The upgrade worked on a small dataset and @ramesh-maddegoda is now testing on the messenger dataset.

ramesh-maddegoda added a commit that referenced this issue Feb 15, 2024
…ing to process a large amount of files in one lambda

Refer to task #54
ramesh-maddegoda added a commit that referenced this issue Feb 15, 2024
…ing to process a large amount of files in one lambda

Refer to task #54
ramesh-maddegoda added a commit that referenced this issue Feb 15, 2024
…ing to process a large amount of files in one lambda

Refer to task #54
@tloubrieu-jpl
Copy link
Member

Now Ramesh is loading data to the registry on JPL AWS. Last step for this task.

@nutjob4life
Copy link
Member

20,000 processed! 8 directories ran! Found 2 errors:

  • "Capacity not available at the moment", related to ECS, seems to have run out; need to have "capacity providers" but cost is a concern; see related bug in Nucleus
  • Role error: have seen it before, error occured by reading data file. An error in validate‽ Trying to isolate it with the same Docker image and reproduce it without Nucleus. Is it Airflow or validate?

@tloubrieu-jpl
Copy link
Member

tloubrieu-jpl commented Feb 27, 2024

@tloubrieu-jpl
Copy link
Member

@ramesh-maddegoda is experimenting with SQS to send to new records to the mysql database and avoid the time out he was experiencing with direct insertion.

@tloubrieu-jpl
Copy link
Member

SQS now mostly works, but another lambda had a time out.

@tloubrieu-jpl
Copy link
Member

We now integrate the copy from S3 to EFS as a nucleus step in the DAGs. We give up DataSync which comes with risks of overlapping copies and complication to remove files from EFS.

@ramesh-maddegoda will also write a note in a wiki for a future design where we don't need to use EFS at all.

@jordanpadams
Copy link
Member Author

This work has been paused as we focus on Catalina Sky Survey. Will move to B15.0 release plan to complete work.

@jordanpadams
Copy link
Member Author

📆 05/2024 status: Delayed several sprints due to delays in #93. This is an operations activity. No impact on build.

@jordanpadams
Copy link
Member Author

📆 06/2024 status: Delayed several sprints due to delays in #93. This is an operations activity. No impact on build.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
Status: Release Backlog
Status: ToDo
Development

No branches or pull requests

4 participants