Instrument processing command line utility #266

bourque · 2023-10-17T17:37:35Z

Change Summary

Overview

This PR adds a script to be used on the command line to process a specific instrument and data level. Currently the methods within just print a message, but eventually these should contain code to perform any necessary steps to run the processing.

Relevant issue: #256

New Files

imap_processing/run_processing.py
- The script that can be run on the command line by passing in parameters for instrument and data level

Updated Files

imap_processing/__init__.py
- Added list of instruments and list of processing_levels, used to validate the passed arguments

bourque · 2023-10-17T17:38:19Z

@tech3371 @greglucas @laspsandoval Could you provide an initial review on this? I just want to make sure that I am on the right track here, and if I have designed this appropriately. Thanks!

imap_processing/__init__.py

greglucas

👍 I added a few ideas to consider, but overall I think this is the right direction!

imap_processing/run_processing.py

tech3371

Yeah, it looks like you are heading in the right direction.

imap_processing/run_processing.py

…guments; updated help descriptions

laspsandoval

Nice. Just update the arguments.

laspsandoval · 2023-11-01T20:26:54Z

imap_processing/run_processing.py

+        f"The data level to process. Acceptable values are: {processing_levels}"
+    )
+
+    parser = argparse.ArgumentParser(description=description)


This looks good. An example of the command that I was planning on using for the batch job will be:

['CodiceHi', {'l2': ['2023-10-31', '2023-06-12'], 'l3': ['2023-06-01']}, 2]

where 2 is the version number.

And then I was planning on passing in the following environment:

environment={ "OUTPUT_PATH": data_bucket.bucket_name, "SECRET_NAME": db_secret_name, },

@laspsandoval , I don't think imap_processing will want the data bucket information or the DB secret because this is external to anything in our infrastructure. I think this is supposed to be going through our APIs to get the files instead.

I think the way this is written currently you'd need to kick off the container by overriding the ENTRYPOINT to be something like: imap-processing --instrument codice-hi --level 2 --start-date XYZ --end-date XYZ with command line arguments instead of a list/dict. Is that doable on the infrastructure side?

@greglucas as I understand it the algorithms will still need to query a database table, for example the universal spin table or a calibration table.

I was thinking that we had to append the s3 bucket information to the lambda url in order to use the upload/download api's. But I could be mistaken. Is that not correct?

@greglucas as I understand it the algorithms will still need to query a database table, for example the universal spin table or a calibration table.

I think we should avoid leaking anything database related between the processing and infrastructure. Can we make new endpoints for this instead?
/calibration_lookup, /spin_table_lookup, ... whatever else we need

I was thinking that we had to append the s3 bucket information to the lambda url in order to use the upload/download api's. But I could be mistaken. Is that not correct?

I think the Lambda should take care of this for us? So a user should only need to know the URL, but the details of where we put that in the backend shouldn't matter to them.

dev.imap-processing.com/upload should map to a different bucket than prod.imap-processing.com/upload, but a user shouldn't need to update both the url and bucket name. I'm hoping we can handle that for them.

@greglucas we could do as you suggested for the ENTRYPOINT, but to avoid duplicates I think we should do

imap-processing --instrument codice-hi --level 2 --dates [list of dates]

We should talk about this at your processing trigger discussion meeting. If I am reprocessing an entire mission, do I want to put 5 x 365 dates into the command line (there may even be a limit for how long the input terminal can be), versus a start / end date? Could the processing code handle the duplicates? Yet another option is to only allow one date in. So we'd spin up 5 x 365 separate processing jobs all with one date input...

…mmand-line-utility Instrument processing command line utility

Initial commit of instrument processing command line utility

d1300a9

bourque requested review from greglucas, tech3371 and laspsandoval October 17, 2023 17:37

bourque self-assigned this Oct 17, 2023

bourque commented Oct 17, 2023

View reviewed changes

imap_processing/__init__.py Outdated Show resolved Hide resolved

greglucas reviewed Oct 17, 2023

View reviewed changes

imap_processing/run_processing.py Outdated Show resolved Hide resolved

imap_processing/run_processing.py Outdated Show resolved Hide resolved

imap_processing/run_processing.py Show resolved Hide resolved

imap_processing/run_processing.py Outdated Show resolved Hide resolved

tech3371 reviewed Oct 17, 2023

View reviewed changes

imap_processing/run_processing.py Outdated Show resolved Hide resolved

tech3371 reviewed Oct 17, 2023

View reviewed changes

imap_processing/run_processing.py Outdated Show resolved Hide resolved

Now using abstract base class and raising exceptions for non-valid ar…

4efabef

…guments; updated help descriptions

bourque marked this pull request as draft October 17, 2023 22:08

laspsandoval approved these changes Nov 1, 2023

View reviewed changes

Merge branch 'dev' into command-line-utility

15e1466

bourque marked this pull request as ready for review December 13, 2023 16:32

bourque changed the title ~~[WIP] Instrument processing command line utility~~ Instrument processing command line utility Dec 13, 2023

bourque added 2 commits December 13, 2023 09:36

Updated supported processing levels

d73cf07

Updates to fix pre-commit hook checks

cd2a359

bourque merged commit b577fd6 into IMAP-Science-Operations-Center:dev Dec 13, 2023
14 checks passed

bourque deleted the command-line-utility branch December 13, 2023 17:01

bourque mentioned this pull request Dec 14, 2023

Create common script for selecting instrument/level processing containers #256

Closed

laspsandoval pushed a commit to laspsandoval/imap_processing that referenced this pull request Apr 2, 2024

Merge pull request IMAP-Science-Operations-Center#266 from bourque/co…

b4c4df2

…mmand-line-utility Instrument processing command line utility

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Instrument processing command line utility #266

Instrument processing command line utility #266

bourque commented Oct 17, 2023

bourque commented Oct 17, 2023

greglucas left a comment

tech3371 left a comment

laspsandoval left a comment

laspsandoval Nov 1, 2023

greglucas Nov 2, 2023

laspsandoval Nov 7, 2023

greglucas Nov 7, 2023

laspsandoval Nov 7, 2023

greglucas Nov 7, 2023

Instrument processing command line utility #266

Instrument processing command line utility #266

Conversation

bourque commented Oct 17, 2023

Change Summary

Overview

New Files

Updated Files

bourque commented Oct 17, 2023

greglucas left a comment

Choose a reason for hiding this comment

tech3371 left a comment

Choose a reason for hiding this comment

laspsandoval left a comment

Choose a reason for hiding this comment

laspsandoval Nov 1, 2023

Choose a reason for hiding this comment

greglucas Nov 2, 2023

Choose a reason for hiding this comment

laspsandoval Nov 7, 2023

Choose a reason for hiding this comment

greglucas Nov 7, 2023

Choose a reason for hiding this comment

laspsandoval Nov 7, 2023

Choose a reason for hiding this comment

greglucas Nov 7, 2023

Choose a reason for hiding this comment