Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

pipelinerefactor daily mode updates #2201

Merged
merged 9 commits into from Mar 23, 2024
Merged

Conversation

sbailey
Copy link
Contributor

@sbailey sbailey commented Mar 22, 2024

This is a "PR on a PR" to document the changes I made to the pipelinerefactor branch in PR #2187 while testing daily mode, and to allow @akremin an opportunity to review them before they are added to that branch. Changes are:

  • BUGFIX: log.log -> log.info typo
  • FEATURE: Added more logging to make it easier to see the boundaries between multiple invocations of desi_proc_night --daily ... when they are all piped to the same log.
  • FEATURE: changed "ok to proceed with submitting cals" criteria to require a 1s flat instead of requiring 3 cte flats.
    • Reasoning: Although we currently take 3 cte flats and thus --daily works for current data, the pipeline only requires/uses the 1s flat so this makes the pipeline job logic closer to what is actually required. This enables testing on older data and gives a little more robustness for the future if we update our CTE flat exposures in the future.
  • FEATURE: adds --still-acquiring option to be able to test --daily mode on nights other than the current night. If not specified, it defaults to previous logic based upon current night and observing hours.
    • Note: this moved the "true_night" and "still_acquiring" logic out of the "if daily:" block since "still_acquiring" has to be correctly defined even in non-daily mode, and the new logic also needs to know "true_night" while also being able to distinguish between "still_acquiring" being auto-derived (from None) vs. being overridden by the user as True/False.
  • TESTING: adds a unit test that only runs at NERSC that mimics daily mode by staging links to real raw data in a separate testing $DESI_SPECTRO_DATA, and runs desi_proc_night --daily --dry-run-level 1 ... multiple times as new data are staged. This test catches the BUGFIX and exercises the new FEATUREs added.
  • FEATURE: added desispec.test.util.link_rawdata to facilitate the new test. That code is also useful for non-unittest interactive testing to stage links (I originally wrote it for interactive testing of desi_daily_proc --daily and then ported the workflow into desispec.test).

@sbailey sbailey requested a review from akremin March 22, 2024 16:14
Copy link
Member

@akremin akremin left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you for submitting this PR to the PR. It will be nice to have a unit test that covers the daily operating mode, and I know it was difficult to contrive the correct test environment to do that. I have requested some changes before we should merge to simplify things a bit.

For the record, this was all for a specific unit test, rather than features that will ever be needed in real-world usage. The existing code works in all circumstances on historic data whether --daily is set or not, and on new data with --daily set. The new features are only required if trying to trick the code into pretending today is a day in the past and where you want to pass data in slowly and watch what the code does.

Overview of the requested changes: I think the proc_night code should be reverted back to run under if daily and the input be converted to a standard boolean flag. There are a few more checks that could be done in the unit test.

@@ -102,6 +102,10 @@ def parse_args():
help="If set then the code will NOT raise an error "
+ "if asked to link psfnight calibrations "
+ "without fiberflatnight calibrations.")
parser.add_argument("--still-acquiring", action=argparse.BooleanOptionalAction,
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this can be a normal boolean flag, i.e. I don't think we need a None state. If set, then it is true regardless of what the code thinks. If not set, it is False, which is the default assumption in the code unless specific criteria are met.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I reverted this here and in desispec.scripts.proc_night.proc_night. I agree that keeps the --daily options override more self contained. I originally went down this path because at the desi_proc(..., still_acquiring) level I wanted to distinguish between "not set" (None) vs. "don't do it" (False), though admittedly I don't have a good reason why one would need to force it off. In the updated logic, still_acquiring=False doen't really mean False, but rather "auto derive what to do". I've clarified that in the docstring, and added a log message when it gets overridden.

@@ -186,30 +190,32 @@ def proc_night(night=None, proc_obstypes=None, z_submit_types=None,
elif dry_run_level > 0:
dry_run = True

## What night are we running on?
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This should only be set automatically for --daily mode, otherwise we want the night to be a required variable. This should be moved back to under --daily. I also found the log.info() relevant, so I'd prefer that to be restored as well.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I moved this back under if daily; see commentary on first desi_proc_night comment.

## Set a flag to determine whether to process the last tile in the exposure table
## or not. This is used in daily mode when processing and exiting mid-night.
still_acquiring = False

if still_acquiring is None:
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If we change still_acquiring to a simple boolean argument then none of this logic is necessary. The old code would only set still_acquiring to True, it never set it to False (except for the initialization which would be removed since it is now an input)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done; see commentary on desi_proc_night comment

py/desispec/scripts/proc_night.py Show resolved Hide resolved
py/desispec/test/test_proc_night.py Show resolved Hide resolved
py/desispec/workflow/processing.py Show resolved Hide resolved
py/desispec/workflow/science_selection.py Show resolved Hide resolved
@@ -53,7 +53,8 @@ def proc_night(night=None, proc_obstypes=None, z_submit_types=None,
path_to_data=None, exp_obstypes=None, camword=None,
badcamword=None, badamps=None, exps_to_ignore=None,
sub_wait_time=0.5, verbose=False, dont_require_cals=False,
psf_linking_without_fflat=False):
psf_linking_without_fflat=False,
still_acquiring=None):
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Change to False

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done; see commentary on desi_proc_night comment.

py/desispec/scripts/proc_night.py Show resolved Hide resolved
@@ -102,6 +102,10 @@ def parse_args():
help="If set then the code will NOT raise an error "
+ "if asked to link psfnight calibrations "
+ "without fiberflatnight calibrations.")
parser.add_argument("--still-acquiring", action=argparse.BooleanOptionalAction,
help="in --daily mode, assume more data is still coming even if "
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please specify this flag is for testing purposes only

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done

Copy link
Contributor Author

@sbailey sbailey left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the review. I think I have addressed all comments and made updates.

@@ -102,6 +102,10 @@ def parse_args():
help="If set then the code will NOT raise an error "
+ "if asked to link psfnight calibrations "
+ "without fiberflatnight calibrations.")
parser.add_argument("--still-acquiring", action=argparse.BooleanOptionalAction,
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I reverted this here and in desispec.scripts.proc_night.proc_night. I agree that keeps the --daily options override more self contained. I originally went down this path because at the desi_proc(..., still_acquiring) level I wanted to distinguish between "not set" (None) vs. "don't do it" (False), though admittedly I don't have a good reason why one would need to force it off. In the updated logic, still_acquiring=False doen't really mean False, but rather "auto derive what to do". I've clarified that in the docstring, and added a log message when it gets overridden.

@@ -102,6 +102,10 @@ def parse_args():
help="If set then the code will NOT raise an error "
+ "if asked to link psfnight calibrations "
+ "without fiberflatnight calibrations.")
parser.add_argument("--still-acquiring", action=argparse.BooleanOptionalAction,
help="in --daily mode, assume more data is still coming even if "
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done

@@ -53,7 +53,8 @@ def proc_night(night=None, proc_obstypes=None, z_submit_types=None,
path_to_data=None, exp_obstypes=None, camword=None,
badcamword=None, badamps=None, exps_to_ignore=None,
sub_wait_time=0.5, verbose=False, dont_require_cals=False,
psf_linking_without_fflat=False):
psf_linking_without_fflat=False,
still_acquiring=None):
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done; see commentary on desi_proc_night comment.

py/desispec/scripts/proc_night.py Show resolved Hide resolved
@@ -186,30 +190,32 @@ def proc_night(night=None, proc_obstypes=None, z_submit_types=None,
elif dry_run_level > 0:
dry_run = True

## What night are we running on?
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I moved this back under if daily; see commentary on first desi_proc_night comment.



etable = load_table(get_exposure_table_pathname(self.dailynight))
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done

py/desispec/test/test_proc_night.py Show resolved Hide resolved
## if 1sec flat has arrived, cals should be submitted
# for jobdesc in ('ccdcalib', 'arc', 'psfnight', 'flat', 'nightlyflat'):
# self.assertIn(jobdesc, proctable['JOBDESC'])
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This highlighted that I had accidentally left a commented test check that I had intended to actually run. The "linkcal OR other jobs" logic is a bit tricky, depending upon exactly what is being linked. I left this test as-is for a normal night. If we want to test daily ops on a night with an override, we should add a specific different test for that case. Though we might never do that in operations — anytime that we don't have cals and would want to do an override, we'd probably be waiting to see if we got cals in the morning anyway.

@@ -61,7 +61,7 @@ def determine_calibrations_to_proc(etable, do_cte_flats=True,
## return nothing so that we swiftly exit and wait for more data.
if np.sum(exptypes == 'dark') >= 1 and np.sum(exptypes == 'arc') >= 5 \
and np.sum(exptypes == 'flat') >= 12 \
and (np.sum(exptypes == 'cteflat') >= 3 or not do_cte_flats
and ((np.sum(is_cte_1s) >= 1) or not do_cte_flats
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done. When we added the additional 3s and 10s flats, we purposefully put those before the 1s flat so that it could remain the "trigger" for launching the pipeline. If we added/removed other flats in the future, we should do the same (put them between the 120s fiberflats and the 1s CTE flat).

py/desispec/workflow/processing.py Show resolved Hide resolved
Copy link
Member

@akremin akremin left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have removed the extra pass that no longer needs to be there after uncommenting code in the unit test. There are no further code changes, so I'm approving and merging.

@akremin akremin merged commit d8725ac into pipelinerefactor Mar 23, 2024
24 checks passed
@akremin akremin deleted the pipelinerefactor-daily branch March 23, 2024 00:36
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
Status: Done
Development

Successfully merging this pull request may close these issues.

None yet

2 participants