Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Review jwst code for step.skip usage #8498

Closed
stscijgbot-jp opened this issue May 21, 2024 · 8 comments · Fixed by #8600
Closed

Review jwst code for step.skip usage #8498

stscijgbot-jp opened this issue May 21, 2024 · 8 comments · Fixed by #8600

Comments

@stscijgbot-jp
Copy link
Collaborator

stscijgbot-jp commented May 21, 2024

Issue JP-3631 was created on JIRA by Melanie Clarke:

Steps defined in stpipe always have a 'skip' attribute; when this is set to True, the step is skipped.  This is available to users as a top-level parameter for each step.

The skip attribute should rarely be set by code within a pipeline step, since it may have unintended consequences.  For example, currently, when the outlier_detection step is run on NIRSpec MOS data, if skip is set for one slit, the step is then skipped for all subsequent slits.  This fix is addressed in the PR for JP-2922.  A similar fix for the tweakreg step is addressed in PR 8476 (https://github.com/spacetelescope/jwst/pull/8476).

We should review remaining steps for similar issues and make sure that the skip attribute is only set by the user or at the pipeline level, not internal to any step.  In particular, it looks like skip is set to True in the core class JwstStep when record_step_status is called with success=False.  This function should probably either be deprecated or updated to avoid setting skip directly.

@stscijgbot-jp
Copy link
Collaborator Author

Comment by Melanie Clarke on JIRA:

This issue was noted by Howard Bushouse in review for tweakreg and outlier_detection updates. Howard, please edit or comment to clarify as needed.

@stscijgbot-jp
Copy link
Collaborator Author

Comment by Howard Bushouse on JIRA:

I'm guessing that when record_step_status was originally written it was assumed that it would only be used once per high-level pipeline processing run, as opposed to the use case where a step gets called repetitively within a single pipeline run. The function itself is probably still useful, for convenience, in some instances, but I agree that we should rethink or eliminate setting self.skip=True.

@jdavies-st
Copy link
Collaborator

I can confirm record_step_status() was written to centralize updating the FITS header on files that had steps skipped in a consistent way. Generally it should not set self.skip, as it is not needed. So line

self.skip = True

can be removed. It is unnecessary, but also it should not have any bad effect, as this function is called at the end of a step, and step instances should not be reused. This method updates the equivalent of the PRIMARY header, so it by definition should only be called once per datamodel (FITS file), though it should be able to be called on N times for steps that output N datamodels. self.skip should only have an effect if the step instance is reused by a pipeline, which is a big no-no for the stpipe Step infrastructure.

@stscijgbot-jp
Copy link
Collaborator Author

stscijgbot-jp commented Jun 25, 2024

Comment by Ned Molter on JIRA:

Noting here the places where I find direct setting of self.skip within a step:

  • cube_build_step.py: 2 instances on lines 399, 405
  • tweakreg_step.py: 6 instances remain, even after merging the PR mentioned in the description. Are these benign?

And here the places where I find record_step_status:

  • master_background_step.py: 3 instances of success=False, one with success=True
  • master_background_mos_step.py: 4 instances of success=False, one with success=True
  • pixel_replace.py: 2 instances, both with success=True, so removing that line will not make any difference here

As a random aside, the record_step_status function seems a bit underutilized, and I'm sure there are instances where this could (should?) be used to e.g. set the metadata attributes for all the slits in a MultiSlitModel.  I guess that change would be low-priority and beyond the scope of this PR, but still

@stscijgbot-jp
Copy link
Collaborator Author

Comment by Ned Molter on JIRA:

In addition, the pipeline codes are checking whether the step.skip attribute was set to True after the step had been run for master_background and cube_build.  With the change that removes setting of that attribute within the step, these checks need to be changed to checks of the meta.step.skipped metadata of the result instead.

@stscijgbot-jp
Copy link
Collaborator Author

Comment by David Law on JIRA:

Melanie Clarke This ticket is current set to RTT, but I'm unclear what testing can be performed or what if any work remains?

@stscijgbot-jp
Copy link
Collaborator Author

Comment by Melanie Clarke on JIRA:

Fixed by #8600

@stscijgbot-jp
Copy link
Collaborator Author

Comment by Melanie Clarke on JIRA:

Thanks for pointing this out.  This was more of an internal refactoring, so no additional specific tests should be needed from INS. 

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants