Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Dark monitor threshold and cadence updates #1259

Merged

Conversation

bhilbert4
Copy link
Collaborator

Make some updates to the thresholds used to trigger a run of the dark monitor. Previously the monitor would only run when N files of a given instrument/aperture were found. This PR updates that to be N integrations (because some teams take dark calibration data with multiple integrations).

There are also some updates to the way the files are grouped, in order to support a case where thresholds have changed enough that the monitor will be re-run across data spanning a broad range in time. The date of the input files are first examined, and the files are split into groups based on the amount of time between consecutive files. A gap larger than N days is interpreted as an epoch boundary, and the monitor will not run on files that span more than 1 epoch. The files for each epoch are then further subdivided by the integration threshold described above. Since we cannot split a single file between monitor runs, the threshold value is used as a minimum, with the exception of the case where it would need to cross epochs.

For example, if the threshold value is 15 integrations, and the new files for one epoch have these numbers of integrations:
[1, 3, 1, 5, 10, 5, 7, 10, 5, 5, 2]
Then the code will divide up the files as such:
[1, 3, 1, 5, 10] = 20 total integrations, [5, 7, 10] = 22 total integrations, [5, 5, 2] = 12 total integrations
Even though the final of the three groups is under the threshold of 15 integrations, the monitor will run on the 12 integrations.

Also, for MIRI, the first M integrations of each file are skipped, as the dark current is not stable in the initial M integrations. At the moment we use M=1 for full frame data.

@bhilbert4 bhilbert4 self-assigned this Apr 25, 2023
@bhilbert4
Copy link
Collaborator Author

bhilbert4 commented May 2, 2023

  • For MIRI, we need to ignore the first N integrations in each file

@bhilbert4
Copy link
Collaborator Author

Initial test on the dev server looks good. I commented out all database interactions and the actual analysis code. So the test just queried MAST from launch to May 2023, and then divided up the files using the new scheme, and put together what would be the new database entries.

@bhilbert4
Copy link
Collaborator Author

Left the monitor running (with no db interactions nor analysis) on the dev server. See log file ending in 2023-05-02-17-02.log

@pep8speaks
Copy link

pep8speaks commented Dec 19, 2023

Hello @bhilbert4, Thank you for updating !

Cheers ! There are no PEP8 issues in this Pull Request. 🍻

Comment last updated at 2024-02-21 16:50:32 UTC

@bhilbert4
Copy link
Collaborator Author

bhilbert4 commented Dec 19, 2023

Two other small tweaks to make:

  • Add a date to the histogram plots describing when the data used to create them is from
  • Cut down the list of apertures examined. Keep full frame and just a couple subarrays per instrument?
  • Make sure the dates added to the database are those for the used files, rather than the overall query start and stop
  • Make sure the amp-specific data is being correctly populated/retrieved from the database

@bhilbert4
Copy link
Collaborator Author

What do we do about varying numbers of groups? We probably don't want to be creating a mean dark rate image by averaging 10-group integrations with 150-group integrations. The three options for getting around this problem are:

  1. Filter out and ignore all files with ngroups < some threshold
  2. Keep the integration threshold set to 1, so that each file is run on its own.
  3. Making the sublist generator more complex, such that exposures with very different numbers of groups are kept separate

The third option seems like it would be the most difficult to implement.

The second option makes the most sense to me. We don't want JWQL to filter out too much data, even if that data gives noisy results. I think as long as users can get a good idea of the number of groups/ints associated with each data point, then having all the data present would be ok. Also, for NIRCam at least, dark files are taken very infrequently, meaning that the monitor is going to run on one exposure at a time regardless of the number of groups or integration threshold.

Filtering out all data with not enough groups would be the easiest to implement. And it would make sense in terms of ignoring the early buffer flush observations and any short darks that users take in order to look for persistence, etc. But again, unless we set the limit very high, we could still end up with cases where integrations with very different numbers of groups are averaged together in order to make a mean dark current rate image.

So maybe we should do a combination of the first and second options. Filter out exposures with very few groups (e.g. <10), but then keep the one integration threshold, such that each file is run on its own.

@bhilbert4 bhilbert4 changed the title [WIP]: Dark monitor threshold and cadence updates Dark monitor threshold and cadence updates Feb 2, 2024
Copy link
Collaborator

@mfixstsci mfixstsci left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hey @bhilbert4 thanks for testing this out and contributing all of these updates. Looks like a lot of work went into this. From what I see, everything looks good to me. Thanks for adding documentation and tests!

jwql/tests/test_dark_monitor.py Show resolved Hide resolved
jwql/utils/constants.py Show resolved Hide resolved
@mfixstsci mfixstsci merged commit 7a4d932 into spacetelescope:develop Feb 21, 2024
6 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
3 participants