-
Notifications
You must be signed in to change notification settings - Fork 43
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Dark monitor threshold and cadence updates #1259
Dark monitor threshold and cadence updates #1259
Conversation
|
Initial test on the dev server looks good. I commented out all database interactions and the actual analysis code. So the test just queried MAST from launch to May 2023, and then divided up the files using the new scheme, and put together what would be the new database entries. |
Left the monitor running (with no db interactions nor analysis) on the dev server. See log file ending in 2023-05-02-17-02.log |
Hello @bhilbert4, Thank you for updating ! Cheers ! There are no PEP8 issues in this Pull Request. 🍻 Comment last updated at 2024-02-21 16:50:32 UTC |
Two other small tweaks to make:
|
What do we do about varying numbers of groups? We probably don't want to be creating a mean dark rate image by averaging 10-group integrations with 150-group integrations. The three options for getting around this problem are:
The third option seems like it would be the most difficult to implement. The second option makes the most sense to me. We don't want JWQL to filter out too much data, even if that data gives noisy results. I think as long as users can get a good idea of the number of groups/ints associated with each data point, then having all the data present would be ok. Also, for NIRCam at least, dark files are taken very infrequently, meaning that the monitor is going to run on one exposure at a time regardless of the number of groups or integration threshold. Filtering out all data with not enough groups would be the easiest to implement. And it would make sense in terms of ignoring the early buffer flush observations and any short darks that users take in order to look for persistence, etc. But again, unless we set the limit very high, we could still end up with cases where integrations with very different numbers of groups are averaged together in order to make a mean dark current rate image. So maybe we should do a combination of the first and second options. Filter out exposures with very few groups (e.g. <10), but then keep the one integration threshold, such that each file is run on its own. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hey @bhilbert4 thanks for testing this out and contributing all of these updates. Looks like a lot of work went into this. From what I see, everything looks good to me. Thanks for adding documentation and tests!
Make some updates to the thresholds used to trigger a run of the dark monitor. Previously the monitor would only run when N files of a given instrument/aperture were found. This PR updates that to be N integrations (because some teams take dark calibration data with multiple integrations).
There are also some updates to the way the files are grouped, in order to support a case where thresholds have changed enough that the monitor will be re-run across data spanning a broad range in time. The date of the input files are first examined, and the files are split into groups based on the amount of time between consecutive files. A gap larger than N days is interpreted as an epoch boundary, and the monitor will not run on files that span more than 1 epoch. The files for each epoch are then further subdivided by the integration threshold described above. Since we cannot split a single file between monitor runs, the threshold value is used as a minimum, with the exception of the case where it would need to cross epochs.
For example, if the threshold value is 15 integrations, and the new files for one epoch have these numbers of integrations:
[1, 3, 1, 5, 10, 5, 7, 10, 5, 5, 2]
Then the code will divide up the files as such:
[1, 3, 1, 5, 10] = 20 total integrations, [5, 7, 10] = 22 total integrations, [5, 5, 2] = 12 total integrations
Even though the final of the three groups is under the threshold of 15 integrations, the monitor will run on the 12 integrations.
Also, for MIRI, the first M integrations of each file are skipped, as the dark current is not stable in the initial M integrations. At the moment we use M=1 for full frame data.