New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Update raw data model based on spectrograph tests #73
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Comments inline. High level issue is that the FV tests are writing a per-spectrograph format which is very similar but slightly different to the final multi-spectrograph format. Let's keep this documenting the final multi-spectrograph format and see if we can get ICS to make the single-spectrograph format a simple subset of that instead of a different format.
Number EXTNAME Type Contents | ||
================= ========= ======== ==================================== | ||
HDU0_ SP0 IMAGE Blank except for header keywords | ||
HDU1_ SPECTCON0 BINTABLE Telemetry data |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Until the FV tests started, I hadn't realized that Klaus has one format for single spectrograph readouts and a different but closely related format for multi-spectrograph readouts. Let's keep this datamodel file focused on what the final multi-spectrograph format should be, and then work with ICS to see if they would be willing to make the single-spectrograph case a simple subset of that (instead of a slightly different format).
Single spectrograph | Multi-spectrograph | pixsims (so far) |
---|---|---|
HDU0 = SP0 .. SP9 | HDU0 = SPS | HDU0 = (no EXTNAME) |
HDU1 = SPECTCON0 .. SPECTCON9 | last HDU = SPECCONS | not in pixsims |
filename sp{n}-*.fits.fz | filename desi-*.fits.fz | filename desi-*.fits.fz |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I agree that this data model should be focused on the final multi-spectrograph file. But what I have to work with to get all the details of the header keywords are the single spectrograph files. One does what one must...
DATASUM 0 str data unit checksum updated 2019-02-20T21:27:57 | ||
TELRA 335.03 float Telescope pointing RA [degrees] **MISSING** | ||
TELDEC 19.88 float Telescope pointing dec [degrees] **MISSING** | ||
AIRMASS 1.0 float Airmass at middle of exposure **MISSING** |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do we have a data model structure for documenting optional header keywords? e.g. TELRA, TELDEC should be present for on-sky FLAVOR=SCIENCE exposures, but doesn't apply to dome flats or test-slit exposures of single spectrographs.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Let me think about that. I think the current verification code treats all non-trivial keywords as mandatory.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Not only does the verification code treat all non-trivial keywords as mandatory, the order of the keywords also matters, because the easiest way to compare the file to the model is one-to-one. We're pushing the limits of the .rst
format as it is. Let's punt optional keywords to a separate issue.
* The compressed HDUs in the "sp0" files contain ``ZSIMPLE`` keyword. This would | ||
be appropriate in a compressed *primary* HDU but not in a compressed extension. | ||
Make sure that the images are actually compressed *as extensions*, not as | ||
individual images that are then shoved into an HDU. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
OK to have notes here, but we'll likely need a different way to converge with Klaus/ICS about issues like this (i.e. don't expect Klaus to be reading the datamodel and responding...)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Noted.
@sbailey, I'd like to get this finished before too much time passes, however, I've made as much progress as I can using the |
Thanks for remembering this dangling update.
Other examples NIGHT/EXPID combos listed at https://desi.lbl.gov/trac/wiki/SpectrographsFunctionalVerification/SP2/DataSummary . Please update the datamodel to match that file, while still keeping the MISSING tags for keywords we were expecting. For datamodel validation, ok to punt the concept of optional keywords to a future update. We can also punt the order of the keywords issue to a future update (though I really would like to support differently ordered keywords unless there is a compelling data reason why the keywords must be in order, not just validation convenience). |
OK, thanx. When you say that spectrograph HDUs can be in any order, is that going to be a permanent thing, or just for these tests? I anticipate that validating HDUs in an arbitrary order might be somewhat more difficult than validating keywords in arbitrary order within one HDU. |
Spectrograph HDUs in any order is a permanent feature, which is one of the reasons why people should access HDUs by name rather than number. For the validator, it means we need to treat these like comparing two dictionaries (with arbitrary order keys, even if they are the same set) instead of two lists (with a guaranteed order). This is also true for offline pipeline results: we only have one serial writer per file so they happen to always be in the same order, but we don't guarantee the order and reserve the right to change that order or add an extra debugging/optional HDU and it shouldn't break people's code if they are using names instead of numbers to access HDUs. The raw data is a particularly tricky case for an automated validator, since For the purposes of this PR, let's just get the datamodel to match the files as-written by ICS, while flagging any keywords that we were expecting but don't see. Thanks. |
OK. IF we can really enforce that every HDU (OK, not necessarily the first) has an EXTNAME, then that would make the problem simpler to solve. |
See issue #69 for EXTNAME policy. HDU > 1 shall have an EXTNAME, and HDU 0 is a special case depending upon whether it has meaningful data and/or header keywords. Klaus is also a fan of using EXTNAME instead of number, so I think we can rely upon that for both ICS-generated data and pipeline-generated data. |
Yes, I know about the policy, but what progress have we made throughout the entire DESI code base, not just the raw data, on enforcing the policy? |
I think we have nearly 100% compliance for the "shall" clauses, and decent compliance with the "should" clauses for HDU 0. The only exceptions to the "shall" clauses that I know about are some desimodel data files with a single binary table HDU 1 (like footprint/desi-tiles.fits). I.e. I'm fine with the datamodel validator complaining about cases that don't match the policy and then us fixing any cases we find (e.g. adding an EXTNAME when we update the tiles file with Eddie's new dither pattern). In fact I thought the validator was already doing that after PR #39 addressing issue #36. |
Yes, the validator is already doing that, but if we are going to discard the idea that HDUs are in a particular order, then EXTNAME is an absolute necessity. Just like comparing header keywords within an HDU by dictionary keys, we would also need to compare entire HDUs by dictionary keys, and EXTNAME is that key. |
too busy producing data 😊
I just caught the tail end of this so this comment might be no longer relevant but yes – the ICS writes named extensions
Klaus
From: Stephen Bailey <notifications@github.com>
Reply-To: desihub/desidatamodel <reply@reply.github.com>
Date: Wednesday, April 3, 2019 at 10:30 AM
To: desihub/desidatamodel <desidatamodel@noreply.github.com>
Cc: Klaus Honscheid <kh@physics.osu.edu>, Review requested <review_requested@noreply.github.com>
Subject: Re: [desihub/desidatamodel] Update raw data model based on spectrograph tests (#73)
See issue #69<#69> for EXTNAME policy. HDU > 1 shall have an EXTNAME, and HDU 0 is a special case depending upon whether it has meaningful data and/or header keywords. Klaus is also a fan of using EXTNAME instead of number, so I think we can rely upon that for both ICS-generated data and pipeline-generated data.
—
You are receiving this because your review was requested.
Reply to this email directly, view it on GitHub<#73 (comment)>, or mute the thread<https://github.com/notifications/unsubscribe-auth/AGHX-5uomkCqf8cwnoY-FAN-2TaRNwyXks5vdOVBgaJpZM4bSnR7>.
|
@daqiii, I think we're good in terms of the raw data. It's the files used and/or produced in other parts of the pipeline that I'm concerned about. And I think we're mostly OK there as well, but this has to apply to all DESI data files, so I have to ask. |
I have updated the desi-EXPID.rst file as best as I can given |
@sbailey, should I leave this open until all the problems are addressed? |
Rather than try to describe in English my suggested changes to this branch before merging, I opened PR #75 with a specific suggestion to document all cameras. I recognize that this might cause the validator to generate a bunch of warnings about missing HDUs, but ok. The previous example file was We don't need to resolve full validator flexibility before merging this PR, but I would like to have at least:
|
suggested updates for documenting all cameras
This PR still needs some work, and very careful review. Do not merge yet!
In addition to updating the data model for raw data files, this PR:
There are several open questions about the data model that are written directly in the desi-EXPID.rst file.