-
Notifications
You must be signed in to change notification settings - Fork 267
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Refactor of name for container of waveforms #608
Comments
In this case should :
In case of 2. I think it is more intuitive to have them named |
event.r0.tel[1].adc_samples --> event.r0.tel[1].samples All of these are waveforms, so have shape (n_channels, n_pixels, n_samples) The units depend on the low-level R1 calibration each camera applies on them. I'm not sure if there is a requirement to have the waveforms in p.e. at the dl0 level (they will at least be something proportional to p.e. i presume, such that the online analysis can use them) |
Ok I did not get your point first. |
I'm happy with the suggested names, though there will need to be some small change to support the fact that in the final DL1 we have only samples for a subset and images for the rest (so we need both DL0.image and DL0.samples. In fact, I'm working on standardizing the definitions for all of CTA. In my current doc I have "image" = a single image for a full camera, and "traces" = the time-evolution of a subset of pixels, but "samples" is perhaps better (any preference or better name?). |
Yes, I think if there is no sample dimension, it should be "image" (for example, we may have the intensity |
I'm a little confused. I didn't think there is a Correct me if I'm wrong, but it should be changed to: event.r0.tel[1].samples
event.r1.tel[1].samples
event.dl0.tel[1].samples
event.dl1.tel[1].image |
Did you mean to only specify DL0, and were talking about that some pixels (the important ones) will have all the samples, but other pixels will be reduced to single samples as a form of data reduction to save storage space? If so then yeah your suggestion may be appropriate...but it might be useful to have a proper discussion of how to handle this. Some immediate question I have:
|
Yes, that's only DL0 - for R0, it's much simpler, just as we have now
I think we can assume yes
I think that is basically a sparse array (which numpy supports), or we could just use a masked array (wasting a bit of memory). We can assume that even if the data are stored in a more sparse way, they can be "unpacked" into this sort of in-memory structure, for convenience. So I think to answer the question, we can just assume it looks like it is now, but some elements may be not filled in, or all zeros. The way I've defined it so far in the doc for DL0 is a bit different, but can be converted to this way of thinking. It is stored as a K_pix (only for pixels with waveforms) by N_samp array, and there is a third array that maps one to the other: it is basically a mask of which pixels have waveforms. So we could also store it that way, which s more compact. Then you'd have
|
I should note in that previous comment, |
Another option would be dl0.tel[x].waveform for waveform data? That maybe is less jargon than "samples" (though of course the waveform is sampled at regular intervals, techincally the integrated pixel is also a sample, just over a larger time-base). There doesn't seem to be a good name for a sequence of images in time (other than "movie", but that doesn't sound very scientific). In any case, since we will only have a subset of the image with waveforms. |
I have no preference over samples/traces/waveform. The only input I have is that with ASTRI, its a little bit of a misnomer to describe their data as a waveform or a trace. Whereas it is more accurate to describe it as a sample (because as you stated "the integrated pixel is also a sample, just over a larger time-base") |
Ok, I've updated the DL0 interface doc to use "waveform" everywhere where there are sampled data, and "image" where it is integrated. I like that the best so far. In the case of ASTRI, there is only the image, and no waveform data (it's not really a single sample anyhow, since they use the pulse shape and time-over-threshold to predict the integral) |
Could you put a link to that document please? |
For R1, it's nearly identical, except there is no |
No, that document is not yet complete or publicly available - I'm working on it now with a few other experts, and there will be a meeting with the project office to validate it in a month or so. The camera groups will need to review and accept it, in particular, before it is official, so it will take some time. I'm just putting this info here, since I think it will be closer to this format than what we have now, and it helps me if we can prototype it as I write the requirements and specs. If we discover something missing, it helps a lot. However, we can't add much more information to DL0.TEL.EVT info without saturating the off-site link. |
@kosack I know it's personal taste, but i'm not a huge fan of unneeded abbreviations or truncations. Why not Just out of curiosity: why quarter nano seconds? |
Agreed. In fact, this is even explicitly stated in the coding guidelines document (which I wrote, and apparently ignored :-) I'll remove the abbreviations.
Mainly so it fits in a 32-bit integer, and also since we don't need better precision than that and 250ps is close to the precision of the underlying time distribution system, so smaller would be useless, and larger would throw away some info. |
- rename - change run_id -> obs_id
* rename containers and fields according to #608 - rename - change run_id -> obs_id - adc_sums -> image - adc_samples and pe_samples -> waveform everywhere
I would like to propose a refactor of the name for the container of waveforms. Currently we have:
event.r0.tel[1].adc_samples
event.r1.tel[1].pe_samples
event.dl0.tel[1].pe_samples
event.dl1.tel[1].image
It is potentially misleading to call them "adc_samples" and "pe_samples" as these assume units that might not be the case. Therefore I would propose to rename them to "samples". This would also make the access of waveforms more consistent across data levels.
This would be a huge refactor that would presumably affect every ctapipe user's code. Therefore it should probably be delayed until we intend a huge refactor of the code. This could be the same time we remove the "tel" containers for the R0 -> DL1 containers (I presume this is an intended change once we move to single telescopes per file).
The text was updated successfully, but these errors were encountered: