Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Default value for not used entries #119

Open
MichaelUM opened this issue Jul 11, 2022 · 6 comments
Open

Default value for not used entries #119

MichaelUM opened this issue Jul 11, 2022 · 6 comments
Assignees

Comments

@MichaelUM
Copy link

Probably this is already mentioned somewhere but I couldn't find it directly in the specs (only in #40, but not integrated).
If required fields are not used they can't be empty and a default / not used value should be set instead.
@dboas @jayd1860 suggested -1 for integers for example for wavelengthIndex=-1 and dataTypeIndex=-1 in preprocessed data. Strings could be handled as empty strings in general.
It would be great if this could be stated in the spec. directly.

@samuelpowell
Copy link
Collaborator

This issue was discussed in a meeting of 17/04/2024 and it was agreed that this is a deficiency in the specification.

Ideally, we would like to specify that in situations where it does not make sense to index into the wavelengths array, that the index field no longer be required. This is likely limited to situations where the data type is one of a chromophore concetration. Implementation TBC.

@samuelpowell
Copy link
Collaborator

I note that when we are in the 'chromophore space' rather than the 'wavelength space', all practical datasets will likely express every channel over the same set of chromophores, and thus a 'chromophore index' (or general spectroscopic index) could be implemented in a future revision which acts in the same way as the wavelength index.

@emiddell
Copy link
Collaborator

Here is the example I brought up in our discussion. It is a file with processed data from Satori. Shown is the measurement list of a data element which contains both wavelength-dependent (RAW, dOD) and chromophore (HbO, HbR, HbT) data. In such a scenario the wavelengthIndex and datatypeIndex fields may be required for some entries but not for all. In the latter case a default / not used value is needed. I guess this is the scenario @MichaelUM had in mind.

In [20]: cedalion.io.snirf.measurement_list_to_dataframe(s.nirs[0].data[0].measurementList, drop_none=True)[::10]
Out[20]: 
     sourceIndex  detectorIndex  wavelengthIndex  dataType dataTypeLabel  dataTypeIndex
0              1              1                1     99999           RAW              1
10             4              2                1     99999           RAW              1
20             6              6                1     99999           RAW              1
30             9             20                1     99999           RAW              1
40            12             14                1     99999           RAW              1
50            15             15                1     99999           RAW              1
60             3              1                2     99999           RAW              1
70             5              7                2     99999           RAW              1
80             8              8                2     99999           RAW              1
90            11             12                2     99999           RAW              1
100           14             13                2     99999           RAW              1
110            1             16                1     99999           dOD             -1
120            4              6                1     99999           dOD             -1
130            7              5                1     99999           dOD             -1
140           10             10                1     99999           dOD             -1
150           13             11                1     99999           dOD             -1
160           16             15                1     99999           dOD             -1
170            3              4                2     99999           dOD             -1
180            6              4                2     99999           dOD             -1
190            9              9                2     99999           dOD             -1
200           12             10                2     99999           dOD             -1
210           14             15                2     99999           dOD             -1
220            2              2               -1     99999           HbO             -1
230            5              3               -1     99999           HbO             -1
240            7              8               -1     99999           HbO             -1
250           11              9               -1     99999           HbO             -1
260           13             22               -1     99999           HbO             -1
270            1              1               -1     99999           HbR             -1
280            4              2               -1     99999           HbR             -1
290            6              6               -1     99999           HbR             -1
300            9             20               -1     99999           HbR             -1
310           12             14               -1     99999           HbR             -1
320           15             15               -1     99999           HbR             -1
330            3              1               -1     99999           HbT             -1
340            5              7               -1     99999           HbT             -1
350            8              8               -1     99999           HbT             -1
360           11             12               -1     99999           HbT             -1
370           14             13               -1     99999           HbT             -1

@MichaelUM
Copy link
Author

Yes @emiddell, I hope this feature was sort of intended in the format. I'm not sure but it allowed us to store all information in one file.

@dboas
Copy link
Collaborator

dboas commented Apr 17, 2024

A solution thus seems to be to just clarify in the spec that a wavelength index is only needed for dataTypes for which a wavelength needs to be specified and that for all other dataTypes, it is not required to specify the wavelength index.

So, we need to update the spec to clarify this... and we need to verify that the validator(s) handle this condition.

@samuelpowell
Copy link
Collaborator

Following meeting of May 24 it was decided that:

  • We should encourage users to use seperate data blocks to store raw, and post-processed data. It was not the intent that these be mixed as individual channels within a certain block.
  • In the future, we will seek to make wavelength required only when it is required by the data type.
  • When a field is present but there is no logical use, e.g., the wavelength index for a chromophore, any value can be present.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants