Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Issues with data2bids #2277

Open
vlitvak opened this issue Jul 12, 2023 · 11 comments
Open

Issues with data2bids #2277

vlitvak opened this issue Jul 12, 2023 · 11 comments
Assignees
Labels

Comments

@vlitvak
Copy link
Contributor

vlitvak commented Jul 12, 2023

Datasets generated with data2bids do not pass the BIDS validator https://bids-standard.github.io/bids-validator/

I generated an example from FT CTF dataset that reproduces some of them. There was also an error related to fiducials that is not generated here for some reason:

File Path: Invalid JSON file. The file is not formatted according the schema.

Type: Error
File: sub-PhantomPerceptPC_coordsystem.json
Location: bids/sub-PhantomPerceptPC/meg/sub-PhantomPerceptPC_coordsystem.json
Reason: Invalid JSON file. The file is not formatted according the schema.
Evidence: .HeadCoilCoordinates['nasion'][0] should be number

I managed to fix it by editing the coordsys file to be like:

"HeadCoilCoordinates": {
"nasion": [7.5093, 0.0, 0.0],
"left_ear": [0.0568, 7.45343, 0.0],
"right_ear": [-0.0568, -7.45343, 0.0]
}

(removed square brackets around each coordinate).

Finally, I suggest amending the example code at https://www.fieldtriptoolbox.org/example/bids_meg/ (and maybe elsewhere) to include the specification of mandatory fields without which the validator will through an error. These are:

cfg.meg.PowerLineFrequency
cfg.meg.DewarPosition
cfg.meg.SoftwareFilters
cfg.meg.DigitizedLandmarks
cfg.meg.DigitizedHeadPoints

Alternatively, the code could have defaults for these fields.

conv2bids_ctf_test.zip

@vlitvak
Copy link
Contributor Author

vlitvak commented Jul 12, 2023

The issue with coordsystem file for some reason doesn't happen on the example dataset but does happen on our CTF datasets. Also with our dataset it puts "Other" in "MEGCoordinateSystem" which leads to a problem later when trying to read the data. For MEGIN data it puts "ElektaNeuromag" in this field which leads to the following error:

Error using ft_affinecoordinates (line 271)
converting from elektaneuromag to ras is not supported

Error in ft_convert_coordsys (line 157)
initial = ft_affinecoordinates(object.coordsys, target);

Error in ft_prepare_layout>sens2lay (line 1335)
sens = ft_convert_coordsys(sens, 'ras', 0);

Error in ft_prepare_layout (line 609)
layout = sens2lay(sens, cfg.rotate, cfg.projection, cfg.style, cfg.overlap, cfg.viewpoint, cfg.boxchannel);

This can be fixed by changing the value to "Neuromag"

@robertoostenveld
Copy link
Contributor

There are multiple issues, I'll go over them one by one.

GeneratedBy should be an array of objects in the JSON, and hence a cell-array of structs in the MATLAB cfg structure. That was not the case for the default; I fixed it now. I also added some on-the-fly conversion in case a single string is given instead of an array-of-strings, and if a single struct is given in case of an array-of-objects (or structs).

@robertoostenveld
Copy link
Contributor

Using the tutorial Subject01.ds dataset I also get an error from the validator when electrodes.tsv is present in the meg directory, this is being discussed in bids-standard/bids-specification#1550 and should be fixed in the validator.

@robertoostenveld
Copy link
Contributor

I cannot reproduce the HeadCoilCoordinates error and reading the code I don't see why the formatting would go wrong. If you say that you had to remove additional [] brackets, it suggests that it is a cell-array of scalars, rather than a 1x3 vector. The code is at

coordsystem_json.HeadCoilCoordinates.(['coil' num2str(i)]) = hdr.orig.dig(idxHPI(i)).r';
, could you have a look at that specific line with your data?

@robertoostenveld
Copy link
Contributor

The mapping between BIDS and FieldTrip coordinate systems was incomplete: the older Elekta/Neuromag was supported but the ElektaNeuromag without slash was not. I fixed that.

@robertoostenveld
Copy link
Contributor

you write

These are:

cfg.meg.PowerLineFrequency
cfg.meg.DewarPosition
cfg.meg.SoftwareFilters
cfg.meg.DigitizedLandmarks
cfg.meg.DigitizedHeadPoints

Alternatively, the code could have defaults for these fields.

How could the data2bids code generate valid defaults for these fields? It could specify NaN, which would translate to `n/a' upon writing to the JSON and the validator would probably not complain any more, but I consider that worse than the current implementation.

robertoostenveld added a commit that referenced this issue Jul 14, 2023
- change default for GeneratedBy
- update cfg fields that should be a list-of-strings
- update cfg fields that should be a list-of-objects
@robertoostenveld
Copy link
Contributor

@vlitvak can you try again?

@robertoostenveld
Copy link
Contributor

I just added a commit that fixes a problem in the merging of the GeneratedBy field in dataset_description.json

@vlitvak
Copy link
Contributor Author

vlitvak commented Jul 17, 2023

CTF runs without errors now. For MEGIN I get:

File:		sub-PhantomPerceptPC_electrodes.tsv
Location:		bids/sub-PhantomPerceptPC/meg/sub-PhantomPerceptPC_electrodes.tsv
Reason:		Files with such naming scheme are not part of BIDS specification. This error is most commonly caused by typos in file names that make them not BIDS compatible. Please consult the specification and make sure your files are named correctly. If this is not a file naming issue (for example when including files not yet covered by the BIDS specification) you should include a ".bidsignore" file in your dataset (see https://github.com/bids-standard/bids-validator#bidsignore for details). Please note that derived (processed) data should be placed in /derivatives folder and source data (such as DICOMS or behavioural logs in proprietary formats) should be placed in the /sourcedata folder.
Evidence:	sub-PhantomPerceptPC_electrodes.tsv

======================================================

File Path: Tabular file contains custom columns not described in a data dictionary

Type:		Warning
File:		sub-PhantomPerceptPC_task-02IPGoffNoMov_run-2_events.tsv
Location:		bids/sub-PhantomPerceptPC/meg/sub-PhantomPerceptPC_task-02IPGoffNoMov_run-2_events.tsv
Reason:		Tabular file contains custom columns not described in a data dictionary
Evidence:	Columns: sample, type, value not defined, please define in: /events.json, /task-02IPGoffNoMov_events.json,/run-2_events.json,/task-02IPGoffNoMov_run-2_events.json,/sub-PhantomPerceptPC/sub-PhantomPerceptPC_events.json,/sub-PhantomPerceptPC/sub-PhantomPerceptPC_task-02IPGoffNoMov_events.json,/sub-PhantomPerceptPC/sub-PhantomPerceptPC_run-2_events.json,/sub-PhantomPerceptPC/sub-PhantomPerceptPC_task-02IPGoffNoMov_run-2_events.json,/sub-PhantomPerceptPC/meg/sub-PhantomPerceptPC_events.json,/sub-PhantomPerceptPC/meg/sub-PhantomPerceptPC_task-02IPGoffNoMov_events.json,/sub-PhantomPerceptPC/meg/sub-PhantomPerceptPC_run-2_events.json,/sub-PhantomPerceptPC/meg/sub-PhantomPerceptPC_task-02IPGoffNoMov_run-2_events.json

======================================================

@vlitvak
Copy link
Contributor Author

vlitvak commented Jul 17, 2023

Regarding mandatory fields, I suggest that you update the example scripts in the wiki because I used your example as a starting point, but those fields were not specified there and I had to figure out from the errors and the code what the problem is.

@robertoostenveld
Copy link
Contributor

The example script should indeed be updated so that its results pass the validator, I made a separate issue for that.

The electrodes.tsv not being allowed for MEG is a known bug in the bids-validator and being worked on in relation to bids-standard/bids-specification#1550 and bids-standard/bids-specification#1555.

Regarding the events.json file and other json files that are allowed to accompany the tsv files (e.g., participants, channels, electrodes), do you have a suggestion as to how to implement those? In general I would expect all scans in a dataset to have the same columns in the tsv files, so having the json file at the top level would be efficient. Perhaps we should add some code around ft_write_tsv in data2bids that detect which xxx.tsv is written, with which columns, and that it ensures the corresponding xxx.json is present and includes all table columns. Most content would still need to be provided by the user though.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

2 participants