Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[ENH] Add Glossary of terms/abbreviations used in the specification #152

Merged
merged 10 commits into from
May 23, 2020

Conversation

yarikoptic
Copy link
Collaborator

@yarikoptic yarikoptic commented Feb 15, 2019

Just an idea which came up while fighting over the suffix in PyBIDS: bids-standard/pybids#380

BTW -- does anyone know about a proper way to define glossary and refer to the terms in this markdown/mkdocs, like e.g. :term: in sphinx?

@chrisgorgo
Copy link
Contributor

Thank you for sending this over. How do you see the proposed glossary overlapping with Definitions outlined in Common Principles - https://github.com/bids-standard/bids-specification/blob/master/src/02-common-principles.md#definitions ? Perhaps it's worth extending that section instead of creating a new one. Curious about your thoughts.

@yarikoptic
Copy link
Collaborator Author

Great question, and shame on me that I forgot about them being listed there. I would be happy to contribute to it instead. What is left is file extension, <index>, <label>, suffix. They sound a bit too "down to the earth/detail" in comparison to the other more prominent high level definitions. But I think they would be appropriate since they are used through the spec. Just gimme the blessing Father! ;)

May be that is a part of the problem, that it is somewhat buried within the common principles instead of separate section? I have no strong opinion on this one

@chrisgorgo
Copy link
Contributor

What if the compromise would be to have a dedicated "definitions/glossary" section early on in the spec (instead of the appendix)?

@KirstieJane
Copy link
Member

Little +1 on @chrisfilo’s idea - I think the definitions are really useful and putting them in the appendix makes them harder to find for new folks.

@sappelhoff sappelhoff changed the title ENH: Add Glossary of terms/abbreviations used in the specification [ENH] Add Glossary of terms/abbreviations used in the specification Apr 12, 2019
@yarikoptic
Copy link
Collaborator Author

Mental buzz to myself to get back to it to refurbish to come earlier. Kept running into absence of clear description for the <index> and <label> and common principle is indeed the place them to be listed. If someone beats me to do it -- I would only appreciate

indentation, e.g. it is `01` in `run-01` following `run-<index>` specification
- **<label>**:
an alphanumeric value, possibly prefixed with arbitrary number of 0s for consistent
indentation, e.g. it is `rest` in `task-rest` following `task-<label>` specification
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

me whining: I am so desperately waiting for - or + to become allowed! in case of derivatives _desc it is quite hard to provide it sometimes without any way to punctuate items

@yarikoptic
Copy link
Collaborator Author

ok, pushed (rebased) updates placing <index> and <label> into Definitions and then rewording/shortening section describing loose term "labels". I also added some formatting to make Definitions IMHO easier to parse/link to each other

Copy link
Member

@sappelhoff sappelhoff left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @yarikoptic ... just a small comment, otherwise LGTM


1. **File extension** - a portion of the the file name after the left-most
period (`.`) preceded by any other alphanumeric (so `.gitignore` does not have a
suffix)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

do you mean the right-most period? Also, the .gitignore example is not clear to me: Why is the suffix of relevance when describing what a file extension is?

Copy link
Collaborator

@effigies effigies May 10, 2019

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Left-most is correct. Following the right-most rule for X.nii.gz would get you .gz, not .nii.gz.

By this definition, .func.gii, .surf.gii and .dtseries.nii are extensions, which I think is appropriate, but we should make sure we're on the same page here, as these are part of derivatives.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

re .gitignore -- just an example where there is a left-most period but the gitignore is not an extension because is not preceded by anything. May be I should replace it with .bidsignore, which is also not a part of the spec but more relevant here?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@effigies you bring up the point either period itself is a part of file extension, i.e. either it is .nii.gz or nii.gz?!

  • in my wording above it is without ("... after ...") so might need to be fixed.
  • Python seems to also retain period in splitext:
$> python -c 'from os.path import splitext; print(splitext("bla.nii.gz"))'
('bla.nii', '.gz')
  • text in bids-specification ATM seems to list them with period (IMHO makes it easier to parse when period is there)
  • pybids seems to ask for them without period (might be my influence attn: @tyarkoni :
(git)hopa:~/proj/bids/pybids[sqlalchemy]git
$> git grep extension.*nii
bids/layout/README.md:>>> files = layout.get(subject='0[12]', run=1, extension='.nii.gz')
bids/layout/README.md:In the above snippet, we retrieve all files with subject id 1 or 2 and run id 1 (notice that any entity defined in the config file can be used a filtering argument), and with a file extension of .nii.gz. The returned result is a list of named tuples, one per file, allowing direct access to the defined entities as attributes.
bids/layout/config/bids.json:        "sub-{subject}[/ses-{session}]/dwi/sub-{subject}[_ses-{session}][_acq-{acquisition}]_{suffix<dwi>}{extension<bval|bvec|json|nii\\.gz|nii>|nii\\.gz}",
bids/layout/index.py:                    extension=['nii', 'nii.gz'], suffix='bold',
bids/layout/index.py:                    extension=['nii', 'nii.gz'], suffix='dwi',
bids/layout/layout.py:                         extension=['nii.gz', 'nii'])
bids/layout/layout.py:        images = self.get(extension=['nii', 'nii.gz'], scope=scope,
bids/layout/tests/test_layout.py:              'extension': 'nii.gz'}
bids/layout/tests/test_layout.py:              'desc': 'bleargh', 'extension': 'nii.gz'}
bids/layout/tests/test_layout.py:              'extension': 'nii.gz'}
bids/layout/tests/test_layout.py:              'desc': 'bleargh', 'extension': 'nii.gz'}
bids/layout/tests/test_layout.py:                            acquisition='fullbrain', extension='nii.gz')[0]
bids/reports/parsing.py:            iff_file = [f for f in layout.get(extension='nii.gz') if fn in f.path][0]
bids/reports/parsing.py:                    echos = layout.get_echoes(subject=subj, extension='nii.gz',
bids/reports/parsing.py:                                                     extension='nii.gz',
bids/reports/report.py:            niftis = self.layout.get(subject=subject, extension='nii.gz',
bids/reports/tests/test_parsing.py:    niftis = testlayout.get(subject=subj, extension='nii.gz')
bids/variables/io.py:    images = layout.get(return_type='object', extension='nii.gz',
doc/layout/index.rst:    >>> f = layout.get(task='nback', run=1, extension='nii.gz')[0].filename
doc/layout/index.rst:    >>> f = layout.get(task='nback', run=1, extension='nii.gz')[0].filename
examples/pybids tutorial.ipynb:    "layout.get(subject='01', extension='nii.gz', suffix='bold', return_type='filename')"
examples/pybids tutorial.ipynb:       "{'subject': '01', 'run': 1, 'suffix': 'T2w', 'extension': 'nii.gz'}"
1 11544.....................................:Fri 10 May 2019 08:53:36 AM EDT:.
(git)hopa:~/proj/bids/pybids[sqlalchemy]git
$> git describe
0.7.1-256-g52c6a7f

so what should it be? ;)

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think that the example should be "so sub-01_task-resting_bold.nii.gz has the suffix .nii.gz"

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

suffix should be bold. Extension nii.gz in current wording.

NB it has been awhile for this issue/PR, I might be loosing grip on what should be what now ;)

@effigies
Copy link
Collaborator

@yarikoptic re dots, I don't really care, and I wasn't trying to stake a position here. I can work with either, and I think any good tool will probably have to have a special case to check for missing/spurious initial dots.

@yarikoptic
Copy link
Collaborator Author

@yarikoptic re dots, I don't really care, and I wasn't trying to stake a position here. I can work with either, and I think any good tool will probably have to have a special case to check for missing/spurious initial dots.

yeah, but IMHO consistency between/within BIDS and terms used by "BIDS tools" is important, that is why I wonder myself now what should it be, even though it might sound like a superficial concern. But I am ok to postpone its further discussion until later/separate issue, or even remove "File extension" from this PR. Please advise.

@sappelhoff
Copy link
Member

mmh 😕 ... both with or without a dot sounds fine. Just from my intuition, perhaps we can go with a dot in our definition here? It's probably better to make a strong definition (as opposed to "do whatever you want"), because it will make it easier for downstream tools to work with it.

In the spec, we are using extensions WITH a dot in several places:

@tyarkoni
Copy link

tyarkoni commented May 11, 2019

I'd rather make a decision on this now, because PyBIDS 0.9 (to be released imminently) already changes extensions to extension, so this is the optimal time to introduce a (possible) change to the pattern. My current plan was to switch from including the period to not including it, but maintain backward compatibility for a couple of version releases. But it would be bad to ask people to switch and then revert again later, so we should probably decide sooner rather than later.

Alternatively, I could say nothing about it and handle both scenarios, but that seems suboptimal for the reason @yarikoptic articulated (and even if we continue to handle both cases gracefully, there should probably be an official statement one way or the other that we can appeal to if needed).

@sappelhoff
Copy link
Member

Having talked about definitions of label and index in the past weeks, I think this PR would be nice to get into the spec because it clarifies several parts.

Do you see any blockers @franklin-feingold @effigies?

Especially with regards to pybids?

Copy link
Collaborator

@franklin-feingold franklin-feingold left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

looks good to me! this should help clarify this ambiguity

@yarikoptic
Copy link
Collaborator Author

woohoo -- we got an approval! BTW since the last time we discussed extension, pybids now seems consistently not include that leading .

so current description would stand:
$> git grep extension.*nii
bids/layout/config/bids.json:        "sub-{subject}[/ses-{session}]/{datatype<anat>|anat}/sub-{subject}[_ses-{session}][_acq-{acquisition}][_ce-{ceagent}][_rec-{reconstruction}]_{suffix<T1w|T2w|T1rho|T1map|T2map|T2star|FLAIR|FLASH|PDmap|PD|PDT2|inplaneT[12]|angio>}.{extension<nii|nii.gz|json>|nii.gz}",
bids/layout/config/bids.json:        "sub-{subject}[/ses-{session}]/{datatype<anat>|anat}/sub-{subject}[_ses-{session}][_acq-{acquisition}][_ce-{ceagent}][_rec-{reconstruction}][_mod-{modality}]_{suffix<defacemask>}.{extension<nii|nii.gz|json>|nii.gz}",
bids/layout/config/bids.json:        "sub-{subject}[/ses-{session}]/{datatype<func>|func}/sub-{subject}[_ses-{session}]_task-{task}[_acq-{acquisition}][_ce-{ceagent}][_dir-{direction}][_rec-{reconstruction}][_run-{run}][_echo-{echo}]_{suffix<bold|cbv|phase|sbref>}.{extension<nii|nii.gz|json>|nii.gz}",
bids/layout/config/bids.json:        "sub-{subject}[/ses-{session}]/{datatype<dwi>|dwi}/sub-{subject}[_ses-{session}][_acq-{acquisition}]_{suffix<dwi>}.{extension<bval|bvec|json|nii.gz|nii>|nii.gz}",
bids/layout/config/bids.json:        "sub-{subject}[/ses-{session}]/{datatype<fmap>|fmap}/sub-{subject}[_ses-{session}][_acq-{acquisition}][_dir-{direction}][_run-{run}]_{fmap<phasediff|magnitude[12]|phase[12]|fieldmap>}.{extension<nii|nii.gz|json>|nii.gz}",
bids/layout/config/bids.json:        "sub-{subject}[/ses-{session}]/{datatype<fmap>|fmap}/sub-{subject}[_ses-{session}][_acq-{acquisition}][_ce-{ceagent}]_dir-{direction}[_run-{run}]_{fmap<epi>}.{extension<nii|nii.gz|json>|nii.gz}",
bids/layout/index.py:                    extension=['nii', 'nii.gz'], suffix='bold',
bids/layout/index.py:                    extension=['nii', 'nii.gz'], suffix='dwi',
bids/layout/layout.py:                         extension=['nii.gz', 'nii'])
bids/layout/layout.py:        images = self.get(extension=['nii', 'nii.gz'], scope=scope,
bids/layout/tests/test_layout.py:        (False, {'task': 'rest', 'extension': ['nii.gz']}, 3.0),
bids/layout/tests/test_layout.py:        (False, {'task': 'rest', 'extension': 'nii.gz'}, 3.0),
bids/layout/tests/test_layout.py:        (False, {'task': 'rest', 'extension': ['nii.gz', 'json'], 'return_type': 'file'}, 3.0),
bids/layout/tests/test_layout.py:    sample_file = layout.get(task='rest', extension='nii.gz',
bids/layout/tests/test_layout.py:              'extension': 'nii.gz'}
bids/layout/tests/test_layout.py:              'desc': 'bleargh', 'extension': 'nii.gz'}
bids/layout/tests/test_layout.py:              'extension': 'nii.gz'}
bids/layout/tests/test_layout.py:              'desc': 'bleargh', 'extension': 'nii.gz'}
bids/layout/tests/test_layout.py:                            acquisition='fullbrain', extension='nii.gz')[0]
bids/layout/tests/test_layout.py:                    acquisition='fron.al', extension='nii.gz',
bids/layout/tests/test_layout.py:                    acquisition='fron.al', extension='nii.gz',
bids/layout/tests/test_layout.py:                    acquisition='fullbrain', extension='nii.gz',
bids/layout/tests/test_writing.py:        assert _PATTERN_FIND.findall('{extension<nii|nii.gz|json>|nii.gz}') == [
bids/layout/tests/test_writing.py:            ('{extension<nii|nii.gz|json>|nii.gz}', 'extension', 'nii|nii.gz|json', 'nii.gz')
bids/layout/tests/test_writing.py:{extension<nii|nii.gz|json>|nii.gz}"""
bids/layout/tests/test_writing.py:            ('{extension<nii|nii.gz|json>|nii.gz}', 'extension', 'nii|nii.gz|json', 'nii.gz'),
bids/layout/tests/test_writing.py:            assert build_path({'session': 'B', 'run': 3, 'extension': '.nii'},
bids/layout/tests/test_writing.py:        assert build_path({'session': 'B', 'run': 3, 'extension': '.nii'},
bids/layout/tests/test_writing.py:            "[_acq-{acquisition}]_{suffix|dwi}.{extension<bval|bvec|json|nii.gz|nii>|nii.gz}"
bids/layout/tests/test_writing.py:        pats = ['ses-{session<A|B|C>|D}/r-{run}.{extension<json|nii|nii.gz>|nii.gz}']
bids/layout/tests/test_writing.py:                'extension': ['.nii.gz', 'json']
bids/layout/writing.py:    ...     'extension': 'nii',
bids/layout/writing.py:    ...             'inplaneT[12]|angio>}.{extension<nii|nii.gz|json>|nii.gz}',
bids/layout/writing.py:    ...             '{extension<nii|nii.gz|json>|nii.gz}']
bids/layout/writing.py:    ...     "[_acq-{acquisition}]_{suffix|dwi}.{extension<bval|bvec|json|nii.gz|nii>|nii.gz}"
bids/reports/parsing.py:            iff_file = [f for f in layout.get(extension=[".nii", ".nii.gz"]) if fn in f.path][0]
bids/reports/parsing.py:                    echos = layout.get_echoes(subject=subj, extension=[".nii", ".nii.gz"],
bids/reports/parsing.py:                                                     extension=[".nii", ".nii.gz"],
bids/reports/report.py:        >>> files = layout.get(session='01', extension=['nii.gz', 'nii'])
bids/reports/report.py:                subject=subject, extension=[".nii", ".nii.gz"],
bids/reports/tests/test_parsing.py:    niftis = testlayout.get(subject=subj, extension=[".nii", ".nii.gz"])
bids/reports/tests/test_report.py:    files = testlayout.get(extension=['nii.gz', 'nii'])
bids/variables/io.py:    images = layout.get(return_type='object', extension='nii.gz',
doc/layout/index.rst:    >>> f = layout.get(task='nback', run=1, extension='nii.gz')[0].filename
doc/layout/index.rst:    >>> f = layout.get(task='nback', run=1, extension='nii.gz')[0].filename
examples/pybids_tutorial.ipynb:    "layout.get(subject='01', extension='nii.gz', suffix='bold', return_type='filename')"
examples/pybids_tutorial.ipynb:       "{'subject': '01', 'run': 1, 'suffix': 'T2w', 'extension': 'nii.gz'}"

Let me know merge master and resolve the conflicts and push...

* origin/master: (404 commits)
  [DOC] Auto-generate changelog entry for PR bids-standard#460
  Apply suggestions from code review
  label -> index
  drop _part-, introduce _split-
  [DOC] Auto-generate changelog entry for PR bids-standard#459
  [DOC] Auto-generate changelog entry for PR bids-standard#465
  fix table
  Update src/99-appendices/06-meg-file-formats.md
  [DOC] Auto-generate changelog entry for PR bids-standard#441
  inject _part into MEG spec
  update entity table
  FIX: clarify _part
  Apply suggestions from code review
  FIX: clarify participants tsv
  [DOC] Auto-generate changelog entry for PR bids-standard#457
  Update Release_Protocol.md
  add pdf steps for release protocol
  FIX: remove BESA from list of restricted keywords
  Remove trailing space
  Add reference to PDF on front page of specification
  ...

Conflicts:
	src/02-common-principles.md - had to meld with my previous wording etc.
@yarikoptic
Copy link
Collaborator Author

uff... merge conflict was a bit painful, so probably a nice look through the diff would not hurt.

Copy link
Member

@sappelhoff sappelhoff left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I left some suggestions, but overall I'd be happy to get this in soon. Thank for rebasing @yarikoptic

src/02-common-principles.md Outdated Show resolved Hide resolved
src/02-common-principles.md Outdated Show resolved Hide resolved
src/02-common-principles.md Outdated Show resolved Hide resolved
Co-authored-by: Stefan Appelhoff <stefan.appelhoff@mailbox.org>
This would
- match current use of "extension" in BIDS text
- match use within bids-validator messages
- stay inline with filename splitting utilities available in Python and Matlab

Note: it is different from PyBIDS which ATM does not require leading
'.' for `extension` specification in the queries.

Co-authored-by: Stefan Appelhoff <stefan.appelhoff@mailbox.org>
@yarikoptic
Copy link
Collaborator Author

and ready for another round of review/recommendations and/or merge ;)

Copy link
Member

@sappelhoff sappelhoff left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

https://github.com/bids-standard/bids-specification/pull/152/files#r429333329 is my last comment, other than that I approve.

But before we can merge, I'd like @effigies to have a look.

@sappelhoff sappelhoff requested a review from effigies May 22, 2020 16:38
…uniqueness

Co-authored-by: Stefan Appelhoff <stefan.appelhoff@mailbox.org>
sappelhoff
sappelhoff previously approved these changes May 22, 2020
* origin/master:
  [DOC] Auto-generate changelog entry for PR bids-standard#477
  Also ignore users urls on github
  Quote regexp in command line
  [INFRA] linkchecker - ignore github pull and tree URLs
@yarikoptic
Copy link
Collaborator Author

ok, with a merge of recent master linkchecker is quiet, it is all green and just awaits another review from @effigies ;)

Copy link
Collaborator

@effigies effigies left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This looks good. Thanks for doing this @yarikoptic!

I think the wording around suffix can be made a little clearer (see suggestion).

Also notifying @tyarkoni that here extension is defined to include the initial .. PyBIDS currently transparently parses either, but when queried, returns without the ..

src/02-common-principles.md Outdated Show resolved Hide resolved
Co-authored-by: Chris Markiewicz <effigies@gmail.com>
@yarikoptic yarikoptic dismissed stale reviews from sappelhoff via a360532 May 22, 2020 20:52
@yarikoptic
Copy link
Collaborator Author

Now that accepted suggestion from @effigies , pr needs to be reapproved @sappelhoff

@yarikoptic
Copy link
Collaborator Author

Re pybids, filed bids-standard/pybids#618

@yarikoptic yarikoptic merged commit ed040bc into bids-standard:master May 23, 2020
satra added a commit to satra/bids-specification that referenced this pull request May 23, 2020
* upstream/master: (113 commits)
  [DOC] Auto-generate changelog entry for PR bids-standard#152
  [DOC] Auto-generate changelog entry for PR bids-standard#467
  Specify that suffix must be alphanumeric
  ENH: make NOT RECOMMENDED stronger (SHOULD NOT) for zero padding for uniqueness
  ENH: Include leading . within definition of the file extension
  ENH: provide an example for a suffix based on an _eeg.vhdr filename
  [DOC] Auto-generate changelog entry for PR bids-standard#477
  [DOC] Auto-generate changelog entry for PR bids-standard#460
  Also ignore users urls on github
  Quote regexp in command line
  [INFRA] linkchecker - ignore github pull and tree URLs
  Apply suggestions from code review
  replace purview with scope
  label -> index
  Apply suggestions from code review
  drop _part-, introduce _split-
  Apply SA feedback and amended to purview
  [DOC] Auto-generate changelog entry for PR bids-standard#459
  Add Domain Expert to Maintainers Group
  [DOC] Auto-generate changelog entry for PR bids-standard#465
  ...
@yarikoptic yarikoptic deleted the enh-glossary branch May 27, 2020 14:27
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

8 participants