Merge branch 'bids-standard:master' into DeIdentificationMethod

bids-standard · Apr 23, 2024 · 4fab2f4 · 4fab2f4
2 parents 1589e22 + 01025da
commit 4fab2f4
Show file tree

Hide file tree

Showing 28 changed files with 1,016 additions and 77 deletions.
diff --git a/.pre-commit-config.yaml b/.pre-commit-config.yaml
@@ -43,6 +43,12 @@ repos:
       - id: prettier
         entry: env PRETTIER_LEGACY_CLI=1 prettier # temporary fix for https://github.com/prettier/prettier/issues/15742
         files: src/schema/.*/.*\.yaml
+  - repo: https://github.com/adrienverge/yamllint
+    rev: v1.35.1
+    hooks:
+      - id: yamllint
+        args: [-f=standard, -c=.yamllint.yml]
+        files: src/schema/.*/.*\.yaml
   - repo: https://github.com/codespell-project/codespell
     rev: v2.2.6
     hooks:
@@ -62,5 +68,6 @@ repos:
           - pytest
           - types-PyYAML
           - types-tabulate
+          - types-jsonschema
         args: ["tools/schemacode/bidsschematools"]
         pass_filenames: false
diff --git a/pdf_build_src/process_markdowns.py b/pdf_build_src/process_markdowns.py
@@ -559,7 +559,7 @@ def correct_tables(root_path, debug=False):
                             for i, new_line in enumerate(content):
                                 if i == start_line:
                                     new_content.pop()
-                                if i >= start_line and i < end_line:
+                                if start_line <= i < end_line:
                                     new_content.append("|".join(table[count]) + " \n")
                                     count += 1
                                 elif i == end_line:

diff --git a/src/appendices/entity-table.md b/src/appendices/entity-table.md
@@ -1,3 +1,8 @@
+---
+hide:
+-   toc
+---
+
 # Entity table
 
 This section compiles the entities (key-value pairs within filenames)

diff --git a/src/appendices/qmri.md b/src/appendices/qmri.md
@@ -211,7 +211,7 @@ but also by which metadata fields are provided in accompanying json files.
 |----------------------|------------------------------------------------------------------------------------------------------------------------------|----------------------------|
 | VFA                  | `FlipAngle`, `PulseSequenceType`, `RepetitionTimeExcitation`                                                                 | `SpoilingRFPhaseIncrement` |
 | IRT1                 | `InversionTime`                                                                                                              |                            |
-| MP2RAGE<sup>\*</sup> | `FlipAngle`, `InversionTime`, `RepetitionTimeExcitation`, `RepetitionTimePreperation`, `NumberShots`,`MagneticFieldStrength` | `EchoTime`                 |
+| MP2RAGE<sup>\*</sup> | `FlipAngle`, `InversionTime`, `RepetitionTimeExcitation`, `RepetitionTimePreparation`, `NumberShots`,`MagneticFieldStrength` | `EchoTime`                 |
 | MESE                 | `EchoTime`                                                                                                                   |                            |
 | MEGRE                | `EchoTime`                                                                                                                   |                            |
 | MTR                  | `MTState`                                                                                                                    |                            |

diff --git a/src/common-principles.md b/src/common-principles.md
@@ -309,6 +309,28 @@ field in `dataset_description.json` of each subdirectory of `derivatives` to:
 }
 ```
 
+!!! danger "Caution"
+
+    Sharing source data may help amend errors and missing data discovered
+    only with the reuse of the raw dataset in practice.
+    Therefore, from an Open Science perspective, it is RECOMMENDED to share
+    the source data whenever it is possible.
+
+    However, more stringent sharing limitations may apply to the source data
+    than those applicable to the raw data.
+    For example, human data almost always requires deidentification
+    before they can be redistributed,
+    or the subjects' consent form did not explicitly state that the source files
+    would be shared after deidentification.
+    Further examples in which sharing source data may not be possible
+    include original data formats that are not redistributable
+    as per the acquisition device's license.
+
+    As for raw data, all regulatory, ethical, and legal aspects SHOULD
+    be carefully considered before sharing data
+    through the `sourcedata/` directory mechanism.
+    In the case of source data, these aspects are likely more stringent.
+
 ### Storage of derived datasets
 
 Derivatives can be stored/distributed in two ways:
@@ -563,7 +585,7 @@ like in the example below.
 ### Compressed tabular files
 
 Large tabular information, such as physiological recordings, MUST be stored with
-[compressed tab-delineated (TSV.GZ) files](glossary.md#tsvgz-extensions) when
+[compressed tab-delineated (TSV.GZ) files](glossary.md#tsv_gz-extensions) when
 so established by the specifications.
 Rules for formatting plain-text tabular files apply to TSVGZ files with three exceptions:
 

diff --git a/src/derivatives/common-data-types.md b/src/derivatives/common-data-types.md
@@ -256,24 +256,33 @@ static volume, a `RepetitionTime` property would no longer be relevant).
 
 ## descriptions.tsv
 
+Template:
+
+```Text
+[sub-<label>/]
+    [ses-<label>/]
+        [sub-<label>_][ses-<label>_]descriptions.tsv
+        [sub-<label>_][ses-<label>_]descriptions.json
+```
+
+Optional: Yes
+
 To keep a record of processing steps applied to the data, a `descriptions.tsv` file MAY be used.
-The `descriptions.tsv` file MUST contain at least the following two columns:
+The `descriptions.tsv` file consists of one row for each unique `desc-<label>`
+entity used in the dataset and a set of REQUIRED and OPTIONAL columns:
 
--   `desc_id`
--   `description`
+<!-- This block generates a columns table.
+The definitions of these fields can be found in
+  src/schema/rules/tabular_data/*.yaml
+and a guide for using macros can be found at
+ https://github.com/bids-standard/bids-specification/blob/master/macros_doc.md
+-->
+{{ MACROS___make_columns_table("derivatives.common_derivatives.Descriptions") }}
 
 This file MAY be located at the root of the derivative dataset,
 or at the subject or session level
 ([Inheritance Principle](../common-principles.md#the-inheritance-principle)).
 
-The `desc_id` column contains the labels used with the [`desc entity`](../appendices/entities.md#desc),
-within the particular nesting that the `descriptions.tsv` file is placed.
-For example, if the `descriptions.tsv` file is placed at the root of the derivative dataset,
-its `desc_id` column SHOULD contain all labels of the [`desc entity`](../appendices/entities.md#desc)
-used across the entire derivative dataset.
-
-The `description` column contains human-readable descriptions of the processing steps.
-
 The use of `descriptions.tsv` files together with the [`desc entity`](../appendices/entities.md#desc)
 are helpful to document how files are generated, even if their use may not be sufficient
 to provide full computational reproducibility.