Skip to content

Commit

Permalink
Merge pull request #560 from hed-standard/develop
Browse files Browse the repository at this point in the history
Added descriptions of the tsv files and fixed links
Other corrections and improvements to tests.
  • Loading branch information
VisLab committed Jan 4, 2024
2 parents 17a8a2f + c5c5485 commit 0151c06
Show file tree
Hide file tree
Showing 14 changed files with 435 additions and 195 deletions.
2 changes: 1 addition & 1 deletion docs/source/01_Introduction.md
Original file line number Diff line number Diff line change
Expand Up @@ -218,7 +218,7 @@ rules for library schema creation.
[**Appendix A: Schema format**](Appendix_A.md) provides a reference manual for the HED schema format rules, and
[**Appendix B: HED errors**](Appendix_B.md) gives a complete listing of HED error codes and their meanings.
A common set of test cases for these errors is available
in [**error_tests**](https://github.com/hed-standard/hed-specification/tree/master/docs/source/_static/data/error_tests) directory of the
in the [**tests**](https://github.com/hed-standard/hed-specification/tree/master/tests) directory of the
[**hed-specification**](https://github.com/hed-standard/hed-specification) GitHub repository.

Other resources include a comprehensive list of
Expand Down
54 changes: 37 additions & 17 deletions docs/source/03_HED_formats.md
Original file line number Diff line number Diff line change
Expand Up @@ -46,7 +46,7 @@ whose name is the library name.
### 3.1.2. Schema layout overview

Schemas can be specified in either `.mediawiki` or `.xml` format.
[**Online tools**](https://hedtools.ucsd.edu/hed/schema)
The HED schema [**online tools**](https://hedtools.org/hed/schemas)
provide an easy way for users to validate schema and convert between formats.

HED schema developers usually use `.mediawiki` format for more convenient editing,
Expand Down Expand Up @@ -967,7 +967,7 @@ as are entries in the corresponding tabular data file that have `n/a` or blank v

See [**3.2.9.4. A sidecar example**](./03_HED_formats.md/#3294-a-sidecar-example)
for an elaborated example of these different types of entries and
[**3.2.10.2 Event-level processing**](./03_HED_formats.md/#32102-event-level-processing)
[**3.2.10.2 Event-level processing**](./03_HED_formats.md/#32103-event-level-processing)
for an example of how the resulting HED annotations are assembled.

#### 3.2.9.2. Sidecar validation
Expand Down Expand Up @@ -1023,7 +1023,10 @@ the name of another HED-annotated column within the sidecar.
the curly brace expression including the curly braces as well as any extra parentheses or commas are removed.
4. A sidecar column name cannot both appear in a curly braces and have
an annotation that uses curly braces (to prevent circular references).
5. The curly braces cannot be used within a `Definition`.
5. The curly braces cannot be used within a `Definition`.
6. Curly braces can not appear in the HED column of a tabular file.
7. Curly braces can not be nested.
8. A pair of curly braces must appear syntactically as a tag and not as the substitution for a place holder.
``````

Expand Down Expand Up @@ -1132,37 +1135,54 @@ This is permitted.

### 3.2.10. Tabular files

A tabular file is a text file in which each line represents a row in a table.
The column entries in a given row are separated by tabs.
Further, the first line of the file must contain a tab-separated list of
A tabular (`.tsv`) file is a text file in which each line represents a row in a table.
The column entries in a given row are separated by tabs.
The first line of the file must contain a tab-separated list of
column names, which should be unique.
This description of tabular file conforms to that used by [**BIDS**](https://bids.neuroimaging.io/).
This description of tabular files conforms to that used by [**BIDS**](https://bids.neuroimaging.io/).

#### 3.2.10.1 Tabular types

Generally each row in a tabular file represents an item and the columns values provide properties of that item.
The most common HED-annotated tabular file represents event markers in an experiment.
In this case each row in the file represents a time at which something happened.
The most common HED-annotated tabular files represent event markers in an experiment (e.g., BIDS `events.tsv` files).
In this case each row in represents a time at which something happened.

Another common HED-annotated tabular file represents experiment participants.
In this case each row in the file represents a participant, and the columns provide
Another common HED-annotated tabular file represents experiment participants
(e.g., BIDS `participants.tsv`).
Each row in the file represents a participant, and the columns provide
characteristics or other information about the participant identified in that row.

The `events.tsv` and the `participants.tsv` are representative of two distinct types
of tabular files: ones representing time markers and those representing other types of information.
To be recognized as having time-markers, the first column of the file must be `onset`.
Non-time marker files cannot use the `Onset`, `Offset`, or `Inset` tags as these
tags are reserved for annotations of time processes.

In any case, the general strategy for validation or other processing is:
1. Process the individual components of the HED annotation (tag and string level processing).
2. Assemble the component annotations for a row (event or row level processing).
3. Check consistency and relationships among the row annotations (file-level processing).

#### 3.2.10.1. Tabular annotations
See [**BIDS tabular files**](06_Infrastructure_and_tools.md#631-bids-tabular-files) for
more examples.

#### 3.2.10.2. Tabular annotations

HED annotations in tabular files can occur both in a `HED` column within the file and
in an associated JSON sidecar.

The HED strings that appear in a `HED` column must be valid HED strings.
If the first column is not called `onset`, the assembled annotation for
the tabular file cannot contain any of the tags `Onset`, `Offset`, or `Inset`.

Definitions many not appear in the `HED` column of a tabular file or
in any entry of a JSON sidecar corresponding that contains items other than definitions.

See [**DEFINITION_INVALID**](./Appendix_B.md#definition_invalid)
and [**ONSET_OFFSET_INSET_ERROR**](./Appendix_B.md#onset_offset_inset_error) for information.

Definitions many not appear in the `HED` column of a tabular file.
Definitions may not appear in any entry of a JSON sidecar corresponding
to a column of the tabular file.

#### 3.2.10.2. Event-level processing
#### 3.2.10.3. Event-level processing

After individual HED tags and HED strings in the `HED` column of tabular files and
in the associated sidecars are validated or otherwise processed,
Expand Down Expand Up @@ -1228,7 +1248,7 @@ has been included. The entries with `n/a` have been ignored.
For more examples of event assembly, see [**How HED works in BIDS**](https://www.hed-resources.org/en/latest/BidsAnnotationQuickstart.html#how-hed-works-in-bids) tutorial.


#### 3.2.10.3 File-level processing
#### 3.2.10.4. File-level processing

HED versions >= 8.0.0 allow annotation of relationships among rows in a tabular file.
Hence, processing generally requires that annotations for all the rows be assembled
Expand Down
74 changes: 57 additions & 17 deletions docs/source/06_Infrastructure_and_tools.md
Original file line number Diff line number Diff line change
Expand Up @@ -114,16 +114,39 @@ if the tool does not support these tags.
widely-adopted specification and supporting tools for organizing and
describing brain imaging and behavioral data.

BIDS dataset events are stored in tab-separated value files whose names end in `events.tsv`.
HED's use of tabular files and sidecars closely aligns with BIDS and its requirements.
HED has been incorporated into the BIDS standard as the mechanism for annotating
tabular files.

### 6.3.1. BIDS tabular files

HED has been incorporated into the BIDS standard as the mechanism for annotating
BIDS tabular files, which are tab-separated value files whose first line has the column names.
A BIDS tabular file has extension `.tsv` and can be optionally accompanied by a
similarly named `.json` file (i.e., a sidecar) containing metadata for the file.

The most common types of BIDS tabular files are shown in the following table.

(tabular-types-anchor)=
| Ends with | Time? | A row represents |
|----------------------|-------|----------------------------------------------------|
| `_events.tsv` | Yes | Markers on the timeline of another file. |
| `_participants.tsv` | No | Metadata about one dataset subject. |
| `_scans.tsv` | No | Metadata about one dataset recording. |
| `_beh.tsv` | No | A measurement not associated with an `onset`. |
| `_samples.tsv ` | No | Properties of a particular sample (e.g., tissue). |
| `phenotype/xxx_.tsv` | No | A measurement from a particular participant. |

HED treats tabular files such as (e.g., `_events.tsv` files) whose first column has
the name `onset` as expressing markers on a timeline.
These timeline files allow `Onset`, `Inset`, and `Offset` tags in their
annotations and receive additional [**file-level processing**](./03_HED_formats.md#32104-file-level-processing) to assure that these tags are properly
matched.

**Tabular files that do not represent timelines are not permitted to use
`Onset`, `Inset`, and `Offset` tags in their annotations.**
See [**ONSET_OFFSET_INSET_ERROR**](./Appendix_B.md#onset_offset_inset_error)
for additional information.

The following shows an excerpt from a BIDS event file:

````{admonition} **Example:** Excerpt from a BIDS event file.
````{admonition} **Example:** Excerpt from a BIDS _events.tsv file.
```
onset duration trial_type response_time HED
Expand All @@ -132,21 +155,39 @@ onset duration trial_type response_time HED
```
````


The first two columns in a BIDS events file are required to be `onset` and `duration`, respectively.
The first two columns in a BIDS event file are required to be `onset` and `duration`, respectively.
The `onset` is the time in seconds of an event marker relative to the start of its corresponding
data recording,
while the `duration` represents the duration in seconds of some aspect of the event.
The remaining columns in this event file are optional.

BIDS reserves an optional column named `HED` to contain HED strings relevant for the event instance.
BIDS reserves an optional column named `HED` to contain HED strings relevant for the instance
represented by that row.
In the above example, the first row `HED` column contains `Label/Starting-point, Quiet`,
while the second row contains `n/a`, indicating that entry should be ignored.
meaning that the event that starts at time 1.2 has these particular annotations
in addition to annotations contributed by any relevant sidecars.
The `HED` column in the second row contains `n/a`, indicating that entry should be ignored.
In this case only annotations from applicable sidecars are applied.

HED annotations can also be associated with entries in other columns of the event file
through an associated JSON sidecar as described in the next section.

### 6.3.2. BIDS sidecars

### 6.3.2. BIDS timeseries

BIDS also includes another type of file with extension `.tsv` that represents
continuous timeseries data.
For example, the motion capture `_motion.tsv` files give samples from
channels of a motion capture apparatus,
while physiological data files
`_physio.tsv` and `_stim.tsv` represent continuous
physiological recordings such as cardiac and respiratory signals.
These tabular files do not have column headers,
but rather use other files to define the column names.
**BIDS currently does not support HED in these types of files.**


### 6.3.3. BIDS sidecars

BIDS also recommends data dictionaries in the form of JSON sidecars to document
the meaning of the data in the event files.
Expand All @@ -155,7 +196,7 @@ in compatibly-named sidecars.
See the [**example sidecar**](./03_HED_formats.md#3291-sidecar-entries) in Chapter 3
for an explanation of the different sidecar entries.

### 6.3.3. Annotation assembly
### 6.3.4. Annotation assembly

HED tools are available to assemble the annotations associated with each row in
a tabular file using its `HED` column and the sidecar information associated
Expand All @@ -176,7 +217,7 @@ to give the following annotation:
````
The process is to look up the appropriate row annotation for each column in the sidecar and append these with an annotation in the `HED` column if available.

### 6.3.4. HED version in BIDS
### 6.3.5. HED version in BIDS

The HED version is included as the value of the `"HEDVersion"` key in the
`dataset_description.json` metadata file located at the top level in a BIDS dataset.
Expand Down Expand Up @@ -244,13 +285,13 @@ tags.
````


### 6.3.5. HED in the BIDS validator
### 6.3.6. HED in the BIDS validator

HED provides a JavaScript validator in the [**hed-javascript**](https://github.com/hed-standard/hed-javascript) repository, which is available as an installable package via [**npm**](https://www.npmjs.com/).
The [**BIDS validator**](https://github.com/bids-standard/bids-validator)
incorporates calls to this package to validate HED tags in BIDS datasets.

### 6.3.5. HED python tools
### 6.3.7. HED python tools

The [**hedtools**](https://pypi.org/project/hedtools/) package includes
input functions that use [**Pandas**](https://pandas.pydata.org/) data frames to construct internal
Expand All @@ -259,8 +300,7 @@ representations of HED-annotated event files.
HED schema developers generally do initial development of the schema using `.mediawiki` format.
The tools to convert schema between `.mediawiki` and `.xml` format are located
in the `hed.schema` module of the
[**hedtools**](https://github.com/hed-standard/hed-python/tree/master/hedtools)
project of the [**hed-python**](https://github.com/hed-standard/hed-python) GitHub repository.
[**hed-python**](https://github.com/hed-standard/hed-python) GitHub repository.
All conversions are performed by converting the schema to a `HedSchema` object.
Then modules `wiki2xml.py` and `xml2wiki.py` provide top-level functions to perform these
conversions.
2 changes: 1 addition & 1 deletion docs/source/07_Library_schemas.md
Original file line number Diff line number Diff line change
Expand Up @@ -475,7 +475,7 @@ Based on the above description tools will download:
1. The standard HED schema:
[https://raw.githubusercontent.com/hed-standard/hed-schemas/main/standard_schema/hedxml/HED8.1.0.xml](https://raw.githubusercontent.com/hed-standard/hed-schemas/main/standard_schema/hedxml/HED8.1.0.xml).
2. The HED `score` library schema version 1.0.0:
[https://raw.githubusercontent.com/hed-standard/hed-schemas/main/library_schemas/score/hedxml/HED_score_1.0.0.xml](https://raw.githubusercontent.com/hed-standard/hed-schemas/main/library_schemas/score/hedxml/HED_score_0.0.1.xml).
[https://raw.githubusercontent.com/hed-standard/hed-schemas/main/library_schemas/score/hedxml/HED_score_1.0.0.xml](https://raw.githubusercontent.com/hed-standard/hed-schemas/main/library_schemas/score/hedxml/HED_score_1.0.0.xml).

In the dataset annotations for the above example, tags drawn from the score schema would
be prefixed with `sc:`, where `sc` is a local name used to distinguish
Expand Down
4 changes: 2 additions & 2 deletions docs/source/Appendix_A.md
Original file line number Diff line number Diff line change
Expand Up @@ -266,7 +266,7 @@ Only the schema attributes listed in the following table can be handled by curre
* - [`SIUnit`](#a1413-siunit)
- unit
- This unit represents an SI unit and can be modified.
* - [`SIUnitModifier`](#a1414-siunitmodifer)
* - [`SIUnitModifier`](#a1414-siunitmodifier)
- unitModifier
- Modifier applies to base units.
* - [`SIUnitSymbolModifier`](#a1415-siunitsymbolmodifier)
Expand Down Expand Up @@ -552,7 +552,7 @@ HED version="8.0.0"

The schema `.mediawiki` file specified in this example is named `HED8.0.0.mediawiki` and can be found in the
[**standard_schema/hedwiki**](https://github.com/hed-standard/hed-schemas/tree/main/standard_schema/hedwiki)
directory of the [**hed-schemas**](https://github.com/hed-standard/hedschemas) GitHub repository.
directory of the [**hed-schemas**](https://github.com/hed-standard/hed-schemas) GitHub repository.

The versions of the schema that use XSD validation to verify the format (versions 8.0.0 and above) have `xmlns:xsi` and `xsi:noNamespaceSchemaLocation` attributes.
The `xsi` attribute is required if `xmlns:xsi` is given.
Expand Down

0 comments on commit 0151c06

Please sign in to comment.