## Validate HED in a BIDS dataset.

Validating annotations HED as you develop them makes the annotation process much easier and
faster to debug. This notebook validates HED in a BIDS dataset.

The tool creates a `BidsDataset` object, which represents the information from a BIDS
dataset that is relevant to HED, including the `dataset_description.json`,
all `events.tsv` files, and all `events.json` sidecar files.

The `validate` method of `BidsDataset` first validates all of the `events.json` sidecars
and then assembles the relevant sidecars for each `events.tsv` file and validates it.
The validation uses the HED schemas specified in the `HEDVersion` field of the
dataset's `dataset_description.json` file.

The script does the following steps:

1. Set the dataset location (`bids_root_path`) to the absolute path of the root of your BIDS dataset.
2. Indicates whether to check for warnings during validation (`check_for_warnings`).
3. Create a `BidsDataset` for the dataset.
4. Validate the dataset and output the issues.

**Note:** This validation pertains to event files and HED annotation only. It does not do a full BIDS validation.

The example below uses a
[small version](https://github.com/hed-standard/hed-examples/tree/main/datasets/eeg_ds003654s_hed)
of the Wakeman-Hanson face-processing dataset available on openNeuro as
[ds003654](https://openneuro.org/datasets/ds003645/versions/2.0.0).

This dataset has no validation errors, but since we have set `check_for_warnings` to `True`,
validation returns warnings that the `sample` column does not have any metadata.

For validation of a single `events.json` files during annotation development,
users often find the [online sidecar tools](https://hedtools.ucsd.edu/hed/sidecar)
convenient, but the online tool does not provide complete dataset-level validation.

In [1]:
import os
from hed.errors import get_printable_issue_string
from hed.tools import BidsDataset
from hed import _version as vr
from hedcode._version import get_versions

print(f"Using HEDTOOLS version: {str(vr.get_versions())}")
print(f"HED Examples version: {str(get_versions())}")

## Set the dataset location and the check_for_warnings flag
check_for_warnings = False
datasets_dir = '../../../datasets'
bids_datasets = ['eeg_ds003654s_hed', 'eeg_ds003654s_hed_column',
                 'eeg_ds003654s_hed_inheritance', 'eeg_ds003654s_hed_longform',
                 'eeg_ds003654s_hed_library', 'eeg_ds002893s_hed_attention_shift',
                 '.eeg_ds004117s_hed_sternberg'
              ]

for bids_dataset in bids_datasets:
    bids_root_path = os.path.realpath(os.path.join(datasets_dir, bids_dataset))
    print(f"Validating {bids_dataset}")

    ## Validate the dataset
    bids = BidsDataset(bids_root_path)
    issue_list = bids.validate(check_for_warnings=check_for_warnings)
    if issue_list:
        issue_str = get_printable_issue_string(issue_list, "HED validation errors: ", skip_filename=False)
    else:
        issue_str = "No HED validation errors"
    print(issue_str)

Using HEDTOOLS version: {'date': '2022-07-05T18:08:32-0500', 'dirty': True, 'error': None, 'full-revisionid': '63fc2f3a91c897d6c6d7ad163c33d80145b472cc', 'version': '0.1.0+38.g63fc2f3.dirty'}
HED Examples version: {'version': '0.1.0+0.gf9bf968.dirty', 'full-revisionid': 'f9bf968253e528ef49ad3b066d87f05bfefc8bc8', 'dirty': True, 'error': None, 'date': '2022-06-21T09:41:35-0500'}
BIDS path is: ../../datasets/eeg_ds003654s_hed
No HED validation errors
BIDS path is: ../../datasets/eeg_ds003654s_hed_column
No HED validation errors
BIDS path is: ../../datasets/eeg_ds003654s_hed_inheritance
No HED validation errors
BIDS path is: ../../datasets/eeg_ds003654s_hed_longform
No HED validation errors
BIDS path is: ../../datasets/eeg_ds002893s_hed_attention_shift
No HED validation errors
BIDS path is: ../../datasets/eeg_ds004117s_hed_sternberg
No HED validation errors
