# BIDS dataset to a JSON-LD NIDM-Experiment dataset

(work under conda environment)
This document illustrates an example showing how to go from a BIDS dataset to a JSON-LD NIDM-experiment dataset.
We pick our bids dataset from DataLad where the hierarchy of folders follows the bids standard. For this example, we import the ABIDE DataLad dataset from CMU_a site : http://datasets.datalad.org/?dir=/abide/RawDataBIDS/CMU_a 

In [None]:
conda install -c conda-forge git-annex
pip install datalad 
datalad install ///abide/RawDataBIDS/CMU_a

We use the BIDSMRI2NIDM.py file from PyNIDM/nidm/experiment/tools.
This program will convert a BIDS MRI dataset to a NIDM-Experiment RDF document. It will parse phenotype information and simply store variables/values and link to the associated json data dictionary file.
Argument used: 
- -d [root directory of BIDS dataset]
- -jsonld [If flag set, output is json-ld not TURTLE]

In [None]:
python ~/PyNIDM/nidm/experiment/tools/BIDSMRI2NIDM.py -d ~/bids_dataset/CMU_a/

The Datalad hosted abide dataset doesn’t have participants.tsv files and doesn’t have a dataset_description.json file so it is really not even valid BIDS. The participants.tsv file is optional but we must have a datasaet_description.json file. 
One has to create some basic dataset_description.json files for each abide and adhd200 site so one could run bidsmri2nidm on it and add it in the github repo.
For instance see https://bids-specification.readthedocs.io/en/stable/03-modality-agnostic-files.html to put the required fields in the dataset_description.json file. 
Below our dataset_description.json file:

In [None]:
{
	"Name": "ABIDE dataset CMU_a Site",
	"BIDSVersion": "1.0.1",
	"License": "CC BY-SA 4.0"
	
}

One must also parse other json files available at the root of the bids directory: T1w.json and taskxxx.json files. 
if key from T1w.json file or taskxxx.json file is mapped to term in BIDS_Constants.py then add to NIDM object. 
See function bidsmri2project(directory, args):
- for dataset_description.json file 

In [None]:
def bidsmri2project(directory, args):
    #Parse dataset_description.json file in BIDS directory
    if (os.path.isdir(os.path.join(directory))):
        try:
            with open(os.path.join(directory,'dataset_description.json')) as data_file:
                dataset = json.load(data_file)
        except OSError:
            logging.critical("Cannot find dataset_description.json file which is required in the BIDS spec")
            exit("-1")
    else:
        logging.critical("Error: BIDS directory %s does not exist!" %os.path.join(directory))
        exit("-1")

    #create project / nidm-exp doc
    project = Project()

    #add various attributes if they exist in BIDS dataset
    for key in dataset:
        #if key from dataset_description file is mapped to term in BIDS_Constants.py then add to NIDM object
        if key in BIDS_Constants.dataset_description:
            if type(dataset[key]) is list:
                project.add_attributes({BIDS_Constants.dataset_description[key]:"".join(dataset[key])})
            else:
                project.add_attributes({BIDS_Constants.dataset_description[key]:dataset[key]})
        #add absolute location of BIDS directory on disk for later finding of files which are stored relatively in NIDM document
        project.add_attributes({Constants.PROV['Location']:directory})

    #get BIDS layout
    bids_layout = BIDSLayout(directory) 

- For T1w.json file 

In [None]:

            if file_tpl.entities['datatype']=='anat':
                #do something with anatomicals
                acq_obj = MRObject(acq)
                #add image contrast type
                if file_tpl.entities['suffix'] in BIDS_Constants.scans:
                    acq_obj.add_attributes({Constants.NIDM_IMAGE_CONTRAST_TYPE:BIDS_Constants.scans[file_tpl.entities['suffix']]})
                else:
                    logging.info("WARNING: No matching image contrast type found in BIDS_Constants.py for %s" % file_tpl.entities['suffix'])

                #add image usage type
                if file_tpl.entities['datatype'] in BIDS_Constants.scans:
                    acq_obj.add_attributes({Constants.NIDM_IMAGE_USAGE_TYPE:BIDS_Constants.scans[file_tpl.entities['datatype']]})
                else:
                    logging.info("WARNING: No matching image usage type found in BIDS_Constants.py for %s" % file_tpl.entities['datatype'])
                #add file link
                #make relative link to
                acq_obj.add_attributes({Constants.NIDM_FILENAME:getRelPathToBIDS(join(file_tpl.dirname,file_tpl.filename), directory)})

                #add sha512 sum
                if isfile(join(directory,file_tpl.dirname,file_tpl.filename)):
                    acq_obj.add_attributes({Constants.CRYPTO_SHA512:getsha512(join(directory,file_tpl.dirname,file_tpl.filename))})
                else:
                    logging.info("WARNINGL file %s doesn't exist! No SHA512 sum stored in NIDM files..." %join(directory,file_tpl.dirname,file_tpl.filename))
                #get associated JSON file if exists
                #There is T1w.json file with information 
                json_data = (bids_layout.get(suffix=file_tpl.entities['suffix'],subject=subject_id))[0].metadata
                if len(json_data.info)>0:
                    for key in json_data.info.items():
                        if key in BIDS_Constants.json_keys:
                            if type(json_data.info[key]) is list:
                                acq_obj.add_attributes({BIDS_Constants.json_keys[key.replace(" ", "_")]:''.join(str(e) for e in json_data.info[key])})
                            else:
                                acq_obj.add_attributes({BIDS_Constants.json_keys[key.replace(" ", "_")]:json_data.info[key]})
                   
                #Parse T1w.json file in BIDS directory to add the attributes contained inside
                if (os.path.isdir(os.path.join(directory))):
                    try:
                        with open(os.path.join(directory,'T1w.json')) as data_file:
                            dataset = json.load(data_file)
                    except OSError:
                        logging.critical("Cannot find T1w.json file which is required in the BIDS spec")
                        exit("-1")
                else:
                    logging.critical("Error: BIDS directory %s does not exist!" %os.path.join(directory))
                    exit("-1")

                #add various attributes if they exist in BIDS dataset
                for key in dataset:
                    #if key from T1w.json file is mapped to term in BIDS_Constants.py then add to NIDM object
                    if key in BIDS_Constants.json_keys:
                        if type(dataset[key]) is list:
                            acq_obj.add_attributes({BIDS_Constants.json_keys[key]:"".join(dataset[key])})
                        else:
                            acq_obj.add_attributes({BIDS_Constants.json_keys[key]:dataset[key]})     

- for taskxxx.json file 

In [None]:
                                                          
            elif file_tpl.entities['datatype'] == 'func':
                #do something with functionals
                acq_obj = MRObject(acq)
                #add image contrast type
                if file_tpl.entities['suffix'] in BIDS_Constants.scans:
                    acq_obj.add_attributes({Constants.NIDM_IMAGE_CONTRAST_TYPE:BIDS_Constants.scans[file_tpl.entities['suffix']]})
                else:
                    logging.info("WARNING: No matching image contrast type found in BIDS_Constants.py for %s" % file_tpl.entities['suffix'])

                #add image usage type
                if file_tpl.entities['datatype'] in BIDS_Constants.scans:
                    acq_obj.add_attributes({Constants.NIDM_IMAGE_USAGE_TYPE:BIDS_Constants.scans[file_tpl.entities['datatype']]})
                else:
                    logging.info("WARNING: No matching image usage type found in BIDS_Constants.py for %s" % file_tpl.entities['datatype'])
                #make relative link to
                acq_obj.add_attributes({Constants.NIDM_FILENAME:getRelPathToBIDS(join(file_tpl.dirname,file_tpl.filename), directory)})
                 #add sha512 sum
                if isfile(join(directory,file_tpl.dirname,file_tpl.filename)):
                    acq_obj.add_attributes({Constants.CRYPTO_SHA512:getsha512(join(directory,file_tpl.dirname,file_tpl.filename))})
                else:
                    logging.info("WARNINGL file %s doesn't exist! No SHA512 sum stored in NIDM files..." %join(directory,file_tpl.dirname,file_tpl.filename))

                if 'run' in file_tpl.entities:
                    acq_obj.add_attributes({BIDS_Constants.json_keys["run"]:file_tpl.entities['run']})

                #get associated JSON file if exists
                json_data = (bids_layout.get(suffix=file_tpl.entities['suffix'],subject=subject_id))[0].metadata

                if len(json_data.info)>0:
                    for key in json_data.info.items():
                        if key in BIDS_Constants.json_keys:
                            if type(json_data.info[key]) is list:
                                acq_obj.add_attributes({BIDS_Constants.json_keys[key.replace(" ", "_")]:''.join(str(e) for e in json_data.info[key])})
                            else:
                                acq_obj.add_attributes({BIDS_Constants.json_keys[key.replace(" ", "_")]:json_data.info[key]})
                #get associated events TSV file
                if 'run' in file_tpl.entities:
                    events_file = bids_layout.get(subject=subject_id, extensions=['.tsv'],modality=file_tpl.entities['datatype'],task=file_tpl.entities['task'],run=file_tpl.entities['run'])
                else:
                    events_file = bids_layout.get(subject=subject_id, extensions=['.tsv'],modality=file_tpl.entities['datatype'],task=file_tpl.entities['task'])
                #if there is an events file then this is task-based so create an acquisition object for the task file and link
                if events_file:
                    #for now create acquisition object and link it to the associated scan
                    events_obj = AcquisitionObject(acq)
                    #add prov type, task name as prov:label, and link to filename of events file

                    events_obj.add_attributes({PROV_TYPE:Constants.NIDM_MRI_BOLD_EVENTS,BIDS_Constants.json_keys["TaskName"]: json_data["TaskName"], Constants.NIDM_FILENAME:getRelPathToBIDS(events_file[0].filename, directory)})
                    #link it to appropriate MR acquisition entity
                    events_obj.wasAttributedTo(acq_obj)
                    
                #Parse task-rest_bold.json file in BIDS directory to add the attributes contained inside
                if (os.path.isdir(os.path.join(directory))):
                    try:
                        with open(os.path.join(directory,'task-rest_bold.json')) as data_file:
                            dataset = json.load(data_file)
                    except OSError:
                        logging.critical("Cannot find task-rest_bold.json file which is required in the BIDS spec")
                        exit("-1")
                else:
                    logging.critical("Error: BIDS directory %s does not exist!" %os.path.join(directory))
                    exit("-1")

                #add various attributes if they exist in BIDS dataset
                for key in dataset:
                    #if key from task-rest_bold.json file is mapped to term in BIDS_Constants.py then add to NIDM object
                    if key in BIDS_Constants.json_keys:
                        if type(dataset[key]) is list:
                            acq_obj.add_attributes({BIDS_Constants.json_keys[key]:",".join(map(str,dataset[key]))})
                        else:
                            acq_obj.add_attributes({BIDS_Constants.json_keys[key]:dataset[key]}) 

The idea behind is to create acquisition objects for each scan for each subject using the information available in the bids directory. 
Below, an extract of the nidme.json document created with the acquisition informations for one subject: 

In [None]:
 {
            "@id": "nidm:8edfb514-b761-11e9-8f81-e1efbef5a165",
            "@type": [
                "Entity",
                "AcquisitionObject"
            ],
            "crypto:sha512": "840affc90aeda40601a48f8d5cd3421fb73e0cbecc3111c4a3194aef74a716b732e7649b8b36d55f0b5559a88531ac10acd00bf27a95d52b59da78a88b7eeb8f",
            "dicom:AcquisitionMatrix": "256x256",
            "dicom:EchoTime": 0.00248,
            "dicom:FlipAngle": {
                "@type": "xsd:int",
                "@value": "8"
            },
            "dicom:InversionTime": 1.1,
            "dicom:MagneticFieldStrength": {
                "@type": "xsd:int",
                "@value": "3"
            },
            "dicom:PixelBandwidth": 170.0,
            "dicom:RepetitionTime": 1.87,
            "dicom:ScanningSequence": "MPRAGE",
            "nidm:PhaseEncodingDirection": "j-",
            "hadAcquisitionModality": {
                "@id": "nidm:MagneticResonanceImaging"
            },
            "hadImageContrastType": {
                "@id": "nidm:T1Weighted"
            },
            "HadImageUsageType": {
                "@id": "nidm:Anatomical"
            },
            "filename": "/udd/nperez/sub-0050656/anat/sub-0050656_T1w.nii.gz",
            "prov:wasGeneratedBy": {
                "@id": "nidm:8edfacae-b761-11e9-8f81-e1efbef5a165"
            }
        },

However, some limitations appear: 
- we only extract the data that map to BIDS_Constants.py and not all bids attributes are referenced in this document 
- for bval and bvec files, what to do with those?