# About dicom2bids
dicom2bids is a python module for converting dicom files to nifti in a bids-compatible file structure. It uses dcm2niix for conversion, which must be installed separately (https://github.com/rordenlab/dcm2niix). It was developed at the University of Oregon and has not been tested on other systems.

## Dicom folder structure
dicom2bids assumes that dicoms are organized into separate folders by series, with folder names that include the series description. It provides a function for sorting your dicoms if they are not already organized in this manner. Arguments are input directory (pathlike object), output directory (pathlike object), and an optional flag to use or not use slurm (default False, requires slurmpy, may not work everywhere).

In [25]:
# dicom sorting example
import pathlib
import dicom2bids
unsorted_dicoms = pathlib.Path('/projects/lcni/jolinda/shared/TalapasClass/unsorted_dicoms/')
dicom2bids.SortDicoms(unsorted_dicoms, 'sorted_dicoms')

One or more files already existing and not moved


In [26]:
!tree sorted_dicoms -d

sorted_dicoms
`-- phantom_20171212_161421
    |-- Series_1_AAHScout
    |-- Series_2_AAHScout_MPR_sag
    |-- Series_3_AAHScout_MPR_cor
    |-- Series_4_AAHScout_MPR_tra
    `-- Series_5_Resting1

6 directories


## Mapping series descriptions to BIDS entities
Any dicoms that are to be converted must have their series descriptions mapped to the appropriate BIDS entities. GetSeriesNames will extract all series descriptions from your sorted input dicom folder to make it easier to define this mapping.

In [7]:
example_dicoms = pathlib.Path('/projects/lcni/dcm/lcni/Burggren/HEAT')
dicom2bids.GetSeriesNames(example_dicoms)

{'AAHScout_32ch-head-coil',
 'AAHScout_32ch-head-coil_MPR_cor',
 'AAHScout_32ch-head-coil_MPR_sag',
 'AAHScout_32ch-head-coil_MPR_tra',
 'Flair_axial.sw',
 'bold_mb3_g2_2mm_te25',
 'mprage_p2',
 'pcasl_hires_0.0',
 'pcasl_hires_0.2',
 'pcasl_hires_0.7',
 'pcasl_hires_1.2',
 'pcasl_hires_1.7',
 'pcasl_hires_2.2',
 'se_epi_mb3_g2_2mm_ap',
 'se_epi_mb3_g2_2mm_pa',
 'siemens_diff_3shell_ap',
 'siemens_diff_3shell_lr',
 'siemens_diff_3shell_pa',
 'siemens_diff_3shell_rl',
 't2_space_sag_p2_iso',
 't2_tse_cor65slice_2avg.+',
 'tof_fl3d_tra_p2_multi-slab',
 'tof_fl3d_tra_p2_multi-slab_MIP_COR',
 'tof_fl3d_tra_p2_multi-slab_MIP_SAG',
 'tof_fl3d_tra_p2_multi-slab_MIP_TRA'}

These should be mapped to output file names using a dictionary. 'datatype' and 'suffix' are required. dicom2bids does NOT check whether all required fields have been defined (eg, 'task' for bold images), that's up to you. Entries can be defined in any order; dicom2bids will format the output file names correctly.

In [116]:
bd = dicom2bids.bids_dict() # create a bids dictionary

In [117]:
bd.add('mprage_p2', datatype = 'anat', suffix = 'T1w') 

In [118]:
bd.add('bold_mb3_g2_2mm_te25', datatype = 'func', suffix = 'bold', task = 'resting')

Some BIDS entities are also python keywords. In that case you can't use this function call:
``bd.add('siemens_diff_3shell_ap', datatype = 'fmap', suffix = 'epi', dir = 'ap') ``
Instead, use the "entities" parameter. This parameter takes a dictionary as an argument:
``bd.add('siemens_diff_3shell_ap', datatype = 'fmap', suffix = 'epi', entities = {'dir':'ap'}) ``

In [119]:
bd.add('siemens_diff_3shell_ap', datatype = 'fmap', suffix = 'epi', entities = {'dir':'ap'})
bd.add('siemens_diff_3shell_pa', datatype = 'fmap', suffix = 'epi', entities = {'dir':'pa'})
bd.add('siemens_diff_3shell_rl', datatype = 'fmap', suffix = 'epi', entities = {'dir':'rl'})
bd.add('siemens_diff_3shell_lr', datatype = 'fmap', suffix = 'epi', entities = {'dir':'lr'})

You can convert things that aren't in the bids standard by including the "nonstandard = True" argument.

In [120]:
bd.add('pcasl_hires_0.0', datatype = 'perf', suffix = 'asl', acq = '0.0', nonstandard = True)
bd.add('pcasl_hires_0.2', datatype = 'perf', suffix = 'asl', acq = '0.2', nonstandard = True)
bd.add('pcasl_hires_0.7', datatype = 'perf', suffix = 'asl', acq = '0.7', nonstandard = True)
bd.add('pcasl_hires_1.2', datatype = 'perf', suffix = 'asl', acq = '1.2', nonstandard = True)
bd.add('pcasl_hires_1.7', datatype = 'perf', suffix = 'asl', acq = '1.7', nonstandard = True)
bd.add('pcasl_hires_2.2', datatype = 'perf', suffix = 'asl', acq = '2.2', nonstandard = True)

In [121]:
print(bd)

mprage_p2: sub-{}_run-{}_T1w
bold_mb3_g2_2mm_te25: sub-{}_task-resting_run-{}_bold
siemens_diff_3shell_ap: sub-{}_dir-ap_run-{}_epi
siemens_diff_3shell_pa: sub-{}_dir-pa_run-{}_epi
siemens_diff_3shell_rl: sub-{}_dir-rl_run-{}_epi
siemens_diff_3shell_lr: sub-{}_dir-lr_run-{}_epi
pcasl_hires_0.0: sub-{}_acq-0.0_run-{}_asl
pcasl_hires_0.2: sub-{}_acq-0.2_run-{}_asl
pcasl_hires_0.7: sub-{}_acq-0.7_run-{}_asl
pcasl_hires_1.2: sub-{}_acq-1.2_run-{}_asl
pcasl_hires_1.7: sub-{}_acq-1.7_run-{}_asl
pcasl_hires_2.2: sub-{}_acq-2.2_run-{}_asl



## About the 'run' entity
'run' will be replaced with the series number from the dicom file. This ensures that every run, including duplicates, will be converted, and that the output files are still BIDS compliant. If you want to use 'run' differently, or not use it at all, you'll need to rename your files after conversion

## Conversion
Once you've defined your bids dictionary, call Convert with the input directory, output directory, and bids dictionary. Optionally, you can add "slurm = True" to submit conversion as a job to the cluster (this requires slurmpy and may not work on all or even most clusters, but if it works it will be MUCH faster).

In [131]:
# define the output directory
bidsdir = pathlib.Path.home() / 'lcni' / 'bidsexample'

### A note about duplicate subjects
The current iteration of dicom2bids allows you to submit a directory with multiple subjects, but it assumes that there's only one of each! In my example input I have two sessions for subject 999 and that's a problem:

In [130]:
!tree {str(example_dicoms)} -L 1

/projects/lcni/dcm/lcni/Burggren/HEAT
|-- HEAT002_20200303_102436
|-- HEAT_999_20191211_105824
|-- HEAT_999_20200127_132053
`-- Phantom_20200116_151959

4 directories, 0 files


I'm going to use a bit of python to only select the 2020 subjects, and convert one directory at a time instead of running `dicom2bids.Convert(example_dicoms, bidsdir, bd)`. If this was a real study, I'd probably add the "ses" keyword to my dictionary, split my list of subject directories to convert into two lists, and have one with ses = 'one' in the bids dictionary and one with ses = 'two'.

In [188]:
for subjectdir in example_dicoms.glob('*_2020*'):
    print(subjectdir)
    dicom2bids.Convert(subjectdir, bidsdir, bd)

/projects/lcni/dcm/lcni/Burggren/HEAT/HEAT_999_20200127_132053
/projects/lcni/dcm/lcni/Burggren/HEAT/Phantom_20200116_151959
/projects/lcni/dcm/lcni/Burggren/HEAT/HEAT002_20200303_102436


In [178]:
!tree {str(bidsdir)}

/home/jolinda/lcni/bidsexample
|-- dataset_description.json
|-- participants.json
|-- participants.tsv
|-- sub-HEAT002
|   |-- anat
|   |   |-- sub-HEAT002_run-07_T1w.json
|   |   `-- sub-HEAT002_run-07_T1w.nii.gz
|   |-- fmap
|   |   |-- sub-HEAT002_dir-ap_run-17_epi.bval
|   |   |-- sub-HEAT002_dir-ap_run-17_epi.bvec
|   |   |-- sub-HEAT002_dir-ap_run-17_epi.json
|   |   |-- sub-HEAT002_dir-ap_run-17_epi.nii.gz
|   |   |-- sub-HEAT002_dir-lr_run-20_epi.bval
|   |   |-- sub-HEAT002_dir-lr_run-20_epi.bvec
|   |   |-- sub-HEAT002_dir-lr_run-20_epi.json
|   |   |-- sub-HEAT002_dir-lr_run-20_epi.nii.gz
|   |   |-- sub-HEAT002_dir-pa_run-18_epi.bval
|   |   |-- sub-HEAT002_dir-pa_run-18_epi.bvec
|   |   |-- sub-HEAT002_dir-pa_run-18_epi.json
|   |   |-- sub-HEAT002_dir-pa_run-18_epi.nii.gz
|   |   |-- sub-HEAT002_dir-rl_run-19_epi.bval
|   |   |-- sub-HEAT002_dir-rl_run-19_epi.bvec
|   |   |-- sub-HEAT002_dir-rl_run-19_epi.json
|   |   `-- sub-HEAT002_dir-rl_run-19_epi.nii.gz
|   |-- func


In [179]:
# we automatically created a participants file
with open(bidsdir / 'participants.tsv') as f:
    print(f.read())

participant_id	age	sex
sub-HEAT999	25	M
sub-Phantom	24	O
sub-HEAT002	47	F



In [180]:
import json
with open(bidsdir / 'participants.json') as f:
    print(json.dumps(json.load(f), indent = 4))

{
    "age": {
        "Description": "age of participant",
        "Units": "years"
    },
    "sex": {
        "Description": "sex of participant",
        "Levels": {
            "M": "male",
            "F": "female",
            "O": "other"
        }
    }
}


It also creates a dataset_description file for you. On talapas it will even attempt to fill out the authors with your name and the PI on the project (using the pirg structure in the dicom repository; in this example it gets it wrong but we can always go back and edit it later). If you don't want either of these files created just specify description_file = False and/or participant_file = False in your call to dicom2bids.Convert().

In [181]:
with open(bidsdir / 'dataset_description.json') as f:
    print(json.dumps(json.load(f), indent = 4))

{
    "Name": "HEAT",
    "BIDSVersion": "1.3.0",
    "Authors": [
        "Fred Sabb",
        "Jolinda Smith"
    ],
    "Acknowledgements": "BIDS conversion was performed using dcm2niix and dicom2bids.",
    "ReferencesAndLinks": [
        "Li X, Morgan PS, Ashburner J, Smith J, Rorden C (2016) The first step for neuroimaging data analysis: DICOM to NIfTI conversion. J Neurosci Methods. 264:47-56. doi: 10.1016/j.jneumeth.2016.03.001."
    ]
}


One more thing -- in our dicom files, certain fields used by dcm2niix to write the .json files are wrong. We can fix this if we know what they are.

In [190]:
with open(next(bidsdir.rglob('sub*.json'))) as f:
    j = json.load(f)
    print(j['InstitutionName'])
    print(j['InstitutionalDepartmentName'])
    print(j['InstitutionAddress'])

Lewis_Building
Department
Franklin_Blvd_1440_Eugene_District_US_97403


We need to make a dictionary object with the correct values; this particular problem is pervasive at LCNI so I've included it in the module.

In [165]:
dicom2bids.lcni_corrections

{'InstitutionName': 'University of Oregon',
 'InstitutionalDepartmentName': 'LCNI',
 'InstitutionAddress': 'Franklin_Blvd_1440_Eugene_Oregon_US_97403'}

We can convert again and include this in the argument "json_mod". Since we only need to change the .json files, we can include the dcm2niix flags '-b o -w 0' to skip converting the .nii.gz files. We don't have to set particpant_file = False and description_file = False, they'll come through unchanged, but I will for illustration purposes. (-w 0 is 'ignore duplicates'. You might think we want -w 1, 'overwrite', but that will delete the existing dicom files).

In [191]:
for subjectdir in example_dicoms.glob('*_2020*'):
    print(subjectdir)
    dicom2bids.Convert(subjectdir, bidsdir, bd, json_mod = dicom2bids.lcni_corrections, dcm2niix_flags= '-b o -w 0', participant_file = False, description_file = False)

/projects/lcni/dcm/lcni/Burggren/HEAT/HEAT_999_20200127_132053
/projects/lcni/dcm/lcni/Burggren/HEAT/Phantom_20200116_151959
/projects/lcni/dcm/lcni/Burggren/HEAT/HEAT002_20200303_102436


In [195]:
with open(next(bidsdir.rglob('sub*.json'))) as f:
    j = json.load(f)
    print(j['InstitutionName'])
    print(j['InstitutionalDepartmentName'])
    print(j['InstitutionAddress'])

University of Oregon
LCNI
Franklin_Blvd_1440_Eugene_Oregon_US_97403
