# Uploading An ASL Acquisition

Typically on Flywheel, we can upload BIDS ready data using `fw bids import`. Unfortunately, in this case, ASL data doesn't yet have official BIDS specifications -- this means Flywheel doesn't know how to read it in automatically!

```
(base) ttapera@dopamine:/storage/ttapera/RBC/data$ fw import bids --debug --project RBC_PNC /storage/ttapera/RBC/data/131160/ bbl
DEBUG: CLI Version: 12.1.1
DEBUG: CLI Args: ['/home/ttapera/.cache/flywheel/python-3.6.6/lib/python3.6/site-packages/flywheel_cli/main.pyc', 'import', 'bids', '--debug', '--project', 'RBC_PNC', '131160/', 'bbl']
DEBUG: Platform: Linux-4.4.0-177-generic-x86_64-with-debian-stretch-sid
DEBUG: System Encoding: UTF-8
DEBUG: Python Version: 3.6.6 (default, Jun 27 2018, 22:42:57)
[GCC 6.4.0]
DEBUG: SDK Version: 12.1.0
DEBUG: Flywheel Site URL: https://upenn.flywheel.io:443/api
INFO: Verifying directory exists
INFO: Project (RBC_PNC) was found. Adding data to existing project.
WARNING: Project has enabled rules, these may overwrite BIDS data. Either disable rules or run bids curation gear after data is uploaded.
Continue upload? (yes/no): yes
INFO: Subject (sub-2791617373) was found. Adding data to existing subject.
INFO: Session (ses-PNC1) for subject (sub-2791617373) was found. Adding data to existing session.
INFO: Acquisition (m0scan) not found. Creating new acquisition for session 5f130e858a33e0393ed3495a.
INFO: Acquisition (acq-se_asl) not found. Creating new acquisition for session 5f130e858a33e0393ed3495a.
INFO: Acquisition (acq-gre_asl) not found. Creating new acquisition for session 5f130e858a33e0393ed3495a.
DEBUG: Uncaught Exception
Traceback (most recent call last):
  File "flywheel_cli/main.pyc", line 62, in main
  File "flywheel_cli/commands/import_bids.pyc", line 60, in import_bids
  File "flywheel_bids/upload_bids.pyc", line 1037, in upload_bids
  File "flywheel_bids/upload_bids.pyc", line 698, in upload_bids_dir
  File "flywheel_bids/upload_bids.pyc", line 567, in handle_subject_folder
  File "flywheel_bids/upload_bids.pyc", line 306, in upload_acquisition_file
  File "flywheel_bids/upload_bids.pyc", line 357, in classify_acquisition
AttributeError: 'NoneType' object has no attribute 'get'
Error: 'NoneType' object has no attribute 'get'
Flywheel CLI 12.1.1 build a8ff8ea35efce4642e35a9221891b30c31857b60 on 2020-06-19 19:41
```

In this notebook we demonstrate how to create a custom parser & uploader for ASL data.

The general strategy is as follows:

1. List all the files we want to import
2. Get the target _session_ object to which we want to upload these files
3. Organise the files into _acquisitions_ that we can create objects out of
4. Create _acquisition_ objects in the session
5. Upload the appropriate files to the correct acquisition

First, set up the flywheel client:

In [93]:
import flywheel
import glob
import os
import pathlib
import re
import json

client = flywheel.Client()

Here are some arguments that go into this script; no need to worry about these

In [16]:
args = '131160,2791617373,sub-2791617373,True'.split(',')

In [17]:
args = ['/storage/ttapera/RBC/PennLINC/Transfer/renameDWI.py'] + args

In [18]:
args

['/storage/ttapera/RBC/PennLINC/Transfer/renameDWI.py',
 '131160',
 '2791617373',
 'sub-2791617373',
 'True']

Here are the files we need to import in BIDS; we use `glob` to do an easy listing of files:

In [92]:
path = '/storage/ttapera/RBC/data/{}/*/*/*/*'.format(args[1])

files = glob.glob(path)

In [68]:
files

['/storage/ttapera/RBC/data/131160/sub-2791617373/ses-PNC1/perf/sub-2791617373_ses-PNC1_m0scan.json',
 '/storage/ttapera/RBC/data/131160/sub-2791617373/ses-PNC1/perf/sub-2791617373_ses-PNC1_acq-se_asl.json',
 '/storage/ttapera/RBC/data/131160/sub-2791617373/ses-PNC1/perf/sub-2791617373_ses-PNC1_acq-gre_asl.nii.gz',
 '/storage/ttapera/RBC/data/131160/sub-2791617373/ses-PNC1/perf/sub-2791617373_ses-PNC1_acq-gre_aslcontext.tsv',
 '/storage/ttapera/RBC/data/131160/sub-2791617373/ses-PNC1/perf/sub-2791617373_ses-PNC1_acq-gre_asl.json',
 '/storage/ttapera/RBC/data/131160/sub-2791617373/ses-PNC1/perf/sub-2791617373_ses-PNC1_acq-se_asl.nii.gz',
 '/storage/ttapera/RBC/data/131160/sub-2791617373/ses-PNC1/perf/sub-2791617373_ses-PNC1_acq-se_aslcontext.tsv',
 '/storage/ttapera/RBC/data/131160/sub-2791617373/ses-PNC1/perf/sub-2791617373_ses-PNC1_m0scan.nii.gz']

We need to create an `acquisition` object; we look at this folder of files and create a function for extracting the subject and session labels from the filepath using `regex`...

In [77]:
def get_subject_session(f):
    # use regex to look for the subject label and session label
    
    sub_search = re.search(r'sub-[a-zA-Z0-9]+', f)

    assert sub_search.group(0)

    subject = sub_search.group(0)


    ses_search = re.search(r'ses-[a-zA-Z0-9]+', f)

    assert ses_search.group(0)

    session = ses_search.group(0)

    # retun as a tuple
    return(subject, session)

In [70]:
get_subject_session(files[0])

('sub-2791617373', 'ses-PNC1')

...and a function to create a dictionary of acquisition labels extracted from the file names (since some acquisitions can have more than one file in them).

In [60]:
def create_acquisition_label(f):
    
    # get the subject and session labels so that we can insert them into a regex search string
    subject, session = get_subject_session(f)

    # use pathlib to get the stem (filename without slashes and extensions)
    stem = pathlib.Path(f).stem    
    
    # sometimes we need to keep removing extensions, e.g. 'file.nii.gz'; use a while loop
    while '.' in stem:
        stem = pathlib.Path(stem).stem

    # create the regex search string
    regex = '(?<={}_{}_).+'.format(subject, session)

    # run regex
    acquisition_label = re.search(regex, stem).group(0)
    return(acquisition_label)

In [69]:
acquisitions = [create_acquisition_label(x) for x in files]    
acquisitions

['m0scan',
 'acq-se_asl',
 'acq-gre_asl',
 'acq-gre_aslcontext',
 'acq-gre_asl',
 'acq-se_asl',
 'acq-se_aslcontext',
 'm0scan']

We loop over the files, creating acquisitions and assigning the files to each:

In [71]:
acquisitions = {}

for f in files:
    
    acq = create_acquisition_label(f)
    
    if acq not in acquisitions.keys():
        acquisitions[acq] = [f]
    else:
        acquisitions[acq].append(f)
    

In [72]:
acquisitions

{'m0scan': ['/storage/ttapera/RBC/data/131160/sub-2791617373/ses-PNC1/perf/sub-2791617373_ses-PNC1_m0scan.json',
  '/storage/ttapera/RBC/data/131160/sub-2791617373/ses-PNC1/perf/sub-2791617373_ses-PNC1_m0scan.nii.gz'],
 'acq-se_asl': ['/storage/ttapera/RBC/data/131160/sub-2791617373/ses-PNC1/perf/sub-2791617373_ses-PNC1_acq-se_asl.json',
  '/storage/ttapera/RBC/data/131160/sub-2791617373/ses-PNC1/perf/sub-2791617373_ses-PNC1_acq-se_asl.nii.gz'],
 'acq-gre_asl': ['/storage/ttapera/RBC/data/131160/sub-2791617373/ses-PNC1/perf/sub-2791617373_ses-PNC1_acq-gre_asl.nii.gz',
  '/storage/ttapera/RBC/data/131160/sub-2791617373/ses-PNC1/perf/sub-2791617373_ses-PNC1_acq-gre_asl.json'],
 'acq-gre_aslcontext': ['/storage/ttapera/RBC/data/131160/sub-2791617373/ses-PNC1/perf/sub-2791617373_ses-PNC1_acq-gre_aslcontext.tsv'],
 'acq-se_aslcontext': ['/storage/ttapera/RBC/data/131160/sub-2791617373/ses-PNC1/perf/sub-2791617373_ses-PNC1_acq-se_aslcontext.tsv']}

There's a special case here: we need our `aslcontext.tsv` file to be uploaded to the session object -- we add a special case adjustment to the function:

In [78]:
def create_acquisition_label(f):

       
    # get the subject and session labels so that we can insert them into a regex search string
    subject, session = get_subject_session(f)

    # use pathlib to get the stem (filename without slashes and extensions)
    stem = pathlib.Path(f).stem    
    
    # sometimes we need to keep removing extensions, e.g. 'file.nii.gz'; use a while loop
    while '.' in stem:
        stem = pathlib.Path(stem).stem

    # remove the word 'context' from the stem
    if 'context' in stem:
        stem = stem[:-7]
    
    # create the regex search string
    regex = '(?<={}_{}_).+'.format(subject, session)

    # run regex
    acquisition_label = re.search(regex, stem).group(0)
    return(acquisition_label)

In [79]:
acquisitions = {}

for f in files:
    
    acq = create_acquisition_label(f)
    
    # if the key does not exist, create it and assign the value as a list with this file
    if acq not in acquisitions.keys():
        acquisitions[acq] = [f]
    # otherwise, if the key exists, append the file to that list of files
    else:
        acquisitions[acq].append(f)

In [80]:
acquisitions

{'m0scan': ['/storage/ttapera/RBC/data/131160/sub-2791617373/ses-PNC1/perf/sub-2791617373_ses-PNC1_m0scan.json',
  '/storage/ttapera/RBC/data/131160/sub-2791617373/ses-PNC1/perf/sub-2791617373_ses-PNC1_m0scan.nii.gz'],
 'acq-se_asl': ['/storage/ttapera/RBC/data/131160/sub-2791617373/ses-PNC1/perf/sub-2791617373_ses-PNC1_acq-se_asl.json',
  '/storage/ttapera/RBC/data/131160/sub-2791617373/ses-PNC1/perf/sub-2791617373_ses-PNC1_acq-se_asl.nii.gz',
  '/storage/ttapera/RBC/data/131160/sub-2791617373/ses-PNC1/perf/sub-2791617373_ses-PNC1_acq-se_aslcontext.tsv'],
 'acq-gre_asl': ['/storage/ttapera/RBC/data/131160/sub-2791617373/ses-PNC1/perf/sub-2791617373_ses-PNC1_acq-gre_asl.nii.gz',
  '/storage/ttapera/RBC/data/131160/sub-2791617373/ses-PNC1/perf/sub-2791617373_ses-PNC1_acq-gre_aslcontext.tsv',
  '/storage/ttapera/RBC/data/131160/sub-2791617373/ses-PNC1/perf/sub-2791617373_ses-PNC1_acq-gre_asl.json']}

Awesome! Now, we need to work on targeting the session object and creating the acquisition object:

In [81]:
# here we use flywheel's finders to target the exact object we're looking for

subject, session = get_subject_session(files[0])

project = client.projects.find_first('label=RBC_PNC')
subject = project.subjects.find_first('label={}'.format(subject))
session = subject.sessions.find_first('label={}'.format(session))

This session object has the method `add_acquisition`; you can just use this to label and upload data:

In [89]:
for k,v in acquisitions.items():
    print('Processing acquisition:', k)

    new_acquisition = session.add_acquisition(label="{}".format(k))
    
    for file_upload in v:
        
        print('Uploading file ', file_upload)
        new_acquisition.upload_file("{}".format(file_upload))
        new_acquisition = new_acquisition.reload() # update your copy of the object
    print()

Processing acquisition: m0scan
Uploading file  /storage/ttapera/RBC/data/131160/sub-2791617373/ses-PNC1/perf/sub-2791617373_ses-PNC1_m0scan.json
Uploading file  /storage/ttapera/RBC/data/131160/sub-2791617373/ses-PNC1/perf/sub-2791617373_ses-PNC1_m0scan.nii.gz

Processing acquisition: acq-se_asl
Uploading file  /storage/ttapera/RBC/data/131160/sub-2791617373/ses-PNC1/perf/sub-2791617373_ses-PNC1_acq-se_asl.json
Uploading file  /storage/ttapera/RBC/data/131160/sub-2791617373/ses-PNC1/perf/sub-2791617373_ses-PNC1_acq-se_asl.nii.gz
Uploading file  /storage/ttapera/RBC/data/131160/sub-2791617373/ses-PNC1/perf/sub-2791617373_ses-PNC1_acq-se_aslcontext.tsv

Processing acquisition: acq-gre_asl
Uploading file  /storage/ttapera/RBC/data/131160/sub-2791617373/ses-PNC1/perf/sub-2791617373_ses-PNC1_acq-gre_asl.nii.gz
Uploading file  /storage/ttapera/RBC/data/131160/sub-2791617373/ses-PNC1/perf/sub-2791617373_ses-PNC1_acq-gre_aslcontext.tsv
Uploading file  /storage/ttapera/RBC/data/131160/sub-27916

Importantly, though, we don't want to upload the `json` sidecar to the acquisition -- instead, we want to add this `json` as _metadata_ to the nifti file. To do this we'll read in the `json` data and convert it to a dictionary that Flywheel can understand and ingest, including BIDS fields:

In [151]:
def build_metadata(file_list):
    # build a dictionary of metadata to add to the file
    
    # get the json file and nifti file from this acquisition
    json_file = [f for f in file_list if f.endswith('.json')][0]
    nifti_file = [f for f in file_list if '.nii' in f][0]

    # open it
    with open(json_file, 'r') as read_file:
            json_data = json.load(read_file)

    # add important BIDS fields
    bids = {
        'Acq': '',
        'Dir': '',
        'Filename': '',
        'Folder': '',
        'ignore': False,
        'Modality': '',
        'Path': '',
        'Run': ''
    }

    # to fill these BIDS fields, we extract them from the filename:
    def find_value(string, key):

        regex = r'(?<={}-)[a-zA-Z0-9]+'.format(key)

        target_key = re.search(regex, string)

        try:
            return target_key.group(0)
        except:
            return ''

    bids['Acq'] = find_value(nifti_file, 'acq')
    bids['Dir'] = find_value(nifti_file, 'dir')
    bids['Run'] = find_value(nifti_file, 'run')

    bids['Filename'] = pathlib.Path(nifti_file).name
    bids['Folder'] = pathlib.Path(nifti_file).parents[0].name
    bids['Path'] = '/'.join(list(pathlib.Path(nifti_file).resolve().parts[-4:-1]))
    mod = create_acquisition_label(pathlib.Path(nifti_file).stem)
    bids['Modality'] = mod[mod.find('_')+1:]
    
    json_data['BIDS'] = bids
    return(json_data)



In [161]:
build_metadata(acquisitions['acq-se_asl'])

{'AcquisitionDateTime': '2012-02-24T18:19:36.037500',
 'AcquisitionDuration': 123,
 'AcquisitionMatrixPE': 96,
 'AcquisitionNumber': 1,
 'AcquisitionTime': '18:19:36.037500',
 'AverageB1LabelingPulses': 0,
 'AverageLabelingGradient': 34,
 'BackgroundSuppression': 'Yes',
 'BandwidthPerPixelPhaseEncode': 30.193,
 'BaseResolution': 96,
 'BolusCutOffDelayTime': 0,
 'BolusCutOffFlag': 'False',
 'BolusCutOffTechnique': 'False',
 'BolusCutOffTimingSequence': 'False',
 'ConversionSoftware': 'dcm2niix',
 'ConversionSoftwareVersion': 'v1.0.20180918  (JP2:OpenJPEG) GCC4.8.4',
 'DeidentificationMethod': 'Penn_BSC_profile_v1.0',
 'DerivedVendorReportedEchoSpacing': 0.000690005,
 'DeviceSerialNumber': '35069',
 'EchoTime': 0.029,
 'EffectiveEchoSpacing': 0.000345003,
 'FlipAngle': 90,
 'ImageOrientationPatientDICOM': [1, 0, 0, 0, 1, 0],
 'ImageType': ['ORIGINAL', 'PRIMARY', 'M', 'ND', 'MOSAIC'],
 'InPlanePhaseEncodingDirectionDICOM': 'COL',
 'InstitutionAddress': 'Spruce_Street_3400_Philadelphia_343

Looks great! Now we can add this to our loop:

In [168]:
for k,v in acquisitions.items():
    print('Processing acquisition:', k)

    new_acquisition = session.add_acquisition(label="{}".format(k))
    
    print('Building acquisition metadata')
    
    meta = build_metadata(v)

    for file_upload in v:
        
        # we no longer need to upload jsons
        if '.json' in file_upload:
            continue
            
        # upload TSVs straight to the session object
        elif 'context' in file_upload:
            print('Updating TSV metadata...')
            # the context file doesn't need as much data, 
            # just the BIDS Filename, Path, and Folder
            subset_meta = meta['BIDS']
            subset_meta['Filename'] = subset_meta['Filename'].replace('asl.nii.gz', 'aslcontext.tsv')
            
            print('Uploading TSV to session')
            session.upload_file(file_upload)
            session = session.reload()
            session.update_file_info(subset_meta['Filename'], {'BIDS': subset_meta})
            session = session.reload()
        
        # upload the nifti to the new acquisition
        elif '.nii' in file_upload:
            print('Uploading file ', file_upload)
            new_acquisition.upload_file("{}".format(file_upload))
            new_acquisition = new_acquisition.reload() # update your copy of the object
            
            # update the metadata
            print('Updating nifti metadata...')
            new_acquisition.update_file_info(meta['BIDS']['Filename'], meta)
            new_acquisition = new_acquisition.reload() # update your copy of the object

            
    print()

Processing acquisition: m0scan
Building acquisition metadata
Uploading file  /storage/ttapera/RBC/data/131160/sub-2791617373/ses-PNC1/perf/sub-2791617373_ses-PNC1_m0scan.nii.gz
Updating nifti metadata...

Processing acquisition: acq-se_asl
Building acquisition metadata
Uploading file  /storage/ttapera/RBC/data/131160/sub-2791617373/ses-PNC1/perf/sub-2791617373_ses-PNC1_acq-se_asl.nii.gz
Updating nifti metadata...
Updating TSV metadata...
Uploading TSV to session

Processing acquisition: acq-gre_asl
Building acquisition metadata
Uploading file  /storage/ttapera/RBC/data/131160/sub-2791617373/ses-PNC1/perf/sub-2791617373_ses-PNC1_acq-gre_asl.nii.gz
Updating nifti metadata...
Updating TSV metadata...
Uploading TSV to session



Now we wrap all of the above in a function:

In [None]:
def build_and_upload_asl(args):
    
    # step 1: get files
    
    path = '/storage/ttapera/RBC/data/{}/*/*/*/*'.format(args[1])
    files = glob.glob(path)
    
    # step 2: get the asl acquisitions
    acquisitions = {}

    for f in files:

        acq = create_acquisition_label(f)

        # if the key does not exist, create it and assign the value as a list with this file
        if acq not in acquisitions.keys():
            acquisitions[acq] = [f]
        # otherwise, if the key exists, append the file to that list of files
        else:
            acquisitions[acq].append(f)
            
    # step 3: extract subject and session labels; initialise flywheel target object
    
    subject, session = get_subject_session(files[0])

    project = client.projects.find_first('label=RBC_PNC')
    subject = project.subjects.find_first('label={}'.format(subject))
    session = subject.sessions.find_first('label={}'.format(session))
    
    # step 4: upload data
    
    for k,v in acquisitions.items():
        print('Processing acquisition:', k)

        new_acquisition = session.add_acquisition(label="{}".format(k))

        print('Building acquisition metadata')

        meta = build_metadata(v)

        for file_upload in v:

            # we no longer need to upload jsons
            if '.json' in file_upload:
                continue

            # upload TSVs straight to the session object
            elif 'context' in file_upload:
                print('Updating TSV metadata...')
                # the context file doesn't need as much data, 
                # just the BIDS Filename, Path, and Folder
                subset_meta = meta['BIDS']
                subset_meta['Filename'] = subset_meta['Filename'].replace('asl.nii.gz', 'aslcontext.tsv')

                print('Uploading TSV to session')
                session.upload_file(file_upload)
                session = session.reload()
                session.update_file_info(subset_meta['Filename'], {'BIDS': subset_meta})
                session = session.reload()

            # upload the nifti to the new acquisition
            elif '.nii' in file_upload:
                print('Uploading file ', file_upload)
                new_acquisition.upload_file("{}".format(file_upload))
                new_acquisition = new_acquisition.reload() # update your copy of the object

                # update the metadata
                print('Updating nifti metadata...')
                new_acquisition.update_file_info(meta['BIDS']['Filename'], meta)
                new_acquisition = new_acquisition.reload() # update your copy of the object


        print()