# NEUFEP-ME study: Data setup workflow

This notebook contains the workflow required to convert all raw data into the formats, types and structures necessary for the main data processing pipeline. Thus, getting data ready to be processed.

## Study overview

The main purpose of the study is to investigate methods to improve the quality of real-time functional magnetic resonance imaging (fMRI) data. These improvements are for future applications in real-time fMRI neurofeedback, a method where participants are presented with visual feedback of their brain activity while they are inside the MRI scanner, and then asked to regulate the level of feedback. We have developed real-time multi-echo EPI acquisition sequences and processing methods, and this study aims to collect data from volunteers in order to validate these new methods. No neurofeedback is provided.

```
The scan session will consist of a number of scan sequences, some of which will require you to look at pictures or perform a task, and others just to lie still in the scanner. The total time in the scanner will be around 1 hour, with a break in the middle. Volunteers have to be right-handed, healthy and should have no history of or be under current treatment for psychiatric / neurological conditions. If this inclusion criteria do not fit you, you will unfortunately not be able to participate in the study. Other reasons for not being able to participate include being pregnant or having metal implants.
```

## Data

Per subject, all of the following data were collected during one scan session (all functional scans are multi-echo EPI):

| Nr | Name  | Scan Type | Description | Format |
| :--- | :--- | :--- | :--- | :--- |
| 1 | T1w | Anatomical | Standard T1-weighted sequence | NIfTI |
| 2 | run1_BOLD_rest | Functional | Resting state | PAR/REC, XML/REC, DICOM |
| 3 | run1_BOLD_task1 | Functional | Motor - finger tapping | PAR/REC, XML/REC, DICOM |
| 4 | run1_BOLD_task2 | Functional | Emotion - shape/face matching | PAR/REC, XML/REC, DICOM |
| 5 | run2_BOLD_rest | Functional | Resting state | PAR/REC, XML/REC, DICOM |
| 6 | run2_BOLD_task1 | Functional | Motor mental - imagined finger tapping | PAR/REC, XML/REC, DICOM |
| 7 | run2_BOLD_task2 | Functional | Emotion mental - recalling emotional memories | PAR/REC, XML/REC, DICOM |
| 8 | Stimulus timing | Peripheral measure | Stimulus and response timing for all tasks, i.e. x4 | Eprime .dat and .txt |
| 9 | Physiology | Peripheral measure | Cardiac + respiratory traces for all runs, i.e. x6 | Philips "scanphyslog" |

## Data setup goals

For each dataset (i.e. for each subject) we have to:

1. Move all files into a machine readable directory structure
2. Rename all image files in this directory structure such that BIDS tags are findable
3. Convert data to BIDS:
  1. Run `bidsify` to convert the image data to BIDS (This includes conversion of PAR/REC to NIfTI using `dcm2niix`; this also includes anonymization, which doesn't work for some reason)
  2. Deface the T1w image using `pydeface`
  3. Run eprime conversion script to convert stimilus and response timings to BIDS (need to figure out this format)
  4. Run `scanphyslog2bids` (or Matlab script if needed) to convert physiology data to BIDS
4. Run the BIDS validator
6. Create summary tables and plots using `pybids`

## Data quality notes

#### Problem subjects: EPI acquisition

- **sub-008** - last scan not finished (recon issue), need to rescan
- **sub-014** - last scan not finished (recon issue), need to rescan

#### Problem subjects: scanphyslog

UPDATE 13 January 2020:
*I incorrectly attributed the emotion1 scan restart problem to sub-015, it was actually sub-016 as confirmed on RIS system (see photos). This also explains "extra small file" previously mentioned in sub-016 dataset*

- **sub-015** - emotion1 scan was started twice, because heartrate sensor was not correctly recording. Second time was fine. need to inspect earlier scans for same issue. also, looks like scanphyslog didn't restart and kept loggin to same file, resulting in >12mb log file. need to inspect
- **sub-016** - All image data fine. Have to look at this data on RIS since scanphyslog has one extra small file which doesn't seem to fit

#### Problem subjects: T1w display

- **sub-017** - when displaying in Mango, top slices of brain are cut off, need to check if this is a viewer issue or acquisition issue. Need to check other subjects as well.

#### Steps done for now (8 Jan 2020):

- **sub-008** - deleted from organised data
- **sub-014** - deleted from organised data
- **sub-015** - removed from BIDS data, kept in organised data separate folder, scanphyslog renamed, need to inspect
- **sub-016** - removed from BIDS data, kept in organised data separate folder, scanphyslog NOT renamed, need to inspect

#### Steps to do (UPDATE 13 Jan 2020):

- **sub-015** - move back to BIDS and ensure everything is correctly named and converted
- **sub-016** - inspect everything, ensure this is correctly named and converted (ignore small file).


## Data processing notes

Pybids gave `unknown locale UTF-8` issue, fixed by adding values to path variable upon conda env startup (and removing upon deactivation), see: https://coderwall.com/p/-k_93g/mac-os-x-valueerror-unknown-locale-utf-8-in-python

Need to build eprime code into a script, currently in a notebook

Remember for BIDS-validator:
- Delete T1w file and rename defaced file to original T1w name
- add events.tsv
- SliceTiming required for all functionals - need to find a better place to implement this in future. For now just a script to run through all files and update json with precalculated slice times based on known parameters. This is needed since Philips does not supply slice timing information in PAR/rec or DICOM.


## Required packages / software

- bidsify: https://github.com/NILAB-UvA/bidsify/tree/master/bidsify
- pydeface: https://github.com/poldracklab/pydeface
- fsl (required for pydeface): https://fsl.fmrib.ox.ac.uk/fsl/fslwiki
- bids_validator: https://github.com/bids-standard/bids-validator; https://bids-standard.github.io/bids-validator/
- convert-eprime: https://github.com/tsalo/convert-eprime
- scanphyslog2bids: https://github.com/lukassnoek/scanphyslog2bids

In [1]:
%matplotlib inline
import os
import shutil
import glob
import matplotlib.pyplot as plt
import numpy as np
import nibabel as nib
import bidsify
import pydeface
# import nipype
from bids_validator import BIDSValidator
from convert_eprime.utils import remove_unicode
from convert_eprime.convert import text_to_csv

## Step 0: initialize variables, directories, filenames, etc


In [2]:
# Dependent on structure and filenames in current data on external drive

data_dir = '/Volumes/Stephan_WD/NEUFEPME_data/'
org_data_dir = '/Volumes/Stephan_WD/NEUFEPME_data_organised/'
bids_data_dir = '/Volumes/Stephan_WD/NEUFEPME_data_BIDS/'

## Step 1+2: Move and rename all files into a machine readable and BIDS-ready directory structure


#### ONCE ONLY: copy all raw data, with exceptions, to new directory (leave raw data untouched!)


In [27]:
# # Ignore files:
# # - with 'FSL', this is a proxy for the FSL '.nii' file
# # - with spaces (this is a proxy for the xml/rec files),
# # - the PAR/RECs of T1w file
# src = data_dir
# dst = org_data_dir
# ignore_pattern = shutil.ignore_patterns('*FSL*',
#                                         '* *',
#                                         '*T1W*.PAR',
#                                         '*T1W*.REC')
# shutil.copytree(src, dst, ignore=ignore_pattern)

#### ONCE ONLY: Rename files such that bidsify can read separate tags and convert correctly to BIDS format
See filenames in: "/Users/jheunis/Documents/PYTHON/rtme-fMRI/bidsify_test_2709/sub-01")


In [27]:
# # This means removing the 'SENSE' text from the T1w file such that files with SENSE are all recognised as BOLD
# os.chdir(org_data_dir)
# t1w = glob.glob('*/*T1W*',recursive=True)
# for fn in t1w:
#     os.rename(fn, fn.replace('_SENSE', ''))

#### ONCE ONLY: Rename scanphyslog files based on time of acquisition and relation to functional scans

In [51]:
# Typically 10 files in each subject's 'scanphyslog' directory. If not, throw error.
# scan_names = ['_emotion2', '_motor2', '_rest2', '_emotion1', '_motor1', '_rest1']
# all_subs = next(os.walk(org_data_dir))[1]
# for sub in all_subs:
#     print(sub)
#     os.chdir(org_data_dir + sub + '/scanphyslog')
#     files = glob.glob('*')
#     files.sort(key=os.path.getmtime, reverse=True)
#     if len(files) != 10:
#         print('ERROR: there are {} files. Not renaming files'.format(len(files)))
#     else:
#         for i, file in enumerate(files):
#             if i < 6:
#                 new_file = os.path.splitext(file)[0] + scan_names[i] + os.path.splitext(file)[1]
#                 print(new_file)
#                 os.rename(file, new_file)
#             else:
#                 print(file)

## Step 3.1: Convert image data to BIDS

Using `bidsify`

See `me_workflow.ipynb`


In [2]:
# !cd /Volumes/Stephan_WD/NEUFEPME_data_organised && bidsify -v

## Step 3.2: Deface T1w images

Using `pydeface`


In [3]:
# os.chdir(bids_data_dir)
# all_subs = next(os.walk(org_data_dir))[1]
# i = 0
# anat_dir = os.path.join(bids_data_dir, all_subs[0], 'anat')
# os.chdir(anat_dir)
# t1w = glob.glob('*T1w*')
# t1w_fn = t1w[0]
# !cd $anat_dir && pydeface $t1w_fn


#sub 17 already defaced


# for path, subdirs, files in os.walk(anat_dir):
#     for name in files:
#         print(os.path.join(path, name))



200109-13:54:29,22 nipype.utils INFO:
	 Running nipype version 1.2.3 (latest: 1.4.0)
------------
pydeface 2.0
------------
Temporary files:
  /var/folders/v1/00bvrxc97sndvb45mvrzd_2r0000gn/T/tmpgn7_a64b.mat
  /var/folders/v1/00bvrxc97sndvb45mvrzd_2r0000gn/T/tmpl4u6b6lc.nii.gz
Defacing...
  sub-017_T1w.nii
Defaced image saved as:
  sub-017_T1w_defaced.nii
Cleaning up...
Finished.


## Step 3.3: Convert Eprime data to BIDS

Use `neufepme_read_eprime.ipynb`, which uses `convert-eprime`

Perhaps write this into separate functions?


In [None]:
# see: https://bids-specification.readthedocs.io/en/stable/04-modality-specific-files/05-task-events.html

## Step 3.4: Convert physiology data to BIDS

Using `scanphyslog2bids`

See: https://bids-specification.readthedocs.io/en/stable/04-modality-specific-files/06-physiological-and-other-continuous-recordings.html


In [5]:
from scanphyslog2bids.core import PhilipsPhysioLog
import nibabel as nib
import numpy as np

main_dir = '/Users/jheunis/Desktop/sample-data/sub-neufepmetest/sub-pilot/raw-data'
log_file = main_dir + '/SCANPHYSLOG20191011115018_motor1.log'
out_dir = main_dir + '/my_bids_data'  # where the BIDSified data should be saved
deriv_dir = main_dir + '/my_bids_data/physio'  # where some QC plots should be saved

# fmri_file is used to extract metadata, such as TR and number of volumes
fmri_file = main_dir + '/sub-pilot_task-motor_run-1_echo-2.nii' 
fmri_img = nib.load(fmri_file)
n_dyns = fmri_img.shape[-1]
tr = np.round(fmri_img.header['pixdim'][4], 3)

# Create PhilipsPhysioLog object with info
phlog = PhilipsPhysioLog(f=log_file, tr=tr, n_dyns=n_dyns, sf=500, manually_stopped=False)

# Load in data, do some preprocessing
phlog.load()

# Try to align physio data with scan data, using a particular method
# (either "vol_markers", "gradient_log", or "interpolation")
# phlog.align(trigger_method='vol_markers')  # load and find vol triggers
phlog.align(trigger_method='interpolation')  # load and find vol triggers

# Write out BIDS files
phlog.to_bids(out_dir)  # writes out .tsv.gz and .json files

# Optional: plot some QC graphs for alignment and actual traces
phlog.plot_alignment(out_dir=deriv_dir)  # plots alignment with gradient
phlog.plot_traces(out_dir=deriv_dir)  # plots cardiac/resp traces

2020-01-09 14:04:33,301 [MainThread] [INFO   ]  Found end marker ('0020') at an offset of 32 samples relative to the end of the file.
2020-01-09 14:04:37,376 [MainThread] [INFO   ]  Trimmed off 59 samples from end of file based on the (absence) of a gradient.


CouldNotFindThresholdError: Could not find threshold!

## Step 4: Run BIDS validator

see: https://bids-standard.github.io/bids-validator/

In [None]:
# First run scripts to ensure that BIDS json data 
TR = 2
N_slices = 34
t_d = TR/N_slices
t_start = 0
t_stop = TR
slice_timing = np.arange(start=t_start, stop=t_stop, step=t_d).round(4).tolist()





In [6]:
# Check if possible to run validator on a full directory tree

# fn = '/Users/jheunis/Documents/PYTHON/rtme-fMRI/bidsify_test_2709/sub-01/anat/sub-01_T1w.nii'
# BIDSValidator().is_bids(fn)



## Step 6: Create summary tables and plots using `pybids`

See:
- https://bids-standard.github.io/pybids/
- https://github.com/bids-standard/pybids/blob/master/examples/pybids_tutorial.ipynb
- https://mybinder.org/v2/gh/bids-standard/pybids/master

this is also useful for de/constructing filenames, might be useful in earlier steps of this workflow already.