# DWI preprocessing
## using MRTRIX, FSL and ANTS
### by Michael Paquette
#### Notes from Mar 17th

In this serie of bash notebook, I will be describing a basic DWI data preprocessing pipeline.

I will try to stick to this (loose) general format to describe each step:  

- What is the artefact we are trying to correct or the transformation we are trying to achieve  
- Why does this happen  
- What would happen if we didnt correct/modify 
- Why is this step here (somewhat arbitrary or specific ordering)  
- Important parameters of the tool (and which one are likely to need finetuning)  
- What to look for when doing QC (quality control)

In [3]:
# Like last time, we setup the basic variable with the folders
ROOTDIR='/data/pt_02586/'
SUBJECTDIR=$ROOTDIR'sub_01/'
# and we move to the preprocessing folder
cd $SUBJECTDIR'preprocessing'
pwd

# Like last time, we setup the basic variable with the folders
ROOTDIR='/data/pt_02586/'
SUBJECTDIR=$ROOTDIR'sub_01/'
# and we move to the preprocessing folder
cd $SUBJECTDIR'preprocessing'
pwd
/data/pt_02586/sub_01/preprocessing


: 1

### Tractography

Tractography is the process of building paths from the diffusion information.  
In essence, it follows the orientation of the FOD to build "streamlines" (list of linked 3D points).  
In pratice, there are many parameters to tractography, which we can roughly describe as:  
- Where to start the streamline (**seeding**)  
- How to pick a new orientation to follow from some point in the brain (**model**)    
- When to stop (**masks**)  

#### Seeding
Streamlines start from "seed", i.e. points in the brain.  
- WM seeding: Seeding from everywhere inside a white matter mask  
- interface seeding: Seeding from some WM-GM interface mask, or simply seeding from the GM (dilated a bit into WM)
- ROI seeding: Seeding from some specific mask, typically a ROI  
Once a type of seeding is decided, you need to define how much seed you will start.  
- Option 1: A fixed number of seeds per voxel in the seeding mask (careful, this grows quickly in size)  
- Option 2: A fixed number of total seed randomely selected in the seeding mask  
- Option 3: A flexible number of radom seed for a fixed number of streamlines  
Not all seeds will become valid streamlines (most won't actually, and it's very region dependant).  
You can set a required number of streamlines instead of a numbers of seed. The tool will launch new random seed until finished.  


#### Stepping
In tractography, you take steps following some local orientations.  
The most common case if to take steps of fixed length, i.e. 0.1 mm  
To properly capture the geometry, we want the step size to be smaller than the voxel size.  
How much small will depend on the tractography method.  

#### Curvature
Typically, you want to limit the curvature at each step, i.e. limit how sharp of a turn the streamline can take  
This parameter will be dependant on the step size, for example, if you have step size of 0.1mm and maximum curvature per step of 9 degrees, the maximum curvature after 1mm (10 steps) will be 90 degrees. So, if you desire a similar limitation in a tractography with 0.5mm steps, you would pick 45 degrees maximum curvature.  
You can think of the maximum curvature as the radius of the cone restricting the possible next step. 

#### Deterministic vs Probabilistic
Det and Prob are two different approach to direction picking in tractography.  
- DET: follows the maximum peaks of the FOD (within the "cone" of the maximum curvature)  
- PROB: Pick a random direction within the cone with the size of the FOD weigthing the probability of each direction  
For a given seed, DET will always produce the same streamline, while PROB will vary, hence their names.  
DET gives cleaner looking tracks but typically smaller in reach because it struggle in crossing and fanning area.  
PROB gives much messier looking tracts and as more "false tracts", but it is a more accurate representation of the possible paths. You can think of it as exploring the uncertainty of the data.    

Connectomics is typically done with probabilistic tractography, but you will porbably prefer deterministic for paper figures ;)  

#### Mask

There a many different masks involve in tractography, depending what you want to do, how's your seeding, what are the ROIs.  
- Mask from FOD computation: This mask isn't explicitely used but you can't have streamlines begining or propagating in voxels without any directional information.   
- Seeding mask: Mask specifying where streamlines can start.  
- Tracking mask: Mask specifying where the streamline can propagate. If a next step brings you outside this mask, the propagation of the streamline stops. What happens next depend on tracking options (keep or reject streamline)  
- Target mask: Mask specifying region(s) that streamlines NEED to cross to be valid. What happens next depend on tracking options (stop or continue at target)  
- Exclusion mask: Mask specifying region(s) that streamlines CANNOT cross to be valid. Reject streamline crossing them.  

It is very important to carefully design the masks of tractography.  
For example, if you have a thin WM region and the mask is too "small" / "tight", streamlines could exit the tracking mask even when they are roughly following the desired direction.  
Alternatively, if the mask is too "large" / "dilated", streamline might wander to regions where the FOD are misestimated because if outside WM.  



### tckgen
The mrtrix tractography tool  
https://mrtrix.readthedocs.io/en/latest/reference/commands/tckgen.html

In [1]:
# we look at the information of the mrcalc command
# normally, you would simply do "mrcalc -h" 
# but the text is too long for this jupyter notebook
# so I have to do a small hack with the 'cat' command
tckgen -h | cat

# we look at the information of the mrcalc command
# normally, you would simply do "mrcalc -h" 
# but the text is too long for this jupyter notebook
# so I have to do a small hack with the 'cat' command
tckgen -h | cat
MRtrix 3.0.0                         tckgen                          Apr 23 2020

     tckgen: part of the MRtrix3 package

SYNOPSIS

     Perform streamlines tractography

USAGE

     tckgen [ options ] source tracks

        source       The image containing the source data. The type of image
                     data required depends on the algorithm used (see
                     Description section).

        tracks       the output file containing the tracks generated.


DESCRIPTION

     By default, tckgen produces a fixed number of streamlines, by attempting
     to seed from new random locations until the target number of streamlines
     have been selected (in other words, after all inclusion & exclusion
     criteria have been applied), or the maximum number o


Tractography seeding mechanisms; at least one must be provided

  -seed_image image  (multiple uses permitted)
     seed streamlines entirely at random within a mask image 

  -seed_sphere spec  (multiple uses permitted)
     spherical seed as four comma-separated values (XYZ position and radius)

  -seed_random_per_voxel image num_per_voxel  (multiple uses permitted)
     seed a fixed number of streamlines per voxel in a mask image; random
     placement of seeds in each voxel

  -seed_grid_per_voxel image grid_size  (multiple uses permitted)
     seed a fixed number of streamlines per voxel in a mask image; place seeds
     on a 3D mesh grid (grid_size argument is per axis; so a grid_size of 3
     results in 27 seeds per voxel)

  -seed_rejection image  (multiple uses permitted)
     seed from an image using rejection sampling (higher values = more probable
     to seed from)

  -seed_gmwmi image  (multiple uses permitted)
     seed from the grey matter - white matter interface (on

     streamlines tractography by 2nd order integration over fibre orientation
     distributions. Proceedings of the International Society for Magnetic
     Resonance in Medicine, 2010, 1670

     * Nulldist1 / Nulldist2:
     Morris, D. M.; Embleton, K. V. & Parker, G. J. Probabilistic fibre
     tracking: Differentiation of connections from chance events. NeuroImage,
     2008, 42, 1329-1339

     * Tensor_Det:
     Basser, P. J.; Pajevic, S.; Pierpaoli, C.; Duda, J. & Aldroubi, A. In vivo
     fiber tractography using DT-MRI data. Magnetic Resonance in Medicine,
     2000, 44, 625-632

     * Tensor_Prob:
     Jones, D. Tractography Gone Wild: Probabilistic Fibre Tracking Using the
     Wild Bootstrap With Diffusion Tensor MRI. IEEE Transactions on Medical
     Imaging, 2008, 27, 1268-1274

     References based on command-line options:

     * -rk4:
     Basser, P. J.; Pajevic, S.; Pierpaoli, C.; Duda, J. & Aldroubi, A. In vivo
     fiber tractography using DT-MRI data. Magnetic Re

: 1

### tckgen
Summary of relevant stuff:  


-algorithm "name"  
SD_STREAM: DET algorithm for FOD type of data  
iFOD2: PROP algorithm for FOD (better than iFOD1)  
I will ignore the other and called them DET and PROB from now on  


-select "number"  
number of desired streamlines after inclusion/exclusion criteria  
This can be 0 to have a fixed number of seeds instead of streamlines  
This can be unspecifed also to have a specific seeding strategy  


-seeds "number"  
Total number of seeds that can be attempted (to prevent unending runtime)  
If 0, tckgen will run until done.  
If unspecified, it is default at 1000 * "select"


-step "number"  
stepsize in mm, "minimum" recommended is 0.5 * vox for PROB and 0.25 * vox for DET.  
Tractography takes longuer and file size is bigger with small steps but it potentially takes curves better.  
I wouldn't get smaller than 0.1 * vox.  


-angle "number"  
Maximal angle between steps. MRTRIX recommand 60 deg for DET and 45 deg for PROP. These values are only valid if using the "default" stepsize of 0.25 * vox for DET and 0.5 * vox for PROB.  
For example, If using step size 0.1 * vox, the corresponding angle would be 24 deg for DET and 9 deg for PROB.  


-minlength "number"  
Minimum length for valid streamlines in mm  
Use to kill failed streamlines too short to be anatomically plausible  


-maxlength "number"  
Maximum length for valid streamlines in mm  
Use to kill failed streamlines too long to be anatomically plausible (looping on themselves for instance)  


-rk4
Enable Runge-Kutta method, **always** use it with DET, **never** with PROB.  
iFOD2 is already doing something equivalent (it is the difference with iFOD1)  


-stop
Determines if streamline propagation continues or stops when reaching the target mask.  



seeding option, can only choose **one**  

-seed_image "mask file"  
Provide a seeding mask for the random option 2.  
This flag can be used multiple times.  


-seed_random_per_voxel "mask file" "number"  
Provide a seeding mask and a nubmer of seed per voxel for option 1.  
This flag can be used multiple times.



-downsample "number"  
Downsample the streamlines, i.e. less 3D points per streamlines.  
Sometime necessary to make smaller, more manageable file size.  
We can always downsample after tractography if the file turns out to big.  



Masking options:  

-include "mask image"  
This is the include mask.  
Streamlines need to pass by **ALL** include mask to be valid.  
The "stop" option decides if the streamlines stops or not.  
This flag can be used multiple times.

-include_ordered_image "mask image"  
Same as include mask but ordering matters.  
This flag can be used multiple times.


-exclude "mask image"  
This it the exclude mask.  
Streamlines cross **ANY** exclude mask will be rejected.  
This flag can be used multiple times.


-mask "mask image"  
This is the tracking mask.  
Part of the streamline exiting will be truncated.  
This flag can be used multiple times.
     
-nthreads "number"  
Parallel computing option like most MrTrix command  

In [None]:
# Reasonable example for PROB with WM seeding without special ROIs
# We seed 1 streamline per voxel in the WM
# Tractography is limited inside the same WM mask.
# In that case, we would be filtering "bad" streamline AFTER
# We have some reasonable curvature and step size and some length limits
tckgen fod.nii.gz tracts.tck \
  -algorithm iFOD2 \
  -step 0.25 \
  -angle 22.5 \
  -minlength 20 \
  -maxlength 300 \
  -seed_random_per_voxel WM.nii.gz 1 \
  -mask WM.nii.gz \
  -nthreads 20

In [None]:
# Reasonable example for DET with interface seeding without special ROIs
# We seed 5 streamline per voxel in the Interface
# Tractography is limited inside the WM mask.
# In that case, we would be filtering "bad" streamline AFTER
# We have some reasonable curvature and step size and some length limits
tckgen fod.nii.gz tracts.tck \
  -algorithm SD_STREAM \
  -rk4 \
  -step 0.1 \
  -angle 24 \
  -minlength 20 \
  -maxlength 300 \
  -seed_random_per_voxel INT.nii.gz 5 \
  -mask WM.nii.gz \
  -nthreads 20

### How to visualize tck in mrview
(demo)

### tckinfo
https://mrtrix.readthedocs.io/en/latest/reference/commands/tckinfo.html  

Print information about the tracts.tck  

All sorts of parameters about the tracking options, subsampling, masks, seed, etc. 
Also prints the streamline count.

In [5]:
tckinfo -h | cat

tckinfo -h | cat
MRtrix 3.0.0                         tckinfo                         Apr 23 2020

     tckinfo: part of the MRtrix3 package

SYNOPSIS

     Print out information about a track file

USAGE

     tckinfo [ options ] tracks [ tracks ... ]

        tracks       the input track file.


OPTIONS

  -count
     count number of tracks in file explicitly, ignoring the header

Standard options

  -info
     display information messages.

  -quiet
     do not display information messages or progress status; alternatively,
     this can be achieved by setting the MRTRIX_QUIET environment variable to a
     non-empty string.

  -debug
     display debugging messages.

  -force
     force overwrite of output files (caution: using the same file as input and
     output might cause unexpected behaviour).

  -nthreads number
     use this number of threads in multi-threaded applications (set to 0 to
     disable multi-threading).

  -config key value  (multiple uses permitted)
     temp

: 1

### tckresample
https://mrtrix.readthedocs.io/en/latest/reference/commands/tckresample.html  

Resample streamlines.  

In particular, we can downsample by a fixed factoor (like the downsampl option in the tracking).  
We can also resample at desired step size or total points per streamlines.  

In [6]:
tckresample -h | cat

tckresample -h | cat
MRtrix 3.0.0                       tckresample                       Apr 23 2020

     tckresample: part of the MRtrix3 package

SYNOPSIS

     Resample each streamline in a track file to a new set of vertices

USAGE

     tckresample [ options ] in_tracks out_tracks

        in_tracks    the input track file

        out_tracks   the output resampled tracks


DESCRIPTION

     It is necessary to specify precisely ONE of the command-line options for
     controlling how this resampling takes place; this may be either increasing
     or decreasing the number of samples along each streamline, or may involve
     changing the positions of the samples according to some specified
     trajectory.

     Note that because the length of a streamline is calculated based on the
     sums of distances between adjacent vertices, resampling a streamline to a
     new set of vertices will typically change the quantified length of that
     streamline; the magnitude of the differ

: 1

In [None]:
# example downsampling
tckresample original.tck resampled.tck \
  -downsample 2

### tckmap
https://mrtrix.readthedocs.io/en/dev/reference/commands/tckmap.html  

Map streamlines back to voxel image.  

In particular, this is used to produce tract density images (TDI) and color encoded TDI.  
A TDI map counts the density of streamline in each voxel.  
The colored version include orientation information in RGB space.  
These map are useful to judge/analyse the output of a probabilistic tracking.  

In [7]:
tckmap -h | cat

tckmap -h | cat
MRtrix 3.0.0                         tckmap                          Apr 23 2020

     tckmap: part of the MRtrix3 package

SYNOPSIS

     Use track data as a form of contrast for producing a high-resolution image

USAGE

     tckmap [ options ] tracks output

        tracks       the input track file.

        output       the output track-weighted image


DESCRIPTION

     Note: if you run into limitations with RAM usage, make sure you output the
     results to a .mif file or .mih / .dat file pair - this will avoid the
     allocation of an additional buffer to store the output for write-out.

Options for the header of the output image

  -template image
     an image file to be used as a template for the output (the output image
     will have the same transform and field of view).

  -vox size
     provide either an isotropic voxel size (in mm), or comma-separated list of
     3 voxel dimensions.

  -datatype spec
     specify output image data type.

Options for t

: 1

In [None]:
# example color encoded TDI
tckmap track.tck tdi.nii.gz \
  -dec \
  -contrast length

### tckedit
https://mrtrix.readthedocs.io/en/dev/reference/commands/tckedit.html  

Manipulate tck files.  

This can be used to:  
- concatenate tck  
- extract streamlines subset randomly (for viz)  
- extract streamlines with criteria (lengths, masks, ...)

In [8]:
tckedit -h | cat

tckedit -h | cat
MRtrix 3.0.0                         tckedit                         Apr 23 2020

     tckedit: part of the MRtrix3 package

SYNOPSIS

     Perform various editing operations on track files

USAGE

     tckedit [ options ] tracks_in [ tracks_in ... ] tracks_out

        tracks_in    the input track file(s)

        tracks_out   the output track file


DESCRIPTION

     This command can be used to perform various types of manipulations on
     track data. A range of such manipulations are demonstrated in the examples
     provided below.

EXAMPLE USAGES

     Concatenate data from multiple track files into one:
       $ tckedit *.tck all_tracks.tck
     Here the wildcard operator is used to select all files in the current
     working directory that have the .tck filetype suffix; but input files can
     equivalently be specified one at a time explicitly.

     Extract a reduced number of streamlines:
       $ tckedit in_many.tck out_few.tck -number 1k -skip 500
     Th

: 1

### tckconvert
https://mrtrix.readthedocs.io/en/latest/reference/commands/tckconvert.html  

Convert tck file to other streamline file format to use with other software  

In [9]:
tckconvert -h | cat

tckconvert -h | cat
MRtrix 3.0.0                       tckconvert                        Apr 23 2020

     tckconvert: part of the MRtrix3 package

SYNOPSIS

     Convert between different track file formats

USAGE

     tckconvert [ options ] input output

        input        the input track file.

        output       the output track file.


DESCRIPTION

     The program currently supports MRtrix .tck files (input/output), ascii
     text files (input/output), VTK polydata files (input/output), and
     RenderMan RIB (export only).

     Note that ascii files will be stored with one streamline per numbered
     file. To support this, the command will use the multi-file numbering
     syntax, where square brackets denote the position of the numbering for the
     files, for example:

     $ tckconvert input.tck output-'[]'.txt

     will produce files named output-0000.txt, output-0001.txt,
     output-0002.txt, ...

OPTIONS

  -scanner2voxel reference
     if specified, the propert

: 1

### tcksample
https://mrtrix.readthedocs.io/en/latest/reference/commands/tcksample.html  

Sample voxel maps along streamlines.  

For example, project an FA on streamlines.  
You can also compute some metrics on the track.  

In [10]:
tcksample -h | cat

tcksample -h | cat
MRtrix 3.0.0                        tcksample                        Apr 23 2020

     tcksample: part of the MRtrix3 package

SYNOPSIS

     Sample values of an associated image along tracks

USAGE

     tcksample [ options ] tracks image values

        tracks       the input track file

        image        the image to be sampled

        values       the output sampled values


DESCRIPTION

     By default, the value of the underlying image at each point along the
     track is written to either an ASCII file (with all values for each track
     on the same line), or a track scalar file (.tsf). Alternatively, some
     statistic can be taken from the values along each streamline and written
     to a vector file.

OPTIONS

  -stat_tck statistic
     compute some statistic from the values along each streamline (options are:
     mean,median,min,max)

  -nointerp
     do not use trilinear interpolation when sampling image values

  -precise
     use the precise me

: 1

In [None]:
# example sampling FA along a bundle
tcksample bundle.tck FA.nii.gz values.txt

### tck2connectome
https://mrtrix.readthedocs.io/en/latest/reference/commands/tck2connectome.html  
https://mrtrix.readthedocs.io/en/dev/quantitative_structural_connectivity/structural_connectome.html  
https://mrtrix.readthedocs.io/en/dev/quantitative_structural_connectivity/labelconvert_tutorial.html#labelconvert-tutorial  

Tools for building connectome from streamlines and label map.  

The links give information to build label map easier from other type of data, such as freesurfer cortex parcellation.  

In [11]:
tck2connectome -h | cat

tck2connectome -h | cat
MRtrix 3.0.0                     tck2connectome                      Apr 23 2020

     tck2connectome: part of the MRtrix3 package

SYNOPSIS

     Generate a connectome matrix from a streamlines file and a node
     parcellation image

USAGE

     tck2connectome [ options ] tracks_in nodes_in connectome_out

        tracks_in    the input track file

        nodes_in     the input node parcellation image

        connectome_out  the output .csv file containing edge weights


EXAMPLE USAGES

     Default usage:
       $ tck2connectome tracks.tck nodes.mif connectome.csv -tck_weights_in weights.csv -out_assignments assignments.txt
     By default, the metric of connectivity quantified in the connectome matrix
     is the number of streamlines; or, if tcksift2 is used, the sum of
     streamline weights via the -tck_weights_in option. Use of the
     -out_assignments option is recommended as this enables subsequent use of
     the connectome2tck command.

     Gene




: 1