Skip to content

Parameter File Setup

Mike Jones edited this page Apr 25, 2022 · 3 revisions

The aa parameter file

Automatic Analysis must be configured before its first use. This is done using a parameter file. The parameter file serves as a description of the computational environment, including directories for data and results, processing options and so on. You may view these as a collection of default settings which may be customized in a particular analysis as needed. Alternatively, each project (or even each analysis or individual user) can create a unique parameter file to use. You can find example parameter files in the /aa_parametersets folder in the aa code repository.

A minimum parameter file can be created using the create_default_parameterfile utility. This will prompt you to enter some required information and create a minimal parameter file in your $HOME/.aa directory. However, additions to this file will probably be required for general aa use eventually. These will depend on the types of analyses you plan to implement, the data you intend to use, and even your programming style. As such, here we review the overall organization of the parameter file and identify settings that are commonly customized.

An aa parameter file is formatted using standard XML syntax. It contains five main sections: 1) directory conventions, 2) options, 3) acquisition details, 4) GUI controls, and 5) timeouts. The overall organization is as follows:

<?xml version="1.0" encoding="utf-8"?>
<aap xmlns:xi="http://www.w3.org/2001/XInclude">
<xi:include href="/path/to/.../aap_parameters_defaults.xml" parse="xml"/>
<local>

	<directory_conventions>
		directory naming
		third-party software setup	
		Matlab toolbox setup
	</directory_conventions>

	<options>
		NIFTI4D
		autoidentify
		multicore and cluster setup
	</options>

	<acq_details>
		root directory for analysis
		subject and session selection
		numdummies
	</acq_details>

	<gui_controls>
		text message colors
	</gui_controls>

	<timeouts>
		scheduler settings
	</timeouts>

</local>
</aap>

The xi:include specifies a base parameter file (usually a default parameter file included in the aa distribution) followed by a local block where customization appears. Generic entries are included here that characterize each section. Parameters in the directory conventions block generally relate to file system settings, the options block defines runtime settings, acquisition details include data-related settings, settings in the GUI controls block affect the appearance of text messages, and the timeouts block includes parameters that control scheduler behavior. You may customize entries in none, some, or all of these. It is common practice to change settings affecting all of your analyses in the parameter file but include any analysis-specific settings in the userscript.

Even though over 200 parameters are defined, you will generally only need to customize a handful. The most common are presented below.

directory conventions

Many entries in the directory_conventions block are used to set up thirdparty software. For example, consider the addition of FSL. There are four FSL-related parameters that can be customized:

<fsldir>/usr/local/fsl</fsldir>
<fslshell>bash</fslshell>
<fsloutputtype>NIFTI</fsloutputtype>
<fslsetup></fslsetup>

These specify the FSL install directory, the shell that FSL will run in, the type of output files FSL should generate, and an FSL setup script. Specify the toplevel FSL install directory for fsldir, not the directory where the executable lives. That is, the shell command which fsl should return /bin/fsl. If this setting is incorrect, aa will throw an error the first time it attempts to use an FSL function or template file in your analysis. Note NIFTI is uppercase even though, officially, the acronym is not. The fslsetup parameter is a full pathname to a shell script that will run before aa executes any FSL command and implements additional FSL initialization. If none is required, the field is left empty as shown here.

FYI: XML tag descriptors ("desc=") and typedefs ("ui=") are omitted here for clarity. At minimum, you must include a typedef for a tag and it's good practice to also include a descriptor. For example, the fsldir entry shown above is properly entered as:

<fsldir desc='Path to FSL' ui='dir'>/usr/local/fsl</fsldir>

You should retain these modifiers in the entries appearing in your custom parameter file. See any sample parameter file provided in the aa distribution (directory: aa_parametersets) for more examples.

Matlab Toolboxes

Setup for other third-party software is similar to the FSL example shown above. However, Matlab toolboxes use a different specification based on a Matlab class interface. After downloading the toolbox you wish to use, add a toolbox entry to the parameter file that includes the name and location of the toolbox. Here is an example for the BrainWavelet toolbox:

<toolbox ui='custom'>
	<name ui='text'>bwt</name>
	<dir ui='dir'>/path/to/BrainWavelet/toplevel/directory</dir>
</toolbox>

Here, name is the name of the class interface name for the toolbox and dir is the location of the downloaded toolbox code. A collection of class interfaces is provided in aa_tools/toolboxes. These have the format xxxClass.m, where xxx is a three letter designation chosen for the toolbox (bwt, in this example). Currently about a dozen toolbox classes are defined. If a toolbox of interest is not listed, you will need to create a class interface for it (or request one from the aa dev team).

Another category of entries in the directory convention block are parameters used for file organization. The data and results directories are specified using the rawdata and analysisid parameters. These are often set in the userscript. The results directories can also be tagged using the analysisid_suffix parameter. This adds the specified suffix to the name of the results directory which is sometimes done in a branched analysis to more easily identify results from a given branch. There are a number of parameters of the form dirname and subject_directory; these appear to be older options that have been largely superseded by BIDS input. The protocol_* entries define scanner protocols used in the automatic identification of DICOM files.

Other settings you may need to customize in the directory settings section include linux_shell which defines the shell aa should use when running Unix commands (options: csh, tsch, bash, or zsh), reportname which defines the name of the aa report file (default: report.html) and T1template which identifies the location of the SPM T1 template (modules that require this information will also look in a default location if the template cannot be located from this setting). A number of parameters (e.g., poolprofile, qsub, condorwrapper) are related to multicore and cluster processing.

Entries in this section you probably will never need to change include parallel_dependencies (used by the scheduler) and remote file system setup if you are not using a remote file system. The remaining entries in the directory conventions block include a number of parameters defined for a specific analysis, project, or installation. These may be removed in a future aa release.

options

Many options block parameters relate to data formatting or file system setup. The NIFTI4D parameter indicates all NifTi files should be written as a single 4D file. The alternative is a collection of 3D files, each containing a single brain volume. The autoidentify* settings are used for automatic type identification when processing DICOM input. The two choose_* parameters specify which T1 and/or T2 images to use if more than one is present. If the hardlinks parameter is set, aa will use hard links instead of copying files between module directories (i.e., when making the output from one module available as the input to another). This will result in substantial disk savings, but hard links are not available on some systems.

Another group of options parameters controls the setup of multicore and cluster programming. If the Matlab Parallel Computing Toolbox is installed on your machine, it is straightforward to set up aa to use the PCT. Setting up aa on a computing cluster is more involved. See your site administrator for assistance.

If you wish to receive an email when an analysis finishes (or doesn't finish), you can enter an email address in the email parameter. The verbose parameter controls the amount of information aa writes to the Matlab command window during an analysis.

Finally, there are parameters in the options block you will probably never need to change. This includes the aaworker* parameters which control scheduler behavior, parameters defining required version numbers, and settings related to specialized applications such as searchlight and realtime analysis.

acquisition details

Several settings in the acq_details section typically appear in the userscript. This includes the root directory for results and also fields in the subjects and sessions subsections which are usually set by aas_processBIDS (if using BIDS data) or another aa data utility.

Other settings here allow you to modify the processing of functional data. A nonzero numdummies specifies the number of initial volumes to be discarded when an epi file is read. Similarly, topscannernumber specifies the maximum number of volumes that will be read. If specifying numdummies , you may also need to set the boolean correctEVfordummies so that aa will adjust specified event onset times in the model.

Two settings allow you to modify the session definition. The boolean combinemultiple will combine multiple "visits" into multiple sessions of a single visit (so aa will combine all data into a single design matrix with multiple sessions rather than creating multiple models each containing one session).

Finally, the selected_sessions parameter allows you to include only a subset of sessions in the model. The data is entered as an integer array (e.g., [2 4 6]) and will select the sessions in the order added (if using aas_processBIDS, sessions are ordered alphabetically by name).

GUI controls
timeouts

Customization is usually not required for parameters in the gui_controls or timeouts sections.

Settings used in specific applications

There are settings which may require customization depending on a particular application. Here we summarize some of the most common.

    <directory_conventions>
      <rawdatadir> <!-- Colon-separated list of directories to find raw MRI data. You MUST have an entry ending with "aa_demo" for the examples -->
      <rawmeegdatadir> <!-- Colon-separated list of directories to find raw M/EEG data. You MUST have an entry ending with "aa_demo" for the examples -->
      <subjectoutputformat> <!-- `sprintf` formatting string to get subject directory as stored in rawdatadir -->
      <meegsubjectoutputformat> <!-- `sprintf` formatting string to get subject directory as stored in rawmeegdatadir -->
      <seriesoutputformat> <!-- `sprintf` formatting string to get series directory as stored in subject directory -->
      <protocol_structural> <!-- For automatic identification of structural/anatomical data, you must sepcify the name of the structural/anatomical protocol as stored in the DICOM header -->
      <dicomfilter> <!-- Directory listing filter to find DICOM files -->
      <toolbox> <!-- Settings for SPM. N.B.: You should not modify SPM version in your user script but rather in your parameterset. -->
        <name>spm</name> 
        <dir> <!-- path to SPM -->
  • Parallel execution requires access to a cluster with one of the supported job schedulers (i.e. Torque, SLURM, LSF, SoGE), a valid corresponding Cluster Profile, and its specification in the parameterset. You can also specify a required queue or resource other than memory and walltime by adding the corresponding submit argument (i.e. as it would be specified in the system's command-line) after the Cluster Profile and separated with colon. If this settings is empty, the parallel execution will not be available. You can also use the pre-defined 'local' profile.
    <directory_conventions>
      <poolprofile> <!-- Cluster Profile and (optional) submit argument separated with colon. -->
  • Email notification requires specification of an email account (email addres) and password (unencrypted text!) separated with colon.
    <directory_conventions>
      <mailerserver> <!-- E-mail address and password (colon-sepertated). -->
  • Distortion correction using fieldmaps
    <directory_conventions>
      <protocol_fieldmap> <!-- For automatic identification of fieldmap, you must sepcify the name of the fieldmap protocol as stored in the DICOM header -->
  • Multichannel segmentation using T2-weighted images
    <directory_conventions>
      <protocol_t2> <!-- For automatic identification of T2-weighted images, you must sepcify the name of the T2-weighted protocol as stored in the DICOM header -->
  • Some software are integrated using specific interface found in <aa path>/aa_tools/toolboxes folder.
    <directory_conventions>
      <toolbox> <!-- Settings for SPM. N.B.: You should not modify SPM version in your user script but rather in your parameterset. -->
        <name>eeglab</name> 
        <dir> <!-- path to EEGLAB -->
        <extraparameters>
          <requiredPlugins> <!-- colon-separated list of plugins to be used -->

      <toolbox> <!-- Settings for SPM. N.B.: You should not modify SPM version in your user script but rather in your parameterset. -->
        <name>fieldtrip</name> 
        <dir> <!-- path to FieldTrip -->

      <toolbox> <!-- Settings for SPM. N.B.: You should not modify SPM version in your user script but rather in your parameterset. -->
        <name>hcpwb</name> 
        <dir> <!-- path to Human Connectome Project Workbench (used by M/EEG source reconstruction based on cortical sheet) -->
        <extraparameters>
          <templateDir> <!-- Path to the folder created as spefcified in [FieldTrip path]/bin/ft_postfreesurferscript.sh -->

      <toolbox> <!-- Settings for SPM. N.B.: You should not modify SPM version in your user script but rather in your parameterset. -->
        <name>mvpalight</name> 
        <dir> <!-- path to SPM -->
  • FSL and FreeSurfer require customization of the following parameters:

    • FSL
        <directory_conventions>
          <fsldir> <!-- Path to FSL -->
          <fslshell> <!-- Shell used to run FSL -->
          <fslsetup> <!-- Path to a setup script to be executed before any FSL command. This script usually invoke [FSL path]/etc/fslconf/fsl.sh -->
          <fsloutputtype> <!-- Type of images generated by FSL You can read more about it at https://fsl.fmrib.ox.ac.uk/fsl/fslwiki/FslEnvironmentVariables. -->
    • FreeSurfer
        <directory_conventions>
          <freesurferdir> <!-- Path to FreeSurfer -->
          <freesurfershell> <!-- (optional) Path to a setup script to be executed before any FreeSurfer command. -->
          <freesurfersetup> <!-- Shell used to run FreeSurfers -->
          <freesurferenvironment> <!-- Path to the FreeSurfer environmental setup script (ended with a semicolon). It usually points to [FreeSurfer path]/FreeSurferEnv.sh -->
  • Certain toolboxes have dedicated fields within the parameterset.

    <directory_conventions>
      <ANTSdir> <!-- Path to Advanced Normalisation Tools -->
      <BrainWaveletdir> <!-- Path to BrainWavelet -->
      <DCMTKdir> <!-- Path to DICOM Toolkit -->
      <FaceMaskingdir> <!-- Path to FaceMasking -->
      <GIFTdir> <!-- Path to Group ICA Of fMRI Toolbox -->
  • Folders containing further MATLAB-based toolboxes and codes you want to add to the path can be added as a list of paths. These folders will be added to the MATLAB path without further processing.
    <directory_conventions>
      <matlabtoolsdir> <!-- Colon-separated list of path -->