# Week 3: Unix for Neuroimagers

### September 25-29, 2023
### From [Andy's Brain Book](https://andysbrainbook.readthedocs.io/en/latest/unix/Unix_Intro.html)

## 1. Navigating the directory tree

### Commands
* **pwd**: **P**rints **w**orking **d**irectory
* **cd**: Lets us **c**hange **d**irectory using format `cd targetFolder`
    * Can use `cd ..` to change to outer folder
    * `cd ~` to go to outermost folder
* **ls**: Prints contents of current folder (**l**i**s**t contents of directory)
    * Can take an argument (a different folder/) to show contents of a folder that is not current directory
* **mkdir**: **M**a**k**es a new **dir**ectory (folder)

### Additional Information
* Root symbolized by `/` and is highest level of directory tree 

## 2. Copying and Removing Files

### Commands
* **cp**: **C**o**p**ying a file 
* **mv**: **M**o**v**ing a file to a different location OR renaming file
* **rm**: **R**e**m**oving a file (deleting it)
* **touch**: Creates a new file using `touch newFile.txt`
* **open**: Opens file with filename specified after `open` command
* **rmdir**: Removes directory listed as argument (only if empty directory)
* **man**: Used with a name of a command to see options

### Using `cp`, `rm`, and `mv`
* Using `mv` with two file name arguments lets you rename a file (argument 1, source file) with a new file name (argument 2, target file)
* `cp` also requires two arguments: one is the file to copy, one is the desired file name for the copy
* Using `mv` with any number of arguments, where the LAST argument is a directory, means that the other arguments (file names) will be moved to that directory
    * Directories can be moved into other directories e.g. `mv myFolder/ containerFolder/`
    * `mv containerFolder/myFolder/ . ` can move folder back out
* To remove files, `cd` into the directory and then type `rm` followed by names of as many files as you want to remove
    * Note that these files won't be in the "trash bin" but actually deleted permanently

### Options 
* `cp` and `rm` can be used with something called **options**, a way of offering more flexibility with commands 
* E.g., copy a directory: can't be done with basic cp command, but `cp -R myFolder/ myNewFolder/` means to recursively copy `myFolder` directory and all contents including subdirectories to `myNewFolder`
* `rmdir` can be used with `-R` and folder name 

## 3. Reading Text Files

### Commands
* **cat**: Displays contents of file (con**cat**enate), will print file to command line with output called stdout
* **less**: 
* **head**: Prints first ten lines by default, can use `head -7 myFile.txt` to print 7 lines (or whatever is desired)
* **tail**: `head` in reverse! Prints last 10 lines 
* **wc**: Gets word count of file name provided
* **echo**: Can be used to add information to files, or create a file if it doesn't exist
* **less**: Can be used to find strings in files

### Displaying text from files
* Use `cat myFile.txt` to view file, or `head myFile.txt` or `tail myFile.txt` to see just 10 lines (by default)


### Redirection
* Stdout can be used to move or append output to a file (redirection)
* E.g., `echo sixteen > myFile.txt` will create or overwrite `myFile.txt` with sixteen as contents
* To **append** and NOT overwrite, use: `echo seventeen >> myFile.txt`

### Streams
* Information that goes in is called stdin 
* Information coming out is stdout
* Stderr is error text when commands are used improperly

### Searching files for strings 
* Use `less myFile.txt` and type forward slash to enable search mode: this will highlight all instances of the string in the text file
    * Press n to jump to next instance of string 
    * Shift-n will jumpt to previous instance
* Press q to exit paging window 
* What if you want to search help command, let's say for 3dDeconvolve package? 
    * Typing `3dDeconvolve` will get you the whole help command, but this is too long to easily read
    * `less 3dDeconvolve` doesn't work
    * Redirecting the help output into `less` command through a pipe is a way to get around this: `3dDeconvolve | less` 
        * Page up and down using arrow keys
        * Scroll down one page by pressing "d" key or spacebar, scroll up one page using "u", press "q" to exit 
        * Text is processed by `less` command instead of being dumped to stdout

## 4. Shells and Path Variables

### Commands
* **set**: Used in tcsh to set variables (assign value to a string)
* **setenv**: Used to keep variable `x` constant no matter which shell we're in in a specific terminal (tcsh)
* **export**: Used to keep variable `x` constant no matter which shell we're in in a specific terminal (bash)
* **tcsh**: Create tcsh subshell
* **bash**: Create bash subshell


### Shells
* Shell is an environment you can type Unix commands - the interpeter turning typed commands into operations performed by the computer
* Two types of shells: **Bourne shell** (bash/Bourne-again is a widely-used version) and **C-shells** (t-shell/tcsh is a popular variation)
* Most commands we've done so far are build-in commands but shell-specific differences arise when doing things like setting a variable
    * Type `x = 3` and then `echo $x` to print 3 (bash)
    * Type `tcsh` to switch to that shell, then set `x=3`, then `echo $x`
    * To confirm which shell you're in, use `echo $0`
    * Switching to tcsh shell means we're technically in a subshell: each new terminal is an environment, so we can type exit to leave the subshell
    
### FSL Setup and Paths 
* Code in .bashrc (bash run commands) file is run anytime you create a new shell in bash and updates something called the path variable
* **Path variable**: List of directories searched anytime you run a command
    * Can see this list by typing `echo $PATH` 
    * See several absolute paths pointing to different directories, with a colon separating paths
    * When you enter a command, the shell looks for that command within each directory in your path, and then returns error if not found
* Paths allow you to use FSL commands from anywhere in the terminal 
    * Like all packages, FSL has a library containing all functions needed for FSL 
    * To run commands, we need to either be in the directory or specify the absolute path of command we want to run
    * To let us run FSL commands anywhere, we can set path variable to indicate where the FSL library is 
    * Run `ls $FSLDIR/bin` to see all available commands (or binaries, hence name) in FSL library 
    * Should be able to run these from anywhere in directory structure

## 5. For Loops

### Commands
* **for**: Runs a for loop in bash 
* **seq**: Prints every number in a specified range, e.g. `seq 1 15` prints 1 to 15


### Overview
* Neuroimaging analysis often requires running many commands but with small alterations, like for multiple subjects in a row
* For loops help save time 
* Basic for loop: `for i in 1 2 3; do echo $i; done`, will print #1-3
    * First section is called **declaration** where you assign the first item after `in` to variable `i`
        * Numbers after `in` are part of the "list" 
    * Next section: **body**: runs after command `do`
        * Replaces variable with whichever value is currently assigned to it, and then goes back to declaration before repeating the process 
        * Can have more commands in body, separated by semicolon 
    * Lastly we have the **end**, which is just the word `done` and indicates to exit the list after all items have been run through body

### Looping over Subjects
* For multiple subjects, we could type each subject name in the list out, or use `seq` 
    * `for i in sub-01 sub-02 … sub-26; do echo $i; done` compared to `for i in `seq 1 26`; do echo “sub-$i”; done`
* Command within backticks is executed first and expanded, so putting `seq 1 26` in backticks means that code will be executed first 
* Lastly we want to set all subject names to have two integers, ensuring the same length and keeping them in order when listed with `ls` command
    * We can do this with the `-w` option in `seq`
    * This is called zero padding

In [None]:
%%bash 

# Multi line 
for i in 1 2 3
do echo $i
done

# Single line 
for i in 1 2 3; do echo $i; done

# Adding commands to body section
for i in 1 2 3; do echo $i; echo "Printed number $i"; done

# Seq command for many subjects
for i in `seq 1 7`; do echo "sub-$i"; done

# Implementing zero padding using -w option in seq
for i in `seq -w 1 7`; do echo "sub-$i"; done

## 6. Conditional Statements 

### Writing conditional statements for analysis
* Allows us to execute code only if certai nconditions are met 
* IF statement is true, THEN do something, ELSE if condition is not met, do something else 
* E.g., check anatomical directory to see if an image has been skull-stripped 
    * If it hasn't been skull-stripped, do skull-stripping
    * If skull-stripped already, do nothing 
```bash
if [[-e sub-01_T1w_brain_f02.nii.gz. ]]; then
    echo "Skull-stripped brain exists" 
fi 
```

### `if` statements 
* Three distinct sections to if statement :
    * First begins with "if" and evaluates/checks whether condition in brackets is true 
        * Within the brackets, the -e checks if the file exists 
    * Body of conditonal: runs if the if statement evaluates to true
        * As many lines as desired
    * Last line is "fi" (if backwards) and ends the conditional
* Formatting has to be precise: for example, need exactly one space between first bracket and the -e that follows


### Adding an `else` section 
* Runs code if statement evaluates to false
```bash
if [[ -e sub-01_T1w_brain_f02.nii.gz ]]; then
        echo “Skull-stripped brain exists”
else
        echo “Skull-stripped brain does not exist”
fi
```
* Can use multiple conditionals to get more flexibility using `elif`
```bash
if [[ -e sub-01_T1w_brain_f02.nii.gz ]]; then
        echo “Skull-stripped brain exists”
elif [[ -e sub-01_T1w.nii.gz ]]; then
        echo “Original anatomical brain exists”
else
        echo “Neither the skull-stripped nor the original brain exists”
fi
```

### Logical operators in Unix
* Use `&&` within brackets for AND (check if both files exist)
```bash
if [[ -e sub-01_T1w.nii.gz && -e sub-01_T1w_f02_brain.nii.gz ]]; then
        echo “Both files exist”
fi
```
* Use `||` (two vertical pipes) within brackes for OR (to check if one or the other exists
```bash
if [[ -e sub-01_T1w.nii.gz || -e sub-01_T1w_f02_brain.nii.gz ]]; then
        echo “At least one of the files exists”
fi
```
* Add a `!` before the `-e` option to check if a file does NOT exist
```bash
if [[ ! -e sub-01_T1w_f02_brain.nii.gz ]]; then
        echo “The skull-stripped brain doesn’t exist”
fi
```

In [None]:
%%bash

# Linking for loop and conditionals

for i in sub-01 sub-02 sub-03; do
        cd ${i}/anat   # Navigate into subject's directory 
        if [[ ! -e ${i}_T1w_f02_brain.nii.gz ]]; then  # I
                echo “Skull-stripped brain doesn’t exist; stripping brain”
                bet2 ${i}_T1w.nii.gz ${i}_T1w_f02_brain.nii.gz -f 0.2
        else
                echo “Skull-Stripped brain already exists; doing nothing”
        fi
        cd ../..  # move to parent directory x2
done

## 7. Scripting

### Combining Commands
* We can put a series of commands into a script (file containing code)
* This makes it easier to move code between directories, and makes debugging easier

### Writing Scripts 
* Begin each script with a **shebang** indicating it should be interpreted with bash shell: `#!/bin/bash` 
* When writing scripts, it's good practice to use indentation and other visual cues to help with readability 
* Save scripts in code editor using the .sh extension (shell) 

### Running Scripts 
* Run script using `bash fileName.sh` from the folder 
* Could also do `./fileName.sh` to run all code in script 

### Wildcards 
* Two types of wildcards:
    * Asterisk looks for one or more characters. 
        * For example, navigate to the Flanker directory and type mkdir sub-100. 
        * If you type ls -d sub-* , it will return every directory that starts with sub-, whether it is sub-01 or sub-100
        * The asterisk wildcard doesn’t care about directory length; it will match and return all of them, as long as they start with sub- 
    * Question mark matches a single occurrence of any character. 
        * If you type ls -d sub-??, it will only return directories with two integers after the dash
        * In other words, it will return sub-01 through sub-26, but not sub-100


### Text Manipulation with Awk 
* Awk is a text processing command that prints columns from a text file 
* `Cat` will print all text in file, but we can redirect output of that command into input for `awk` using a vertical pipe
* Use conditional statements in `awk` to get, for example certain columns, e.g., print onset times for specific experimental conditions, and redirect output into a text file 

## 8. The Sed Command 

### Overview
* `Sed`: **S**tream **ed**itor, with input that is a stream of text and output that replaces a string with another string 
* Benefit over for loops is that sed can edit a file and only change certain words, while overwriting file and leaving the rest of the text intact 
* Example script to edit:
```bash
#!/bin/bash
echo "Hello, world!" 
```

    * Running `sed` on this: `sed "s|world|Earth|g" Hello.sh`
    * Three parts: declaring `sed` command, pattern to match/replace with another pattern, enclosed in quotes, and the file to be read into `sed`
        * In the pattern section, the "s" means "swap" pair of strings, and "g" means global, referring to replacing every instance
    * To pipe output into a new file, `sed "s|world|Earth|g" Hello.sh > Hello_Earth.sh
    * Editing files in place using `-i` and `-e` options: `sed -i -e "s|Andy|Bill|g" Hello.sh`
        * `-i` stands for in place: text file should be overwritten once words have been swapped
        * `-e` lets `-i` work on Macs 


### Using `sed` with for loops 
* `sed` can be combined with for-loops and conditionals to write more sophisticated code
* E.g., creating several copies of a template file and changing only one word over a list of names
* Example:
    * File called names.sh
    ```bash
    #!/bin/bash
    echo "Hi, my name is CHANGENAME."
    ```
    * Iterate through: 
    ```bash
    for name in Andy John Bill; do
        sed -i -e "s|CHANGENAME|${name}|g" Names.sh > ${name}_Names.sh
    done
    ```

## 9. Automating the Analysis

### Overview
* Integrating conditionals, for-loops, and sed in order to integrate separate lines of code into a useful script 
* We will run a script from the fMRI short course that is analyzing the Flanker dataset 


### Analyzing the Script
* Initializing the for loop
    * Begins with a shebang and some comments describing what exactly the script does, and then backticks are used to expand seq -w 1 26 in order to create a loop that will run the body of the code over all of the subjects
    * This will expand to 01, 02, 03 ... 26 and update the number that is assigned to the variable id on each iteration of the loop
    * For example, the first loop of this code will assign the string sub-01 to the variable subj, then echo “===> Starting processing of sub-01”. It will then navigate into the sub-01 directory
* Conditionals to check for skull stripped anatomical 
    * The script then uses a conditional to check whether the skull-stripped anatomical exists, and if it doesn’t, the skull-stripped image is generated
* Editing and running template file 
    * Then the template design*.fsf file is edited to replace the string sub-08 with the current subject’s name
    * The * .fsf files are run with the command feat, which is like running the FEAT GUI from the command line
    * Echo commands are used throughout the script to let the user know when a new step is being run
    * The design.fsf files, which are located in the main Flanker directory, are copied into the current subject’s directory
    * Sed then replaces the string sub-08 with the current value of subj that has been assigned in the loop
    * The last part of the code runs the .fsf files with the feat command, and prints to the Terminal which run is being analyzed
    * You can run the script by simply typing `bash run_1stLevel_Analysis.sh`
        * The echo commands will print text to the Terminal when a new step is run, and HTML pages will track the progress of the preprocessing and statistics

In [None]:
#!/bin/bash

# Generate the subject list to make modifying this script
# to run just a subset of subjects easier.

for id in `seq -w 1 26` ; do
    subj="sub-$id"
    echo "===> Starting processing of $subj"
    echo
    cd $subj

        # If the brain mask doesn’t exist, create it
        if [ ! -f anat/${subj}_T1w_brain_f02.nii.gz ]; then
            bet2 anat/${subj}_T1w.nii.gz \
                echo "Skull-stripped brain not found, using bet with a fractional intensity threshold of 0.2" \
                anat/${subj}_T1w_brain_f02.nii.gz -f 0.2 
        fi

        # Copy the design files into the subject directory, and then
        # change “sub-08” to the current subject number
        cp ../design_run1.fsf .
        cp ../design_run2.fsf .

        # Note that we are using the | character to delimit the patterns
        # instead of the usual / character because there are / characters
        # in the pattern.
        sed -i '' "s|sub-08|${subj}|g" \
            design_run1.fsf
        sed -i '' "s|sub-08|${subj}|g" \
            design_run2.fsf

        # Now everything is set up to run feat
        echo "===> Starting feat for run 1"
        feat design_run1.fsf
        echo "===> Starting feat for run 2"
        feat design_run2.fsf
                echo

    # Go back to the directory containing all of the subjects, and repeat the loop
    cd ..
done

echo