**Loops** are key to productivity improvements through automation as they allow us to execute
commands repetitively. Similar to wildcards and tab completion, using loops also reduces the
amount of typing (and typing mistakes).
Suppose we have several hundred snapshots data files named `snap001.txt`, `snap002.txt`, and so on.
In this example,
we'll use the `high_equilibrium/snapshots` directory which only has one hundred example files,
but the principles can be applied to many many more files at once.

In [1]:
cd high_equilibrium/snapshots/

In [2]:
ls

02-loop.Rmd  snap017.txt  snap034.txt  snap051.txt  snap068.txt  snap085.txt
snap001.txt  snap018.txt  snap035.txt  snap052.txt  snap069.txt  snap086.txt
snap002.txt  snap019.txt  snap036.txt  snap053.txt  snap070.txt  snap087.txt
snap003.txt  snap020.txt  snap037.txt  snap054.txt  snap071.txt  snap088.txt
snap004.txt  snap021.txt  snap038.txt  snap055.txt  snap072.txt  snap089.txt
snap005.txt  snap022.txt  snap039.txt  snap056.txt  snap073.txt  snap090.txt
snap006.txt  snap023.txt  snap040.txt  snap057.txt  snap074.txt  snap091.txt
snap007.txt  snap024.txt  snap041.txt  snap058.txt  snap075.txt  snap092.txt
snap008.txt  snap025.txt  snap042.txt  snap059.txt  snap076.txt  snap093.txt
snap009.txt  snap026.txt  snap043.txt  snap060.txt  snap077.txt  snap094.txt
snap010.txt  snap027.txt  snap044.txt  snap061.txt  snap078.txt  snap095.txt
snap011.txt  snap028.txt  snap045.txt  snap062.txt  snap079.txt  snap096.txt
snap012.txt  snap029.txt  snap046.txt  snap063.txt  snap080.txt  snap097.txt

We would like to modify these files, but also save a version of the original files, naming the copies
`original-snap001.txt`, `original-snap002.txt`, and so on.
We can't use:

```
cp *.txt original-*.txt
```

because that would expand to:

```
cp snap001.txt snap002.txt ... original-*.dat
```

This wouldn't back up our files, instead we get an error:

In [3]:
cp *.txt original-*.txt

cp: target 'original-*.txt' is not a directory


: 1

This problem arises when `cp` receives more than two inputs. When this happens, it
expects the last input to be a directory where it can copy all the files it was passed.
Since there is no directory named `original-*.txt` in the `high_equilibrium/snapshots` directory we get an
error.

Instead, we can use a **loop**
to do some operation once for each thing in a list.
Here's a simple example that displays the first three lines of each file in turn:

In [4]:
for filename in snap001.txt snap002.txt
do
  head -n 4 $filename
done

Mass	x	y	z	vx	vy	vz
M_sol	Parsecs	Parsecs	Parsecs	km/s	km/s	km/s

0.11	1.15	0.6	1.3	-0.01	0.89	1.12
Mass	x	y	z	vx	vy	vz
M_sol	Parsecs	Parsecs	Parsecs	km/s	km/s	km/s

0.11	1.14	0.68	1.41	-0.19	0.75	1.19


Note that it is common practice to indent the line(s) of code within a for loop.
The only purpose is to make the code easier to read -- it is not required for the loop to run.

When the shell sees the keyword `for`,
it knows to repeat a command (or group of commands) once for each item in a list.
Each time the loop runs (called an iteration), an item in the list is assigned in sequence to
the **variable**, and the commands inside the loop are executed, before moving on to 
the next item in the list.
Inside the loop,
we call for the variable's value by putting `$` in front of it.
The `$` tells the shell interpreter to treat
the **variable** as a variable name and substitute its value in its place,
rather than treat it as text or an external command. 

In this example, the list is two filenames: `snap001.txt` and `snap002.txt`.
Each time the loop iterates, it will assign a file name to the variable `filename`
and run the `head` command.
The first time through the loop,
`$filename` is `snap001.txt`. 
The interpreter runs the command `head` on `snap001.txt`, 
and the prints the 
first three lines of `snap001.txt`.
For the second iteration, `$filename` becomes 
`snap002.txt`. This time, the shell runs `head` on `snap002.txt`
and prints the first three lines of `snap002.txt`. 
Since the list was only two items, the shell exits the `for` loop.

When using variables it is also
possible to put the names into curly braces to clearly delimit the variable
name: `$filename` is equivalent to `${filename}`, but is different from
`${file}name`. You may find this notation in other people's programs.

Returning to our example in the `high_equilibrium/snapshots/` directory,
we have called the variable in this loop `filename`
in order to make its purpose clearer to human readers.
The shell itself doesn't care what the variable is called;
if we wrote this loop as:

In [5]:
for x in snap001.txt snap002.txt
do
    head -n 4 $x
done

Mass	x	y	z	vx	vy	vz
M_sol	Parsecs	Parsecs	Parsecs	km/s	km/s	km/s

0.11	1.15	0.6	1.3	-0.01	0.89	1.12
Mass	x	y	z	vx	vy	vz
M_sol	Parsecs	Parsecs	Parsecs	km/s	km/s	km/s

0.11	1.14	0.68	1.41	-0.19	0.75	1.19


or

In [6]:
for temperature in snap001.txt snap002.txt
do
    head -n 4 $temperature
done

Mass	x	y	z	vx	vy	vz
M_sol	Parsecs	Parsecs	Parsecs	km/s	km/s	km/s

0.11	1.15	0.6	1.3	-0.01	0.89	1.12
Mass	x	y	z	vx	vy	vz
M_sol	Parsecs	Parsecs	Parsecs	km/s	km/s	km/s

0.11	1.14	0.68	1.41	-0.19	0.75	1.19


it would work exactly the same way.
*Don't do this.*
Programs are only useful if people can understand them,
so meaningless names (like `x`) or misleading names (like `temperature`)
increase the odds that the program won't do what its readers think it does.

Let's continue with our example in the `high_equilibrium/snapshots/` directory.
Here's a slightly more complicated loop:

In [7]:
for filename in snap001.txt snap002.txt
do
    echo $filename
    head -n 4 $filename | tail -n 1
done

snap001.txt
0.11	1.15	0.6	1.3	-0.01	0.89	1.12
snap002.txt
0.11	1.14	0.68	1.41	-0.19	0.75	1.19


The **loop body**
then executes two commands for each of those files.
The first, `echo`, just prints its command-line arguments to standard output.

In this case,
since the shell expands `$filename` to be the name of a file,
`echo $filename` just prints the name of the file.
Note that we can't write this as:

In [8]:
for filename in snap001.txt snap002.txt
do
    $filename
    head -n 4 $filename | tail -n 1
done

bash: snap001.txt: command not found
0.11	1.15	0.6	1.3	-0.01	0.89	1.12
bash: snap002.txt: command not found
0.11	1.14	0.68	1.41	-0.19	0.75	1.19


because then the first time through the loop,
when `$filename` expanded to `snap001.txt`, the shell would try to run `snap001.txt` as a program.
Finally,
the `head` and `tail` combination selects line 4
from whatever file is being processed
(assuming the file has at least 4 lines).

Going back to our original file copying problem, we can solve it using this loop:

In [9]:
for filename in *.txt
do
    cp $filename original-$filename
done

This loop runs the `cp` command once for each filename.

Since the `cp` command does not normally produce any output, it’s hard to check that the loop is doing the correct thing. By prefixing the command with `echo` it is possible to see each command as it would be executed.

In [10]:
for filename in *.txt
do
    echo "Copying $filename ..." && cp $filename original-$filename && echo "$filename copied sucessfully"
done

Copying original-snap001.txt ...
original-snap001.txt copied sucessfully
Copying original-snap002.txt ...
original-snap002.txt copied sucessfully
Copying original-snap003.txt ...
original-snap003.txt copied sucessfully
Copying original-snap004.txt ...
original-snap004.txt copied sucessfully
Copying original-snap005.txt ...
original-snap005.txt copied sucessfully
Copying original-snap006.txt ...
original-snap006.txt copied sucessfully
Copying original-snap007.txt ...
original-snap007.txt copied sucessfully
Copying original-snap008.txt ...
original-snap008.txt copied sucessfully
Copying original-snap009.txt ...
original-snap009.txt copied sucessfully
Copying original-snap010.txt ...
original-snap010.txt copied sucessfully
Copying original-snap011.txt ...
original-snap011.txt copied sucessfully
Copying original-snap012.txt ...
original-snap012.txt copied sucessfully
Copying original-snap013.txt ...
original-snap013.txt copied sucessfully
Copying original-snap014.txt ...
original-snap014.t

snap017.txt copied sucessfully
Copying snap018.txt ...
snap018.txt copied sucessfully
Copying snap019.txt ...
snap019.txt copied sucessfully
Copying snap020.txt ...
snap020.txt copied sucessfully
Copying snap021.txt ...
snap021.txt copied sucessfully
Copying snap022.txt ...
snap022.txt copied sucessfully
Copying snap023.txt ...
snap023.txt copied sucessfully
Copying snap024.txt ...
snap024.txt copied sucessfully
Copying snap025.txt ...
snap025.txt copied sucessfully
Copying snap026.txt ...
snap026.txt copied sucessfully
Copying snap027.txt ...
snap027.txt copied sucessfully
Copying snap028.txt ...
snap028.txt copied sucessfully
Copying snap029.txt ...
snap029.txt copied sucessfully
Copying snap030.txt ...
snap030.txt copied sucessfully
Copying snap031.txt ...
snap031.txt copied sucessfully
Copying snap032.txt ...
snap032.txt copied sucessfully
Copying snap033.txt ...
snap033.txt copied sucessfully
Copying snap034.txt ...
snap034.txt copied sucessfully
Copying snap035.txt ...
snap035.tx

The `&&` operator separate items in a list of commands. The command after `&&` is executed if, and only if, the command before returns an exit status of zero. This ensure that if the command `cp` fail for any reason we see that information on our log. If you are expecting the first commnd to returns a non-zero exit status, you can use the `||` operator.

In our example, we named the new files as `original-$filename`. If we run the same loop, `for filename in *.txt`, again, we would end up with files `snap001.txt`, `original-snap001.txt`, `original-original-snap001.txt`, `snap002.txt`, `original-snap002.txt`, `original-original-snap002.txt`, and so on. We can use shell parameter expansion to avoid this. We are only going to see three expansions and we invite you to visit the [manual](https://www.gnu.org/software/bash/manual/html_node/Shell-Parameter-Expansion.html#Shell-Parameter-Expansion) for more.

To trim the parameter we can use `${parameter:offset:length}`.

In [11]:
string=01234567890abcdefgh
echo ${string:7}

7890abcdefgh


In [12]:
echo ${string:7:0}




In [13]:
echo ${string:7:2}

78


In [14]:
echo ${string:7:-2}

7890abcdef


In [15]:
echo ${string: -7}

bcdefgh


In [16]:
echo ${string: -7:0}




In [17]:
echo ${string: -7:2}

bc


In [18]:
echo ${string: -7:-2}

bcdef


To replace part of the parameter, we can use `${parameter/pattern/string}`.

In [19]:
filename=snap001.txt

In [20]:
echo ${filename/snap/shot}

shot001.txt


In [21]:
echo ${filename/.txt/.backup}

snap001.backup


Replace part of the parameter is very handful when using commands that require the name of the input and output file and they have different extensions.

The third and last expansion that we are going to see is called, command substitution. Command substitution allows the output of a command to replace the command itself. Command substitution occurs when a command is enclosed as follows:

```
$(command)
```

Bash performs the expansion by executing command in a subshell environment and replacing the command substitution with the standard output of the command, with any trailing newlines deleted. Suppose that you keep a file call `snap2run.list` with a list of the files that you want to process.

In [22]:
cat snap2run.list

snap001.txt
snap002.txt


You can use the command substitution to get the names from that file.

In [23]:
for filename in $(cat snap2run.list)
do
    echo "Copying $filename ..." && cp $filename original-$filename && echo "$filename copied sucessfully"
done

Copying snap001.txt ...
snap001.txt copied sucessfully
Copying snap002.txt ...
snap002.txt copied sucessfully


In the previous example, we use `cat` but we could have used any other command.

Talking about conditionally processing or not one file, Bash has builtin conditional expressions.

Expressions may be unary or binary. Unary expressions are often used to examine the status of a file. There are string operators and numeric comparison operators as well. The full list of expressions is below.

- `-a file`: True if file exists.
- `-b file`: True if file exists and is a block special file.
- `-c file`: True if file exists and is a character special file.
- `-d file`: True if file exists and is a directory.
- `-e file`: True if file exists.
- `-f file`: True if file exists and is a regular file.
- `-g file`: True if file exists and its set-group-id bit is set.
- `-h file`: True if file exists and is a symbolic link.
- `-k file`: True if file exists and its "sticky" bit is set.
- `-p file`: True if file exists and is a named pipe (FIFO).
- `-r file`: True if file exists and is readable.
- `-s file`: True if file exists and has a size greater than zero.
- `-t fd`: True if file descriptor fd is open and refers to a terminal.
- `-u file`: True if file exists and its set-user-id bit is set.
- `-w file`: True if file exists and is writable.
- `-x file`: True if file exists and is executable.
- `-G file`: True if file exists and is owned by the effective group id.
- `-L file`: True if file exists and is a symbolic link.
- `-N file`: True if file exists and has been modified since it was last read.
- `-O file`: True if file exists and is owned by the effective user id.
- `-S file`: True if file exists and is a socket.
- `file1 -ef file2`: True if file1 and file2 refer to the same device and inode numbers.
- `file1 -nt file2`: True if file1 is newer (according to modification date) than file2, or if file1 exists and file2 does not.
- `file1 -ot file2`: True if file1 is older than file2, or if file2 exists and file1 does not.
- `-o optname`: True if the shell option optname is enabled. The list of options appears in the description of the -o option to the set builtin (see The Set Builtin).
- `-v varname`: True if the shell variable varname is set (has been assigned a value).
- `-R varname`: True if the shell variable varname is set and is a name reference.
- `-z string`: True if the length of string is zero.
- `-n string`: True if the length of string is non-zero.
- `string1 == string2`: True if the strings are equal.
- `string1 != string2`: True if the strings are not equal.
- `string1 < string2`: True if string1 sorts before string2 lexicographically.
- `string1 > string2`: True if string1 sorts after string2 lexicographically.
- `arg1 OP arg2`: `OP` is one of `-eq`, `-ne`, `-lt`, `-le`, `-gt`, or `-ge`. These arithmetic binary operators return true if `arg1` is equal to, not equal to, less than, less than or equal to, greater than, or greater than or equal to `arg2`, respectively. `arg1` and `arg2` may be positive or negative integers.

For example, if we want to not run the loop for `snap001.txt`, we could do

In [24]:
for filename in $(cat snap2run.list)
do
    if [[ $filename != "snap001.txt" ]]
    then
        echo "Copying $filename ..." && cp $filename original-$filename && echo "$filename copied sucessfully"
    fi
done

Copying snap002.txt ...
snap002.txt copied sucessfully
