# Expansions

Individual words and operators on the command line are called tokens. The shell performs expansions on the
tokens replacing the token with a new string or strings.

There are seven kinds of expansions:

1. brace expansion
2. tilde expansion
3. parameter expansion
4. command substitution (same precedence as arithmetic expansion)
5. arithmetic expansion (same precedence as command substitution)
6. word splitting
7. filename expansion

The list above shows the order in which expansions are performed (4 and 5 have the same precedence, such
expansions are performed left to right if two or more such expansions are part of the same command line).

<div class="alert alert-block alert-info">
    This code in this notebook assumes that the current working directory is 
    <code>./scripts/expansions</code>. Run the next cell once when using this notebook.
    <br /><br />
    If you see errors related to missing files then the working directory is probably incorrect.
</div>

In [None]:
# run exactly once before using this notebook
cd scripts/expansions

## Brace expansion

Brace expansion is a mechanism for generating strings that follow a pattern.
Brace expansion has the form *pre*`{`*expression*`}`*post* where:

* *pre* is called the preamble and is optional
* *post* is called the postscript and is optional
* *expression* is either
    * a series of two or more comma separated strings, or
    * a sequence expression

The preamble is the string that appears at the beginning of every generated string.
The postscript is the the string that appears at the end of every generated string.
The *expression* generates the part between the preamble and postscript of every generated string.


In [None]:
echo {a,b,c}

In [None]:
echo before_{and,\&,also}_after

A sequence expression has the form `x..y..incr` where `x` and `y` are integers or single characters
and `incr` is the optional integer increment (step size):

In [None]:
echo {0..9}

In [None]:
echo {9..0}

In [None]:
echo {10..20..2}

In [None]:
echo {a..g}

In [None]:
echo {a..z..2}

Brace expansions can be used to create files or directories named using year-month:

In [None]:
mkdir {2019..2022}-{01..12}
ls

We can create the files `file00.txt` through `file30.txt` with a brace expansion:

In [None]:
touch file{00..30}.txt
ls

The Bash manual gives the following example (slightly modified) that illustrates a nested brace expansion:

In [None]:
echo /usr/{ucb/{ex,edit},lib/{a,how_ex}}

## Tilde expansion

In an earlier notebook, it was mentioned that `~` expands to the absolute path of the user's 
home directory.

In [None]:
echo ~

If there are multiple users on your system, then `~`*username* expands to the absolute path
of the home directory of *username*: 

In [None]:
echo ~cisc220

## Parameter expansion

Parameter expansion is occurs when accessing the value of a variable or parameter.

See the *Variables* notebook.

## Command substitution

Command substitution allows the output of a command to replace the command itself. Command substitution
has the form

`$(`*command*`)`

The *command* is executed in a subshell and the standard output of the command is substituted.
Command substitution is used to store the output of a command in a variable. The following
example is from https://mywiki.wooledge.org/CommandSubstitution

In [None]:
echo "Today is $(date +%A), it's $(date +%H:%M)"

In the above example, the output of the two `date` commands are substituted into the string printed
by `echo` uisng command substitutions.

The output of the `cowsay` command can be saved in a variable like so:

In [None]:
moo=$(cowsay "Mooooo!")

Running the previous cell produces no output because the output of `cowsay` is saved in the variable `moo`
instead of being sent to standard output. Of course, the value of `moo` can be printed or used for some
other computation:

In [None]:
echo "$moo"

## Arithmetic expansion

Arithmetic expansion has the form:

`$((` *expression* `)`

where *expression* is an integer arithmetic expression. The arithmetic expansion is evaluated and the value
is substituted.

In [None]:
echo $(( 1 + 1 ))

In [None]:
# silly example
mkdir dir$((2 * 5))
ls

Arithmetic expansions are somewhat uncommon when using the command line, but are frequently
used in scripts. See the *Arithmetic* notebook for details.

## Word splitting

After all of the previous expansions have occurred, anything not inside double quotes undergoes word splitting.
Word splitting splits a string into separate parts or fields where each field is delimited by one or more
field separator characters. Word splitting results in a sequence (list) of one or more fields.

The default field separator characters are the space, the tab character, and the
newline character.

The occurrence of word splitting is why spaces are discouraged in filenames. You may have noticed the file
with name `CISC220 Assignment solutions.zip`. If you try to unzip the file like so:

In [None]:
unzip CISC220 Assignment solutions.zip

you will see that the `unzip` program complains that it cannot find the file `CISC220`. Word splitting has
resulted in the filename being broken up into three separate words: `CISC220`, `Assignment`, and `solutions.zip`.
Quoting or escaping the spaces in the filename are needed to suprress word splitting.

The shell uses the *internal field separator*
variable `IFS` to store the field separator characters. By default `IFS` contains 
the characters corresponding to the space, tab, and newline characters. Each character currently stored in `IFS`
is considered to be a valid field separator. If two or more field separator characters occur adjacent to
each other then the shell treats them as a single field separator.

Note that word splitting occurs after all other expansions except for filename expansion. This can cause
surprising results. Consider the following string containing multiple default field separators:

In [None]:
str="field1    field2       field3


field4"

To use the value of `str` we use the parameter expansion `$str`; however, word splitting occurs *after
the parameter expansion*. Try printing the value of `str` using echo:

In [None]:
echo $str

Notice that the mulitple spaces between `field1` and `field2`, `field2` and `field3`, and the three newline
characters between `field3` and `field4` are all replaced with a single space. 

It is sometimes useful
to manipulate `IFS` to modify how word splitting is performed. For example, suppose that you have a string
containing comma separated fields (in this case, a student number, last name, first name):

In [None]:
str="123456,Ester,Polly"

By temporarily changing the value of `IFS` we can modify the behavior of word splitting:

In [None]:
# temporarily change IFS
OLDIFS=$IFS
IFS=","

echo $str

# restore IFS so that word splitting behaves as normal
IFS=$OLDIFS

The sequence of fields produced by word splitting can be processed using a `for` loop:

In [None]:
# temporarily change IFS
OLDIFS=$IFS
IFS=","

for field in $str; do
    echo $field
done

# restore IFS so that word splitting behaves as normal
IFS=$OLDIFS

Alternatively, the list of fields can be passed to another command or function for processing.

There are many rules describing the precise behavior of word splitting. These rules are described in https://www.gnu.org/software/bash/manual/html_node/Word-Splitting.html

## Filename expansion

The last expansion that occurs is filename expansion that occurs when wildcards are used. See the *Filename expansion* notebook for details.