# Filename expansion

Filenames are used very often by the shell.
Filename expansion, often called *globbing*, uses symbols called wildcards to help succinctly specify
groups of filenames. If one of the characters `*`, `?`, or `[` appear in a word, and is not quoted, then
the word is interpreted as a pattern, and is replaced with an alphabetically sorted list of filenames
matching the pattern.

The `glob` man page can be found by typing:

```
man 7 glob
```

The official documentation can be found at:

* [GNU Bash manual page for Pattern Matching](https://www.gnu.org/software/bash/manual/html_node/Pattern-Matching.html)

## Wildcards

The characters `*`, `?`, and `[` are called wildcards. A string is a wildcard pattern if it contains
at least one wildcard. The following table describes how the wildcards match filenames:

| Wildcard | Description |
| :--- | :--- |
| `*` | matches any string including the empty string |
| `?` | matches any single character |
| `[`*characters*`]` | matches any single character in the set *characters* |
| `[!`*characters*`]` | matches any single character not in the set *characters* |
| `[[:`*class*`:]]` | matches any single character in the specified POSIX class |

Some examples of wildcard patterns are shown in the table below:

| Pattern | Matches |
| :--- | :--- |
| `*` | All files |
| `a*` | All files starting with `a` |
| `A*` | All files starting with `A` |
| `*.txt` | All files ending with `.txt` |
| `a*.txt` | All files starting with `a` and ending with `.txt` |
| `???` | Any three character filename |
| `x?z` | Any three character filename starting with `x` and ending with `z` |
| `x[yY12]z` | `xyz` or `xYz` or `x1z` or `x2z` |
| `x[a-z]z` | Any three character filename starting with `x`, followed by a lowercase letter between `a` and `z`, and ending with `z` |
| `x[0-9]z` | Any three character filename starting with `x`, followed by a digit, and ending with `z` |
| `[0-9][0-9].pdf` | Any filename starting with two digits and ending in `.pdf` |

If a wildcard pattern matches the name of one or more files in the current working directory, then
the pattern is replaced with an alphabetically sorted list of filenames matching the pattern.

In the directory `./scripts/filename_expansion` you will find many files. The files are empty with the
exception of `A.txt` and `a.txt`. You can test the examples above by switching into the directory containing
the files and using `echo` to print the list of matching filenames, or by using `ls` to list the matching
filenames.

In [None]:
# run this cell exactly once each time you open this notebook
cd ./scripts/filename_expansion

In [None]:
# change the wildcard pattern to see the effects on matching
echo *

In [None]:
# change the wildcard pattern to see the effects on matching
ls *

Many commands (such as `ls`) accept a list of filenames which makes filename expansion a powerful tool for
supplying arguments to such commands. For example, we can concatenate the contents of all of the `.txt` files
in the directory using a wildcard pattern:

In [None]:
cat *.txt

## POSIX character classes

POSIX defines classes of characters that are grouped using a name that is enclosed by `[:` and `:]`.
The POSIX character classes may be used only inside of square brackets. The classes are described
in the table below:

| POSIX character class | Description |
| :--- | :--- |
| `[:alnum:]` | Alphanumeric characters made up of `[:alpha:]` and `[:digit:]` |
| `[:alpha:]` | Alphabetic characters made up of `[:lower:]` and `[:upper:]` |
| `[:blank:]` | The blank characters space and tab |
| `[:cntrl:]` | Control characters (mostly non-printing) |
| `[:digit:]` | The digits 0 through 9 |
| `[:graph:]` | Graphical characters made up of `[:alnum:]` and `[:punct:]`  |
| `[:lower:]` | The lowercase letters a through z |
| `[:print:]` | The printable characters made up of `[alnum]`, `[punct]`, and space |
| `[:punct:]` | The punctuation characters: `! " # $ % & ' ( ) * + , - . / : ; < = > ? @ [ \ ] ^ _ ` { \| } ~` |
| `[:space:]` | All whitespace characters: tab, newline, vertical tab, form feed, carriage return, and space |
| `[:upper:]` | The uppercase letters A through Z |
| `[:xdigit:]` | The hexadecimal digits 0 through 9, A through F, and a through f |

Some examples of wildcard patterns using the POSIX classes are shown in the table below:

| Pattern | Matches |
| :--- | :--- |
| `[[:upper:]]*` | All files starting with an uppercase letter |
| `[![:upper:]]*` | All files not starting with an uppercase letter |
| `x[[:digit:]]` | Any two character filename starting with `x` followed by one digit |

In [None]:
echo [[:upper:]]*