## What Is the Command Line?

The command line. What is it, and why is it so important? In this lesson, we'll answer a few fundamental questions, including the following:

- What is the command line?
- Why is it important?
- How do we use it?

We've provided many hyperlinks in this lesson in case you want to explore anything we introduce in more detail, but it's not necessary to read them to complete this lesson. We recommend that you wait to explore the hyperlinks until after you've completed the lesson.

Let's get started!

Most people interact with computers using [graphical user interfaces](https://en.wikipedia.org/wiki/Graphical_user_interface) (GUIs). A GUI is an interface that allows its users to interact with computers using icons and [pointing devices](https://en.wikipedia.org/wiki/Pointing_device), like mouses and track pads.

Examples of GUIs include familiar operating systems like Android (phones are computers, too!), iOS, OS X, Ubuntu, and Windows, as well as the most popular internet browsers like Google Chrome and Mozilla Firefox — even the Dataquest platform you're using now!

Below are some screenshots of GUIs. We can see on the left a file explorer on an Android phone, and on the right, we can see the desktop of a laptop running Ubuntu - both of which are based on the [Linux kernel](https://en.wikipedia.org/wiki/Linux_kernel) (the computer program that makes Linux what it is).



But people haven't always interacted with computers this way. Historically, when computers were expensive, monstrosities only owned by large institutions and shared by multiple people, we worked on computers using [command-line interfaces](https://en.wikipedia.org/wiki/Command-line_interface) (CLIs). A CLI is a text-only interface into which users type text-based instructions in a [console](http://www.linfo.org/console.html) (or terminal), using specific syntax.

As computers became more common, we began [emulating terminals](https://en.wikipedia.org/wiki/Terminal_emulator) within GUIs (also known as a terminal window or just a terminal). These terminal emulators look like the window you see on the right side of the following screenshot. On the left, we see a browser window — another GUI.

The instructions mentioned above are commands, which, once input, are [interpreted](https://en.wikipedia.org/wiki/Interpreter_(computing)) by a type of program called a [shell](https://en.wikipedia.org/wiki/Shell_(computing)) or command language interpreter, and then your machine runs the commands. Some of the most popular shells are Bash, Z shell, KornShell, Command Prompt, and Windows PowerShell. If you have installed Anaconda on Windows, then you might have noticed one of its programs is Anaconda Prompt, which is another shell. It's basically the Command Prompt with a few settings on top.

Even though they don't all mean the same thing, the terms command-line interface, command language interpreter, shell, console, terminal window, are all interchangeable.

The most common operating systems for laptops and desktop computers are Linux, OS X and Windows, and they all come equipped with a terminal. Linux and OS X are Unix-like (or *nix) operating systems; that means they behave like Unix (an old, still-in-use operating system), and this similarity is shared by their respective command language interpreters, whereas Windows' shells are very different from the others.

Because it's more common to use Unix-like operating systems in data science, in this lesson, we'll use a Unix shell called Bash. It's also possible to run Unix shells on Windows by using a compatibility layer like Windows Subsystem for Linux (displayed in the screenshot above) — an official Windows tool, if you're using Windows 10 and have a 64-bit CPU, or Cygwin. (You can access our installation guides for these tools here and here.) If you're using Windows, we recommend you wait until the end of the lesson before installing one of these alternatives.

Despite the fact that Unix shells are very similar, there are still some differences between them. In order to ensure that what you learn here is as universally applicable as possible, we'll focus on learning portable commands, by following a set of standards for Unix-like operating systems called POSIX. This way, we'll actually be studying several shells simultaneously. We'll let you know when a command isn't POSIX-compliant.

Now that you know about the CLI is, let's talk about why you should learn it.




## Why Learn the Command Line?

"But, Dataquest," you say, "I've been doing just fine without a CLI. Why should I start using one now?"

Have you ever used a program that fit your needs almost perfectly, but there were a couple of simple features missing that you wished you could add? When you use programs in the CLI, you can do exactly that!

Here are a few more reasons why it's important to learn the CLI:

- It's a very popular technology, which means there is a lot of support.
- employers value this skill.
- It's a very common tool in data science.
- It allows you to automate repetitive tasks.
- It has extremely powerful utilities for data processing/da[ta-driven programming](https://en.wikipedia.org/wiki/Data-driven_programming) like [AWK](https://en.wikipedia.org/wiki/AWK) and [sed](https://en.wikipedia.org/wiki/Sed).
- It's one of the best ways to use cloud services, which will be useful when you need more computing power than you can access locally. As an example, see the screenshot below, taken from the tutorial [Get Started with Deep Learning Using the AWS Deep Learning AMI](https://aws.amazon.com/blogs/machine-learning/get-started-with-deep-learning-using-the-aws-deep-learning-ami/). We can see a command line being used to set up [AWS](- It's one of the best ways to use cloud services, which will be useful when you need more computing power than you can access locally. As an example, see the screenshot below, taken from the tutorial [Get Started with Deep Learning Using the AWS Deep Learning AMI](https://aws.amazon.com/blogs/machine-learning/get-started-with-deep-learning-using-the-aws-deep-learning-ami/). We can see a command line being used to set up [AWS]() with Jupyter Notebook:) with Jupyter Notebook:

![image.png](attachment:9060ef08-fd86-411f-8f9c-07bac19f5c5c.png)

For a more in-depth view as to why the CLI is important, we suggest you read [this](https://www.dataquest.io/blog/why-learn-the-command-line/?_gl=1*14e8xi2*_gcl_au*MTAwNzE5MTEyLjE3MjgyNzg1NTE.*_ga*MTU5MTcwMjI3MS4xNzEyNDMxNTYz*_ga_YXMFSKC6DP*MTczMDI1MDcwOC40MjYuMS4xNzMwMjUxMjA1LjYwLjAuMTI1MzM2NzI4Nw..) Dataquest blog post after you complete this lesson.

Remember that people were also doing just fine before the internet. The shell isn't just another tool; it's a productivity powerhouse that will make your life easier. Welcome to the revolution!

## The Filesystem

You've probably browsed through directories using a [file manager](https://en.wikipedia.org/wiki/File_manager) like Explorer in Windows, or [Finder](https://en.wikipedia.org/wiki/Finder_(software)) in OS X. The framework that allows us to browse like this is a filesystem. In this lesson, we're going to learn how to explore the [filesystem](https://en.wikipedia.org/wiki/File_system) from the command line.

A file system is, among other things, an organizational system for our files. It helps us keep things tidy by placing files together in groups called [directories](https://en.wikipedia.org/wiki/Directory_(computing)) (or folders). Directories can be combined into a group, which will form another directory. We can also group files with directories.

The elements of a group will be said to be contained in the directory that encapsulates them. Each file or directory can only be contained in, at most, one directory. Alternatively, when a directory contains another, we say that the latter is a subdirectory of the first, and that the former is the parent directory of the latter. When a directory isn't contained in any other directory, it is called a root.

This relationship of groups and containment is at the heart of what we call a [hierarchical directory structure](https://en.wikipedia.org/wiki/Directory_structure). Below, we can see an example diagram of one.

![image.png](attachment:46ff897e-4b5b-4e4d-900c-aa7e6af22196.png)

In this example, the following statements are true:

- date is contained in the directory bin.
- bus and sys are subdirectories of proc.
- usr is the parent directory of include.
- learn contains the files east and west.
- dq contains both files and directories.

## Working Directory

In the first lesson, when we defined the command prompt, we mentioned a "current working directory." The [current working directory](https://en.wikipedia.org/wiki/Working_directory) (or just working directory) is the directory where our terminal's session is located. Whenever a shell is running, it's located somewhere in the directory structure. 

Whenever a shell is running, it's located somewhere in the directory structure. In our case, the prompt shows us the current working directory; this means that our current working directory is /home/dq , and as we'll see below, this indicates that dq is a subdirectory of the home folder. Before we learn how to read the string of characters /home/dq, we need to learn a bit more about Unix directory structures.

Contrary to what happens in Windows, Unix-like operating systems don't have the concept of "drive letter." In Windows, each drive letter represents a storage device — basically its own file system — and so it is common to have more than one root. In *nix systems there is only one root, called the[ root directory](https://en.wikipedia.org/wiki/Root_directory), represented by a single slash: /. This is one of the characteristics of Unix-like systems' directory structures. There are [other conventions](https://en.wikipedia.org/wiki/Unix_filesystem#Conventional_directory_layout) whose details vary by implementation, but these implementations are very, very similar.

The diagram displayed on the previous screen is a graphical sample of a Linux directory structure. In fact, it's a sample of the filesystem we're using in this course. Below, we see a screenshot of Windows 10 File Explorer displaying the roots of a computer.

![image.png](attachment:8cfb8a19-169a-4883-a4d2-a0108a75248b.png)

Now we can fully understand /home/dq. The first slash represents the root directory. home is a subdirectory of the root directory — this relationship is translated into characters by /home. Similarly, dq is a directory contained in home, hence /home/dq.

Seen differently, /home/dq is also a [path](https://en.wikipedia.org/wiki/Path_(computing)), a sequence of symbols that specifies the location of a file or directory in the directory structure. You can also think of it as the way to go from / to the directory dq in the diagram we saw on the previous screen. The character /, when not in the beginning of path, is called a directory separator.

In our case, our command prompt contains the current working directory. What if it didn't? How would we know where in the directory structure we are located?

There's a command to help us with that: pwd.

## Inspecting Directories

On the previous screen, you ran the command pwd. It's an acronym that stands for print working directory. It prints to the terminal window the path of the directory we're located in. By default, the directory we will be in when we open our terminal window will always be /home/<username>, where <username> is a placeholder for the user account of whoever happens to be logged in the shell.

The above-mentioned directory is known as the home directory, and it's where people usually save their files. Your home directory is /home/dq. Note that the home directory is not the directory home.

Now that we know where we are, it's time to determine the contents of the directory we are currently in. There is a command that lists the contents of any directory that we can access; it is called ls (short for list), and it is one of the most commonly used commands. It also accepts options and arguments. Below, we see the result of running ls -p /dev in the terminal to the right.

![image.png](attachment:ad1edfbd-1191-43c1-bdc2-bee619fe58f6.png)

Let's look at this command:

- First comes the command ls.
- It is followed by the option -p. This option signals which items are directories by appending a slash to the end of their names, allowing us to distinguish directories from regular files.
- We end with /dev, which is an argument. This directory is the location of [special (or device) files](https://en.wikipedia.org/wiki/Device_file). We advise you not to touch it, lest you break your OS or worse, lest you break Dataquest!

We see that it's possible to list the content of directories in which we are not located. This isn't always the case because we might lack permissions to access certain directories.

When we're located in a folder whose contents we wish to list, the argument is optional. In the example above, if we were located in /dev, running ls -p would yield the same result.

Let's try this command.

## Inspecting Directories Thoroughly

Perhaps you know about the concept of a hidden file. Wikipedia defines a hidden file as a file that, by default, filesystem utilities don't display when showing a directory listing. We can, however, also list hidden files by using the -A option. Once displayed, you can tell that a file is a hidden file if its name starts with a full stop .

 This whole paragraph is also true if you replace the word file with the word directory — even in this sentence!

A file holds more data than its contents, it also has [metadata](https://en.wikipedia.org/wiki/Metadata). In this context, metadata is data about the file itself, not its contents. Examples of metadata are (the most recent) modification time and date of the file, its size, its name, and so on. When listing files, the -l option allows us to display some of the metadata. The l is for long, as it also lists the file in long format. Here is the output of running ls -lp /.

![image.png](attachment:6710731e-8ff0-4e25-8234-33181ffb6357.png)

The first row tells us the amount of disk space (in kilobytes) the folder uses. In this case, it's 2592. Then we have a table whose columns are in order:

- File type and [file or folder permissions](https://en.wikipedia.org/wiki/File_system_permissions) (e.g., drwxr-xr-x)
- The number of [hard links](https://en.wikipedia.org/wiki/Hard_link) the file or folder has (e.g., 1)
- [Owner](http://www.gnu.org/software/libc/manual/html_node/File-Owner.html) of the file or folder (e.g., root)
- Group owner of the file or folder (e.g., root)
- The size of the file or folder in bytes (e.g., 4096)
- The modification date and time of the file or folder (e.g., Jan 24 16:03)
- The file or folder's name (e.g., bin)

Don't worry if you don't know all of these concepts. We will revisit the most interesting ones throughout the course. One useful thing to note is the first character in the permissions column of the table. Whenever we see a d there, it means it's a directory. As a result, when we use long format to list the contents of a directory, we don't need to use the -p option.

Another useful option for the ls command is -h. The h is for human readable and it formats the sizes in a more intelligible manner. For instance, instead of total 2592 it displays 2.6M, which means 2.6 megabytes.

You may have noticed in the previous exercise that there is a directory called prize_winners in your home directory. We'll focus on it in the following exercise.

## Navigating the Filesystem

Now that we know how to inspect directories, it's time to learn how to navigate between them. To do so, we use the command cd, an acronym for "change directory." A common use of the cd command looks like cd [path], where [path] is a placeholder for a path. Let us revisit the following diagram of an example hierarchical directory structure.

![image.png](attachment:759990b4-9edb-4e4c-9894-8bb92e49824e.png)

Let's say our working directory is /home/dq, if we wish to change the working directory to /home/learn, we can run cd /home/learn/ and our working directory becomes /home/learn. Note that the slash after learn is optional - cd /home/learn would work just fine.

Disclaimer: the directory /home/learn does not exist in the provided filesystem. You will see it in the rest of this lesson merely as a demonstration.

## Absolute and Relative Paths

Typing the full path every time we want to change directories is cumbersome. Fortunately, we don't have to type the whole path whenever we want to access a directory that is a child of the current working directory. Let's say our working directory is /home. If we wish to change to /home/dq, it suffices to run cd dq and the shell will know that we mean cd /home/dq.

We just discovered the concepts of absolute path and of relative path. An absolute path is any path that starts with a slash (/). Any other path is called a relative path, because they are paths relative to the current working directory.

In the diagram below, the following statements are true:

- /bin/date is an absolute path.
- bin/diff is a relative path, and it's only valid if the working directory is /usr.
- fs is a relative path, and it's only valid if the working directory is /proc.

![image.png](attachment:1c82c43a-02ad-4e2a-9fe3-49ac70d6fa6c.png)

We can use relative paths with every command, not just with cd. In fact, every time we passed a filename as an argument to a command, we were using its relative path. Here's another example: suppose we're in /home, and we want to list the contents of dq. Instead of ls /home/dq, we can run ls dq.

## Shortcuts for Filesystem Navigation

In the previous exercise, we asked you to use a new option: -a. This option is different from -A, but it's very similar. Relative to the command ls -A /home/dq/prize_winners that you ran on a previous screen, the command you ran in the most recent exercise printed two additional files: . and ... If you look at the first character in their respective rows, you'll see a d. These are actually directories — very special directories.

Informally, they mean "current directory" and "parent directory" respectively. This is true regardless of where in the directory structure you are! So cd .. will take you to the parent directory of your current directory, while cd . will just take you to where you already are. In the following example, we start in /home/learn and then do the following:

1. We move to the parent directory of our starting directory by running cd ..
2. We print the current directory.
3. We re-enter the learn directory.
4. We move up to the parent directory of the parent directory of /home/learn.

/home/learn$ cd ..
/home$ pwd
/home
/home$ cd learn
/home/learn$ cd ../..
/$

Running cd ~ will take you to your home directory.

And so will cd.



cd ~ is a special case of the more general cd ~[username], which takes us to the home directory of the user we insert instead of [username].

## Creating Directories

Earlier in this course, we learned about the CLI and how to use it to explore the filesystem. We will now learn how to make changes to the filesystem. Here are some of the things we'll learn:

- Creating directories
- Deleting directories and files
- Copying directories and files
- Moving and renaming directories and files

In this and in following lessons, we'll be using a fictitious user named learn. Let's list the contents of the home directory of this user (this won't be accessible in our terminal).

In [None]:
/home/learn$ ls

There are two files in the root of the home folder of learn. We can clean things up a bit by grouping the files east and west in a directory for these files and moving them there. Let's begin by learning how to create directories.

To create a directory, we use the command mkdir, an abbreviation of make directory. As with the other commands we learned, this command takes directory paths as parameters. Usually people create directories in the working directory. To create a directory called my_directory in the working directory, we run mkdir my_directory.

Now that we know how to create a directory, let's make one for east and west in /home/learn. We'll name it koasts.

In [None]:
/home/learn$ mkdir koasts

Let's confirm that a directory named koasts was created in the home folder of user learn:

![image.png](attachment:a1f31755-e436-4b93-8f18-a7166da07d5c.png)

In the diagram above, we can see the following:

- The result of running ls while in /home/learn (in the black background).
- The contents of the directory /home/learn after running the above command (in the white background).
  - Note that, indeed, koasts was created.

In this lesson, you'll see diagrams like the one above. The icons will always represent the state of the directory indicated on the bottom-right of the diagram, after running the commands.

We can create more than one directory at a time. To create three directories named dir2, dir3, and dir5, we can run the command mkdir dir2 dir3 dir5.

![image.png](attachment:4f982107-8d70-4ddc-bca7-b83f6020b652.png)

The POSIX standards are relatively lenient regarding which [characters they allow](http://pubs.opengroup.org/onlinepubs/9699919799/basedefs/V1_chap03.html#tag_03_170) in directory and file names:

- Everything is allowed except for / and something called the [null character](https://en.wikipedia.org/wiki/Null_character).

It's also a good idea to avoid the characters >, <, |, :, &, ;, ?, and * because they have special meanings in the shell. For [fully portable](http://pubs.opengroup.org/onlinepubs/9699919799/basedefs/V1_chap03.html#tag_03_282) file names, you should stick to characters in the character range [a-zA-Z0-9._-].

Let's try out mkdir.

## Deleting Directories

In the example on the previous screen, we misspelled the word "coasts" as "koasts." We can fix this by renaming the directory koasts appropriately — a technique we'll learn later.

For now, we're going to work around this. Since this screen is called "Deleting Directories," we're going to delete koasts and create a new directory called coasts – otherwise it would be an awkward title. This works because koasts is empty, so we won't lose any data.

To delete empty directories, we can use the rmdir (remove directory) command (we'll learn about deleting non-empty directories later in the lesson). Running rmdir koasts deletes the directory created in the example on the previous screen. We see this in the diagram below.

![image.png](attachment:af3af14d-f202-442b-8c52-59f0c19bcdf4.png)

Just as mkdir can create more than one directory at a time, rmdir can delete several directories with only one command. The usage is similar to creating several directories, we simply pass the directories we wish to delete as arguments to rmdir, like so:

![image.png](attachment:fc67cdce-1d1d-44e1-80df-54fda04ea5e4.png)

## Failing to Delete Directories

On the previous screen, we mentioned that rmdir deletes empty directories. So what happens when we try to delete a non-empty directory with rmdir?

Recall that prize_winners in /home/dq has a hidden file, namely .mike (the hidden directories . and .. do not count).

Trying to delete a non-empty directory will result in an error, and nothing will happen.

Later in the lesson, we'll learn how to delete non-empty directories. For now, let's move on.

## Copying Files

Now that we have created a directory for our files, we can move on to copying them into it. To copy the files east and west to coasts, we can use the command cp. This command's usage typically looks like this: cp source_files destination.

We want to copy the source files east and west to the directory coasts. We do this by running cp east west coasts — the command interprets the last argument as the destination and not as a file to be copied.

As with the previous commands, this one takes paths as arguments. In the next example, we'll use relative paths — the names of the files and directories will suffice.

![image.png](attachment:cbf36757-124d-41ff-86e6-12a743689342.png)

We can see that the files east and west are now in the directory coasts, as well as in /home/learn, because they were copied. You might have also noticed that we switched the order of east and west when we ran the command. This is to demonstrate that the order of the source files doesn't matter to cp.

The parameter destination doesn't have to be a directory; it can be a filename. If we want to have a copy of the file west in /home/learn called california_love, we could run cp west california_love.

## Hidden Dangers of Copying Files

We should be careful when copying files because cp will silently overwrite files with the same name.

For instance, running cp east coasts would replace the contents of /home/learn/coasts/east with those of /home/learn/east, and this would be problematic if they happened to be different.

To protect ourselves against this, we can enter the interactive mode of cp with the option -i, as we can see in the .gif below:

![image.png](attachment:f64aa50b-3c14-487f-87b3-6f26016d52ad.png)

## Copying Directories

Copying a directory isn't as simple as copying files. Let's try to make a copy of coasts called beaches by running cp coasts beaches.

![image.png](attachment:2480907d-445c-45c8-a21b-3b415dc04921.png)

The message cp: omitting directory ‘coasts’ is a bit cryptic, but we can see in the diagram that beaches wasn't even created. 

What the error message is saying is that, from all the sources used in the command, it is skipping the directory coasts. The only source of the command was this directory, otherwise cp would have tried to copy whatever other sources were passed to it.

As a consequence, the above paragraph tells us that our attempt at copying coasts didn't work. To successfully copy a directory, we need to use the -R option with cp. The R stands for recursive, as it will get into the directory's subdirectories, and into these subdirectories' subdirectories, and so on, copying everything. Here's how we can use it.

![image.png](attachment:2f4a5810-5153-4921-ba0e-a485a983084e.png)

Great success!

On the previous screen, we learned that we need to be careful when copying files so we don't overwrite them. Copying directories has a different behavior. Now that beaches exists, we're going to run the same command that we just ran.

![image.png](attachment:cd056b5b-8fd8-460e-ad4b-003a4f8582ca.png)

It created a copy of coasts inside beaches, leaving everything else untouched. This is the behavior we also experience in graphical user interfaces.

##  Deleting Directories and Files

Now we'll learn how to delete files. Contrary to what happens in graphical user interfaces, recovering deleted files or directories isn't a simple task. We should always be very careful with what we delete.

To delete a file called my_file in the working directory, we use the command rm (for remove) like so: rm my_file.

The behavior of rm is very similar to that of the other commands we learned in this lesson. It also has an interactive mode, just like cp, which is accessible through the -i option. If you're not going to use the -i option when deleting files, you should make sure you're deleting the correct files.

Here's how we can delete the files east and west in the home directory:

![image.png](attachment:7b560946-3e29-4350-8103-acc8b7a7d50d.png)

Naively trying to delete a directory with this command will result in an error. As you can see below, we couldn't get rid of beaches.

![image.png](attachment:ff0c4f86-0fdc-405f-a19e-136694387c74.png)

To delete directories, empty or not, we can use this very same command with the option -R. The behavior is similar to what we have already seen on the previous screen with cp. To delete a directory called my_directory in the current working directory, run rm -R my_directory. Let's look at an example:

![image.png](attachment:f834c5db-61e5-4dba-8ea8-701d43369390.png)

## Moving and Renaming Directories and Files

On this screen, we'll learn about moving files and directories, as well as renaming them. When moving a file, the original file is deleted, and a copy is placed in the location of the user's commands. This new file may or may not have the same name.

The command we'll use to move files is mv. Assuming our current directory is /home/learn, to move the files coasts/east and coasts/west back to the home directory, we can run mv coasts/east coasts/west /home/dq. We are, however, going to perform this action in a slightly different manner.

![image.png](attachment:42d434af-a489-43b0-959b-669f93560e3d.png)

Note the destination directory (./). In the first lesson, we learned about the directories . and ... They refer to the current directory and to the parent directory of the current directory, respectively. This is an example of how referring to our current directory can be useful. The slash is optional, running mv coasts/east coasts/west . would accomplish the same thing.

Contrary to what happens with cp and rm, mv doesn't require a recursive flag to move directories; it works without it. However, similarly to what happens with the aforementioned commands, mv has an interactive mode that is accessible by using the -i option. This is because, just like cp, mv will replace files with the same name. For example, if we move a file named my_file to a directory where there already is a file called my_file, the latter will be replaced by the former.

Unix shells don't have a built-in rename command. The action of renaming a file or directory is realized by using mv. Let's say we want to capitalize the first letter of the name of the file east in /home/learn. Here's how we accomplish this.

![image.png](attachment:aa24ea1c-6877-4ccc-b86a-26ab7ee83424.png)

Great work so far! In the next lesson, we're going to learn the following:

- Glob patterns and wildcards
- How to use wildcards
- How to search for files from the command line

## Wildcards and Globbing Patterns

We've been using files and their names in the other lessons in this course. With what we've learned so far, the only way to copy, for example, one hundred files in the command line is to pass each of the files' names as arguments to cp.

It seems like graphical user interfaces have a definite advantage in this regard, except they don't! In fact, the command line is usually much more powerful for this sort of task.

The shell gives us a way to specify groups of files by creating patterns to match filenames. In this lesson, we will learn about this feature.



The patterns we create to match filenames are [glob patterns](https://en.wikipedia.org/wiki/Glob_%28programming%29). This works in a similar way to regular expressions, which you learned earlier, only the characters and their roles are a bit different. We build glob patterns from special characters called [wildcards](https://en.wikipedia.org/wiki/Wildcard_character) and from regular characters.

Before we see how all of this works in the shell, let's see it on Google's search engine! In the .gif below, we look for exact matches of the sequence of words Largest * in the world. The character * is one example of a wildcard; it acts as a placeholder for any word. In fact, it is a placeholder for any number of characters, including zero characters and spaces, which means it can also match several words.

![image.png](attachment:a61c3396-5ec9-4046-b275-cbce8285c59b.png)

We see in the first few result previews, in bold, that * matched the words countries, epic, country, and city. In the shell, the behavior of * is almost the same: it will match any character, any number of times, except for leading dots (.). To draw a parallel with regular expressions, * behaves like the regex pattern .*, although there are some differences.

## The * Wildcard

We will see * in action with the ls command. We already learned that ls allows us to inspect the contents of directories.

Let's continue with the example from the previous lesson. Here's the state of the directory /home/learn as we left it:

![image.png](attachment:f5fd7105-3053-4a2a-89eb-6716797ee1ab.png)

Recall that ls also accepts filenames as parameters:

Because both East and west are in /home/learn, the shell prints to screen the names of the files:

In the previous lesson, we mentioned that * will match any character, any number of times, except for leading dots (.).

Passing * as an argument to ls will cause it to list all non-hidden files and directories in the working directory, plus all files at the root of the listed directories. Let's see what this looks like:

/home/learn$ ls *

Here's what happens in the background:

1. For each file/directory, the shell checks to see if * matches its name. As we've learned, it will match every name except for the names of any hidden files.
2. The names that are matched are passed as parameters to ls. Since the matched names are coasts, East, and west, running ls * in this instance is the same as running ls coasts East west.
3. Therefore, the contents of coasts are listed (it happens to be empty), in addition to the contents of the working directory.
4. To facilitate reading, the results are [pretty-printed](https://en.wikipedia.org/wiki/Prettyprint).


We can also use wildcards in conjunction with other characters to form more complex patterns, just like regular expressions. We do this by [concatenating](https://en.wikipedia.org/wiki/Concatenation) wildcards with other characters in a way that fits our intent.

Let's look at an example. If we want to list all the files (or directory content) in /home/learn with names ending in st, we can do this by running ls *st.

/home/learn$ ls *st

The pattern we're using to match filenames is *st. It's the concatenation of the wildcard * with st. * matches both Ea (in East) and we (in west), and st matches st in both filenames. Nothing else is matched because no other file or directory names end with st.

## The ? Wildcard

We mentioned that the wildcard * behaves like the regex pattern .*, despite being different in some cases. In the regular expressions lesson, you learned that . by itself matches (almost) any character exactly once, while appending * makes it match (almost) any character any number of times.

There's also a wildcard equivalent of . – again, with some differences. It is the character ?.

The wildcard ? matches any character exactly once. For example, if we use the pattern ?its, it will match any filename that is four characters long and ends with its. Other examples include @its, fits, and hits, as well as many others.

Here is another example with ls:

/home/learn$ ls w??t

We used the pattern w??t, which matches any four-character word that starts with w and ends with t. Because west is the only file/directory with a name that conforms to these rules, the output is west.

## Escaping Characters

You may wonder what happens when a filename uses the character ? (or *, for that matter) in its name.

As with regular expressions, we can use a character's literal meaning (as opposed to its special meaning) by "escaping" it with a backslash (\). In these circumstances, a backslash is an [escape character](https://en.wikipedia.org/wiki/Escape_character).

In the next code snippet, we'll create a directory with a name that contains a question mark, and we'll achieve this by escaping the wilcard ?.

![image.png](attachment:6cd05413-7eae-4e7d-bedc-5e8f1971141e.png)

We escaped the question mark in the command mkdir coast\? so that the shell would interpret it literally.

Listing all file and directory content names that are six characters long, start with coast, and end with a question mark by running ls coast? will match both coasts and coast?.

/home/learn$ ls coast?

If we want it to only match coast?, we need to escape ?. Here's one way to do it:

/home/learn$ ls coast\?

This will result in no output since coast\? is an empty directory.

We can also escape special characters by using single quotes around the word we're trying to escape. So, we could also do what we did above with mkdir 'coast?' and ls 'coast?'. Escaping * works similarly.

Potential problems like this are part of the reason why we learned in the previous lesson that certain characters — although allowed in file and directory names — are worth avoiding.

## The Wildcard []

We also have a wildcard that allows us to match specific characters, as opposed to any character. We call it the square brackets wildcard. In order to match either a, i, or u, we can use the wildcard [aiu]. This will match only one occurrence, just like ?.

If we want to list all files or directory content with names that start with either a, i, or u, we can do so by running ls [aiu]*. The wildcard [aiu] will ensure that the name of the file/directory starts with one of the listed vowels, while * will match everything that follows.

We must pass the characters we wish to match into [] without separators, because any character inside [] will take its literal meaning (there are exceptions for specific sequences of characters. We'll let you know about them when we learn about variables in the shell in a later course).

Let's see an example. We're going to list all directory content and files with names that satisfy the following requirements:

- First letter must be a vowel.
- Is four characters long.
- Last letter must be s, t, or u.

[aeiouAEIOU]??[stu]

Let's examine this pattern:

- First letter must be a vowel: begin pattern with [aAiIuUeEoO].
- Is four characters long: confirm that pattern matches four characters by using ? appropriately.
- Last letter must be s, t, or u: end pattern with [stu].

We need to remember that lowercase letters are different characters than uppercase letters when we're creating patterns.

The behavior of [] is similar in regular expressions. In regular expressions, the pattern [^aiu] matches any character that isn't a, i, or u. Wildcards have a similar power, only it's ! instead of ^. So, for example, the pattern Data[!qQ]uest will match any variation of the pattern Data?uest, in which the fifth character is not q and is not Q.

We should also remember what happens when we use a pattern that doesn't match anything together with ls.

Since there aren't any files or directories in the working directory with names that are three characters long, ??? didn't match any filename, so ??? was passed as an argument to ls without its special meaning. This led the shell to try and list a file/directory that was literally called ???, hence the error message.

## Other Wildcards

When using regular expressions, we can refer to sets of characters instead of listing them one by one. The shell's glob patterns have the same functionality. We can use character ranges like [a-z], [A-Z] and [0-9].

We can also use character classes like [:alpha:] (the usual letters), [:digit:] (the numbers 0 through 9), [:lower:] (lowercase letters), [:upper:] (uppercase letters), and [:alnum:] (letters and numbers). You can read more about character classes [here](https://www.gnu.org/software/grep/manual/html_node/Character-Classes-and-Bracket-Expressions.html).

Character ranges and character classes are not square bracket wildcards. They are wildcards just like ? or *, although we must use them inside square brackets, otherwise they will be interpreted literally. Here are a couple of usage examples:

- To list all files (and the content of directories) in the working directory with names that end in ., directly followed by three lowercase letters, we can run ls *.[[:lower:]][[:lower:]][[:lower:]].
- To list all files (and the content of directories) in the working directory with names that do not start with an uppercase letter and end with a number, we can run ls [![:upper:]]*[[:digit:]].

We don't recommend using character ranges, and we should also be careful with character classes. Here's an example that illustrates why:

![image.png](attachment:b133ebcd-ca4b-4123-af2b-782558ea2e53.png)

Trying to capture the contents of a and z by using the character range [a-z] yields some counterintuitive results:

/home/learn$ ls [a-z]

It outputs A (which was unexpected) and not Z (which was expected but odd given that A was printed). This happened because of something called locale.

The POSIX documentation defines [locale](http://pubs.opengroup.org/onlinepubs/9699919799/basedefs/V1_chap07.html#tag_07_01) as "a definition of the subset of a user's environment that depends on language and cultural conventions." This refers to, among other things, settings like keyboard layout, date and time formats, currency, and also the order of characters.

The reason why Z wasn't output and A was is because, in this platform, the locale settings are such that A is in the range [a-z] and Z isn't. We'll have a chance to see this happen again when we learn about the command sort.

A safer bet is to use character classes:

/home/learn$ ls [[:lower:]]

But even those aren't without problems unless the system uses a [POSIX compliant locale](http://pubs.opengroup.org/onlinepubs/9699919799/basedefs/V1_chap07.html#tag_07_03_01). This can become difficult as we get further and further away from American standards.

Don't worry if some of the character ranges we saw at the beginning of the screen don't work on your computer — that's probably a consequence of the locale

## Summary and Practice

We've only used wildcards with the ls command, but they work with most commands that we've learned so far: cp, mv, rm, and rmdir, to name a few.

We reiterate that you should be very careful when using wildcards with commands like rm, cp and mv because they can have disastrous consequences. Before using filesystem-altering commands with wildcards, make sure they'll work as you intend by using them with ls first.

Here is a table that summarizes and exemplifies some of what we've learned about wildcards.

![image.png](attachment:f2a061d9-6962-4086-96c8-71b9d7ec6221.png)

## Searching for Files

One basic functionality of using the command line to interact with the filesystem that we haven't explored yet is searching for files.

We can search for files by using the find command. This command is extremely powerful — in addition to letting us search for files by name, it allows to search for files as follows:

- Last time the file was accessed
- Last time the file was modified
- What type of file it is (directory, regular file, special file, etc.)

And many other criteria. It also allows us to perform actions on the files that are found. On this screen, we'll only learn how to use find to search by name, but it's good to know how powerful this tool can be.

A simplified usage of this command looks like find [location] -name ['filename']. Let's examine this:

-  refers to the command itself.
- location refers to the directory in which we'll perform the search.
It will also include all subdirectories of this directory.
- -name tells find that the criteria we're using is the filename.
- 'filename' is the name of the file that we're searching for.

Before we look at an example, recall what the directory /home/learn looks like at the moment:

![image.png](attachment:8ab8098b-a02b-455f-8006-3864d6981b69.png)

Let's try using the find command to locate all the files named East.

/home/dq/learn$ find / -name 'East'

Recall that / indicates the root of the filesystem; this will ensure that we're looking in the entire system. And here's the output:

Quite a confusing output! Let's interpret it: the results we're interested in are in the lines that don't end with "Permission denied":

The command returned the path of all files named East to us. The other lines are simply directories that we can't access due to lack of permissions, hence the error messages.

When searching for files, a GUI will usually ignore the case of the characters in the name. We can do the same with find by using -iname instead of -name. Running find / -iname east would find all of the results we received above, perhaps more (if there happens to be any file named east).

The reason why we waited until now to learn about find is that we now know about glob patterns! And we can use them to full effect with find.

In the following example, we will use a glob pattern to search for any files or directories in /home with names that are two characters long and end with q:

/home/learn$ find /home -name '?q'

You may have noticed that we used single quotes throughout the examples. To ensure that find behaves as we expect it to in regards to wildcards, we should use single quotes.

In this lesson, we modified the functionality of find to not output the error messages, so don't think it's odd that you don't see them when you solve this exercise.

We're almost done with the first course! In the next lesson we'll learn about the following:

- Multiuser systems
- Users
- Permissions
- Ownership