In this file, we will learn:

* What the command line is
* Why it is important
* The basics of the command line

Most people interact with computers exclusively through graphical user interfaces (GUIs). A GUI is a user interface that allows its users to interact with computers by means of icons and pointing devices, like mouses and fingers.

Historically, at a time when computers were expensive, gigantic monstrosities only owned by large institutions and shared by multiple people, computers were operated through command-line interfaces (CLIs). A CLI is a text only interface through which users interact with computers by typing text instructions in a [console](http://www.linfo.org/console.html) (or terminal), using specific syntax.

As technology evolved and computers became ubiquitous, terminals were emulated within GUIs, giving rise to [terminal emulators](https://en.wikipedia.org/wiki/Terminal_emulator) (also terminal window or just terminal).

Instructions sent through terminal are called **commands** which, once input, are [interpreted](https://en.wikipedia.org/wiki/Interpreter_(computing)) by a type of program called a [shell](https://en.wikipedia.org/wiki/Shell_(computing)) or **command language interpreter**, and then run by our machine. Some of the most popular shells are Bash, Z shell, KornShell, Command Prompt and Windows PowerShell. If we have installed Anaconda on Windows, then we might have noticed one of its programs is Anaconda Prompt, which is another shell. It's basically the Command Prompt with a few settings on top.

Even though they do not all mean the same thing, the terms command-line interface, command language interpreter, shell, console, terminal window and other variants are often used interchangeably.

The most common operating systems for laptops and desktop computers nowadays are Linux, OS X and Windows, and they all come equipped with a terminal. Linux and OS X are [Unix-like](https://en.wikipedia.org/wiki/Unix-like) (or *nix) operating systems, that is, they behave like Unix (an old, still in use, operating system), and this similarity has been transported to their respective command language interpreters. Where as Windows' shells are independent and very different from the others.

Because it's more common to use Unix-like operating systems in data science, we'll be using a Unix shell, namely Bash.

It is also possible to run Unix shells on Windows by using a [compatibility layer](https://en.wikipedia.org/wiki/Compatibility_layer) like [Windows Subsystem for Linux](https://en.wikipedia.org/wiki/Windows_Subsystem_for_Linux) - an official Windows tool, if we are on Windows 10 and have a 64-bit CPU, or [Cygwin](https://en.wikipedia.org/wiki/Cygwin). We can access our install guides for these tools [here](https://www.dataquest.io/blog/tutorial-install-linux-on-windows-wsl/) and [here](https://cygwin.com/install.html), respectively.

Despite the fact that Unix shells are very similar, there are still some differences between them. In order to ensure that what we learn here is as universally applicable as possible, we'll be focusing on learning [portable](https://en.wikipedia.org/wiki/Software_portability) commands, by following a set of standards for Unix-like operating systems called [POSIX](https://en.wikipedia.org/wiki/POSIX) (pronounced pahz-icks as in positive). This way, we'll actually be studying several shells simultaneously, while exemplifying with Bash. We'll let you know when a command isn't POSIX compliant.

Now that we know what the CLI is, it's time to understand why we should learn it.

Have you ever used a program that fit your needs almost perfectly, but there were a couple of simple features missing that you wished you could add? When using programs in the CLI, we are able to accommodate those needs!

As we become more experienced with it, we'll often prefer to simply build our own tools to solve the problems we have by using commands — the shell's building blocks.

Here are a few more reasons why it is important to learn the CLI:

*  is a very popular technology, which means support abounds.
* It is valued by employers.
* As was already mentioned, it is a very common tool in data science.
* It allows us to automate repetitive tasks.
* It has extremely powerful utilities for data processing/[data-driven programming](https://en.wikipedia.org/wiki/Data-driven_programming) like [AWK](https://en.wikipedia.org/wiki/AWK) and [sed](https://en.wikipedia.org/wiki/Sed).
* It is one of the best ways to use cloud services, which will be useful when we need more computing power than we can access locally.

#  let's get started

The text **\$** indicates that the CLI is ready to accept input. To give it some input, we can click just to the right of $ for the terminal to get focus, which is indicated by a blinking cursor. Once it gets focus, the terminal will be ready to receive input from the keyboard. We can then type a command and hit "Enter" to tell the shell to run it.

Type **date** into the terminal

The piece of text **/home/waqas\$** is called command prompt or just prompt because it prompts the user to insert a command. The command prompt is customizable and can vary in appearance, depending on the settings. Typically it will end with $. To the left of it, it will usually be anything from nothing at all to variations of **username/machinename/directory**, in which username, machinename and directory are placeholders for, respectively, the user, the name of the computer where the shell is running, and the current working directory.

It's important to note that the command prompt is provided automatically by the terminal; we do not have to type it in ourselves. In fact, we shouldn't do that.

The screen area to the right of the prompt is called a command line and it is where we, as users, enter commands.

If we type some random letters and hit Enter, we will most likely not execute a valid command and we will get an error message.

Run an invalid command, such as **alsoNotRandom**.

Every item in a command is case sensitive. For instance, running **Date** is not the same as running **date**.

By definition, a parameter is either an **option** or an **argument**. An option is a string of symbols that modifies the behavior of the command and it always starts with a dash (-). Other possible names for this parameter are **flag** and **switch**, but depending on who we ask they might not always be interchangeable. An **argument** or **operand** — is an object upon which the command acts. The **utility** (also **command** or **program**) is the first item in the instruction.

Because these are relatively small files, it's easy to eyeball their differences, but if that wasn't the case, we could get their differences by using the **diff** command. Running **diff -y west east** prints to the screen the content of both files and marks the rows that are different by having a **|** next to the rows of the second file that differ from the first:

Let us dissect this command:

* diff is the command name.
* The command has three parameters:
 * -y is an option, as it starts with a dash.
 * west is an argument, as it is one of the objects upon which diff acts
 * east is the last parameter and second argument.

We may wonder how we can tell what options are available for each command. As with Python, there is documentation for the shell. Before we learn how to access it and read it, we're going to need some more knowledge, so we'll leave it for later.

For now, let's look at an excerpt taken from the diff documentation.

![image.png](attachment:image.png)

Note the synopsis section. It is illustrative of the most common appearance of a command:

command -options arguments

As we may have noticed, the documentation mentions short options and long options. So what are these? Most of the time, we'll be able to choose between using the short form or the long form of an option.The output is the same in every way. An option is said to be a long option if it starts with --, otherwise it is a short option.

The documentation tells us that we can use --side-by-side in place of -y. This  diff -y west east and diff --side-by-side west east yield the same result. The only real advantage of long options is that they usually are more descriptive, which makes them:

* Easier to remember
* Easy to guess what they do

A non-obvious disadvantage of long options is that they are not POSIX-compliant, as can be seen in one of the guidelines [here](http://pubs.opengroup.org/onlinepubs/9699919799/basedefs/V1_chap12.html):

* Each option name should be a single alphanumeric character (...)
* Multi-digit options should not be allowed.

Getting back to the diff documentation excerpt, we might have noticed that one of the options is -q which only reports that the files are different, when they in fact are. This can be useful when we only care if the files are different and not what their differences are (specially if the files are very large), as we'll not want to have our terminal window filled with uninformative text.

A very nice feature of options is that they can be combined. To do it we simply write all the options separated by a space, as if it were any other parameter. It's a best practice to list options in alphabetical order.

The command **diff -q -y west east** merely prints Files west and east differ, it won't print the files side-by-side despite the presence of the -y option. This illustrates that sometimes options are incompatible. We need to assess on a case-by-case basis to see how the combination of incompatible options behaves.

The previous command can be replaced with diff -qy west east.

Sometimes we will want to run past commands repeatedly or make slight modifications to them. It would be annoying to have to type the commands from scratch every time. Fortunately, most shells offer an interactive way of accessing the command history by means of the up and down arrow keys.

Alas, this feature isn't available in all shells (for instance, dash doesn't support it). But there are other ways to access command history.

In Bash, we can access command history by running the history command. By running history command we can notice how each command has a number at the start of line. This is useful for another (non-portable) Bash feature called **history expansion**.

History expansion allows us to very quickly reference lines in the command history by the number to the left of the command. We can do this very quickly by running lets say **!7**

We can also reference commands counting from the end, just like we can with Python lists. 

* To run the last command we can run !-1. Or alternatively !!.
* To run the next-to-last command we can run !-2.
* And so on.

Many people find that having a screen filled with irrelevant information is distractive. This happens frequently after using history. To clear the screen, we can use the command **clear**.

When we're done using the command-line, we can terminate our session by pressing the "X" button to close the terminal window.

Alternatively, we can run the **exit** command. If there's only one running session, this will also close the window. 