<h1>Table of Contents<span class="tocSkip"></span></h1>
<div class="toc"><ul class="toc-item"><li><span><a href="#Intro" data-toc-modified-id="Intro-1"><span class="toc-item-num">1&nbsp;&nbsp;</span>Intro</a></span><ul class="toc-item"><li><span><a href="#WHY-CLI" data-toc-modified-id="WHY-CLI-1.1"><span class="toc-item-num">1.1&nbsp;&nbsp;</span>WHY CLI</a></span></li><li><span><a href="#Go-For-The-Gold" data-toc-modified-id="Go-For-The-Gold-1.2"><span class="toc-item-num">1.2&nbsp;&nbsp;</span>Go For The Gold</a></span></li><li><span><a href="#Diff" data-toc-modified-id="Diff-1.3"><span class="toc-item-num">1.3&nbsp;&nbsp;</span>Diff</a></span></li></ul></li></ul></div>

# Intro

In this mission, we will learn:

- What the command line is
- Why it is important
- The basics of the command line

There are a lot of hyperlinks in case you wish to dive deeper into the concepts, but it's not necessary to read them to understand the material. This course is self-contained, so we recommend that you wait until you've completed the course and then go back and read through them. Let's start learning.

Most people interact with computers exclusively through [graphical user interfaces](https://en.wikipedia.org/wiki/Graphical_user_interface) (GUIs). A GUI is a user interface that allows its users to interact with computers by means of icons and [pointing devices](https://en.wikipedia.org/wiki/Pointing_device), like mouses and fingers.

Examples of GUIs include the operating systems Android (phones are computers, too!), iOS, OS X, Ubuntu and Windows, the most popular internet browsers like Google Chrome and Mozilla Firefox, as even the Dataquest platform you've been using.

Below are some screenshots of GUIs. We can see on the left a file explorer on an Android phone, and on the right the desktop of a laptop running Ubuntu - both of which are based on the [Linux kernel](https://en.wikipedia.org/wiki/Linux_kernel) (the computer program that makes Linux what it is).

But people haven't always interacted with computers this way. Historically, at a time when computers were expensive, gigantic monstrosities only owned by large institutions and shared by multiple people, computers were operated through [command-line interfaces](https://en.wikipedia.org/wiki/Command-line_interface) (CLIs). A CLI is a text only interface through which users interact with computers by typing text instructions in a [console](http://www.linfo.org/console.html) (or **terminal**), using specific syntax.

The above-mentioned instructions are called **commands** which, once input, are [interpreted](https://en.wikipedia.org/wiki/Interpreter_(computing)) by a type of program called a shell or **command language interpreter**, and then run by your machine. Some of the most popular shells are Bash, Z shell, KornShell, Command Prompt and Windows PowerShell. If you have installed Anaconda on Windows, then you might have noticed one of its programs is Anaconda Prompt, which is another shell. It's basically the Command Prompt with a few settings on top.

Even though they do not all mean the same thing, the terms command-line interface, command language interpreter, shell, console, terminal window and other variants are often used interchangeably.

The most common operating systems for laptops and desktop computers nowadays are Linux, OS X and Windows, and they all come equipped with a terminal. Linux and OS X are Unix-like (or *nix) operating systems, that is, they behave like Unix (an old, still in use, operating system), and this similarity has been transported to their respective command language interpreters. Where as Windows' shells are independent and very different from the others.

Because it's more common to use Unix-like operating systems in data science, in this course and its successors we'll be using a Unix shell, namely Bash. It is also possible to run Unix shells on Windows by using a compatibility layer like Windows Subsystem for Linux (displayed in the screenshot above) - an official Windows tool, if you are on Windows 10 and have a 64-bit CPU, or Cygwin. You can access our install guides for these tools here and here, respectively. If you are on Windows, we recommend you wait until the end of the mission before installing one of these alternatives, and we recommend you definitely do install one of these, so you can explore and practice what you learn here on your own.

Despite the fact that Unix shells are very similar, there are still some differences between them. In order to ensure that what you learn here is as universally applicable as possible, we'll be focusing on learning portable commands, by following a set of standards for Unix-like operating systems called POSIX (pronounced pahz-icks as in positive). This way, we'll actually be studying several shells simultaneously, while exemplifying with Bash. We'll let you know when a command isn't POSIX compliant.

Now that you know what the CLI is, it's time to understand why you should learn it. Let's move on to the next screen.

## WHY CLI
"But Dataquest, I've been doing just fine without a CLI, why should I start using one now?" Have you ever used a program that fit your needs almost perfectly, but there were a couple of simple features missing that you wished you could add? When using programs in the CLI, you are able to accomodate those needs!

As you become more experienced with it, you'll often prefer to simply build your own tools to solve the problems you have by using commands — the shell's building blocks.

Here are a few more reasons why it is important to learn the CLI:

- It is a very popular technology, which means support abounds.
- It is valued by employers.
- As was already mentioned, it is a very common tool in data science.
- It allows you to automate repetitive tasks.
- It has extremely powerful utilities for data processing/[data-driven programming](https://en.wikipedia.org/wiki/Data-driven_programming) like [AWK](https://en.wikipedia.org/wiki/AWK) and [sed](https://en.wikipedia.org/wiki/Sed).
- It is one of the best ways to use cloud services, which will be useful when you need more computing power than you can access locally. As example, see the screenshot below, taken from the tutorial [Get Started with Deep Learning Using the AWS Deep Learning AMI](https://aws.amazon.com/blogs/machine-learning/get-started-with-deep-learning-using-the-aws-deep-learning-ami/). We can see a command line being used to setup [AWS](https://en.wikipedia.org/wiki/Amazon_Web_Services) with Jupyter Notebook:

For a more in-depth view as to why the CLI is important, we suggest you read this Dataquest blog post after you complete the mission.

On the right part of this screen we have a terminal window. The text `/home/dq$` indicates that the CLI is ready to accept input. To give it some input, we can click just to the right of `$` for the terminal to get focus, which is indicated by a blinking cursor. Once it gets focus, the terminal will be ready to receive input from the keyboard. We can then type a command and hit "Enter" to tell the shell to run it.

## Go For The Gold
On the previous screen, we ran the command `date`, which printed the current time and date.

The piece of text /home/dq$ is called [command prompt](https://en.wikipedia.org/wiki/Command-line_interface#Command_prompt) — or just prompt — because it prompts the user to insert a command. The command prompt is customizable and can vary in apperance, depending on the settings. Typically it will end with $. To the left of it, it will usually be anything from nothing at all to variations of `[username]@[machinename]` `[directory]`, in which `[username]`, `[machinename]` and `[directory]` are placeholders for, respectively, the user, the name of the computer where the shell is running, and the current working directory. In our case, we have only the current working directory, `/home/dq`. We'll learn about all of these concepts later.

It's important to note that the command prompt is provided automatically by the terminal; we do not have to type it in ourselves. In fact, we shouldn't do that.

The screen area to the right of the prompt is called a [command line](http://www.linfo.org/command_line.html) and it is where we, as users, enter commands.

If we type some random letters and hit Enter, we will most likely not execute a valid command and we will get an error message. Here's an example.

`/home/dq$ notSoRandom`
`bash: notSoRandom: command not found`

We tried to run the "command" notSoRandom and a message was printed to the screen saying that the command wasn't found. Let's try running an invalid command.

## Diff
The commands we pass to the shell, in their most elementary form, look like this:

`utility_name parameter1 parameter2 ... parameterN`

for some [non-negative integer](https://en.wikipedia.org/wiki/Natural_number) `N`, (i.e. `N` can be `0`, `1`, `2`, `3`, and so on). As we've seen with the `date` command, parameters aren't always mandatory. Every item in a command is case sensitive. For instance, running `Date` is not the same as running `date`.

By definition, a **parameter** is either an **option** or an argument. An option is a string of symbols that modifies the behaviour of the command and it always starts with a dash (-), other possible names for this parameter are flag and **switch**, but depending on who you ask they might not always be interchangeable. An **argument** — or **operand** — is an object upon which the command acts. The **utility** (also **command** or **program**) is the first item in the instruction.

Let us see another example. We will compare the text file `west`:

`West side is the best!
Windows is the best!
Dataquest is the best!`

with the text file `east`:

`East side is the best!
Linux is the best!
Dataquest is the best!`

Because these are relatively small files, it's easy to eyeball their differences, but if that wasn't the case, we could get their differences by using the `diff` command. Running `diff -y west east` prints to the screen the content of both files and marks the rows that are different by having a `|` next to the rows of the second file that differ from the first:

Let us dissect this command:

- `diff` is the command name.
- The command has three parameters:
    - `-y` is an option, as it starts with a dash.
    - `west` is an argument, as it is one of the objects upon which `diff` acts
    - `east` is the last parameter and second argument.

We'll learn more about `diff` in a later mission.
In this missions' exercises you'll have access to the text files `augustus`, `violet`, `veruca` and `tv`.