<img src="http://imgur.com/1ZcRyrc.png" style="float: left; margin: 20px; height: 55px">

# Using the Command Line

_Author: Kiefer Katovich and Dave Yerrington | DSI-SF. Adapted by: Vlada Rozova_

---


## Learning Objectives
*After this lesson, you will be able to:*
- Create folders and files using the command line (`mkdir`, `touch`).
- Change directories and list directory content (`cd`, `ls`).
- Check the current working directory (`pwd`).


## Student Pre-Work
*Before this lesson, you should already be able to:*
- Complete the [GA pre-work](http://generalassembly.github.io/prework/cl/#/).
- Open the terminal.
- Familiarize yourself with the [UNIX] [1] commands `cd`, `pwd`, `mkdir`, `pwd`, and `touch`. (Don't worry,
    we'll be going over these commands in class, just make sure you are able to describe what each of them means).

[1]: http://mally.stanford.edu/~sr/computing/basic-unix.html    "UNIX"

## Table of Contents

---
- [Command Line](#command_line)
- [Paths](#paths)
- [Independent Practice](#independent_practice)
- [Editing and Examining Files](#editing_files)
- [Check for Understanding](#check)
- [Environments for Data Science](#ide)
- [Conclusion](#conclusion)

<a id='command_line'></a>


## Why Command Line?

---

Everything you can do in a windowed environment, you can do in the terminal — and faster. When you use the mouse, click on buttons, confirm operations, or wait for windows to load, you waste time! Terminal commands can assist you with:

* Running processes
* Finding files
* Substring match of file contents
* Assessing performance
* Remote operations
* Installing packages
Managing your development environment
Digging into your computer's state and resources
Allowing more dynamic use of your computing resources


**Windows Users**

In this class, we will use a popular UNIX shell called `bash`. If you have not already, we recommend installing [Git Bash](https://git-for-windows.github.io/).

**What Is Git Bash?**
Git Bash provides a set of executables that emulate `bash` commands in the Windows shells.

**Why?**
Since UNIX is the prefered operating system for programmers and developers alike, almost all of the lessons at GA have been written for interaction with UNIX.  

**Pro Tip:** UNIX commands and Windows commands are not polar opposites — in fact, most of their commands and interactions are quite similar.

<a id='paths'></a>

## Paths

---

Every file or folder in a file system can be read, written, and deleted by referencing its position inside that system. When we talk about the position of a file or a folder in a file system, we refer to its "path." There are two different kinds of paths we can use to refer to a file — **absolute paths** and **relative paths**.

**Directory** is an important term that is typically used interchangeably with *folder*. Technically, a folder contains the files themselves, whereas a directory is just a listing of them (e.g. a student directory). That said, when we say "navigate to your project directory," think of it as "navigate to your project folder."

### What Is an Absolute Path?

An **absolute path** is the specific location of a file or folder as accessed from the **root** directory, typically shown as `/` (or `\` in DOS/Windows systems). The root directory is the starting point from which all other folders are defined and is not usually the same as your **home** directory, which is normally found at `/Users/[Your Username]`.

### Let's have a look at our home directory

Typing **`cd`** — a command for "change directory" — with no parameters takes us to our home directory.

```bash
cd
```

If we type in `pwd` — a command for "print working directory" — from that folder, we can see where we are in relation to the root directory. The `pwd` command will always give you the absolute path of your current location:

```bash
[Your Username]:~$ pwd
/Users/[Your Username]
```

Notice that this path starts from `/` directory, which is a **root** directory for every Linux/Unix machine.

### What Is a Relative Path?

A relative path is a reference to a file or folder **relative** to your current position or the present working directory (pwd). If we are in the folder `/a/b/` and we want to go to a subfolder c, we can simply type:

```bash
cd c
```

Similarly, if we are in the folder `/a/b/` and want to open the file that has the absolute path `/a/b/c/file.txt`, we can type:

```bash
open c/file.txt
```

or

```bash
open ./c/file.txt
```

We can also use the absolute path at any time by adding a slash to the beginning of the relative path. The absolute path is the same for a file or a folder, regardless of the current working directory, but relative paths differ based on directory. Directory structures are laid out like so: `directory/subdirectory/subsubdirectory`.

**Check:** What is the difference between an absolute path and a relative path?


### Navigating Using the Command Prompt

* Changing directories
* Listing files
* Creating directories and files
* Removing files

The tilde (`~`) character is an alias for your home directory. Use it to quickly return home.

```bash
cd ~
```

Or, even more simply, we can type:

```bash
cd
```

The tilde is useful for shortening paths that would otherwise be absolute. For example, to navigate to your desktop, we can type:

```bash
cd ~/Desktop
```

Another useful example is going to the parent diectory one level up. We can do this by typing:

```bash
cd ..
```

The `ls` command lists files and directories in the current folder.
```bash
ls
```

It can also be used to list files located in any directory. For example, to list your applications, you can type:
```bash
ls /Applications
```

To make a new directory, type:
```bash
mkdir my_folder
```
then type `ls` to see the new folder

To create a new file, type:
```bash
touch file1
```

To remove a file, type:
```bash
rm file1
```

### General Format for Commands

`<command> -<options> <arguments>`
* `<command>` is the action we want the computer to take.
* `<options>` (or "flags") modify the behavior of the command.
* `<arguments>` are the things we want the command to act on.

For example, to see all the files in the directory including the hidden ones, type:

```bash
ls -a
```

You cannot remove a directory by just typing `rm`, instead use:

```bash
rm -rf directory_name
```

We used this to check the python version on your computers:

```bash 
python -h
```

Compare the following two commands:
```bash
du directory_name
```

and 

```bash
du -h directory_name
```

Sometimes you can use options to limit the command output:
```bash
wc file.txt
```

This command shows the number of newlines, words and characters in the file. To output only the number of words try:

```bash
wc -w file.txt
```

**To display what options are available for a command use ```man```, e.g.:**

```bash
man ls
```

### Using Wildcards in the Command Prompt

The wildcard symbol (`*`) is useful for using commands to operate on multiple
files. To provide an example, first create a folder on your desktop and add some
files.
```bash
mkdir ~/Desktop/example_folder
cd ~/Desktop/example_folder
touch cat.txt
touch dog.txt
touch bird.txt
touch fish.txt
```

You can then use the wildcard `*` to operate on subsets of files. List any
file with "d" in the file name, for example:
```bash
ls *d*
```

Or, remove any file with "i":
```bash
rm *i*
ls
```

### Hidden Directories Can Also Be Found

There are hidden directories all over your file system — mainly to save you from youself. Using the parameters `-lha` to `ls`, we can find these directories.

```bash
ls -lha
```
<pre>
-l: &nbsp;&nbsp; One entry per line.
-h: &nbsp;&nbsp; When used with the -l option, use unit suffixes: byte, kilobyte, megabyte, etc.
-a: &nbsp;&nbsp; Include directory entries whose names begin with a dot (.).
</pre>

<a id='independent_practice'></a>

## Independent Practice

---

Try out the `mkdir`, `touch`, `cd`, `pwd`, and `ls` commands on your own. If you want, try out using the wildcard command as well.

<a id='editing_files'></a>

## Editing and Examining Files

---

At times it's helpful to edit files in a pinch. We can accomplish this by using the terminal editor `nano`.

Use the following syntax to edit files from the terminal with `nano`:

`nano [filename]`

These hotkeys are available:

* **ctrl-w**: Search within file.
* **ctrl-o**: Save file as [filename].
* **ctrl-x**: Exit editor.

*The bottom of the editor contains the most common operations.*

### Echo File Content to the Terminal

Sometimes it's nice to view the contents of files as text. There are a variety of ways to do this. The commands `cat`, `head`, and `tail` will allow us to view the entire or partial contents of a target file.

The `cat` command is typically used to quickly display short files. (`cat` is also used for concatenating files, hence its name!)

```bash
cat /etc/passwd
```

Traditionally, the /etc/passwd file is used to keep track of every registered user with access to a system.

**Only the first few lines of a file**

This command is useful when looking at files that might be too large to open in a traditional editor such as Sublime or Atom.
```bash
head /etc/passwd
```

**Only the last few lines of a file**
```bash
tail /etc/passwd
```

You can also pass the paramter `-n` to `head` and `tail` to control the amount of output displayed.

### Searching Inside Files: `grep`

The `grep` command will search within files and traverse within subdirectories.

Find all files with the word "the" inside:
```bash
grep -r "the" *
```

Omitting `-r` will cause `grep` to only look within the current subdirectory:

```bash
grep "the" *
```

Using `-i` will make `grep` ignore the casing of characters, but at the expense of efficiency.


<a id='ide'></a>
## Intro to Development Environments
---

In addition to being able to write commands in the Unix OS(mac) or Windows OS(PC) command langauge in the terminal, we can also execute commands in a variety of languages like Python, Java and Git in terminal/command line as well.

In your terminal, you can enter into a Python shell by simply typing `python`.

Within the Python shell, we can execute Python expressions

```python
>>> # assigning a variable
>>> x = 'hello world'

>>> # printing a variable's content
>>> print(x)
hello world
```

Writing and trouble shooting a lot of code in the terminal can be tedious, as it is hard to write several line scripts.  Almost all developers don't actually write their scripts in the command line; instead they use text editors or development environments to write their code.

Try writing a `for loop` in the Python shell

```python
listo = [1, 5, 9]

for item in listo:
    print(itm)
```

We made an error in the second line of the `for loop` but we still have to rewrite the entire loop and we can't go back and just edit out mistake inline.

## Common Environments for Data Science

---
An IDE (Integrated Development Environment) is a program that provides an **all-in-one environment** to programmers. For example, often in development you will open many programs all at once, e.g. Finder, a text editor, debugging terminal, a terminal window for displaying output, and a graphics editor. An IDE will provide all of these (and more!) inside a single application. That said, IDEs are often sluggish. So, many professional Python programmers prefer a plain text editor.


The Anaconda distribution we installed earlier comes with two useful Python-based development environments, `Spyder` and `Jupyter`. A common third-party environment is `PyCharm`.

**Jupyter Notebooks**

Jupyter uses cell based execution, which means you can run all the code in a cell simultaneously. Jupyter notebooks also have markdown and slide show integration, which means they make great blog and instructional resources!  _All of the lessons in this class have been written in a Jupyter Notebook_.

Jupyter Notebooks open in your default browser and can be opened from the Anaconda Navigator or from the command line by executing ```jupyter notebook```.

- [28 Jupyter Notebook tips, tricks and shortcuts](https://www.dataquest.io/blog/jupyter-notebook-tips-tricks-shortcuts/)
- [Markdown Cheatsheet](https://github.com/adam-p/markdown-here/wiki/Markdown-Here-Cheatsheet)

**Spyder IDE (Integrated Development Environment)**

Spyder has a selection-based execution which allows you to run all the code that you have _selected_ simultaneously.  Spyder is very similar to R Studio, supports Jupyter notebooks, and has several customizable windows for displaying output, variables, and computer usage.

Syder is desktop software that opens in its own window. It can be opened from the Anaconda Navigator or from the command line by executing ```spyder```.

- [Introduction to Spyder (Video)](https://www.youtube.com/watch?v=8JiWEZEnJ40)

**PyCharm IDE (by JetBrains)**

PyCharm is an excellent fully-featured commercial IDE for writing Python code files. It has a free-of-charge Community Edition which has no restrictions. PyCharm is often used for developing larger Python applications. However, it offers powerful features that other environments lack such as debugging capabilities, intelligent code refactoring, and integration with Git.

- [Free PyCharm Community Edition](https://www.jetbrains.com/pycharm/download)


None of these environments is better than the other; you will even find many Python developers who only use a text editor to edit their code! That said, often Jupyter Notebook is preferred for quick experimentation and initial data exploration, whereas Spyder and Pycharm are preferred for writing non-notebook code files.

### Text Editors

In addition to IDEs, developers also use text editors to create or edit code and files. Text editors or more commonly used for files that are executed via the command line, as well as for software and website development.  

Some common text editors that you may see or use include
- [Sublime](https://www.sublimetext.com/)
- [Atom](https://atom.io/)
- [Notepad++](https://notepad-plus-plus.org/) (Windows)
- [Vim](http://www.vim.org/)


<a id='conclusion'></a>

##  Lesson Review : Command Line

---

Today, we learned about the command line and some of its common commands.  We also learned about file structures, absolute paths, and relative paths.

Additionally, we reviewed running Python in the command line as well as in other environments. There is a [Jupyter Notebook exercise](./IpythonNotebookPractice/ipynb_practice1.ipynb) available if you want to practice working in the Jupyter IDE.

In time, you might find that simple operations are actually faster to perform from the command line.

## Independent Practice (set up Github first)
---
Let's navigate to course-info.


By far, the most useful operation from the terminal is finding files. `locate` finds files all over your file system.  The `find` command will find files relative to the current working directory but needs to be used in conjunction with a pipe operation. The following expression will finding all notebook files within subdirectories of the current working directory:

```bash
find . | grep ipynb
```


<a id='check'></a>
### Check for Understanding
---

Recall Ames dataset from the last lesson and try to do the following using your Command Line:
* Define the path to the dataset (hint: look for AmesHousing.csv)
* Navigate to the directory containing Ames dataset
* Count the number of lines in the file (hint: use ```man``` for options)
* Show the first three lines of the file
* Find all the lines containing "GrnHill"
* _(optional)_ Count the number of records containing "GrnHill"


**Have Questions About "piping"?**

Here's some optional (but highly recommended) reading about pipe and I/O redirection on the command line:


* [I/O redirection](http://linuxcommand.org/lc3_lts0070.php)
* [Good examples of piping commands together](http://unix.stackexchange.com/questions/30759/whats-a-good-example-of-piping-commands-together)
