Overview
--------

_Readings: The [Appendix A of Learn Python the Hard Way](http://learnpythonthehardway.org/book/appendixa.html) also discusses the material below._

Modern data science is impossible without some understanding of the Unix command line.  Unix is a family of computer operating systems including the Macâ€™s OS X and Linux (technically, Linux is a Unix clone); Windows has also Unix emulators, which allow running Unix commands.  In our class, we use the Linux (specifically, the Ubuntu distribution).
Depending on the class, it may be running on Amazon AWS services, in the Google Cloud, or in a cloud at NYU.

Let's start:

(_**Note**: In IPython/jupyter, to call a command line script, you add an exclamation mark before the command. That's why you will see all the commands in this notebook being preceded by a `!` character._)
Let's start by seeing where we are on our server. The pwd (Print Working Directory) will tell us.

In [1]:
!pwd

/home/class_share/nhwclass/1-UNIX_Basics


### Understanding the folder structure

Basic concepts
* Hierarchical directory structure
* Absolute vs. relative directories
* Parent (..) and current (.) directories


### `pwd`

Prints the current directory. Type `pwd` in the shell prompt. This will tell you your current directory. 

### `ls`

Lists the contents of a directory or provide information about the specified file. Typical usage: 

`ls [options] [files or directories]`

If you want to know the contents of this directory, type `ls -A`. 

In [2]:
!ls

A-Basic_Unix_Shell_Commands.ipynb  D-Running_Tasks_In_The_Background.ipynb
B-Fetching_Data_Using_CURL.ipynb   location.json
C-Pipes_Filters_Redirection.ipynb  parameters.json
cronhelp.ipynb			   sample.txt
data				   sorted.csv


By default, `ls` simply lists the contents of the current directory. There are several options that when used in conjunction with ls give more detailed information about the files or directories being queried. Here are a sample:

+ `-A`: list all of the contents of the queried directory, even hidden files.
+ `-l`: detailed format, display additional info for all files and directories.
+ `-R`: recursively list the contents of any subdirectories.
+ `-t`: sort files by the time of the last modification.
+ `-S`: sort files by size.
+ `-r`: reverse any sort order.
+ `-h`: when used in conjunction with `-l`, gives a more human-readable output.



Let's try now to execute `ls` with a different set of options:

In [3]:
!ls -lh

total 3.1M
-rwxrwxr-x+ 1 nhw1 nhw1 8.2K Aug 19 12:42 A-Basic_Unix_Shell_Commands.ipynb
-rwxrwxr-x+ 1 nhw1 nhw1  21K Aug 16 10:02 B-Fetching_Data_Using_CURL.ipynb
-rwxrwxr-x+ 1 nhw1 nhw1  63K Aug 19 12:45 C-Pipes_Filters_Redirection.ipynb
-rw-rw-r--+ 1 nhw1 nhw1 5.0K Aug 16 10:02 cronhelp.ipynb
drwxrwxr-x+ 2 nhw1 nhw1 4.0K Aug 16 10:02 data
-rwxrwxr-x+ 1 nhw1 nhw1  13K Aug 16 10:02 D-Running_Tasks_In_The_Background.ipynb
-rw-rw-r--+ 1 nhw1 nhw1  568 Aug 19 12:49 location.json
-rw-rw-r--+ 1 nhw1 nhw1  283 Aug 19 12:42 parameters.json
-rw-rw-r--+ 1 nhw1 nhw1  201 Aug 19 12:43 sample.txt
-rw-rw-r--+ 1 nhw1 nhw1 3.0M Aug 16 10:02 sorted.csv


### `cd`

Change the current directory. Usage: 

`cd [directory to move to]`

For example, to change to the `/home/ubuntu` directory:

In [4]:
!cd /home/ubuntu

/bin/sh: 1: cd: can't cd to /home/ubuntu


or we can use the cd command alone to connect to our default home directory.

In [5]:
!cd

If we want to run two commands in a row, we separate them using the `;` character. For example, to change to a directory and show its contents:

In [6]:
!cd ; ls -l

total 4
drwxrwx---+ 3 nhw1 nhw1 4096 Aug 19 16:37 jupyter


### `mkdir`

Creates a new folder. For example, to create a new folder named `DealingWithData` under the current folder, we type:


In [7]:
!mkdir DealingWithData
!ls -lA

total 3152
-rwxrwxr-x+ 1 nhw1 nhw1    8295 Aug 19 12:42 A-Basic_Unix_Shell_Commands.ipynb
-rwxrwxr-x+ 1 nhw1 nhw1   20754 Aug 16 10:02 B-Fetching_Data_Using_CURL.ipynb
-rwxrwxr-x+ 1 nhw1 nhw1   63950 Aug 19 12:45 C-Pipes_Filters_Redirection.ipynb
-rw-rw-r--+ 1 nhw1 nhw1    5048 Aug 16 10:02 cronhelp.ipynb
drwxrwxr-x+ 2 nhw1 nhw1    4096 Aug 16 10:02 data
drwxrwxr-x+ 2 nhw1 nhw1    4096 Aug 31 15:48 DealingWithData
-rwxrwxr-x+ 1 nhw1 nhw1   13121 Aug 16 10:02 D-Running_Tasks_In_The_Background.ipynb
-rw-rw-r--+ 1 nhw1 nhw1    6148 Aug 16 10:02 .DS_Store
drwxrwxr-x+ 2 nhw1 nhw1    4096 Aug 16 10:02 .ipynb_checkpoints
-rw-rw-r--+ 1 nhw1 nhw1     568 Aug 19 12:49 location.json
-rw-rw-r--+ 1 nhw1 nhw1     283 Aug 19 12:42 parameters.json
-rw-rw-r--+ 1 nhw1 nhw1     201 Aug 19 12:43 sample.txt
-rw-rw-r--+ 1 nhw1 nhw1 3066312 Aug 16 10:02 sorted.csv


### `rmdir` 

Removes a folder. (The folder must be empty for the command to succeed.)

In [8]:
!rmdir DealingWithData

### `cp` 

Copies a file. Usage:

`cp [source file] [destination file]`

It can also be used to copy multiple files into a directory.

`cp [source file1] [source file2] ... [destination directory]`

For example, to copy the file 'A-Basic_Unix_Shell_Commands.ipynb' and name the file NotebookA.ipynb

In [9]:
!cp A-Basic_Unix_Shell_Commands.ipynb NotebookA.ipynb
!ls -l 

total 3137
-rwxrwxr-x+ 1 nhw1 nhw1    8295 Aug 19 12:42 A-Basic_Unix_Shell_Commands.ipynb
-rwxrwxr-x+ 1 nhw1 nhw1   20754 Aug 16 10:02 B-Fetching_Data_Using_CURL.ipynb
-rwxrwxr-x+ 1 nhw1 nhw1   63950 Aug 19 12:45 C-Pipes_Filters_Redirection.ipynb
-rw-rw-r--+ 1 nhw1 nhw1    5048 Aug 16 10:02 cronhelp.ipynb
drwxrwxr-x+ 2 nhw1 nhw1    4096 Aug 16 10:02 data
-rwxrwxr-x+ 1 nhw1 nhw1   13121 Aug 16 10:02 D-Running_Tasks_In_The_Background.ipynb
-rw-rw-r--+ 1 nhw1 nhw1     568 Aug 19 12:49 location.json
-rwxrwxr-x+ 1 nhw1 nhw1    8295 Aug 31 15:48 NotebookA.ipynb
-rw-rw-r--+ 1 nhw1 nhw1     283 Aug 19 12:42 parameters.json
-rw-rw-r--+ 1 nhw1 nhw1     201 Aug 19 12:43 sample.txt
-rw-rw-r--+ 1 nhw1 nhw1 3066312 Aug 16 10:02 sorted.csv


Or we can copy the file to another folder. For example, the following command copies the file `A-Basic_Unix_Shell_Commands.ipynb` to folder `DealingWithData` and names the new file `NotebookA.ipynb`

In [10]:
!mkdir DealingWithData
!cp A-Basic_Unix_Shell_Commands.ipynb DealingWithData/NotebookA.ipynb
!ls -lA DealingWithData

total 1
-rwxrwxr-x+ 1 nhw1 nhw1 8295 Aug 31 15:48 NotebookA.ipynb


### `rm` 

The `rm` command is used to delete a file.

rm -r : deletes a folder, recursively

In [11]:
!rm DealingWithData/NotebookA.ipynb
!rm NotebookA.ipynb

In [12]:
#clean up
!rmdir DealingWithData
!ls


A-Basic_Unix_Shell_Commands.ipynb  D-Running_Tasks_In_The_Background.ipynb
B-Fetching_Data_Using_CURL.ipynb   location.json
C-Pipes_Filters_Redirection.ipynb  parameters.json
cronhelp.ipynb			   sample.txt
data				   sorted.csv


### `mv`

The `mv` command is similar to `cp` but it moves the file instead of just copying it. Effectively it performs a `cp` command, followed by an `rm` for the original file

## Exercise

* Create two new directories, dir1 and dir2 with the mkdir command. 
* Use ls to confirm
* Copy the file /home/ubuntu/data/titanic.xls to dir1 and name it file1.xls
* Copy the file /home/ubuntu/data/imdb.sql to dir2 and name it file2.sql
* Move each file to the other directory (file1.xls to dir2 and file2.sql to dir1) with the mv command
* Delete both directories with the rm -r command


In [13]:
# your code here