# Intro to Data Science

[Gina Sprint](https://ginasprint.com/)

# OS Commands

### Learner Objectives
What are our learning objectives for this lesson?
* Understand what powershell/terminal is and how it can be used
* Navigate a file system using the command line
* Perform basic file manipulations using the command line
* Run a Python program from command line

Content used in this lesson is based upon information in the following sources:
* None to report

## The Command Line
Instead of using graphical user interfaces (GUIs) to interface with our OS, we can type commands at the command line (AKA powershell (Windows) or terminal (Unix-based machines). The OS you use determines what commands are available. The three main OSs in use today are Windows, Mac OS X, and Linux. Both Mac OS X and Linux are Unix-based OSs, meaning they are derived from the [Unix OS](https://en.wikipedia.org/wiki/Unix). In this lesson, we are going to cover several OS commands for both Windows and Unix-based machines.

## Launching Powershell/Terminal
### Windows
Either one of these methods will launch powershell in Windows:
1. Hit the Windows key then type "powershell" and press enter.
1. Open up the start menu and navigate to All Apps -> Windows System -> Powershell

### Unix-based
Either one of these methods will launch a terminal in Unix-based machines:
1. Hit the command key and then type "terminal" and press enter.
1. Open the applications folder, then open the Utilities folder, then select the Terminal application.
1. Ctrl + Alt + T

## Command Help
`man <command>` (short for interface to the online reference manuals): get help for `<command>`

```
man man
```
provides the instructions on how to use the `man` command (quite long to include).

Note: many commands have different options you can specify using a *switch*. In Windows a switch is specified using the forward slash, such as `/i` to ignore case. In Unix-based machines, a switch is specified using a dash, such as `-i` to ignore case.

## Navigating the File System
Your OS manages your file system. You can think of a file system as a tree structure, with the root node being the top-level folder (e.g. C:\ on Windows or / on Unix-based machines). Child nodes of the root node are folders/files in the top-level folder, and so on and so forth. Leaf nodes are files and empty folders.

We typically use a GUI such as Windows Explorer or Finder to access our files, manage our files, etc. All of the common tasks we use GUIs for have equivalent command line commands.

### Current Working Directory
When you launch the command line, the command line is operating in a folder (AKA a directory). The current folder is called the *current working directory*, or the CWD. At the command lint, to the left of the cursor is the actual prompt, which denotes the CWD. You can also display the CWD with the following commands:

`pwd` (short for present working directory): print the full filename of the current working directory

```
gsprint@gsprint-x1:~/Documents/L1-2$ pwd
/home/gsprint/Documents/L1-2
```

### Paths
When you refer to a file or folder by name, the OS looks for it in the CWD.

If a file you want to interact with is in a directory other than the CWD, you will have to specify its path. The location of a file is represented by its path, the sequence of folders that the file is stored in, plus the file's name. There are two ways to specify a path:
1. Relative path: a path to a file or directory relative to the current directory.
1. Absolute path: a path to a file or directory specified by its exact location on your file system. An absolute path *uniquely* identifies the location of a file/folder on a file system.

#### Windows
On a windows machine, folders and file names in a path are separated by backslashes "\".

Relative path example: "files\transactions.txt" refers to a file ("transactions.txt") in a directory ("files") in the CWD.

Absolute path example:  "C:\Users\gsprint\cpts215\lessons\files\transactions.txt" refers to a file ("transactions.txt") in the folder "C:\Users\gsprint\cpts215\lessons\files" on my C:\ drive.

Note: Windows' absolute paths start with the drive, such as C:\, as the top-level directory.

Note: On a windows machine, folders and file names in a path are separated by backslashes "\". We know the backslash has a special purpose in Python, to escape certain characters, such as a newline "\n"; therefore, you will have to escape a backslash: "`\\`" in your path to a file: `"files\\transactions.txt"`. Alternatively, you can specify your path as a raw string: `r"files\transactions.txt"`. On a Unix-based machine (e.g. Mac, Linux distributions), the forward slash "/" is used in paths and you don't have to worry about this issue.

#### Unix-based
On a Unix-based machine, the forward slash "/" is used in paths.

Relative path example: "files/transactions.txt" refers to a file ("transactions.txt") in a directory ("files") in the CWD.

Absolute path example:  "/home/gsprint/cpts215/lessons/files/transactions.txt" refers to a file ("transactions.txt") in the folder "/home/gsprint/cpts215/lessons/files/transactions.txt" on my hard drive.

Note: Unix-based absolute paths start with root, /, as the top-level directory.

Note: The CWD can be referred to by a single period, `.`. The folder the CWD is in (i.e. CWD's parent folder) can be referred to by two periods, `..`. We will learn more about this when we talk about changing the CWD.

### List of Files in the CWD
To display the files/folders in the CWD:
* `ls`: prints a listing
* `ls -`l: prints a long listing (includes more details about the files/folders)
* `ls -R`: prints a visualization of the the folders/files in the CWD and all subfolders recursively

```
gsprint@gsprint-x1:~$ ls -l
total 48
drwxrwxr-x 20 gsprint gsprint 4096 Sep 29  2016 anaconda3
drwxr-xr-x  2 gsprint gsprint 4096 Aug 29  2016 Desktop
drwxr-xr-x  4 gsprint gsprint 4096 May  2 14:13 Documents
drwxr-xr-x  3 gsprint gsprint 4096 Sep 29  2016 Downloads
-rw-r--r--  1 gsprint gsprint 8980 Aug 30  2016 examples.desktop
drwxr-xr-x  2 gsprint gsprint 4096 Aug 29  2016 Music
drwxr-xr-x  2 gsprint gsprint 4096 May  2 13:38 Pictures
drwxr-xr-x  2 gsprint gsprint 4096 Aug 29  2016 Public
drwxr-xr-x  2 gsprint gsprint 4096 Aug 29  2016 Templates
drwxr-xr-x  2 gsprint gsprint 4096 Aug 29  2016 Videos
```

### Change the CWD
We can change the CWD by specifying a relative or absolute path to move to. 
1. Relative path
    1. `..`: Move "up" a folder (change the CWD to the parent folder, i.e. the folder containing the CWD)
    1. Move "down" a folder (change the CWD to a child folder, i.e. a sub folder in the CWD)
1. Absolute path
    1. Move to a folder specified by an absolute path
        
`cd <path>`: Change directory to `<path>`

```
gsprint@gsprint-x1:~$ pwd
/home/gsprint
gsprint@gsprint-x1:~$ cd Documents
gsprint@gsprint-x1:~$ pwd
/home/gsprint
```

### Create a Directory
To make a new directory:
`mkdir <path>`: Create a directory

```
gsprint@gsprint-x1:~/Documents/L1-2$ ls
gsprint@gsprint-x1:~/Documents/L1-2$ mkdir temp
gsprint@gsprint-x1:~/Documents/L1-2$ ls
temp
```

### Remove a Directory/Folder
To remove (delete) a directory:
`rm <path>`: Deletes a directory/folder

```
gsprint@gsprint-x1:~/Documents/L1-2/temp$ ls
afile.txt  a.txt
gsprint@gsprint-x1:~/Documents/L1-2/temp$ rm a.txt
gsprint@gsprint-x1:~/Documents/L1-2/temp$ ls
afile.txt
```

### Copy a Directory/File
`cp <source> <destination>`: copy files or directories from `<source>` to `<destination>`

```
gsprint@gsprint-x1:~/Documents/L1-2/temp$ ls
afile.txt
gsprint@gsprint-x1:~/Documents/L1-2/temp$ cp afile.txt afile2.txt
gsprint@gsprint-x1:~/Documents/L1-2/temp$ ls
afile2.txt  afile.txt
```

### Move a Directory/File
To move a file to a different location (can also be used to rename a file):
`mv <source> <destination>`: move (rename) files or directories from `<source>` to `<destination>`

```
gsprint@gsprint-x1:~/Documents/L1-2/temp$ ls
afile2.txt  afile.txt
gsprint@gsprint-x1:~/Documents/L1-2/temp$ mv afile2.txt ../afile2.txt
gsprint@gsprint-x1:~/Documents/L1-2/temp$ ls
afile.txt
gsprint@gsprint-x1:~/Documents/L1-2/temp$ cd ..
gsprint@gsprint-x1:~/Documents/L1-2$ ls
afile2.txt  temp
```

## View File Contents
* `cat`: concatenate files and print on the standard output
* `more`: file perusal fileter for crt viewing
* `less`: opposite of `more`

```
gsprint@gsprint-x1:~/Documents/L1-2$ cat afile2.txt
Adding some text!!
gsprint@gsprint-x1:~/Documents/L1-2$ more afile2.txt
Adding some text!!
gsprint@gsprint-x1:~/Documents/L1-2$ less afile2.txt
```

## Open Text File
Opening and editing a text file in a terminal is common on Unix-based machines using vi, vim, or emacs. On a Window's machine, it is more common to edit a text file using the graphical program Notepad.

* `vim <file>`: edit a file in the terminal using command and insert modes.

```
gsprint@gsprint-x1:~/Documents/L1-2$ vim afile2.txt
```

Then in vim, I added some text to afile2.txt... See [vim commands](https://www.fprintf.net/vimCheatSheet.html) to learn more about how to use vim.

## Python from Command Line
Now we are going to run the `python` command to run Python interactively and in script mode. As we checked earlier, we know python.exe is in the path environment variable so we can invoke the Python interpreter from command line. 

### Interactive Mode
`python`: enters the Python command line

The command prompt should now change to ">>>". Try typing our simple hello world program from last lesson: `print("Hello World!")`. You can type `exit()` to exit the Python interpreter.

### Script Mode
At the command line, navigate to the folder that contains the hello_world.py script we created and saved in the last lesson. Enter `python hello_world.py` to run the hello_world.py script.

Check out `python` options by executing the command `python /?`

## Practice Problem
Using the OS commands described above as needed, we are going to create a folder hierarchy suitable for this class. Perform the following:
1. Open the command line (e.g. terminal, powershell, etc)
1. Navigate to a folder where you would like to store files for this class (e.g. Documents, Desktop, etc.)
1. Create a folder called something like `IntroDataScience`
1. Navigate into `IntroDataScience`
1. Create the following folders:
    1. `InClassPractice`
    1. `Project`
1. Navigate into `InClassPractice`
1. Create a directory called `HelloWorld`
1. Navigate into `HelloWorld`
1. Create a new .py file called `hello_world.py`
1. Navigate back to `IntroDataScience`
1. Get a long listing of the files in `IntroDataScience`