## Manipulating files and directories

### Where am I?

In [None]:
!pwd

### How can I identify files and directories?

In [None]:
!ls

### How else can I identify files and directories?

In [None]:
!ls readme.md

### How can I move to another directory?

In [None]:
!cd Test

### How can I move up a directory?

In [None]:
!cd ~

In [None]:
!cd ~/../.

In [None]:
!ls ~

### How can I copy files?

In [None]:
!cp seasonal/summer.csv backup/summer.bck

In [None]:
!cp seasonal/spring.csv seasonal/summer.csv backup

### How can I move a file?

In [None]:
!mv seasonal/spring.csv seasonal/summer.csv backup

### How can I rename files?

In [None]:
!mv winter.csv winter.csv.bck

### How can I delete files?

In [None]:
!rm seasonal/summer.csv

### How can I create and delete directories?

In [None]:
!rmdir people

In [None]:
!mkdir yearly

In [None]:
!mkdir yearly/2017

## Manipulating data


### How can I view a file's contents?

In [None]:
!cat course.txt

### How can I view a file's contents piece by piece?
If you give less the names of several files, you can type :n (colon and a lower-case 'n') to move to the next file, :p to go back to the previous one, or :q to quit.

In [None]:
!less seasonal/spring.csv seasonal/summer.csv

### How can I look at the start of a file?

In [None]:
!head seasonal/summer.csv

### How can I control what commands do?

In [None]:
!head -n 5 seasonal/winter.csv

### How can I list everything below a directory?

In [None]:
!ls -R -F

### How can I get help for a command?

In [None]:
!man tail

### How can I select columns from a file?

In [None]:
!cut -d , -f 1 seasonal/spring.csv

In [None]:
!cut -d, -f1 seasonal/spring.csv

### How can I repeat commands?
! followed by a command number.

In [None]:
!history

### How can I select lines containing specific values?
-c: print a count of matching lines rather than the lines themselves

-h: do not print the names of files when searching multiple files

-i: ignore case (e.g., treat "Regression" and "regression" as matches)

-l: print the names of files that contain matches, not the matches

-n: print line numbers for matching lines

-v: invert the match, i.e., only show lines that don't match

In [None]:
!grep molar seasonal/autumn.csv

In [None]:
!grep -v -n molar seasonal/autumn.csv

In [None]:
!grep -c incisor seasonal/autumn.csv seasonal/winter.csv

## Combining tools

### How can I store a command's output in a file?

In [None]:
!tail -n 5 seasonal/winter.csv > last.csv

### How can I use a command's output as an input?

In [None]:
!head -n 1 last.csv

### What's a better way to combine commands?
Use cut to select all of the tooth names from column 2 of the comma delimited file seasonal/summer.csv, then pipe the result to grep, with an inverted match, to exclude the header line containing the word "Tooth".

In [None]:
!cut -d , -f 2 seasonal/summer.csv | grep -v Tooth

### How can I combine many commands?

In [None]:
!cut -d , -f 2 seasonal/summer.csv | grep -v Tooth | head -n 1

### How can I count the records in a file?
Count how many records in seasonal/spring.csv have dates in July 2017. To do this, use grep with a partial date to select the lines and pipe this result into wc with an appropriate flag to count the lines.

In [None]:
!grep 2017-07 seasonal/spring.csv | wc -l

### How can I specify many files at once?
Write a single command using head to get the first three lines from both seasonal/spring.csv and seasonal/summer.csv, a total of six lines of data, but not from the autumn or winter data files. Use a wildcard instead of spelling out the files' names in full.

In [None]:
!head -n 3 seasonal/s*

### How can I sort lines of text?
By default it does this in ascending alphabetical order, but the flags -n and -r can be used to sort numerically and reverse the order of its output, while -b tells it to ignore leading blanks and -f tells it to fold case (i.e., be case-insensitive). 

Starting from this recipe, sort the names of the teeth in seasonal/winter.csv (not summer.csv) in descending alphabetical order. To do this, extend the pipeline with a sort step.

In [None]:
!cut -d , -f 2 seasonal/winter.csv | grep -v Tooth | sort -r

### How can I remove duplicate lines?
get the second column from seasonal/winter.csv,
remove the word "Tooth" from the output so that only tooth names are displayed,
sort the output so that all occurrences of a particular tooth name are adjacent; and
display each tooth name once along with a count of how often it occurs.

In [None]:
!cut -d , -f 2 seasonal/winter.csv | grep -v Tooth | sort | uniq -c

### How can I save the output of a pipe?

In [None]:
!> result.txt head -n 3 seasonal/winter.csv

### How can I stop a running program?
If you decide that you don't want a program to keep running, you can type Ctrl + C to end it.

In [None]:
!^C

### Wrapping up
Use wc with appropriate parameters to list the number of lines in all of the seasonal data files. (Use a wildcard for the filenames instead of typing them all in by hand.)Add another command to the previous one using a pipe to remove the line containing the word "total".Add two more stages to the pipeline that use sort -n and head -n 1 to find the file containing the fewest lines.

In [None]:
!wc -l seasonal/* | grep -v total | sort -n | head -n 1

## Batch processing

### How does the shell store information?

In [None]:
!set | grep HISTFILESIZE

### How can I print a variable's value?


In [None]:
!echo $OSTYPE

### How else does the shell store information?

In [None]:
!testing=seasonal/winter.csv

In [None]:
!head -n 1 $testing

### How can I repeat a command many times?
The structure is for ...variable... in ...list... ; do ...body... ; done

The list of things the loop is to process (in our case, the words gif, jpg, and png).

The variable that keeps track of which thing the loop is currently processing (in our case, filetype).

The body of the loop that does the processing (in our case, echo $filetype).

In [None]:
!for filetype in gif jpg png; do echo $filetype; done

### How can I repeat a command once for each file?

In [None]:
!for file in people/*; do echo $file; done

### How can I record the names of a set of files?

In [None]:
!files=seasonal/*.csv

In [None]:
!for f in $files; do echo $f; done

### A variable's name versus its value

In [None]:
!files=seasonal/*.csv
!for f in files; do echo $f; done

### How can I run many commands in a single loop?
Write a loop that produces the same output as

grep -h 2017-07 seasonal/*.csv

In [None]:
!for file in seasonal/*.csv; do grep -h 2017-07 $file; done

## Creating new tools
Ctrl + K: delete a line.

Ctrl + U: un-delete a line.

Ctrl + O: save the file ('O' stands for 'output').

Ctrl + X: exit the editor.

In [None]:
!nano names.txt

### How can I record what I just did?
Copy the files seasonal/spring.csv and seasonal/summer.csv to your home directory.

Use grep with the -h flag (to stop it from printing filenames) and -v Tooth (to select lines that don't match the header line) to select the data records from spring.csv and summer.csv in that order and redirect the output to temp.csv.

Pipe history into tail -n 3 and redirect the output to steps.txt to save the last three commands in a file. (You need to save three instead of just two because the history command itself will be in the list.)

In [None]:
!cp seasonal/s* ~

In [None]:
!grep -h -v Tooth s*.csv > temp.csv

In [None]:
!history | tail -n 3 > steps.txt

### How can I save commands to re-run later?
Use nano dates.sh to create a file called dates.sh that contains this command:

cut -d , -f 1 seasonal/*.csv

Use bash to run the file dates.sh.

In [None]:
!nano dates.sh

In [None]:
!bash dates.sh

### How can I re-use pipes?

In [None]:
!nano teeth.sh

In [None]:
!bash teeth.sh

In [None]:
!cat teeth.out

### How can I pass filenames to scripts?
Edit the script count-records.sh with Nano and fill in:

bash: count-records.sh: c

Run count-records.sh on seasonal/*.csv and redirect the output to num-records.out using >.

In [None]:
!nano count-records.sh 

In [None]:
!bash count-records.sh seasonal/*.csv > num-records.out

### How can I process a single argument?
get-field.sh contains:

In [None]:
!head -n $2 $1 | tail -n 1 | cut -d , -f $3

In [None]:
!bash get-field.sh seasonal/summer.csv 4 2

It should select the second field from line 4 of seasonal/summer.csv   

### How can I write loops in a shell script?

In [None]:
# Print the first and last data records of each file.
for filename in $@
do
    head -n 2 $filename | tail -n 1
    tail -n 1 $filename
done