# Bash
## Streams, Redirection, and Control Structures

## Warm Up
- Write a simple bash script that takes in a file name as an argument, and does the following:
    - Sorts that file, and outputs the results to the screen
    - Paste that file to another file with the same name, but all o's replaced with e's, and outputs it to the screen


In [109]:
./src/shell/demo1.sh data/noodles


Gnochi
Penne
Ramen
Rice
Soba
Ramen	Sharp
Rice	Embroidery
Penne	Beading
Gnochi	Doll
Soba	Tapestry
	Leather


## Streams
- STDIN
- STDOUT
- STDERR

## Output Redirection
- The greater than symbol (**>**), is used to redirect output
    - With no additional symbols, this redirects STDOUT to the specified location
    - **1>** also redirects STDOUT to the specified location, but this form is not normally used
    - **2>** redirects STDERR to the specified location
    - **&>** redirects both STDOUT and STDERR to the same specified location
    - **>>** appends STDOUT to the specified file
    

In [110]:
echo "Hello" > data/hello.txt

In [111]:
more data/hello.txt

Hello


In [112]:
echo "World" >> data/hello.txt

In [113]:
more data/hello.txt

Hello
World


In [114]:
gcc no_file.c

[01m[Kgcc:[m[K [01;31m[Kerror: [m[Kno_file.c: No such file or directory
[01m[Kgcc:[m[K [01;31m[Kfatal error: [m[Kno input files
compilation terminated.


: 1

In [115]:
gcc no_file.c 2> data/gcc_errors.txt

: 1

In [116]:
more data/gcc_errors.txt

gcc: error: no_file.c: No such file or directory
gcc: fatal error: no input files
compilation terminated.


In [117]:
more src/python/out_and_err.py

#!/usr/bin/python
from __future__ import print_function
import sys

def eprint(*args, **kwargs):
    print(*args, file=sys.stderr, **kwargs)

print("I am on STDOUT")
eprint("I am on STDERR")


In [118]:
./src/python/out_and_err.py > out 2> err

In [119]:
more out

I am on STDOUT


In [120]:
more err

I am on STDERR


## /dev/null
- Unix has a special device that allows streams to be redirected to it but doesn't save any of the redirected text
- By redirecting to **/dev/null** you are throwing away that stream
    - Can be very useful to ignore errors, but many commands have a quiet option built in

In [121]:
gcc no_file 2>/dev/null

: 1

## Input Redirection
- The less than symbol (**<**) is used to redirect input to STDIN
    - Not many variations of this, but....
    - Two less than operators (**<<**) are used to create a here document, which will have its own slide

In [122]:
more src/python/simple.py

#!/usr/bin/python
number = int(raw_input("Please enter a number: "))
print "The square of %d is %d" % (number, number * number)


In [127]:
./src/python/simple.py < data/numbers.txt 

Please enter a number: The square of 40 is 1600


## Here Documents
- A here document takes any string and allows it to be passed to a command as if it were coming from STDIN
    - For commands that take multiple arguments, you may see the dash (**-**) being used to explicitly indicate which argument should use STDIN
    - The **<<** must be followed by a delimiter that is used to mark the end of the HERE document
    - Using **<<-** will remove leading tabs, which can be useful for formatting nice looking scripts

## Here Strings
- If all you want to redirect is a single line, you can use three less than symbols (**<<<**) with no delimiter to indicate a here string
    - Any variables in a here string (or here document) are expanded before being redirected

In [128]:
more data/numbers.txt

40
1
2
3


In [129]:
diff - data/numbers.txt <<EOF
40
1
2
3
EOF

In [130]:
diff - data/numbers.txt <<< "Hello"

1c1,4
< Hello
---
> 40
> 1
> 2
> 3


: 1

## Pipes
- Many times the output of one command will function as the input to a second command
- Rather than redirect output to a tempoarary file and then use that file as input, use the pipe command (**|**)
    - The STDERR stream can be redirection *along with* the STDOUT stream using **|&**

In [131]:
ls -lh | wc -l

20


In [132]:
find ~/ -size +100M 2>/dev/null | head

/home/bryan/Utils/unix-privesc-check/files_cache.27006
/home/bryan/Utils/unix-privesc-check/files_cache.11354
/home/bryan/Utils/unix-privesc-check/files_cache.489
/home/bryan/Utils/unix-privesc-check/files_cache.21443
/home/bryan/Utils/unix-privesc-check/files_cache.16819
/home/bryan/Utils/unix-privesc-check/files_cache.9765
/home/bryan/Utils/unix-privesc-check/files_cache.14036
/home/bryan/Documents/StabilizedExample2.mp4
/home/bryan/Documents/June91-1.mp4
/home/bryan/Documents/untitled.mp4


## Redirection and Pipe Practice

- Combine the `find` and `sort` commands to produce a sorted list of all files over 10M in a directory. Redirect the output to a file called big_files.txt

## Tee
- The `tee` command takes in a stream as input, and outputs that stream both to STDOUT and to the specified file
    - Used following a pipe operator

In [133]:
pip2 install -U scipy |& tee scipy.log

Collecting scipy
  Using cached scipy-1.0.0-cp36-cp36m-manylinux1_x86_64.whl
Collecting numpy>=1.8.2 (from scipy)
  Using cached numpy-1.14.0-cp36-cp36m-manylinux1_x86_64.whl
Installing collected packages: numpy, scipy
Successfully installed numpy-1.14.0 scipy-1.0.0


In [134]:
more scipy.log

Collecting scipy
  Using cached scipy-1.0.0-cp36-cp36m-manylinux1_x86_64.whl
Collecting numpy>=1.8.2 (from scipy)
  Using cached numpy-1.14.0-cp36-cp36m-manylinux1_x86_64.whl
Installing collected packages: numpy, scipy
Successfully installed numpy-1.14.0 scipy-1.0.0


## Redirecting From Multiple Commands
- Sometimes you may need to combine the output of multiple commands and pass this on to a third or fourth command
- You could use temporary files, but process substitution fills this need nicely
- The syntax is **<(_command_)** (Known as process substitution)
    - This relies on certain operating system features, so isn't truly portable, but can be assumed to be 

In [135]:
diff <(ls -lh .) <(ls -lh ~/Teaching/CMSC433)

In [136]:
head -n1 data/part1.tsv

head: cannot open 'data/part1.tsv' for reading: No such file or directory


: 1

In [None]:
head -n1 data/part2.csv

In [None]:
paste <(cut -f2 data/part1.tsv) <(cut -f2 data/part2.csv -d,)

## Process Substitution Practice
- Use process substitution to shuffle two files, concatenate them together, and shuffle the final results
    - data/numbers.txt - The list of numbers from before
    - data/letters.txt - A list of the letters of the alphabet, one per line

## xargs
- Theoretically, you could pass the `rm` command a long list of directories to delete
    - When this list of arguments becomes arbitarilaly too long, `rm` may break
    - It is better to call `rm` on each of the directories in turn
- xargs allows us to process a string, determine what the arguments are and how to split them up, and how many times to call a command
    - Very useful for calling a command on the output of `find`

In [None]:
echo 1 2 3 4 | xargs ls

In [None]:
ls *.ipynb | xargs file

In [None]:
ls img/*.png | xargs -I{} convert {} {}.jpg

In [None]:
rm img/*.jpg
ls img/*.png > pngs
more pngs
xargs -IFILE convert FILE FILE.jpg < pngs
ls img/*.jpg

## If-Then-Else
- The `if` block must end with `fi`
- The `then` keyword is required in bash
    - For both `elif` and `if`
    - Must be on a different line or follow on the same line after a semicolon
```bash
if CONDITIONAL; then
#CODE
elif CONDITIONAL; then
#CODE
else
#CODE
fi
```

## If-Then-Else
- The `if` block must end with `fi`
- The `then` keyword is required in bash
    - For both `elif` and `if`
    - Must be on a different line or follow on the same line after a semicolon
```bash
if CONDITIONAL
then
#CODE
elif CONDITIONAL
then
#CODE
else
#CODE
fi
```

## Conditional Expression in Bash
- Binary expressions in bash are evaluated
    - Using the `test` command
    - Using the `[` command (an alias of `test`)
    - Using the `[[` syntax 
- Results are stored as a return code
    - Not normally invoked on its own
- Whitespace is very important

## [ and test vs [[
- [ and test are commands
- [[ is part of bash syntax
    - Allows for easier composition of conditionals using && and || 
    - Parentheses don't have to be escaped
    - Can do pattern matching and regular expressions as a conditional

## Conditional Operators
- Bash has three types of conditional operators
    - numeric operators
    - string operators
    - file operators
- You can always negate an comparison by using `!` in front of it

## Conditionals on Numbers
- Equal: -eq
- Not Equal: -ne
- Greater Than: -gt
- Greater Than or Equal: -ge
- Less Than: -lt
- Less Than or Equal: -le

In [None]:
if [ 1 -eq 7 ]; then
echo "What math are you doing?"
else
echo "One is not equal to 7"
fi

In [None]:
if [ 1 -ne 7 ]; then
echo "One is not equal to 7" 
else
echo "What math are you doing?"
fi

In [None]:
if [ ! 1 -eq 7 ]; then
echo "What math are you doing?"
else
echo "One is not equal to 7"
fi

In [None]:
a=1
b=2
if [ $a -lt $b ]; then
echo "$a is smaller than $b"
else
echo "$b is smallter than $a"
fi

In [None]:
a=1
b=2
if [[ $a -lt $b && $b -gt $a ]]; then
echo "$a is smaller than $b"
else
echo "$b is smallter than $a"
fi

## Conditionals on Strings
- Equal: =
- Not Equal: !=
- Is Empty: -z
- Is Not Empty: -n

In [None]:
string1="A string"
string2="Another string"
string3=
if [[ $string1 = $string1 ]]; then
echo "The strings are the same"
fi

In [None]:
if [[ -z $string3 ]]; then
echo "The string is empty"
fi

In [None]:
if [[ -n $string2 ]]; then
echo "The string is not empty"
fi

## Conditionals on Files
- There are about 20 different tests that can be performed on a file
    - `man test` shows them all
- Some common ones are:
    - Existence: -e
    - Is a file: -f
    - Is a directory: -d 
    - Is readable/writable/executable: -r/-w/-x
    - Isn't empty: -s

In [None]:
more data/a_missing_file

In [None]:

if [[ ! -e 'a_missing_file' ]]; then
echo "Lets make a file" > data/a_missing_file
fi

more data/a_missing_file

In [None]:
touch an_empty_file
if [[ -e 'an_empty_file' ]]; then
echo "An empty file exists"
fi
if [[ -s 'an_empty_file' ]]; then
echo "The file isn't empty"
fi

In [None]:
if [ -f . ]; then
echo "This directory isn't a file...something is messed up"
else
echo "All is right in the world"
fi

## If Statement Practice
- Write a simple bash script that prints "Be Careful" if the argument passed to it is
    - A file and
    - Writable and
    - Not empty

## Switch Statements
- Switch statements start with the keyword `case` and end with the keyword `esac`
- Each clause is a pattern to match the expression against
    - The pattern in a clause ends with a right parentheses **)**
    - A clause must end with two semicolons (**;;**)

In [None]:
expression="This is a String"

case $expression in
    0)
        echo "The variable is 0"
        ;;
    *ing)
        echo "The variable ends in ing"
        ;;
    *String)
        echo "The variable ends in String"
        ;;
    *)
        echo "This is the default"
        ;;
esac

## For Loops
- Bash has traditionally used a foreach style loop ( similar to Python)
- Can loop over any type of array
    - Can also loop over files
- Both loops have the general syntax of
```bash
for EXPRESSION(S); do
# CODE_GOES_HERE
done
```

## Foreach Style Loop
- The foreach style loop uses the setup of 
```bash
for variable in list; do
```
- list can be
    - a space seperated list
    - an expanded array
    - a shell-style regular expression (globbing)
    - the output of a command


In [None]:
for x in 1 2 3; do
    echo $x;
done

In [None]:
my_array=(1 2 3)
for y in ${my_array[@]}; do
    echo $y
done

In [None]:
for f in *.ipynb; do
    wc -l $f
done

## For Loop Practice
- Write a for loop that finds the most common line in each file in the data directory
    - Hint: use head to find **most** common

## C-Style Loop
- Support for the C-style loop is widespread in bash, but not all shell scripts 
- The syntax for the C-style loop is:
```bash
for (( START ; END ; CHANGE)); do
```
- The variable isn't prefixed with the dollar sign (**$**) inside the loop definition 

In [None]:
for ((x = 1; x < 4; x++)); do
    echo $x
done

In [None]:
for ((x = 1; x < 4; x += 2)); do
    echo $x
done

## seq Command
- There are many other ways to do a c-style loop while using the traditional syntax
- One option is the `seq` command, which returns a list of numbers 
- The syntax of the `seq` command is
```bash
seq START INCREASE? END
```


In [None]:
for i in $(seq 1 3); do
    echo $i
done

In [None]:
for i in $(seq 0 2 10); do
    echo $i
done

## Brace Expansion
- Another feature of bash that is often, but not exclusively used, with loops is brace expansion
- Bash will expand anything in braces into a list
- Braces can take two forms:
```bash
{A_LIST,OF,OPTIONS}
```
or

```bash
    {START..END}
```

In [None]:
echo Lecture0{0,1,2,3,4,5}.ipynb | xargs ls -lh | cut -f6,7,8  -d' '

In [None]:
for i in {0..5}; do
    ls -lh Lecture0$i.ipynb | cut -f6,7,8 -d' '
done

## While Loops
- While loops also use the `do` expression after the condition
- The syntax for a while loop is
```bash
while CONDITION; do
    #CODE_HERE
done
```

In [None]:
string='Some Characters'
while [[ -n $string ]]; do
    echo ${string:0:1}
    string=${string:1}
done

## Until Loops
- The `until` loop is almost identical to the `while` loop, but continues until the statement is True
- The `until` is still places at the top of the loop and checked before entering it
- The syntax of `until` is 
```bash
until CONDITIONAL; do
    #CODE GOES HERE
done
```

In [None]:
string='Some Characters'
until [[ -z $string ]]; do
    echo ${string:0:1}
    string=${string:1}
done