# bash
## I/O, Processes, and Math

## User Input
- User input is gotten by using the `read` command
- The general syntax is
```bash
read [OPTIONS] variable_name
```
- Common options are:
    - -p &lt;text&gt;: Prompt the user with text before getting input
    - -s: Do not display the text the user types (for passwords, etc)
    - -t &lt;time&gt;: Time out after the given number of seconds  

In [None]:
#Example Code Can't be Run in Browser/Jupyter
echo "Enter some text:"
read text
echo "You entered $text"

In [None]:
#Example Code Can't be Run in Browser/Jupyter
read -p "Enter some more text: " more_text 
echo "Now you are telling me $more_text"

In [None]:
#Must be -sp, -ps means "s" is the argument of -p 
read -sp "Enter the secret word: " secret

#Not printing characters means that we need to 
#explicitly move to the next line
echo
echo "Was I supposed to keep $secret a secret?" ~ 

In [None]:
echo -n "Enter something quickly!: "
read -t5 user_input
if [[ -n $user_input ]]; then 
    echo "Congrats! You beat the clock"
else 
    echo
    echo "Too Slow! Better luck next time" 
fi

## Mapfile
- The `mapfile` command reads STDIN into an array, breaking it up at newlines
- Even though it reads from STDIN, it primarily used with the pipe character or redicrects
    - Not used for user interaction
- The syntax is 
```bash
mapfile [OPTIONS] array_variable
```

In [4]:
mapfile numbers<<HERE
1
2
3
4
5
HERE

for number in ${numbers[@]}; do
    echo -n "$number, "
done
echo

1, 2, 3, 4, 5, 


## Reading A File with a Loop
- The `mapfile` command is generally more efficient, but is a recent addition to bash
- If you want to do something more than just read the lines in, it can still be useful to use a loop
- Reading a file in a loop combines three techniques
    - A `while` loop
    - A `read` command
    - Input redirection

In [5]:
while read line; do
    echo $line
done < numbers.txt

40
1
2
3



## Formatted Output
- The `printf` command allows output to be formatted with more control than echo
- It uses a syntax similar to most formatted strings you are familiar with
    - Based on printf from C
- Newlines are not automatically added
- The variables to print are given as arguments to the `printf` command after the format string

In [8]:
printf "%d is a number\n" 30
printf "%10d is a number\n" 30
printf "%010d is a number\n" 30
printf "%-10d is a number\n" 30
printf "%d is a big number\n" 10000000000
printf "%'d is a big number that is easier to read" 10000000000

30 is a number
        30 is a number
0000000030 is a number
30         is a number
10000000000 is a big number
10,000,000,000 is a big number that is easier to read

In [12]:
printf "%f is a float\n" 30
printf "%f is a float\n" 30.1345
printf "%.2f is a truncated float\n" 30.12345
printf "%'.2f is a truncated , yet big, float" 3000000000.12345

30.000000 is a float
30.134500 is a float
30.12 is a truncated float
3,000,000,000.12 is a truncated , yet big, float

In [13]:
printf "%s is a string\n" "Hello there"
#All Arguments are always printed
printf "%s was passed as an argument\n" Hello there
printf "%3s doesn't truncate the string\n" "A long string"
printf "%.3s does truncate the string\n" "A long string"
printf "%10.3s truncates the string\
, but prints with a width of 10" "A long string"

Hello there is a string
Hello was passed as an argument
there was passed as an argument
A long string doesn't truncate the string
A l does truncate the string
       A l truncates the string, but prints with a width of 10

## Running Other Scripts
- Other scripts can always be run like other commands, simply by calling them
- If you want to have access to all the variables, including function definitions, use the `source` command
    - The single dot `.` is an alias for the `source` command
    
```bash
. lots_of_definitions
source other_definitions
```

In [14]:
more definitions.sh

#!/bin/bash
pi=3.1415
e=2.7182
zero=0.0000
alphabet=(A B C D E F G H I J K L M N O P Q R S T U V W X Y Z)


In [18]:
./definitions.sh
echo $pi




In [19]:
. definitions.sh
echo ${alphabet[*]}

A B C D E F G H I J K L M N O P Q R S T U V W X Y Z


## Process Management
- When calling other commands it is useful to know how to control processes
- Common process control commands are
    - `COMMAND &` - executes command in background
    - `bg JOB_SPEC` - sends command to background
    - `fg JOB_SPEC` - brings background command to foreground
- If you are using the shell interactively
    - `jobs` list all currently running processes launched from this shell
    - `ps` list all processes on the computer

## `ps`  Command
- When you have many processes running its useful to know how to query them
- The `ps` command by default displays the pids for processes launched from this shell
- Common options are
    - -A: display all processes on the system
    - -f: display more information, such as who started the process
    - -F: display even more information
    - -o&lt;format&gt;: customize the information displayed
    - -u&lt;user&gt;: display all processes launched by user

In [20]:
ps

  PID TTY          TIME CMD
 8135 pts/22   00:00:00 bash
 9448 pts/22   00:00:00 ps


In [None]:
ps -f -ubryan | more

UID        PID  PPID  C STIME TTY          TIME CMD
bryan     2202     1  0 Sep07 ?        00:00:00 /lib/systemd/systemd --user
bryan     2203  2202  0 Sep07 ?        00:00:00 (sd-pam)         
bryan     2384     1  0 Sep07 ?        00:00:00 perl -MDevel::IPerl -e Devel::IP
erl::main kernel /run/user/1001/jupyter/kernel-472fef61-1aad-4a0d-8dff-c82fa8a2d
e7f.json
bryan     2458     1  0 Sep07 ?        00:00:00 perl -MDevel::IPerl -e Devel::IP
erl::main kernel /run/user/1001/jupyter/kernel-e887628b-dee5-469f-8dd4-9cb88f658
2a4.json
bryan     4435     1  0 Sep07 ?        00:01:22 perl -MDevel::IPerl -e Devel::IP
erl::main kernel /run/user/1001/jupyter/kernel-4427.json
bryan     4804     1  0 Sep07 ?        00:00:00 perl -MDevel::IPerl -e Devel::IP
erl::main kernel /run/user/1001/jupyter/kernel-4796.json
bryan     5865 20942  0 Sep18 ?        00:00:07 /usr/bin/python -m ipykernel_lau
ncher -f /run/user/1001/jupyter/kernel-9c7a518b-35a2-42a8-a939-a9855d1a4018.json
bryan     9060 25974  0 10

## Kill
- Despite it's name `kill` is a more general command then just ended processes
- The `kill` command can send signals to running processes
    - The signal can be sent using either its numerical value or name
        - -9 or -SIGKILL
    - To see a full list use `kill -l`
- Syntax
```bash
kill SIGNAL PID
```

In [1]:
# Launch a random background job
htop &

[1] 9922


: 1

In [3]:
kill -15 9922

In [6]:
jobs

In [5]:
kill -9 9922

: 1

## The nohup Command
- One signal sent to processes is `SIGHUP` which is sent when a terminal closes
    - Comes from hang up
    - This will generally kill processes 
- If we have a long running background task that we want to continue after the terminal is close, use the nohup command
```bash
nohup COMMAND &
```

## Command Substitution
- We've used it a few times, but formally command substitution runs a command and returns it's output
- You may encounter two forms
    - `` `command` ``
    - `$(command)`
- Always use `$(command)`
    - It is nestable
    - It is safer

In [7]:
html_files=$(find . -name "*.html")
echo $html_files

./Lecture05.html ./Lecture01.html ./Untitled.html ./Lecture08.html ./Lecture06.html ./Lecture00.html ./Lecture02.html ./Lecture04.html ./Lecture03.html ./Lecture07.html


In [8]:
ps_out=$(ps)

In [9]:
echo ${ps_out::10}

PID TTY


In [10]:
nesting=$(echo $(ls))
echo $nesting

433Fall17 airline_tweets.tsv a_missing_file anchored.pl an_empty_file a.out bad_alt.pl capture.pl cla_examples.sh definitions.sh err fast.pl fb_messenger.png fb_messenger.png.jpg fb_verify.png fb_verify.png.jpg gcc_errors.txt good_alt.pl greedy.pl hello.sh hello_simple.sh hello.txt Lecture00.html Lecture00.ipynb Lecture01.html Lecture01.ipynb Lecture02.html Lecture02.ipynb Lecture03.html Lecture03.ipynb Lecture04.html Lecture04.ipynb Lecture05.html Lecture05.ipynb Lecture06.html Lecture06.ipynb Lecture07.html Lecture07.ipynb Lecture08.html Lecture08.ipynb noncapture.pl nongreedy.pl numbersaa numbersab numbersac numbersad numbersae numbers.txt out out_and_err.py part1.tsv part2.csv pngs read_example.sh read_p_example.sh read_ps_example.sh read_t_example.sh re_example.pl regex_starter_code registers.png registers.png.jpg rolling_stone_500_greatest_2010.txt scipy.log simple.py slow.pl to_sort1.txt to_sort2.txt to_sort3.txt to_sort4.txt unanchored.pl Untitled.html Untitled.ipynb xaa xab xa

## Chaining Commands
- The `&&` ,`||` , and `;` operators are used to chain commands together
    - `command1 && command2` only executes command2 upon successful exit of command1
    - `command1 || command2` only executes command2 upon unsuccessful exit of command1
    - `command1 ; command2` always executes command2

In [11]:
rm /home 2> /dev/null || echo "You can't do that"
[[ 1 -eq 1 ]] && echo "That is true 1"
[[ 1 -eq 2 ]] && echo "That is true 2"
[[ 1 -eq 2 ]] || echo "That isn't true 2"

You can't do that
That is true 1
That isn't true 2


## Subshells
- A subshell is a group of commands run in a separate shell from the current process
- Changes to variables in the subshell will not be reflected in the main script
- Can also be used to send an entire group of commands to the background
- Syntax is 
```bash
( COMANDS )
```

In [12]:
echo $(pwd)
(
    cd ~
    echo $(pwd)
)
echo $(pwd)

/home/bryan/CMSC433
/home/bryan
/home/bryan/CMSC433


In [13]:
printf "%'d is a big number\n" 1000000
(
    LANG=es_ES.UTF-8
    printf "%'d is a big number\n" 1000000
)
printf "%'d is a big number\n" 1000000

1,000,000 is a big number
1.000.000 is a big number
1,000,000 is a big number


## Parallel Execution
- Parallel execution can be achieved easily using subshells and backgrounding processes
- Bash has a builtin command `wait` that will pause the execution of the script until all child processes have returned
- For more complex parallel applications, we will look at the GNU parallel suite of tools

In [14]:
#Supress notification of completed background jobs
set +m

(
    for letter in {A..Z}; do
        echo "$letter ";
        sleep 0.5;
    done;
)& 

(
    for number in 1 2 3 4 5 6 7; do
        echo  "$number ";
        sleep 0.25;
    done
)&

wait
echo "EVERYTHING IS AWESOME"

[1] 9951
A 
[2] 9953
1 
B 
2 
3 
C 
4 
5 
D 
6 
7 
E 
F 
G 
H 
I 
J 
K 
L 
M 
N 
O 
P 
Q 
R 
S 
T 
U 
V 
W 
X 
Y 
Z 
EVERYTHING IS AWESOME


## GNU Parallel
- GNU parallel is a collection of utilities to manage processes executing in parallel
- The `parallel` command executes a command in parallel given a list of arguments separated by `:::`
```bash
    parallel echo ::: A B C ::: 1 2 3
```
- `parallel --pipe` allows parallel processing of STDIN
- The `sem` command is useful to combine with backgrounded subprocesses to limit how many run at a time

In [15]:
parallel echo ::: A B C ::: 1 2 3

A 1
A 2
A 3
B 1
B 2
B 3
C 1
C 2
C 3


In [16]:
parallel jupyter-nbconvert {} --to html ::: *.ipynb

[NbConvertApp] Converting notebook Lecture07.ipynb to html
[NbConvertApp] Writing 249617 bytes to Lecture07.html
[NbConvertApp] Converting notebook Lecture08.ipynb to html
[NbConvertApp] Writing 249611 bytes to Lecture08.html
[NbConvertApp] Converting notebook Lecture02.ipynb to html
[NbConvertApp] Writing 320734 bytes to Lecture02.html
[NbConvertApp] Converting notebook Lecture06.ipynb to html
[NbConvertApp] Writing 294507 bytes to Lecture06.html
[NbConvertApp] Converting notebook Lecture00.ipynb to html
[NbConvertApp] Writing 939128 bytes to Lecture00.html
[NbConvertApp] Converting notebook Lecture05.ipynb to html
[NbConvertApp] Writing 319691 bytes to Lecture05.html
[NbConvertApp] Converting notebook Lecture04.ipynb to html
[NbConvertApp] Writing 309209 bytes to Lecture04.html
[NbConvertApp] Converting notebook Lecture01.ipynb to html
[NbConvertApp] Writing 374546 bytes to Lecture01.html
[NbConvertApp] Converting notebook Untitled.ipynb to html
[NbConvertApp] Writing 249190 bytes to

In [18]:
time (grep -P "\d\d\d-\d\d\d-\d\d\d\d" ~/wackypediaFlat.slim | wc -l)
#grep -P "\d\d\d-\d\d\d-\d\d\d\d" ~/wackypediaFlat.slim | wc -l

264

real	0m4.880s
user	0m4.608s
sys	0m0.268s


In [20]:
time parallel --pipe --block 100M 'grep -P "\d\d\d-\d\d\d-\d\d\d\d" | wc -l' < ~/wackypediaFlat.slim

11
18
20
16
13
11
17
10
9
7
16
14
21
15
8
12
13
10
9
12
2

real	0m2.714s
user	0m6.184s
sys	0m4.304s


In [22]:
# There are better ways to do this, ie all in one search

for letter in {A..Z}; do
(
        n=$(grep -P "($letter)\1" ~/wackypediaFlat.slim | wc -l)
        echo "$n double $letter's found"
        sleep 0.5;
)&
done;

wait

[1] 10799
[2] 10800
[3] 10801
[4] 10803
[5] 10806
[6] 10808
[7] 10813
[8] 10818
[9] 10821
[10] 10825
[11] 10829
[12] 10831
[13] 10835
[14] 10837
[15] 10839
[16] 10841
[17] 10843
[18] 10847
[19] 10849
[20] 10851
[21] 10852
[22] 10853
[23] 10854
[24] 10855
[25] 10856
[26] 10857
9561 double E's found
195 double Q's found
30234 double C's found
5724 double X's found
7226 double D's found
898 double U's found
33718 double B's found
36813 double A's found
8725 double T's found
1438 double V's found
6470 double N's found
13430 double W's found
155280 double I's found
11706 double L's found
4702 double R's found
1669 double H's found
5630 double F's found
392 double Y's found
3628 double O's found
1248 double J's found
1071 double Z's found
8736 double M's found
2873 double K's found
1726 double G's found
10254 double P's found
47183 double S's found


In [24]:
# There are better ways to do this, ie all in one search

for letter in {A..Z}; do
(
        
        n=$(sem --id $$ -j3 grep "${letter}${letter}" ~/wackypediaFlat.slim | wc -l)
        echo "$n double $letter's found"
        sleep 0.5;
)&
done;

sem --wait --id $$

[1] 11494
[2] 11495
[3] 11496
[4] 11498
[5] 11500
[6] 11501
[7] 11504
[8] 11506
[9] 11508
[10] 11514
[11] 11518
[12] 11525
[13] 11527
[14] 11528
[15] 11531
[16] 11532
[17] 11533
[18] 11536
[19] 11539
[20] 11542
[21] 11549
[22] 11555
[23] 11561
[24] 11566
[25] 11570
[26] 11572
1071 double Z's found
36813 double A's found
155280 double I's found
8736 double M's found
898 double U's found
33718 double B's found
13430 double W's found
6470 double N's found
1438 double V's found
9561 double E's found
3628 double O's found
11706 double L's found
1669 double H's found
392 double Y's found
5630 double F's found
5724 double X's found
30234 double C's found
2873 double K's found
10254 double P's found
7226 double D's found
1726 double G's found
1248 double J's found
4702 double R's found
47183 double S's found
8725 double T's found
195 double Q's found


## Splitting a File
- Splitting a file comes in handy when doing parallel processing, if you don't want to or can't use `parallel --pipe`
- The split command will automatically split a file according to various metrics, and create new files with a suffix like "aa"
- Common options
    - -n: Split into N chunks
    - -l: Split into files with L lines
    - -b: Split into files with B bytes in them

In [32]:
split -l1 numbers.txt numbers_aa

In [34]:
ls x*

xaa  xab  xac  xad  xae


In [35]:
more numbersaa

40


## Arithmetic
- bash supports only integer arithmetic natively
- The syntax to indicate arithmetic is double parentheses **(( EXPRESSION ))**
- Variables do not need to be expanded inside the double parentheses (no $ needed)
- Standard operators are supported
    - % is the module operator
    - ** is used for exponentiation

In [36]:
echo $((0 + 11))
echo $((10/6))
echo $((10 * 6))
echo $((10 % 6))

11
1
60
4


In [37]:
x=10
((x++))
echo $((x += 1))
echo $((x += 1))

12
13


In [38]:
echo $((3.14 + 11 ))

bash: 3.14 + 11 : syntax error: invalid arithmetic operator (error token is ".14 + 11 ")


: 1

## Floating Point Arithmetic
- In order to perform floating point math, the `bc` command is used
    - The input is STDIN
- The syntax is very similar to C
    - To determine the precision of the output, prefix the math with `scale=PRECISION;`
    - The default is to truncate all floating point numbers

In [41]:
bc <<< "0+5"
bc <<< "scale=2;10/6"
bc <<< "scale=2;3.14 + 11"
bc <<< "scale=2; sqrt(9)"
echo "scale=2; c(0)" | bc -l
echo "scale=2; s(0)" | bc -l

5
1.66
14.14
3.00
.99
0
