# bash
## I/O, Processes, and Math

## User Input
- User input is gotten by using the `read` command
- The general syntax is
```bash
read [OPTIONS] variable_name
```
- Common options are:
    - -p &lt;text&gt;: Prompt the user with text before getting input
    - -s: Do not display the text the user types (for passwords, etc)
    - -t &lt;time&gt;: Time out after the given number of seconds  

In [None]:
#Example Code Can't be Run in Browser/Jupyter
echo "Enter some text:"
read text
echo "You entered $text"

In [None]:
#Example Code Can't be Run in Browser/Jupyter
read -p "Enter some more text: " more_text 
echo "Now you are telling me $more_text"

In [None]:
#Must be -sp, -ps means "s" is the argument of -p 
read -sp "Enter the secret word: " secret

#Not printing characters means that we need to 
#explicitly move to the next line
echo
echo "Was I supposed to keep $secret a secret?" ~ 

In [None]:
echo -n "Enter something quickly!: "
read -t5 user_input
if [[ -n $user_input ]]; then 
    echo "Congrats! You beat the clock"
else 
    echo
    echo "Too Slow! Better luck next time" 
fi

## Mapfile
- The `mapfile` command reads STDIN into an array, breaking it up at newlines
- Even though it reads from STDIN, it primarily used with the pipe character or redicrects
    - Not used for user interaction
- The syntax is 
```bash
mapfile [OPTIONS] array_variable
```

In [None]:
mapfile numbers<<HERE
1
2
3
4
5
HERE

for number in ${numbers[@]}; do
    echo -n "$number, "
done
echo

## Reading A File with a Loop
- The `mapfile` command is generally more efficient, but is a recent addition to bash
- If you want to do something more than just read the lines in, it can still be useful to use a loop
- Reading a file in a loop combines three techniques
    - A `while` loop
    - A `read` command
    - Input redirection

In [None]:
while read line; do
    echo $line
done < numbers.txt

## Processing a File Practice

- Read in a file named data/words.txt, and find the longest word in the file

## Formatted Output
- The `printf` command allows output to be formatted with more control than echo
- It uses a syntax similar to most formatted strings you are familiar with
    - Based on printf from C
- Newlines are not automatically added
- The variables to print are given as arguments to the `printf` command after the format string

In [None]:
printf "%d is a number\n" 30
printf "%10d is a number\n" 30
printf "%010d is a number\n" 30
printf "%-10d is a number\n" 30
printf "%d is a big number\n" 10000000000
printf "%'d is a big number that is easier to read" 10000000000

In [None]:
printf "%f is a float\n" 30
printf "%f is a float\n" 30.1345
printf "%.2f is a truncated float\n" 30.12345
printf "%'.2f is a truncated , yet big, float" 3000000000.12345

In [None]:
printf "%s is a string\n" "Hello there"
#All Arguments are always printed
printf "%s was passed as an argument\n" Hello there
printf "%3s doesn't truncate the string\n" "A long string"
printf "%.3s does truncate the string\n" "A long string"
printf "%10.3s truncates the string\
, but prints with a width of 10" "A long string"

## Other Uses of `printf`

- Two rather unique format types are 
    - `%q%` will escape your string into an appropriate format for bash
    - `%(fmt)T` converts seconds into a user specified date string
        - `fmt` is other format commands for dates, similar to `strftime` function in C

In [None]:
printf %q "A directoryname with spaces/"
printf "\n"
printf "%(%A the %d of %B, %Y, at %r)T\n" -1
printf "%(%A the %d of %B, %Y, at %r)T" 1

## Running Other Scripts
- Other scripts can always be run like other commands, simply by calling them
- If you want to have access to all the variables, including function definitions, use the `source` command
    - The single dot `.` is an alias for the `source` command
    
```bash
. lots_of_definitions
source other_definitions
```

In [None]:
more src/shell/definitions.sh

In [None]:
./src/shell/definitions.sh
echo $pi

In [None]:
. src/shell/definitions.sh
echo ${alphabet[*]}

## Process Management
- When calling other commands it is useful to know how to control processes
- Common process control commands are
    - `COMMAND &` - executes command in background
    - `bg JOB_SPEC` - sends command to background
    - `fg JOB_SPEC` - brings background command to foreground
- If you are using the shell interactively
    - `jobs` list all currently running processes launched from this shell
    - `ps` list all processes on the computer

## `ps`  Command
- When you have many processes running its useful to know how to query them
- The `ps` command by default displays the pids for processes launched from this shell
- Common options are
    - -A: display all processes on the system
    - -f: display more information, such as who started the process
    - -F: display even more information
    - -o&lt;format&gt;: customize the information displayed
    - -u&lt;user&gt;: display all processes launched by user

In [None]:
ps

In [None]:
ps -f -ubryan | more

## Kill
- Despite it's name `kill` is a more general command then just ended processes
- The `kill` command can send signals to running processes
    - The signal can be sent using either its numerical value or name
        - -9 or -SIGKILL
    - To see a full list use `kill -l`
- Syntax
```bash
kill SIGNAL PID
```

In [None]:
# Launch a random background job
htop &

In [None]:
kill -15 9922

In [None]:
jobs

In [None]:
kill -9 9922

## The nohup Command
- One signal sent to processes is `SIGHUP` which is sent when a terminal closes
    - Comes from hang up
    - This will generally kill processes 
- If we have a long running background task that we want to continue after the terminal is close, use the nohup command
```bash
nohup COMMAND &
```

## Command Substitution
- We've used it a few times, but formally command substitution runs a command and returns it's output
- You may encounter two forms
    - `` `command` ``
    - `$(command)`
- Always use `$(command)`
    - It is nestable
    - It is safer

In [None]:
html_files=$(find . -name "*.ipynb")
echo $html_files

In [None]:
ps_out=$(ps)

In [None]:
echo ${ps_out::10}

In [None]:
nesting=$(echo $(ls))
echo $nesting

## Command Substitution Practice
- Use command substitution to print all the `ipynb` files in the directory, with `ipynb` removed
    - Hint: Use `${var//pattern/substitute}`

## Chaining Commands
- The `&&` ,`||` , and `;` operators are used to chain commands together
    - `command1 && command2` only executes command2 upon successful exit of command1
    - `command1 || command2` only executes command2 upon unsuccessful exit of command1
    - `command1 ; command2` always executes command2

In [None]:
rm /home 2> /dev/null || echo "You can't do that"
[[ 1 -eq 1 ]] && echo "That is true 1"
[[ 1 -eq 2 ]] && echo "That is true 2"
[[ 1 -eq 2 ]] || echo "That isn't true 2"

## Subshells
- A subshell is a group of commands run in a separate shell from the current process
- Changes to variables in the subshell will not be reflected in the main script
- Can also be used to send an entire group of commands to the background
- Syntax is 
```bash
( COMANDS )
```

In [None]:
echo $(pwd)
(
    cd ~
    echo $(pwd)
)
echo $(pwd)

In [None]:
printf "%'d is a big number\n" 1000000
(
    LANG=es_ES.UTF-8
    printf "%'d is a big number\n" 1000000
)
printf "%'d is a big number\n" 1000000

## Parallel Execution
- Parallel execution can be achieved easily using subshells and backgrounding processes
- Bash has a builtin command `wait` that will pause the execution of the script until all child processes have returned
- For more complex parallel applications, we will look at the GNU parallel suite of tools

In [None]:
#Supress notification of completed background jobs
set +m

(
    for letter in {A..Z}; do
        echo "$letter ";
        sleep 0.5;
    done;
)& 

(
    for number in 1 2 3 4 5 6 7; do
        echo  "$number ";
        sleep 0.25;
    done
)&

wait
echo "EVERYTHING IS AWESOME"

## GNU Parallel
- GNU parallel is a collection of utilities to manage processes executing in parallel
- The `parallel` command executes a command in parallel given a list of arguments separated by `:::`
```bash
    parallel echo ::: A B C ::: 1 2 3
```
- `parallel --pipe` allows parallel processing of STDIN
- The `sem` command is useful to combine with backgrounded subprocesses to limit how many run at a time

In [None]:
parallel echo ::: A B C ::: 1 2 3

In [None]:
parallel jupyter-nbconvert {} --to html ::: *.ipynb

In [None]:
time (grep -P "\d\d\d-\d\d\d-\d\d\d\d" ~/Research/Data/wackypediaFlat.slim | wc -l)
#grep -P "\d\d\d-\d\d\d-\d\d\d\d" ~/wackypediaFlat.slim | wc -l

In [None]:
time parallel --pipe --block 100M 'grep -P "\d\d\d-\d\d\d-\d\d\d\d" | wc -l' <  ~/Research/Data/wackypediaFlat.slim

In [None]:
# There are better ways to do this, ie all in one search

for letter in {A..Z}; do
(
        n=$(grep -P "($letter)\1" ~/wackypediaFlat.slim | wc -l)
        echo "$n double $letter's found"
        sleep 0.5;
)&
done;

wait

In [None]:
# There are better ways to do this, ie all in one search

for letter in {A..Z}; do
(
        
        n=$(sem --id $$ -j3 grep "${letter}${letter}" ~/wackypediaFlat.slim | wc -l)
        echo "$n double $letter's found"
        sleep 0.5;
)&
done;

sem --wait --id $$

## Splitting a File
- Splitting a file comes in handy when doing parallel processing, if you don't want to or can't use `parallel --pipe`
- The split command will automatically split a file according to various metrics, and create new files with a suffix like "aa"
- Common options
    - -n: Split into N chunks
    - -l: Split into files with L lines
    - -b: Split into files with B bytes in them

In [None]:
split -l1 numbers.txt numbers_aa

In [None]:
ls x*

In [None]:
more numbersaa

## Arithmetic
- bash supports only integer arithmetic natively
- The syntax to indicate arithmetic is double parentheses **(( EXPRESSION ))**
- Variables do not need to be expanded inside the double parentheses (no $ needed)
- Standard operators are supported
    - % is the module operator
    - ** is used for exponentiation

In [None]:
echo $((0 + 11))
echo $((10/6))
echo $((10 * 6))
echo $((10 % 6))

In [None]:
x=10
((x++))
echo $((x += 1))
echo $((x += 1))

In [None]:
echo $((3.14 + 11 ))

## Floating Point Arithmetic
- In order to perform floating point math, the `bc` command is used
    - The input is STDIN
- The syntax is very similar to C
    - To determine the precision of the output, prefix the math with `scale=PRECISION;`
    - The default is to truncate all floating point numbers

In [None]:
bc <<< "0+5"
bc <<< "scale=2;10/6"
bc <<< "scale=2;3.14 + 11"
bc <<< "scale=2; sqrt(9)"
echo "scale=2; c(0)" | bc -l
echo "scale=2; s(0)" | bc -l