# Introduction to Unix

## Professor Matthew Loose

### Week 1 Workshop 2

Unix Workshop - Week 1

matt.loose@nottingham.ac.uk


## Learning Objectives:

Understand Unix Streams and Pipes

UNIX text processing tools

Bash Scripting

Using variables in bash

For Loops in bash



# Introduction to UNIX Streams and Pipes

![ghostbusters](https://i.giphy.com/media/v1.Y2lkPTc5MGI3NjExejhtamY1OHdoaXg3M253MGQxOXZ2and1aW05a3ZtaHRuc3hlYjR6NSZlcD12MV9pbnRlcm5hbF9naWZfYnlfaWQmY3Q9Zw/3o72FiKtrMAjIb0Rhu/giphy.gif)




---

## What Are UNIX Streams?

- **Streams** are a series of bytes of data that flow from one place to another.
- There are three standard streams:
  1. **Standard Input (stdin)**: Data coming into a program.
  2. **Standard Output (stdout)**: Data the program outputs.
  3. **Standard Error (stderr)**: Error messages from the program.

---




---

## Standard Streams Breakdown

- **stdin**: Usually from the keyboard but can be from files or other programs.
- **stdout**: Default is your terminal window, but it can be redirected elsewhere.
- **stderr**: Like stdout but meant for error messages, so it can be handled separately.

---



### 1. **Basic Example of UNIX Streams**

Here’s a simple example of using `fortune` to generate output, which is written to stdout:

```bash
fortune 
```

- **Explanation**: 
  - `fortune` outputs a random quote. It is commonly installed on linux systems and may be on osx...



In [2]:
%%bash

fortune

I hear what you're saying but I just don't care.


In [3]:
%%bash




---
## What Are UNIX Pipes?

- **Pipes (`|`)**: A method of connecting the output of one command directly into the input of another command.
  
  - Example:
    ```bash
    command1 | command2
    ```
  - Output of `command1` becomes input of `command2`.

---



#### 1. **Basic Example of UNIX Streams**

Here’s a simple example of using `fortune` to generate output, which can be displayed using `cowsay`:

```bash
fortune | cowsay
```

- **Explanation**: 
  - `fortune` outputs a random quote.
  - `|` pipes that output into `cowsay`, which formats it as a speech bubble around an ASCII cow.

---


In [4]:
%%bash

fortune | cowsay



 ________________________________________ 
/ Parents often talk about the younger   \
| generation as if they didn't have much |
\ of anything to do with it.             /
 ---------------------------------------- 
        \   ^__^
         \  (oo)\_______
            (__)\       )\/\
                ||----w |
                ||     ||


In [5]:
%%bash




---
## Benefits of Pipes

- **Modular Processing**: Each command in the pipeline does one job.
- **Efficiency**: Avoids creating temporary files.
- **Flexibility**: You can combine simple commands to perform complex tasks.

---




---
## Real-Life Example of Pipes

- Find all text files in a directory and count the number of lines:
  ```bash
  find . -name "*.txt" | xargs wc -l
  ```

---



In [6]:
%%bash

find . -name "*.txt" | xargs wc -l

      11 ./file_7.txt
      22 ./file_27.txt
      13 ./file_26.txt
      11 ./file_6.txt
       9 ./file_4.txt
      10 ./file_18.txt
       9 ./file_24.txt
      11 ./file_30.txt
      20 ./file_25.txt
       8 ./file_19.txt
      11 ./file_5.txt
       6 ./numbers.txt
       9 ./file_1.txt
      12 ./file_21.txt
       9 ./file_20.txt
      13 ./file_0.txt
       9 ./file_2.txt
      14 ./file_22.txt
      10 ./file_23.txt
       9 ./file_3.txt
       5 ./log.txt
       5 ./processes.txt
      10 ./file_12.txt
      10 ./file_13.txt
      10 ./file_11.txt
      52 ./file_10.txt
       9 ./file_8.txt
      10 ./file_14.txt
      10 ./file_28.txt
       8 ./file_29.txt
      11 ./file_15.txt
      10 ./file_9.txt
       5 ./data.txt
       9 ./file_17.txt
       9 ./file_16.txt
     399 total


In [7]:
%%bash




---
## Redirecting Streams

- Use **`>`** to redirect stdout to a file:
  ```bash
  echo "Hello, World!" > output.txt
  ```
- Use **`>>`** to append to a file:
  ```bash
  echo "More data" >> output.txt
  ```
- Redirect **stderr** with **`2>`**:
  ```bash
  some_command2 2> error_log.txt
  ```

---



In [8]:
%%bash
 

some_command2 2>file.logc  

CalledProcessError: Command 'b' \n\nsome_command2 2>file.logc  \n'' returned non-zero exit status 127.

In [9]:
%%bash





In [10]:
%%bash






---
## Combining stdout and stderr

- Redirect both stdout and stderr:
  ```bash
  some_command2 > output.txt 2>&1
  ```

---



In [11]:
%%bash



In [12]:
%%bash




---
## Filters in Pipes

- **Filters**: Commands that process input and produce output.
  - Examples: `grep`, `sort`, `cut`, `awk`, `sed`.
  
- Example:
  ```bash
  ps aux | grep "python" | sort -nrk 3
  ```
  - Finds running Python processes and sorts by memory usage.

---



In [13]:
%%bash

ps aux | grep python | sort -nrk 3 | head -n 5 | cowsay

 _________________________________________ 
/ mattloose 90257 0.2 0.1 412304128 84784 \
| ?? Ss 1:24pm 0:00.85                    |
| /Users/mattloose/miniconda3/envs/lectur |
| es/bin/python -m ipykernel_launcher -f  |
| /Users/mattloose/Library/Jupyter/runtim |
| e/kernel-504bc752-4741-401a-a519-72db47 |
| bcf334.json mattloose 88564 0.1 0.2     |
| 412343328 133904 s013 S+ 12:23pm        |
| 0:23.65                                 |
| /Users/mattloose/miniconda3/envs/lectur |
| es/bin/python                           |
| /Users/mattloose/miniconda3/envs/lectur |
| es/bin/jupyter-lab mattloose 90486 0.0  |
| 0.0 410734336 1616 ?? S 1:33pm 0:00.00  |
| grep python mattloose 88599 0.0 0.1     |
| 412148416 73904 ?? Ss 12:23pm 0:01.32   |
| /Users/mattloose/miniconda3/envs/lectur |
| es/bin/python -m ipykernel_launcher -f  |
| /Users/mattloose/Library/Jupyter/runtim |
| e/kernel-b96b7b73-fcdf-4928-8278-8b9aff |
| 9e6994.json mattloose 77444 0.0 0.0     |
| 414308192 224 ?? S Thu01pm 1:2

In [14]:
%%bash




---
## Summary

- **Streams**: stdin, stdout, stderr – standard communication channels.
- **Pipes**: Connect output of one command to the input of another.
- **Redirection**: Modify where input/output goes, even to files.
- **Filters**: Tools to manipulate data within pipes for flexible processing.


### Note: Most command line bioinformatics programs can be used with streams, pipes, redirection and filters.
---



# UNIX Text Processing Tools: grep, sort, cut, awk, and sed

---

## Introduction to `grep`

- **`grep`** is used to search for patterns within files.
- It stands for **global regular expression print**.
  
### Syntax:
```bash
grep [options] pattern [file...]
```

### Example:
```bash
grep "error" log.txt
```
Searches for the word "error" in log.txt.

---

---

## grep Options

-i: Case-insensitive search.

-v: Invert match (show lines that don't match the pattern).

-r: Search directories recursively.


Example:
```bash
grep -i "warning" log.txt
```

Case-insensitive search for "warning" in log.txt.

---

---

# Introduction to sort

sort is used to sort lines of text files.

Syntax:

```bash
sort [options] [file...]
```

Example:
```bash
sort data.txt
```

Sorts the contents of data.txt in alphabetical order.

---

---

## sort Options

-r: Sort in reverse order.

-n: Numeric sort.

-k: Sort by a specific column.

Example:
```bash

sort -k 2,2 -n data.txt
```

Sorts data.txt numerically based on the second column.

---

---

## Introduction to cut

cut is used to extract specific fields from files.

Syntax:
```bash
cut [options] [file...]
```

Example:
```bash
cut -d "," -f 1,3 data.csv
```

Extracts the 1st and 3rd columns from data.csv using , as a delimiter.

---

---

## cut Options

-d: Specify the delimiter.

-f: Specify fields to extract.

Example:
```bash
cut -d " " -f 2-4 data.txt
```

Extracts the 2nd to 4th fields from data.txt using a space delimiter.


---

## Introduction to awk

awk is a powerful text processing tool, particularly for structured data.

Syntax:
```bash
awk 'pattern {action}' [file...]
```

Example:

```bash
awk '{print $1, $3}' data.txt
```

Prints the 1st and 3rd columns of each line from data.txt.

## awk Example with Conditions

You can use conditions within awk to filter data:

Example:
```bash
awk '$3 > 100 {print $1, $3}' data.txt
```

Prints the 1st and 3rd columns where the value in the 3rd column is greater than 100.

## Introduction to sed

sed is a stream editor for filtering and transforming text.

Syntax:
```bash
sed 'command' [file...]
```

Example:
```bash
sed 's/error/ERROR/g' log.txt
```

Replaces all occurrences of "error" with "ERROR" in log.txt.

## sed Options and Examples

-i: Edit files in place.

s/pattern/replacement/: Replace pattern with replacement.

Example:
```bash
sed -i 's/warning/NOTICE/g' log.txt
```

Replaces "warning" with "NOTICE" in log.txt and saves the changes.



## Combining Tools: A Practical Example

Use a combination of grep, cut, and sort to process data:

Example:
```bash
grep "error" log.txt | cut -d " " -f 2,4 | sort -n
```

Finds lines with "error" in log.txt, extracts the 2nd and 4th fields, and sorts them numerically.



## Summary

grep: Searches for patterns.

sort: Sorts lines.

cut: Extracts specific columns.

awk: Processes structured text with conditions.

sed: Edits and transforms text streams.




## Bash Examples for Each Tool



1. grep Examples:
```bash
# Basic usage to search for a pattern
grep "error" log.txt

# Case-insensitive search
grep -i "error" log.txt

# Search recursively through directories
grep -r "error" /var/logs/
```

2. sort Examples:
```bash
# Sort alphabetically
sort data.txt

# Sort numerically
sort -n numbers.txt

# Sort by the second column
sort -k 2,2 -n data.txt
```

3. cut Examples:

```bash
# Extract first and third fields from a CSV file
cut -d "," -f 1,3 data.csv

# Extract fields from 2 to 4 using space as delimiter
cut -d " " -f 2-4 data.txt
```

4. awk Examples:

```bash
# Print the 1st and 3rd columns of a file
awk '{print $1, $3}' data.txt

# Print lines where the 3rd column is greater than 100
awk '$3 > 100 {print $1, $3}' data.txt
```

5. sed Examples:
```bash
# Replace "error" with "ERROR" in a file
sed 's/error/ERROR/g' log.txt

# Edit a file in place, replacing "warning" with "NOTICE"
sed -i 's/warning/NOTICE/g' log.txt
```

6. Combining Tools:

```bash
# Combine grep, cut, and sort to process data
grep "error" log.txt | cut -d " " -f 2,4 | sort -n

```

# Introduction to Bash Scripting

---

## What is Bash?

- **Bash** (Bourne Again SHell) is a command-line shell used for interacting with the Unix/Linux operating system.
- It allows users to run commands, automate tasks, and write complex scripts to control the system.

---


---

## Why Learn Bash Scripting?

- **Automation**: Perform repetitive tasks automatically.
- **Efficiency**: Combine and execute multiple commands in a single script.
- **Customizability**: Customize your workflow and environment.
- **Portability**: Write scripts that run across multiple Unix-like systems.

---

## Writing Your First Bash Script



### Steps to Create a Script:

1. Open a text editor and type the script content.
2. Save the file with a `.sh` extension.
3. Make the file executable using the `chmod` command.
4. Run the script from the terminal.

### Example Script:

```bash
#!/bin/bash
# This is a simple Bash script
echo "Hello, World!"
```


Try it - now!

- **Explanation**:
  - `#!/bin/bash`: The **shebang** line, which tells the system which interpreter to use.
  - `echo`: A command that prints text to the terminal.
  - Save this as `hello.sh`, then make it executable with `chmod +x hello.sh` and run it using `./hello.sh`.

---


## Comments in Bash

- Comments are ignored by the Bash interpreter and help explain the code.

```bash
# This is a single-line comment
echo "Hello"  # Inline comment
```






In [17]:

%%bash
echo "Hello"  # Inline comment


Hello


---

## Basic Bash Commands

- **echo**: Prints text to the terminal.
- **cd**: Changes directories.
- **ls**: Lists files and directories.
- **cp**: Copies files.
- **mv**: Moves or renames files.
- **rm**: Deletes files.

### Example

```bash
echo "Current directory:"
pwd  # Prints the current directory path
```

---


## Variables in Bash

- Variables store data that can be used throughout the script.
- Assign variables using the `=` operator.


---

In [21]:
%%bash
greeting="Hello"
name="Alice"
echo "$greeting, $name!"


Hello, Alice!


---

## Conditionals in Bash

- **If statements** allow you to run commands based on conditions.

```bash
if [ condition ]
then
    # Commands to run if condition is true
else
    # Commands to run if condition is false
fi
```

### Example


In [None]:
%%bash
age=18
if [ $age -ge 18 ]
then
    echo "You are an adult."
else
    echo "You are a minor."
fi


---

## Loops in Bash

- **For Loops**: Iterate over a list of items.

```bash
for i in 1 2 3
do
    echo "Number: $i"
done
```

- **While Loops**: Repeat commands while a condition is true.

```bash
count=1
while [ $count -le 5 ]
do
    echo "Count: $count"
    count=$((count + 1))
done
```

---


---

## Functions in Bash

- Functions group a set of commands that can be reused in a script.

```bash
greet() {
    echo "Hello, $1!"
}

greet "Alice"
```

- **Explanation**: Functions can take arguments. `$1` refers to the first argument.

---

---

## Command Substitution

- Capture the output of a command and store it in a variable.

```bash
current_time=$(date)
echo "Current time: $current_time"
```

- **Explanation**: The `$(command)` syntax runs the command and stores the result.

---

---

## Redirecting Input and Output

- **>**: Redirects output to a file (overwrites).
- **>>**: Appends output to a file.
- **<**: Takes input from a file.

```bash
echo "Hello" > output.txt  # Write to file
cat output.txt             # Display file contents
```

---

---

## Shell Scripting Best Practices

- **Use comments**: Make your script easier to understand.
- **Quote variables**: Prevent issues with spaces or special characters.
- **Test your script**: Always run your script with test cases before deploying.
- **Error handling**: Check for possible errors using conditionals.

---


---

## Conclusion

- Bash scripting allows you to automate tasks and improve your productivity in Unix/Linux environments.
- With knowledge of variables, conditionals, loops, and functions, you can create powerful scripts.
- Practice by writing small scripts and gradually build more complex ones.

---

# REPRODUCIBILITY!!!

# Introduction to Variables in Bash Scripting

---

## What is a Variable?

- A **variable** is a way to store data that can be referenced and manipulated within a script.
- Bash variables do not require explicit declaration of data types (e.g., string, integer). Everything is treated as a string.

---

---

## Basic Syntax

```bash
variable_name=value
```

- No spaces around the equal sign.
- **variable_name**: The name of the variable (should not start with a number).
- **value**: The data that is stored in the variable.

---

## Simple Example

```bash
greeting="Hello, World!"
echo $greeting
```

- **Output:**
  ```
  Hello, World!
  ```

---

---

## Accessing Variables

- Use the `$` symbol to access the value of a variable.

```bash
name="Alice"
echo "Hello, $name!"
```

- **Output:**
  ```
  Hello, Alice!
  ```

---

---

## Variable Types

### Local Variables

- Variables defined within a script or function, accessible only within that scope.

```bash
greeting="Hello"
echo $greeting
```

### Environment Variables

- Variables that are available system-wide, across different scripts and programs.

```bash
export MY_VAR="Exported value®"
echo $MY_VAR
```

### Positional Parameters

- Variables assigned automatically to command-line arguments in a script.

```bash
echo "First argument: $1"
echo "Second argument: $2"
```

---

---

## Read-Only Variables

- Prevent modification by marking a variable as read-only.

```bash
readonly PI=3.14159
echo $PI
```

- Trying to modify `PI` will result in an error.

---

In [None]:
%%bash

readonly PI=3.14159
echo $PI



In [39]:
%%bash


UsageError: %%bash is a cell magic, but the cell body is empty.


---

## Using `read` to Assign Variables

- Use `read` to accept input from the user.

```bash
read -p "Enter your name: " user_name
echo "Hello, $user_name!"
```

- **Output (after inputting "Alice"):**
  ```
  Hello, Alice!
  ```

---

In [49]:
%%bash

read -p "Enter your name: " user_name


CalledProcessError: Command 'b'bash\nread -p "Enter your name: " user_name\n'' returned non-zero exit status 1.

Note - jump to a termianl and use bash explicitly!

---

## Arithmetic Operations

- Use `(( ))` for integer arithmetic operations.

```bash
x=5
y=3
sum=$((x + y))
echo "Sum: $sum"
```

- **Output:**
  ```
  Sum: 8
  ```

---

---

## Variable Scope

- Variables defined inside a function are local by default.
- Use `global` variables by defining them outside any function.

### Example of Local Variable:

```bash
my_function() {
    local var="I'm local"
    echo $var
}
my_function
echo $var  # This will not print anything
```

### Example of Global Variable:

```bash
var="I'm global"
my_function() {
    echo $var
}
my_function
```

---

---

## String Manipulation

### String Concatenation

- You can concatenate strings using variables.

```bash
first="Hello"
second="World"
greeting="$first, $second!"
echo $greeting
```

- **Output:**
  ```
  Hello, World!
  ```

### Getting the Length of a String

- Use `${#variable}` to get the length of a string.

```bash
str="Hello, World!"
echo ${#str}
```

- **Output:**
  ```
  13
  ```

---

---

## Best Practices

- **Use meaningful names**: Always name your variables descriptively.
- **Quote variable references**: Use `"$variable"` to avoid issues with spaces or special characters.
- **Avoid using reserved words**: Do not use names that conflict with system commands (e.g., `echo`, `read`).
- **Scope your variables**: Prefer local variables when working inside functions.
- **Test before using**: You can check if a variable is set with:

  ```bash
  if [ -z "$var" ]; then
      echo "Variable is not set."
  fi
  ```

---


---

## Conclusion

- Variables are fundamental to bash scripting and are used to store and manipulate data.
- With proper variable usage, you can make your scripts more dynamic and powerful.

---