# 9. `for` Loop and Conditions in `awk`

In addition to using `for` loops in shell scripts, you can also utilize loops and conditions within the `awk` command. `awk` is a powerful text processing tool that allows you to perform various operations on text files.


By combining loops and conditions in `awk`, you can perform complex text processing operations based on specific criteria and patterns found in the input data.

**Remember that `awk` has its own syntax and capabilities for loops and conditions, so make sure to refer to the `awk` documentation or resources for more in-depth guidance on utilizing these features.**

## 9.1 Conditional Statement in `awk`

This statement is used to create conditions that are interpreted as either __true__ or __false__. If the specified condition is satisfied, you can specify which commands should be executed. You can also define what the script should do if the condition is not satisfied. The syntax for such a **one-liner** conditional statement is as follows:

```bash
awk '{if(condition) {command1}' file
```

`if-else` structure
```bash
awk '{if(condition) {command1} else {command2}}'  file
```
- `condition` is the expression that evaluates to true or false.
- If the `condition` is true, `command1` will be executed.
- If the `condition` is false, `command2` will be executed.

You can use this one-liner conditional statement in various scenarios where you want to execute different commands based on a condition. It provides a concise way of handling conditional logic within a single line of code.

`else if` structure (as many conditions as we want)
```bash
awk '{if(condition1) {command1} else if (condition2) {command2} else {command3}}'  file
```

__One-liner example:__

Contents of `numbers.dat`
```bash
1
2
3
4
5
6
7
8
9
```

In [None]:
awk '{if($1>5) {print $1" is greater than 5"} else {print $1" is lesser than 5"}}' files/numbers.dat

Can be improved:

In [None]:
awk '{if($1>5) {print $1" > 5"} else if($1==5) {print $1" == 5"} else {print $1" < 5"}}' files/numbers.dat

As you can see, more complex one-liner commands can be less readable. When writing such a command in a script, it's worth formatting it properly to improve the readability of the code. 

__Example:__

The script checks whether the numbers in the file are greater than/less than/equal to 5. Of course, error handling such as handling letters or other non-numeric characters instead of numbers is not included in this example.

Content of `ifnumber_awk.bash`:
```bash
#!/bin/bash
awk '{
    if ($1 == 5) {
        print "Congratulations, your number is exactly 5"
    } else if ($1 > 5) {
        print "Your number is greater than 5"
    } else {
        print "Your number is less than 5"
    }
}' $1
```

In this example, the `awk` command is used with an inline `if-else` statement to evaluate the numbers and print corresponding messages based on the conditions. The code is properly formatted with indentation, making it easier to understand the flow of the conditional logic.

In [None]:
bash scripts/ifnumber_awk.bash files/numbers.dat

## 9.2 Conditional Statement - Regular Expression (regex)

__1) Example with a string condition. Checking if the email in the file has the correct format using a regular expression (enclosed in slashes `/REGEX/`):__

Content of `mail_list.dat`:
```bash
surname@server.com
name.surname@server.edu.com
adres@email
123456
```

Content of `ifemail_awk.bash`:
```bash
#!/bin/bash
awk '{
if ($1 ~ /^[A-Za-z0-9\._%+-]+@[A-Za-z0-9\.-]+\.[A-Za-z]{2,4}$/)
{
    print "Adres "$1" is valid"
}
else
{
    print "Adres "$1" is invalid"
}
'} files/mail_list.dat
```

In this example, the `awk` command reads each line of the `mail_list.dat` file. 

The condition `($1 ~ /^[A-Za-z0-9\._%+-]+@[A-Za-z0-9\.-]+\.[A-Za-z]{2,4}$/)` checks if the first field (`$1`) matches the regular expression pattern for a valid email address. The pattern `/^[A-Za-z0-9\._%+-]+@[A-Za-z0-9\.-]+\.[A-Za-z]{2,4}$/` ensures that the email address has the correct format.

If the condition is met, it prints the message: **"Adres `email_address` is valid."**

Otherwise, it prints: **"Adres `email_address` is invalid."**

Note that the regular expression used in this example is a simplified version for demonstration purposes and may not cover all possible valid email address formats.

In [None]:
bash scripts/ifemail_awk.bash

**In `awk`, comparing numbers (including real numbers) or strings is done using the same construction, without distinguishing between different types of parentheses.**

## 9.3 `for` loop in `awk`

The `for` loop allows for executing a specific command (known as iterating) a predetermined number of times. Each time a new iteration of the loop runs, the same action can be performed with a different variable. In `awk`, the `for` loop has a slightly different form than in `bash` - it counts the successive iterations of the loop. The syntax for the `for` loop is as follows:

```bash
awk '{for(initialize_variable; end_condition; increment_variable) {action} '}
```

The values can be defined by numbers, files, or the result of a command. Here are three examples.

__1) Example of a `for` loop with a range of numbers:__

Contents of `for_num_awk.bash`:
```bash
#!/bin/bash
awk 'BEGIN{
for(i=1; i<=10; i++)
    print i
}'
```

In [None]:
bash scripts/for_num_awk.bash # here, each time a loop is iterated, i value is incremented

__2) `for` loop with array:__

Contents of `for_array.bash`:
```bash
#!/bin/bash
awk 'BEGIN{
   fruits["pineapple"] = "yellow";
   fruits["orange"] = "orange";
   fruits["kiwi"] = "green";
   fruits["pomegranate"] = "red";
for(i in fruits)
    print i
}'
```

In [None]:
bash scripts/for_array.bash

## 9.4 Breaking with `break` and continuing with `continue` in `awk`

In `awk`, you can also use the `break` and `continue` statements to control the flow of loops. Here's how they work:

- `break`: The `break` statement is used to exit the current loop immediately. It allows you to prematurely terminate the loop and continue executing the next statements after the loop.

- `continue`: The `continue` statement is used to skip the rest of the current iteration and move to the next iteration of the loop. It allows you to skip specific iterations and continue with the next iteration.

These statements can be useful when you want to conditionally break out of a loop or skip certain iterations based on certain conditions.

Please note that the usage of `break` and `continue` in `awk` follows a similar pattern as in other programming languages, and they are used within the context of `if` or other conditional statements to control the loop behavior.

Content of `for_break.bash`:<br>
Example script that finds the smallest divisor of a number.

```bash
#!/bin/bash
awk '{
    number = $1;
    for (divisor = 2; divisor * divisor <= number; divisor++)
    {
        if (number % divisor == 0)
        {
            break
        }
    };

    if (number % divisor == 0)
    {
        printf("The smallest divisor of %d is %d\n", number, divisor)
    }    
    else
    {
        printf("%d is a prime number\n", number)
    }
}' files/numbers.dat
```

In [None]:
bash scripts/for_break.bash

**Alternatively, you can skip the iteration of a loop when a specific condition is met using the `continue` command.**

Content of `for_continue.bash`:

Here the loop will iterate 20 times, but when the value of the variable is not divisible by 5, the loop will skip the remaining commands and move to the next iteration.

```bash
#!/bin/bash
awk 'BEGIN {
    for (x = 0; x <= 20; x++)
    {
        if (x % 5)
        {
            continue
        }
        printf "%d is divisible by 5\n", x
    }
}'
```

In [None]:
bash scripts/for_continue.bash

## 9.5 Generating Commands Using `awk`

Using `awk` and leveraging loops, it is also easy to generate commands that will be executed in `bash`. This is a clever way to quickly execute multiple commands, providing more formatting options and allowing for error detection before running the command.

For example, you can generate a list of commands that operate on the gcd.sh script, which calculates the greatest common divisor (GCD) of two numbers.

In [None]:
awk 'BEGIN{for(i=1;i<=5;i=i+0.5)printf("scripts/gcd.sh 10 %.1f\n",i)}'

In [None]:
awk 'BEGIN{for(i=1;i<=5;i=i+0.5)printf("scripts/gcd.sh 10 %.1f\n",i)}' | bash

Another example is generating commands to perform operations on files, such as copying or creating files/directories.

In [None]:
awk 'BEGIN{for(i=2;i<=16;i=i+2)printf("touch kat_%d\n",i)}'

In [None]:
awk 'BEGIN{for(i=2;i<=16;i=i+2)printf("touch file_%d\n",i)}' | bash  # will generete even numbered files

As a comparison, similar command done entirely in `bash`:

In [None]:
for i in $(seq 1 2 16); do echo "touch file_$i"; done | bash  # will generete odd numbered files

In [None]:
ls file_*