# Lesson 02: Built-in Functions

Awk comes with a variety of built-in functions. They are specified in the [man page](https://developer.apple.com/library/mac/documentation/Darwin/Reference/ManPages/man1/awk.1.html).

## Math functions

Basic mathematical functions are available:

In [1]:
awk -v pi=3.1415 'BEGIN {print exp(1), log(exp(1)), sqrt(2), sin(pi), cos(pi), atan2(pi, 2) }' 

2.71828 1 1.41421 9.26536e-05 -1 1.00387


It can also generate random numbers on (0, 1).

In [4]:
awk 'BEGIN { print rand(); print rand() }' 

0.237788
0.291066


By default, Awk starts with same seed for each call to Awk. Running this command twice in a row returns the same result:

In [5]:
awk 'BEGIN { print rand(); print rand() }' 

0.237788
0.291066


The `srand` function can be used to set the seed:

In [8]:
awk 'BEGIN { srand(10); print rand(); print rand() }' 

0.255219
0.898883


In [7]:
awk 'BEGIN { srand(10); print rand(); print rand() }' 

0.255219
0.898883


The `int` function returns "the nearest integer to x, located between x and zero and truncated toward zero".

In [6]:
awk 'BEGIN { print "int(0.9)  =  " int(0.9); print "int(-0.9) = " int(-0.9) }' 

int(0.9)  =  0
int(-0.9) = 0


## String functions

### `substr`

The `substr(s, m, n)` function will select `n-character substring of s that begins at position m counted from 1`.

In [9]:
awk '{ print $1, substr($1, 2, 3) }' ./data/field_data.txt

Roses ose
Violets iol
Sugar uga
And nd


### `index`
`index(s, t)` returns `the position in s where the string t occurs, or 0 if it does not.``

`index`'s pattern isn't a regular expression.

In [10]:
awk '{ print $1, index($1, "s") }' ./data/field_data.txt

Roses 3
Violets 7
Sugar 0
And 0


### `match`
`match(s, r)` returns `the 1-based position in s where the regular expression r occurs, or 0 if it does not. The  variables RSTART and RLENGTH are set to the position and length of the matched string.`

`match` is like `index` except the patten is a regular expression.

In [11]:
awk '{ print $1, match($1, "[sS]") }' ./data/field_data.txt

Roses 3
Violets 7
Sugar 1
And 0


In [12]:
# "Find three or more repeated letters"
awk '{ match($1, "[a-z]{3}"); print $1, "\tpattern start:", RSTART, "\tpattern end:", RLENGTH }' ./data/letters.txt

a 	pattern start: 0 	pattern end: -1
bb 	pattern start: 0 	pattern end: -1
ccc 	pattern start: 1 	pattern end: 3
dddd 	pattern start: 1 	pattern end: 3
ggg 	pattern start: 1 	pattern end: 3
hh 	pattern start: 0 	pattern end: -1
i 	pattern start: 0 	pattern end: -1


### `split`

`split(s, a, fs) splits  the string s into array elements a[1], a[2], ..., a[n], and returns n.` 

`The separation is done with the regular expression fs or with the field separator FS if fs is not given.   An empty string as field separator splits the string into one array element per character.`

In [11]:
awk 'BEGIN { print split("It-was_the-best_of-times", output_array, "[\-_]"), output_array[2], output_array[4] }'

6 was best


### `sub`
`sub(r, t, s) substitutes  t  for the first occurrence of the regular expression r in the string s.  If s is not given, $0 is used.`

`s` must be a variable which `sub` modifies in place. Instead of returning the substituted string, it returns the number of substitutions made (0 or 1).

In [1]:
awk 'BEGIN { s = "It was the best of times, it was the worst of times"; \
             print "Num. matches replaced:",  sub("times", "gifs", s ); \
             print s  }'

Num. matches replaced: 1
It was the best of gifs, it was the worst of times


### `gsub`
`gsub` does the `same as sub except that all occurrences of the regular expression are replaced; sub and gsub return the number of replacements.`

In [8]:
awk 'BEGIN { s = "It was the best of times, it was the worst of times"; \
             print "Num. matches replaced:", gsub("times", "cats", s ); \
             print s  }'

Num. matches replaced: 2
It was the best of cats, it was the worst of cats


### sprintf
`sprintf(fmt, expr, ... )` returns `the string resulting from formatting expr ...  according to the printf(3) format fmt`

In [21]:
awk 'BEGIN { x = sprintf("[%8.3f]", 3.141592654); print x }'

[   3.142]
