# Lesson 02: Built-in Functions
Awk comes with a variety of built-in functions. They are specified in the MAN page - https://linux.die.net/man/1/awk

## Math functions
Basic mathematical functions are available:

In [6]:
awk --version

GNU Awk 4.1.4, API: 1.1 (GNU MPFR 4.0.1, GNU MP 6.1.2)
Copyright (C) 1989, 1991-2016 Free Software Foundation.

This program is free software; you can redistribute it and/or modify
it under the terms of the GNU General Public License as published by
the Free Software Foundation; either version 3 of the License, or
(at your option) any later version.

This program is distributed in the hope that it will be useful,
but WITHOUT ANY WARRANTY; without even the implied warranty of
MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
GNU General Public License for more details.

You should have received a copy of the GNU General Public License
along with this program. If not, see http://www.gnu.org/licenses/.


In [7]:
awk -v pi=3.1415 'BEGIN {print exp(1), log(exp(1)), sqrt(2), sin(pi), cos(pi), atan2(pi, 2) }'


2.71828 1 1.41421 9.26536e-05 -1 1.00387


In [11]:
# It can also generate random numbers on (0, 1).
awk 'BEGIN { print rand(); print rand(); }'
echo
awk 'BEGIN { print rand(); print rand(); }'

0.237788
0.291066

0.237788
0.291066


In [12]:
# The srand function can be used to set the seed:

awk 'BEGIN { srand(10); print rand(); print rand() }'


0.255219
0.898883


In [13]:
# The int function returns "the nearest integer to x, located between x and zero and truncated toward zero".
awk 'BEGIN { print "int(0.9)  =  " int(0.9); print "int(-0.9) = " int(-0.9) }'

int(0.9)  =  0
int(-0.9) = 0


## String functions



In [15]:
### substr

# The substr(s, m, n) function will select n-character substring of s that begins at position m counted from 1.

cat ./data/field_data.txt
echo
awk '{ print $1, substr($1, 2, 3) }' ./data/field_data.txt

Roses are red,
Violets are blue,
Sugar is sweet,
And so are you.

Roses ose
Violets iol
Sugar uga
And nd


In [16]:
### index
# index(s, t) returns `the position in s where the string t occurs, or 0 if it does not.``

# index's pattern is NOT a regular expression.

awk '{ print $1, index($1, "s") }' ./data/field_data.txt


Roses 3
Violets 7
Sugar 0
And 0


In [17]:
### match
# match(s, r) returns the position in s where the regular expression r occurs, or 0 if it does not. 
# The  variables RSTART and RLENGTH are set to the position and length of the matched string.

# match is like index except the patten is a regular expression.

awk '{ print $1, match($1, "[sS]") }' ./data/field_data.txt

Roses 3
Violets 7
Sugar 1
And 0


In [18]:
# "Find three or more repeated letters"
awk '{ match($1, "[a-z]{3}"); print $1, "\tpattern start:", RSTART, "\tpattern end:", RLENGTH }' ./data/letters.txt

a 	pattern start: 0 	pattern end: -1
bb 	pattern start: 0 	pattern end: -1
ccc 	pattern start: 1 	pattern end: 3
dddd 	pattern start: 1 	pattern end: 3
ggg 	pattern start: 1 	pattern end: 3
hh 	pattern start: 0 	pattern end: -1
i 	pattern start: 0 	pattern end: -1


In [19]:
### split
# split(s, a, fs) splits  the string s into array elements a[1], a[2], ..., a[n], and returns n.

# The separation is done with the regular expression fs or with the field separator FS if fs is not given.   
# An empty string as field separator splits the string into one array element per character.

awk 'BEGIN { print split("It-was_the-best_of-times", output_array, "[\-_]"), output_array[2], output_array[4] }'

6 was best


In [20]:
### sub
# sub(r, t, s) substitutes  t  for the first occurrence of the regular expression r in the string s.  
# If s is not given, $0 is used.

# The s must be a variable which sub modifies in place. 
# Instead of returning the substituted string, it returns the number of substitutions made (0 or 1).

awk 'BEGIN { s = "It was the best of times, it was the worst of times"; \
             print "Num. matches replaced:",  sub("times", "gifs", s ); \
             print s  }'


Num. matches replaced: 1
It was the best of gifs, it was the worst of times


In [21]:
### gsub
# gsub does the same as sub except that all occurrences of the regular expression are replaced; 
# sub and gsub return the number of replacements.

awk 'BEGIN { s = "It was the best of times, it was the worst of times"; \
             print "Num. matches replaced:", gsub("times", "cats", s ); \
             print s  }'

Num. matches replaced: 2
It was the best of cats, it was the worst of cats


In [22]:
### sprintf
# sprintf(fmt, expr, ... ) returns the string resulting from formatting expr ...  
# according to the printf(3) format fmt

awk 'BEGIN { x = sprintf("[%8.3f]", 3.141592654); print x }'

[   3.142]
