In [1]:
%load_ext iawk

> For the purpose of this tutorial, the long-form versions of the arguments are used.

## Inline help

In [2]:
%%awk_input?

```ipython
Docstring: Sets input text data.
File:      ~/src/projects/awkupy/iawk/__init__.py
```

In [3]:
%%awk?

```ipython
  %awk [-f FILE] [-F FIELD_SEPARATOR] [-p] [-s SAVE_AS] [-d] [-v VARIABLE] [-e SOURCE] [input]

Execute awk via awkupy API.

Examples::

    %load_ext iawk

    %%awk LICENSE
    /^  [0-9]/{print $0}

    output = %awk -e '/^  [0-9]/{print $0}' LICENSE
    output.splitlines()[0]

positional arguments:
  input

optional arguments:
  -f FILE, --file FILE  external awk program
  -F FIELD_SEPARATOR, --field-separator FIELD_SEPARATOR
                        separate column using a string
  -p, --print-output    print output to stdout
  -s SAVE_AS, --save-as SAVE_AS
                        save output to a file
  -d, --debug           debug awk code
  -v VARIABLE, --variable VARIABLE
                        pass variables to awk, for example, var={value}
  -e SOURCE, --source SOURCE
                        program text for oneliners
File:      ~/src/projects/awkupy/iawk/__init__.py
```

## Set input to be reused

In [4]:
%%awk_input
1 2 3
4 5 6
7 8 9 10 text

## Processing a simple algebra over rows

In [5]:
%awk --source '{print $1+$2*$3}'

'7\n34\n79\n'

In [6]:
print(_.splitlines())

['7', '34', '79']


> N.B: The output can be also neatly printed out to standard output using the `-p / --print-output` argument. It works only in the IPython console at the moment.

### Debug the magic command

In [7]:
%awk --debug --source '{print $1+$2*$3}'

Code:
{print $1+$2*$3}
Input:
1 2 3
4 5 6
7 8 9 10 text



'7\n34\n79\n'

### Save to a file

In [8]:
%awk --save-as output.txt --source '{print $1+$2*$3}'

In [9]:
%cat output.txt

7
34
79


## Awk-like arguments `-f, -F, -v`

In [10]:
%%awk_input
1,2,3
4, 5, 6
7 ,8, 9,10,text

### Custom field separators

In [11]:
%awk --field-separator=, --source '{print $1+$2*$3}'

'7\n34\n79\n'

> N.B: Awk is really good at handling poorly formatted CSVs

### Pass Python variables into Awk and retrieve output

> Let us pass `np.pi` as `x` into Awk and parse the result into a numpy array.

In [12]:
import numpy as np

result = %awk --field-separator=, --variable x={np.pi} --source '{print $1+$2*$3+x}'
print(np.fromstring(result, sep="\n"))

[10.1416 37.1416 82.1416]


## Execute standalone Awk program files

In [13]:
%ls

[0m[01;31mcoins_histogram.awk[0m*   [32mcoins_histogram.py[0m  [32moutput.txt[0m
[34mcoins_histogram.ipynb[0m  [32mcoins.txt[0m           [34mtutorial.ipynb[0m


In [14]:
%awk --file coins_histogram.awk coins.txt

'Country: USA  count:  3\nCountry: PRC  count:  1\nCountry: Austria-Hungary  count:  1\nCountry: Canada  count:  1\nCountry: Switzerland  count:  1\nCountry: RSA  count:  2\n'

## Cell magic

> Complicated awk programs are best written in multiple lines using `%%awk` cell magic.

In [15]:
%%awk_input
Pat   100 97 58
Sandy  84 72 93
Chris  72 92 89

In [16]:
%%awk
{
    sum = $2 + $3 + $4
    avg = sum / 3
    print $1, avg
}

'Pat 85\nSandy 83\nChris 84.3333\n'

In [17]:
print(_)

Pat 85
Sandy 83
Chris 84.3333

