In [1]:
%load_ext iawk

> For the purpose of this tutorial, the long-form versions of the arguments are used.

## Inline help

In [2]:
%%awk_input?

```ipython
Docstring: Sets input text data.
File:      ~/src/projects/awkupy/iawk/__init__.py
```

In [3]:
%%awk?

```ipython
Docstring:
::

  %awk [-d] [-f FILE] [-F FIELD_SEPARATOR] [-o [PRETTY_PRINT]] [-s SAVE_AS] [-u] [-v VARIABLE] [-e SOURCE] [input]

Execute awk via awkupy API.

Examples::

    %load_ext iawk

    %%awk LICENSE
    /^  [0-9]/{print $0}

    output = %awk -e '/^  [0-9]/{print $0}' LICENSE
    output.splitlines()[0]

positional arguments:
  input

optional arguments:
  -d, --debug           debug awk code
  -f FILE, --file FILE  external awk program
  -F FIELD_SEPARATOR, --field-separator FIELD_SEPARATOR
                        separate column using a string
  -o <[PRETTY_PRINT]>, --pretty-print <[PRETTY_PRINT]>
                        pretty print program to a file; if no file is mentioned the program is saved to awkprof.out
  -s SAVE_AS, --save-as SAVE_AS
                        save output to a file
  -u, --stdout          print to stdout
  -v VARIABLE, --variable VARIABLE
                        pass variables to awk, for example, var={value}
  -e SOURCE, --source SOURCE
                        program text for oneliners
File:      ~/src/projects/awkupy/iawk/__init__.py
```

## Set input to be reused

> The input is fixed as follows

In [4]:
%%awk_input
1 2 3
4 5 6
7 8 9 10 text

## Line magic `%awk`

> Processing a simple operations with awk onliners can be done using a line magic.
> The awk source code is passed using the `-e / --source` argument. Unlike `gawk` where the `-e` flag is optional, here it is mandatory to avoid confusion whether it is an input file or not.

In [5]:
%awk --source '{print $1+$2*$3}'

'7\n34\n79\n'

> The result can be seamlessly processed in the Python world

In [6]:
print(_.splitlines())

['7', '34', '79']


### Printing to stdout

> By default the output is returned to IPython shell, ready to be reused. The output can be also neatly printed out to standard output using the `-u / --stdout` argument.

In [7]:
%awk --stdout --source '{print $1+$2*$3}'

7
34
79



### Debug the magic command

In [8]:
%awk --debug --stdout --source '{print $1+$2*$3}'

- Code:
  {print $1+$2*$3}
  
- Input:
  1 2 3
  4 5 6
  7 8 9 10 text
  
- Output:
  stdout
  
7
34
79



### Save to a file

In [9]:
%awk --save-as output.txt --source '{print $1+$2*$3}'

CompletedProcess(args=['awk', '{print $1+$2*$3}'], returncode=0)

In [10]:
%cat output.txt

7
34
79


## Awk-like arguments `-f, -F, -v`

In [11]:
%%awk_input
1,2,3
4, 5, 6
7 ,8, 9,10,text

### Custom field separators

In [12]:
%awk --field-separator=, --source '{print $1+$2*$3}'

'7\n34\n79\n'

> N.B: Awk is really good at handling poorly formatted CSVs

### Pass Python variables into Awk and retrieve output

> Let us pass `np.pi` as `x` into Awk and parse the result into a numpy array.

In [13]:
import numpy as np

result = %awk --field-separator=, --variable x={np.pi} --source '{print $1+$2*$3+x}'
print(np.fromstring(result, sep="\n"))

[10.1416 37.1416 82.1416]


## Cell magic

In [14]:
%%awk_input
Pat   100 97 58
Sandy  84 72 93
Chris  72 92 89

> Complicated awk programs are best written in multiple lines using `%%awk` cell magic.

In [15]:
%%awk --stdout
{
    sum = $2 + $3 + $4
    avg = sum / 3
    print $1, avg
}

Pat 85
Sandy 83
Chris 84.3333



In [16]:
print(_)

7
34
79



## Execute standalone Awk program files

In [17]:
%ls

[0m[01;31mcoins_histogram.awk[0m*   [32mcoins_histogram.py[0m  [32moutput.txt[0m
[34mcoins_histogram.ipynb[0m  [32mcoins.txt[0m           [34mtutorial.ipynb[0m


In [18]:
%awk --file coins_histogram.awk --stdout coins.txt

Country: USA  count:  3
Country: PRC  count:  1
Country: Austria-Hungary  count:  1
Country: Canada  count:  1
Country: Switzerland  count:  1
Country: RSA  count:  2



See more with the [coins_histogram](coins_histogram.ipynb) demo for an example of polyglot programming with this example.