# aq_pp command -eval

In this notebook, we'll go over common usage examples of the data preprocessing command, `aq_pp`, primarily focusing on `-eval` option.

## Objective

Objective of this notebook is to educate and familiarize new users of `aq_tool` with `-eval` option. By the end of this sample, they should have understandings of basic syntax and usage of the option, and be able to perform basic operation comfortably. 
Advanced application, such as usage with other options will be covered in the future. 

Before going over this notebook, make sure you're faimilar with the following concepts.

* Bash commands
* Regular Expression
* aq_input / input-spec 

We won't go over input, column and output spec on this notebook. They can be found on 
- [this notebook](aq_input.ipynb).
- [aq_output notebook](aq_output.ipynb)
 

Also have the [aq_pp documentation](http://auriq.com/documentation/source/reference/manpages/aq_pp.html#filt) ready on your side, so you can refer to the details of each options as needed.

**TOS**

**EOTOS**


## Overview

`-eval` option in `aq-pp` command is responsible for data manipulation and column creation. Given expression and destination column name, it _evaluate_ the expression, and store the result in the destination column. More details of the option is available at [eval section - aq_pp documentation](http://auriq.com/documentation/source/reference/manpages/aq_pp.html#eval)

### Syntax

```bash
aq-pp ... -eval ColSpec|ColName Expr
```
where
- `ColSpec`: new column's column spec to assign the result
- `ColName`: existing column name to assign the result
- `Expr`: expression to be evaluated.


## Data

[Ramen Ratings Dataset](https://www.kaggle.com/residentmario/ramen-ratings) from kaggle will be used in this sample, which contains ratings of 2500 ramen products. 

Review|Brand|Variety|Style|Country|Stars
---|---|---|---|---|---|
2580|New Touch|T's Restaurant Tantanmen|Cup|Japan|3.75
2579|Just Way|Noodles Spicy Hot Sesame Spicy Hot Sesame Guan-miao Noodles|Pack|Taiwan|1
2578|Nissin|Cup Noodles Chicken Vegetable|Cup|USA|2.25
2577|Wei Lih|GGE Ramen Snack Tomato Flavor|Pack|Taiwan|2.75
2576|Ching's Secret|Singapore Curry|Pack|India|3.75
2575|Samyang Foods|Kimchi song Song Ramen|Pack|South Korea|4.75
2574|Acecook|Spice Deli Tantan Men With Cilantro|Cup|Japan|4
2573|Ikeda Shoku|Nabeyaki Kitsune Udon|Tray|Japan|3.75
2572|Ripe'n'Dry|Hokkaido Soy Sauce Ramen|Pack|Japan|0.25
2571|KOKA|The Original Spicy Stir-Fried Noodles|Pack|Singapore|2.5

Columns and corresponing data types for the dataset are follows.
- `int: Review #`: review id number, the more recent the review is, the bigger the number is
- `str: Brand`: brand / manufacture of the product
- `str: Variety`: title of the product
- `str: Style`: categorical styles of the products, cup, pack or tray
- `str: Country`: country of origin
- `float: stars`: star rating of each product

Several other dataset will be used; they will be introduced along the way.<br>

## Input and Column Specification

Here is the corresponding column specs for the data<br>
`i:reviewID s:brand s:variety s:style s:country f:stars`

**Note**<br>
When reading in the files with `aq-pp`, we'll be using bash's [variable substitution](http://www.compciv.org/topics/bash/variables-and-substitution/) to keep the command short and clean. For instance, 
```bash
# assign file name & path to variable 'file'
file='data/aq_pp/fileName.csv'
```

Now we are all set and ready, let's get started with numerical operation.

## Arithmetic

Some intro here about arithmetic

### Numerical Operation

Operators supported for numerical operation are<br>

_Arithmetic_
- `*`: multiplication
- `/`: division
- `%`: modulus
- `+`: addition
- `-`: subtraction

_Bitwise_
- `&`: AND
- `|`: OR
- `^`: XOR

First, we will double the value of star rating column, and assign it to a new column named `double_rating`. 

In [84]:
# First store filename and column spec in variable to simplify commands
file="data/aq_pp/ramen-ratings-part.csv"
cols="i:reviewID s:brand s:variety s:style s:country f:stars"
# now create a column called double_rating, and assign the value of 2 * stars
aq_pp -f,+1 $file -d $cols -eval f:double_rating '2*stars'

"reviewID","brand","variety","style","country","stars","double_rating"
2580,"New Touch","T's Restaurant Tantanmen ","Cup","Japan",3.75,7.5
2579,"Just Way","Noodles Spicy Hot Sesame Spicy Hot Sesame Guan-miao Noodles","Pack","Taiwan",1,2
2578,"Nissin","Cup Noodles Chicken Vegetable","Cup","USA",2.25,4.5
2577,"Wei Lih","GGE Ramen Snack Tomato Flavor","Pack","Taiwan",2.75,5.5
2576,"Ching's Secret","Singapore Curry","Pack","India",3.75,7.5
2575,"Samyang Foods","Kimchi song Song Ramen","Pack","South Korea",4.75,9.5
2574,"Acecook","Spice Deli Tantan Men With Cilantro","Cup","Japan",4,8
2573,"Ikeda Shoku","Nabeyaki Kitsune Udon","Tray","Japan",3.75,7.5
2572,"Ripe'n'Dry","Hokkaido Soy Sauce Ramen","Pack","Japan",0.25,0.5


Now the new column `double_rating` contains the value twice as large as the `stars` value.

**Couple things to note**<br>
- **Column Datatype:** the destination column's datatype has to be same as the datatype of result of the `Expr`. In the example above, the result is float datatype, therefore we've declared `double_rating` as float.
- **Quotations:** you cannot quote `colName|colSpec`, while `Expr` needs to be quoted. Single quotation is recommended, in case string value is included which require further quotation.
Now we will perform the same operation, but store the result on existing column, `stars`.

In the below example, we'll assign the result to existing column `stars`, instead of creating new column.

In [85]:
aq_pp -f,+1 $file -d $cols -eval stars '2*stars'

"reviewID","brand","variety","style","country","stars"
2580,"New Touch","T's Restaurant Tantanmen ","Cup","Japan",7.5
2579,"Just Way","Noodles Spicy Hot Sesame Spicy Hot Sesame Guan-miao Noodles","Pack","Taiwan",2
2578,"Nissin","Cup Noodles Chicken Vegetable","Cup","USA",4.5
2577,"Wei Lih","GGE Ramen Snack Tomato Flavor","Pack","Taiwan",5.5
2576,"Ching's Secret","Singapore Curry","Pack","India",7.5
2575,"Samyang Foods","Kimchi song Song Ramen","Pack","South Korea",9.5
2574,"Acecook","Spice Deli Tantan Men With Cilantro","Cup","Japan",8
2573,"Ikeda Shoku","Nabeyaki Kitsune Udon","Tray","Japan",7.5
2572,"Ripe'n'Dry","Hokkaido Soy Sauce Ramen","Pack","Japan",0.5


You can apply any of the other arithmetic operators just like above example. 

In the above example, `Expr` only contained existing column and a constant. We can also provide multiple column names as `Expr` and perform calculation.

We'll divide the `reviewID` (int) by `stars`(float), and store the result in new column `div`(float).

In [86]:
aq_pp -f,+1 $file -d $cols -eval f:div 'reviewID/stars'

"reviewID","brand","variety","style","country","stars","div"
2580,"New Touch","T's Restaurant Tantanmen ","Cup","Japan",3.75,688
2579,"Just Way","Noodles Spicy Hot Sesame Spicy Hot Sesame Guan-miao Noodles","Pack","Taiwan",1,2579
2578,"Nissin","Cup Noodles Chicken Vegetable","Cup","USA",2.25,1145.7777777777778
2577,"Wei Lih","GGE Ramen Snack Tomato Flavor","Pack","Taiwan",2.75,937.09090909090912
2576,"Ching's Secret","Singapore Curry","Pack","India",3.75,686.93333333333328
2575,"Samyang Foods","Kimchi song Song Ramen","Pack","South Korea",4.75,542.10526315789468
2574,"Acecook","Spice Deli Tantan Men With Cilantro","Cup","Japan",4,643.5
2573,"Ikeda Shoku","Nabeyaki Kitsune Udon","Tray","Japan",3.75,686.13333333333333
2572,"Ripe'n'Dry","Hokkaido Soy Sauce Ramen","Pack","Japan",0.25,10288


### String Operation 
**+ operator with string**<br>
`+` operator can also be used to concatinate string values, besides numeric operation. As a example, we'll create a string column `s:info`, and store combined strings of `brand` and `country`, separated by ` - ` character. 

Note that only `+` operator supports string manipulation.

In [87]:
aq_pp -f,+1 $file -d $cols -eval s:info 'brand+" - "+country'

"reviewID","brand","variety","style","country","stars","info"
2580,"New Touch","T's Restaurant Tantanmen ","Cup","Japan",3.75,"New Touch - Japan"
2579,"Just Way","Noodles Spicy Hot Sesame Spicy Hot Sesame Guan-miao Noodles","Pack","Taiwan",1,"Just Way - Taiwan"
2578,"Nissin","Cup Noodles Chicken Vegetable","Cup","USA",2.25,"Nissin - USA"
2577,"Wei Lih","GGE Ramen Snack Tomato Flavor","Pack","Taiwan",2.75,"Wei Lih - Taiwan"
2576,"Ching's Secret","Singapore Curry","Pack","India",3.75,"Ching's Secret - India"
2575,"Samyang Foods","Kimchi song Song Ramen","Pack","South Korea",4.75,"Samyang Foods - South Korea"
2574,"Acecook","Spice Deli Tantan Men With Cilantro","Cup","Japan",4,"Acecook - Japan"
2573,"Ikeda Shoku","Nabeyaki Kitsune Udon","Tray","Japan",3.75,"Ikeda Shoku - Japan"
2572,"Ripe'n'Dry","Hokkaido Soy Sauce Ramen","Pack","Japan",0.25,"Ripe'n'Dry - Japan"


`brand` and `country` are column names while ` - ` is a string constant, which is why it is double quoted. 

More complex string manipulations are possible with `aq_pp` by using `-map` options and/or [`builtin functions / aq-emod`](http://auriq.com/documentation/source/reference/manpages/aq-emod.html), which will be covered in other notebook.

### Bitwise Operation

Let's take a look at bitwise operator, which performs [bitwise logical operation](https://en.wikipedia.org/wiki/Bitwise_operation) on decimal numbers.

`aq_pp` supports operators below.


- `&`: AND
- `|`: OR
- `^`: XOR

We'll use different data containing decimal numbers to demonstrate the result of bitwise operation clearly, which looks like below.

number|mask
---|---
1|981
290|90
31|12
79|56
10|874

Let's perform `|`(bitwise OR) operator on `numbers` column, with a constant 32. The result will be stored in the new column `i:result`.

In [88]:
aq_pp -f,+1 data/aq_pp/bitwise.csv -d i:number i:mask -eval i:result 'number | 32'

"number","mask","result"
1,981,33
290,90,290
31,12,63
79,56,111
10,874,42


**Note**: <br>
`aq_pp` interpret numbers as decimal by default, therefore input to the operators will be interpretted as decimal, and output will be in decimal number. 

## Builtin Variables
`aq_pp` is equipped with [builtin variables](http://auriq.com/documentation/source/reference/manpages/aq_pp.html#eval) that can be used to substitue values. There are couple of them, and here we'll take a look at `$RowNum` and `$Random`.

**`RowNum`**<br>
represents the row number of the record, starting at 1.

On the example below, we'll create a new integer column `row` and store the row number.

In [89]:
aq_pp -f,+1 $file -d $cols -eval i:row '$RowNum'

"reviewID","brand","variety","style","country","stars","row"
2580,"New Touch","T's Restaurant Tantanmen ","Cup","Japan",3.75,1
2579,"Just Way","Noodles Spicy Hot Sesame Spicy Hot Sesame Guan-miao Noodles","Pack","Taiwan",1,2
2578,"Nissin","Cup Noodles Chicken Vegetable","Cup","USA",2.25,3
2577,"Wei Lih","GGE Ramen Snack Tomato Flavor","Pack","Taiwan",2.75,4
2576,"Ching's Secret","Singapore Curry","Pack","India",3.75,5
2575,"Samyang Foods","Kimchi song Song Ramen","Pack","South Korea",4.75,6
2574,"Acecook","Spice Deli Tantan Men With Cilantro","Cup","Japan",4,7
2573,"Ikeda Shoku","Nabeyaki Kitsune Udon","Tray","Japan",3.75,8
2572,"Ripe'n'Dry","Hokkaido Soy Sauce Ramen","Pack","Japan",0.25,9


Since we are skipping the header row with `-f,+1` option, we'll correct the row numbers by addding 1 to each row number (counting the header as row 1).

In [90]:
aq_pp -f,+1 $file -d $cols -eval i:row '$RowNum +1'

"reviewID","brand","variety","style","country","stars","row"
2580,"New Touch","T's Restaurant Tantanmen ","Cup","Japan",3.75,2
2579,"Just Way","Noodles Spicy Hot Sesame Spicy Hot Sesame Guan-miao Noodles","Pack","Taiwan",1,3
2578,"Nissin","Cup Noodles Chicken Vegetable","Cup","USA",2.25,4
2577,"Wei Lih","GGE Ramen Snack Tomato Flavor","Pack","Taiwan",2.75,5
2576,"Ching's Secret","Singapore Curry","Pack","India",3.75,6
2575,"Samyang Foods","Kimchi song Song Ramen","Pack","South Korea",4.75,7
2574,"Acecook","Spice Deli Tantan Men With Cilantro","Cup","Japan",4,8
2573,"Ikeda Shoku","Nabeyaki Kitsune Udon","Tray","Japan",3.75,9
2572,"Ripe'n'Dry","Hokkaido Soy Sauce Ramen","Pack","Japan",0.25,10


**`Random`**<br>
Represents a positive random number, and the value changes every time the variable is referenced.

In this example, we will use `Random` to generate random integer for every row, and store it in integer column named `random`.

In [91]:
aq_pp -f,+1 $file -d $cols -eval i:random '$random'

"reviewID","brand","variety","style","country","stars","random"
2580,"New Touch","T's Restaurant Tantanmen ","Cup","Japan",3.75,476707713
2579,"Just Way","Noodles Spicy Hot Sesame Spicy Hot Sesame Guan-miao Noodles","Pack","Taiwan",1,1186278907
2578,"Nissin","Cup Noodles Chicken Vegetable","Cup","USA",2.25,505671508
2577,"Wei Lih","GGE Ramen Snack Tomato Flavor","Pack","Taiwan",2.75,2137716191
2576,"Ching's Secret","Singapore Curry","Pack","India",3.75,936145377
2575,"Samyang Foods","Kimchi song Song Ramen","Pack","South Korea",4.75,1215825599
2574,"Acecook","Spice Deli Tantan Men With Cilantro","Cup","Japan",4,589265238
2573,"Ikeda Shoku","Nabeyaki Kitsune Udon","Tray","Japan",3.75,924859463
2572,"Ripe'n'Dry","Hokkaido Soy Sauce Ramen","Pack","Japan",0.25,1182112391


This outputs very large positive integer. Sometimes we need random numbers within a certain range. Let's say between 0 and 10. Using modulus operator, 

In [92]:
aq_pp -f,+1 $file -d $cols -eval i:row '$random%10'

"reviewID","brand","variety","style","country","stars","row"
2580,"New Touch","T's Restaurant Tantanmen ","Cup","Japan",3.75,3
2579,"Just Way","Noodles Spicy Hot Sesame Spicy Hot Sesame Guan-miao Noodles","Pack","Taiwan",1,7
2578,"Nissin","Cup Noodles Chicken Vegetable","Cup","USA",2.25,8
2577,"Wei Lih","GGE Ramen Snack Tomato Flavor","Pack","Taiwan",2.75,1
2576,"Ching's Secret","Singapore Curry","Pack","India",3.75,7
2575,"Samyang Foods","Kimchi song Song Ramen","Pack","South Korea",4.75,9
2574,"Acecook","Spice Deli Tantan Men With Cilantro","Cup","Japan",4,8
2573,"Ikeda Shoku","Nabeyaki Kitsune Udon","Tray","Japan",3.75,3
2572,"Ripe'n'Dry","Hokkaido Soy Sauce Ramen","Pack","Japan",0.25,1


Other than modulus, you can form and apply more complex numerical operations with builtin variables. 

## Data Conversion

Users can take advantage of powerful [builtin function / aq-emod](www.auriq.com/documentation/source/reference/manpages/aq-emod.html) that can be used for more complex data processing than one can do with combinations of `-eval` options. 

While there are variety of functions available, we'll take a look at ones for data type conversion in this section, specifically `ToI()` and `ToF()`. 

We'll set all columns' data types as string in column spec in the first step.

In [93]:
cols="s:reviewID s:brand s:variety s:style s:country s:stars"
# input every columns as string
aq_pp -f,+1 $file -d $cols 

"reviewID","brand","variety","style","country","stars"
"2580","New Touch","T's Restaurant Tantanmen ","Cup","Japan","3.75"
"2579","Just Way","Noodles Spicy Hot Sesame Spicy Hot Sesame Guan-miao Noodles","Pack","Taiwan","1"
"2578","Nissin","Cup Noodles Chicken Vegetable","Cup","USA","2.25"
"2577","Wei Lih","GGE Ramen Snack Tomato Flavor","Pack","Taiwan","2.75"
"2576","Ching's Secret","Singapore Curry","Pack","India","3.75"
"2575","Samyang Foods","Kimchi song Song Ramen","Pack","South Korea","4.75"
"2574","Acecook","Spice Deli Tantan Men With Cilantro","Cup","Japan","4"
"2573","Ikeda Shoku","Nabeyaki Kitsune Udon","Tray","Japan","3.75"
"2572","Ripe'n'Dry","Hokkaido Soy Sauce Ramen","Pack","Japan","0.25"


Notice that `reviewID` and `stars` columns are quoted, showing that `aq_pp` is interpretting them as string. Let's convert them into appropriate data types with builtin functions, 
- `ToF(Val)`: convert `Val` to float
- `ToI(Val)`: convert `Val` to integer

where `Val` can be constant value or column names of string / numeric data type.

In [94]:
aq_pp -f,+1 $file -d $cols -eval i:int_reviewID 'ToI(reviewID)'

"reviewID","brand","variety","style","country","stars","int_reviewID"
"2580","New Touch","T's Restaurant Tantanmen ","Cup","Japan","3.75",2580
"2579","Just Way","Noodles Spicy Hot Sesame Spicy Hot Sesame Guan-miao Noodles","Pack","Taiwan","1",2579
"2578","Nissin","Cup Noodles Chicken Vegetable","Cup","USA","2.25",2578
"2577","Wei Lih","GGE Ramen Snack Tomato Flavor","Pack","Taiwan","2.75",2577
"2576","Ching's Secret","Singapore Curry","Pack","India","3.75",2576
"2575","Samyang Foods","Kimchi song Song Ramen","Pack","South Korea","4.75",2575
"2574","Acecook","Spice Deli Tantan Men With Cilantro","Cup","Japan","4",2574
"2573","Ikeda Shoku","Nabeyaki Kitsune Udon","Tray","Japan","3.75",2573
"2572","Ripe'n'Dry","Hokkaido Soy Sauce Ramen","Pack","Japan","0.25",2572


We've provided column name as `Val` in the example above, but can also provide a string constant. Note that you should always quote the string values in `-eval` options' `Expr`. 

In [95]:
aq_pp -f,+1 $file -d $cols -eval i:int_reviewID 'ToI("13")'

"reviewID","brand","variety","style","country","stars","int_reviewID"
"2580","New Touch","T's Restaurant Tantanmen ","Cup","Japan","3.75",13
"2579","Just Way","Noodles Spicy Hot Sesame Spicy Hot Sesame Guan-miao Noodles","Pack","Taiwan","1",13
"2578","Nissin","Cup Noodles Chicken Vegetable","Cup","USA","2.25",13
"2577","Wei Lih","GGE Ramen Snack Tomato Flavor","Pack","Taiwan","2.75",13
"2576","Ching's Secret","Singapore Curry","Pack","India","3.75",13
"2575","Samyang Foods","Kimchi song Song Ramen","Pack","South Korea","4.75",13
"2574","Acecook","Spice Deli Tantan Men With Cilantro","Cup","Japan","4",13
"2573","Ikeda Shoku","Nabeyaki Kitsune Udon","Tray","Japan","3.75",13
"2572","Ripe'n'Dry","Hokkaido Soy Sauce Ramen","Pack","Japan","0.25",13


Builtin function can also be combined with arithmetic expression. Let's convert+ `"13"`(str constant) and `reviewID` into int, then add them together, then store the result on `i:result` column this time.

In [96]:
aq_pp -f,+1 $file -d $cols -eval i:result 'ToI(reviewID) + ToI("13")'

"reviewID","brand","variety","style","country","stars","result"
"2580","New Touch","T's Restaurant Tantanmen ","Cup","Japan","3.75",2593
"2579","Just Way","Noodles Spicy Hot Sesame Spicy Hot Sesame Guan-miao Noodles","Pack","Taiwan","1",2592
"2578","Nissin","Cup Noodles Chicken Vegetable","Cup","USA","2.25",2591
"2577","Wei Lih","GGE Ramen Snack Tomato Flavor","Pack","Taiwan","2.75",2590
"2576","Ching's Secret","Singapore Curry","Pack","India","3.75",2589
"2575","Samyang Foods","Kimchi song Song Ramen","Pack","South Korea","4.75",2588
"2574","Acecook","Spice Deli Tantan Men With Cilantro","Cup","Japan","4",2587
"2573","Ikeda Shoku","Nabeyaki Kitsune Udon","Tray","Japan","3.75",2586
"2572","Ripe'n'Dry","Hokkaido Soy Sauce Ramen","Pack","Japan","0.25",2585


13 is added to the original `reviewID` value, on `result` column. 

We can also combine numeric strings by using `+`, convert the result to numeric data type, then store in a numeric column. Let's take a look.



In [97]:
aq_pp -f,+1 $file -d $cols -eval i:result 'ToI(reviewID + "13")'

"reviewID","brand","variety","style","country","stars","result"
"2580","New Touch","T's Restaurant Tantanmen ","Cup","Japan","3.75",258013
"2579","Just Way","Noodles Spicy Hot Sesame Spicy Hot Sesame Guan-miao Noodles","Pack","Taiwan","1",257913
"2578","Nissin","Cup Noodles Chicken Vegetable","Cup","USA","2.25",257813
"2577","Wei Lih","GGE Ramen Snack Tomato Flavor","Pack","Taiwan","2.75",257713
"2576","Ching's Secret","Singapore Curry","Pack","India","3.75",257613
"2575","Samyang Foods","Kimchi song Song Ramen","Pack","South Korea","4.75",257513
"2574","Acecook","Spice Deli Tantan Men With Cilantro","Cup","Japan","4",257413
"2573","Ikeda Shoku","Nabeyaki Kitsune Udon","Tray","Japan","3.75",257313
"2572","Ripe'n'Dry","Hokkaido Soy Sauce Ramen","Pack","Japan","0.25",257213


Phew! that was lots of examples, but now we know how to perform fundamental operations and manipulate data with `-eval` option. 
Let's take a look at some advanced examples.


### Builtin Functions
Builtin functions (a.k.a aq-emod) are able to perform more complicated processing than ones done by aq_pp alone. 
Followings are the types of functions:

- [String property functions](#string_property)
- [Math functions](#math)
- [Comparison functions](#comparison)
- [Data extraction and encode/decode functions](#extract_code)
- [General data conversion functions](#conversion)
- [Date/Time conversion functions](#date_time)
- [Character set encoding conversion functions](#character_encoding)
- [Key hashing functions](#key_hashing)
- [Speciality functions](#speciality)
- [RTmetrics functions](#rtmetrics)
- [Udb specific functions](#udb)

Before we get started on the functions, we will redefine each columns' data type by modifying the column spec.
`reviewID` will be integer type, and `stars` will be float type, while the rest will remain string type.

In [98]:
cols="i:reviewID s:brand s:variety s:style s:country f:stars"
aq_pp -f,+1 $file -d $cols

"reviewID","brand","variety","style","country","stars"
2580,"New Touch","T's Restaurant Tantanmen ","Cup","Japan",3.75
2579,"Just Way","Noodles Spicy Hot Sesame Spicy Hot Sesame Guan-miao Noodles","Pack","Taiwan",1
2578,"Nissin","Cup Noodles Chicken Vegetable","Cup","USA",2.25
2577,"Wei Lih","GGE Ramen Snack Tomato Flavor","Pack","Taiwan",2.75
2576,"Ching's Secret","Singapore Curry","Pack","India",3.75
2575,"Samyang Foods","Kimchi song Song Ramen","Pack","South Korea",4.75
2574,"Acecook","Spice Deli Tantan Men With Cilantro","Cup","Japan",4
2573,"Ikeda Shoku","Nabeyaki Kitsune Udon","Tray","Japan",3.75
2572,"Ripe'n'Dry","Hokkaido Soy Sauce Ramen","Pack","Japan",0.25


<a id='string_property'></a>
#### String Property Functions


- `SHash(Val)`:Returns the numeric hash value of a string.<br>
Val can be a string column’s name, a string constant, or an expression that evaluates to a string.

    In this example, we'll hash `style` column, which value consists of Cup, Pack or Tray, and store the result in `style_hash` column.

In [99]:
aq_pp -f,+1 $file -d $cols -eval 'i:style_hash' 'SHash(style)'

"reviewID","brand","variety","style","country","stars","style_hash"
2580,"New Touch","T's Restaurant Tantanmen ","Cup","Japan",3.75,193488781
2579,"Just Way","Noodles Spicy Hot Sesame Spicy Hot Sesame Guan-miao Noodles","Pack","Taiwan",1,2090607556
2578,"Nissin","Cup Noodles Chicken Vegetable","Cup","USA",2.25,193488781
2577,"Wei Lih","GGE Ramen Snack Tomato Flavor","Pack","Taiwan",2.75,2090607556
2576,"Ching's Secret","Singapore Curry","Pack","India",3.75,2090607556
2575,"Samyang Foods","Kimchi song Song Ramen","Pack","South Korea",4.75,2090607556
2574,"Acecook","Spice Deli Tantan Men With Cilantro","Cup","Japan",4,193488781
2573,"Ikeda Shoku","Nabeyaki Kitsune Udon","Tray","Japan",3.75,2090769765
2572,"Ripe'n'Dry","Hokkaido Soy Sauce Ramen","Pack","Japan",0.25,2090607556


You can see that same original string value results in equal hash.

- `SLeng(Val)`:Returns the length of a string.<br>
Val can be a string column’s name, a string constant, or an expression that evaluates to a string.
    
    Again in this example, we'll provide `style` column, and result will be stored in `style_len` column.

In [100]:
aq_pp -f,+1 $file -d $cols -eval 'i:style_len' 'SLeng(style)'

"reviewID","brand","variety","style","country","stars","style_len"
2580,"New Touch","T's Restaurant Tantanmen ","Cup","Japan",3.75,3
2579,"Just Way","Noodles Spicy Hot Sesame Spicy Hot Sesame Guan-miao Noodles","Pack","Taiwan",1,4
2578,"Nissin","Cup Noodles Chicken Vegetable","Cup","USA",2.25,3
2577,"Wei Lih","GGE Ramen Snack Tomato Flavor","Pack","Taiwan",2.75,4
2576,"Ching's Secret","Singapore Curry","Pack","India",3.75,4
2575,"Samyang Foods","Kimchi song Song Ramen","Pack","South Korea",4.75,4
2574,"Acecook","Spice Deli Tantan Men With Cilantro","Cup","Japan",4,3
2573,"Ikeda Shoku","Nabeyaki Kitsune Udon","Tray","Japan",3.75,4
2572,"Ripe'n'Dry","Hokkaido Soy Sauce Ramen","Pack","Japan",0.25,4


<a id='math'></a>
### Math Functions
<br>

**Basics for math functions**

Except few functions, math function will take a single argument `Val` which can be numeric column, constant or expression that will result in numeric value. 
We will go over just a few of them here, plus functions with irregular syntax. For the list of math functions, refer to the [aq-emod documentation](http://auriq.com/documentation/source/reference/manpages/aq-emod.html#math-functions).

- `Ceil(Val)`: Rounds Val up to the nearest integral value and returns the result.

    Val can be a numeric column’s name, a numeric constant, or an expression that evaluates to a number.

`stars` column that contains average star rating will be provided and result will be stored in `ceiling` column.

In [101]:
aq_pp -f,+1 $file -d $cols -eval 'i:ceiling' 'Ceil(stars)'

"reviewID","brand","variety","style","country","stars","ceiling"
2580,"New Touch","T's Restaurant Tantanmen ","Cup","Japan",3.75,4
2579,"Just Way","Noodles Spicy Hot Sesame Spicy Hot Sesame Guan-miao Noodles","Pack","Taiwan",1,1
2578,"Nissin","Cup Noodles Chicken Vegetable","Cup","USA",2.25,3
2577,"Wei Lih","GGE Ramen Snack Tomato Flavor","Pack","Taiwan",2.75,3
2576,"Ching's Secret","Singapore Curry","Pack","India",3.75,4
2575,"Samyang Foods","Kimchi song Song Ramen","Pack","South Korea",4.75,5
2574,"Acecook","Spice Deli Tantan Men With Cilantro","Cup","Japan",4,4
2573,"Ikeda Shoku","Nabeyaki Kitsune Udon","Tray","Japan",3.75,4
2572,"Ripe'n'Dry","Hokkaido Soy Sauce Ramen","Pack","Japan",0.25,1



- `Floor(Val)`: Rounds Val down to the nearest integral value and returns the result.

    Val can be a numeric column’s name, a numeric constant, or an expression that evaluates to a number.

Similary to `Ceil()`, we'll use `stars` column again. Notice the difference in the result compare to `Ceil()` function.


In [102]:
aq_pp -f,+1 $file -d $cols -eval 'i:floor' 'Floor(stars)'

"reviewID","brand","variety","style","country","stars","floor"
2580,"New Touch","T's Restaurant Tantanmen ","Cup","Japan",3.75,3
2579,"Just Way","Noodles Spicy Hot Sesame Spicy Hot Sesame Guan-miao Noodles","Pack","Taiwan",1,1
2578,"Nissin","Cup Noodles Chicken Vegetable","Cup","USA",2.25,2
2577,"Wei Lih","GGE Ramen Snack Tomato Flavor","Pack","Taiwan",2.75,2
2576,"Ching's Secret","Singapore Curry","Pack","India",3.75,3
2575,"Samyang Foods","Kimchi song Song Ramen","Pack","South Korea",4.75,4
2574,"Acecook","Spice Deli Tantan Men With Cilantro","Cup","Japan",4,4
2573,"Ikeda Shoku","Nabeyaki Kitsune Udon","Tray","Japan",3.75,3
2572,"Ripe'n'Dry","Hokkaido Soy Sauce Ramen","Pack","Japan",0.25,0



- `Round(Val)`: Rounds Val up/down to the nearest integral value and returns the result. Half way cases are rounded away from zero.

    Val can be a numeric column’s name, a numeric constant, or an expression that evaluates to a number.

Given `star` column, the result will be rounded to the nearest integer.

In [103]:
aq_pp -f,+1 $file -d $cols -eval 'i:round' 'round(stars)' 

"reviewID","brand","variety","style","country","stars","round"
2580,"New Touch","T's Restaurant Tantanmen ","Cup","Japan",3.75,4
2579,"Just Way","Noodles Spicy Hot Sesame Spicy Hot Sesame Guan-miao Noodles","Pack","Taiwan",1,1
2578,"Nissin","Cup Noodles Chicken Vegetable","Cup","USA",2.25,2
2577,"Wei Lih","GGE Ramen Snack Tomato Flavor","Pack","Taiwan",2.75,3
2576,"Ching's Secret","Singapore Curry","Pack","India",3.75,4
2575,"Samyang Foods","Kimchi song Song Ramen","Pack","South Korea",4.75,5
2574,"Acecook","Spice Deli Tantan Men With Cilantro","Cup","Japan",4,4
2573,"Ikeda Shoku","Nabeyaki Kitsune Udon","Tray","Japan",3.75,4
2572,"Ripe'n'Dry","Hokkaido Soy Sauce Ramen","Pack","Japan",0.25,0


- `Sqrt(Val)`: Computes the square root of Val.

    Val can be a numeric column’s name, a numeric constant, or an expression that evaluates to a number.

In this example, we will provide an constant as an argument for clearity.

In [104]:
aq_pp -f,+1 $file -d $cols -eval 'i:squared' 'Sqrt(9)' 

"reviewID","brand","variety","style","country","stars","squared"
2580,"New Touch","T's Restaurant Tantanmen ","Cup","Japan",3.75,3
2579,"Just Way","Noodles Spicy Hot Sesame Spicy Hot Sesame Guan-miao Noodles","Pack","Taiwan",1,3
2578,"Nissin","Cup Noodles Chicken Vegetable","Cup","USA",2.25,3
2577,"Wei Lih","GGE Ramen Snack Tomato Flavor","Pack","Taiwan",2.75,3
2576,"Ching's Secret","Singapore Curry","Pack","India",3.75,3
2575,"Samyang Foods","Kimchi song Song Ramen","Pack","South Korea",4.75,3
2574,"Acecook","Spice Deli Tantan Men With Cilantro","Cup","Japan",4,3
2573,"Ikeda Shoku","Nabeyaki Kitsune Udon","Tray","Japan",3.75,3
2572,"Ripe'n'Dry","Hokkaido Soy Sauce Ramen","Pack","Japan",0.25,3


**Math functions with irregular syntax**

These functions require multiple values as their arguments to return the result.

- `Min(Val1, Val2 [, Val3 ...])`: Returns the smallest value among Val1, Val2 and so on.

    Each Val can be a numeric column’s name, a number, or an expression that evaluates to a number.
    If all values are integers, the result will also be an integer.
    If any value is a floating point number, the result will be a floating point number.
We'll provide a constant, as well as `stars` column to be compared with, and store the result in a column called `smaller`.

In [105]:
aq_pp -f,+1 $file -d $cols -eval 'f:smaller' 'Min(3, stars)' 

"reviewID","brand","variety","style","country","stars","smaller"
2580,"New Touch","T's Restaurant Tantanmen ","Cup","Japan",3.75,3
2579,"Just Way","Noodles Spicy Hot Sesame Spicy Hot Sesame Guan-miao Noodles","Pack","Taiwan",1,1
2578,"Nissin","Cup Noodles Chicken Vegetable","Cup","USA",2.25,2.25
2577,"Wei Lih","GGE Ramen Snack Tomato Flavor","Pack","Taiwan",2.75,2.75
2576,"Ching's Secret","Singapore Curry","Pack","India",3.75,3
2575,"Samyang Foods","Kimchi song Song Ramen","Pack","South Korea",4.75,3
2574,"Acecook","Spice Deli Tantan Men With Cilantro","Cup","Japan",4,3
2573,"Ikeda Shoku","Nabeyaki Kitsune Udon","Tray","Japan",3.75,3
2572,"Ripe'n'Dry","Hokkaido Soy Sauce Ramen","Pack","Japan",0.25,0.25


- `Pow(Val, Power)`: Computes Val raised to the power of Power.

    Val and Power can be a numeric column’s name, a numeric constant, or an expression that evaluates to a number.

In this example, we'll calculate a 8th power of 2, meaning `Val = 2` and `Power = 8`, and result will be in a integer column called `byte`. 

In [106]:
aq_pp -f,+1 $file -d $cols -eval 'i:byte' 'Pow(2, 8)'

"reviewID","brand","variety","style","country","stars","byte"
2580,"New Touch","T's Restaurant Tantanmen ","Cup","Japan",3.75,256
2579,"Just Way","Noodles Spicy Hot Sesame Spicy Hot Sesame Guan-miao Noodles","Pack","Taiwan",1,256
2578,"Nissin","Cup Noodles Chicken Vegetable","Cup","USA",2.25,256
2577,"Wei Lih","GGE Ramen Snack Tomato Flavor","Pack","Taiwan",2.75,256
2576,"Ching's Secret","Singapore Curry","Pack","India",3.75,256
2575,"Samyang Foods","Kimchi song Song Ramen","Pack","South Korea",4.75,256
2574,"Acecook","Spice Deli Tantan Men With Cilantro","Cup","Japan",4,256
2573,"Ikeda Shoku","Nabeyaki Kitsune Udon","Tray","Japan",3.75,256
2572,"Ripe'n'Dry","Hokkaido Soy Sauce Ramen","Pack","Japan",0.25,256


- `IsInf(Val)`: Tests if Val is infinite.

    Returns 1, -1 or 0 if the value is positive infinity, negative infinity or finite respectively.
    Val can be a numeric column’s name, a numeric constant, or an expression that evaluates to a number.

This one's little interesting. In order to provide "negative infinity", we'll provide an expression `-1.0/0`, and see if it returns -1.

**Note:** 
* In order to get positive / negative infinity, the expression needs to be evaluated as float(e.g. `1.0/0` instead of `1/0`).
* The column to assign the result needs to be a datatype of signed integer, either `is` or `ls`, in order to be able to display negative values correctly.

In [107]:
aq_pp -f,+1 $file -d $cols -eval 'is:IsInf' 'IsInf(-1.0/0)' -c IsInf

"IsInf"
-1
-1
-1
-1
-1
-1
-1
-1
-1


<a id='comparison'></a>
### Comparison Functions

Most of the comparision function compare 1 or more string constant or pattern or regex against whole / part of given string or string column. They return 1 if there are match, and 0 for no match. 

Let's start with function that compares beginning and end of the string with given pattern.



- `BegCmp(Val, BegStr [, BegStr ...])`: examine if string `Val` start exactly with `BegStr`. 

    * Returns 1 if there is a match, 0 otherwise.
    * `Val` can be a string column’s name, a string constant, or an expression that evaluates to a string.
    * Each `BegStr` is a string constant that specifies the starting string to match.

Let's use `style` column again to demonstrate this function. I will give "P" as a pattern to match, and this should return 1 whenever `style` column's content start with "Pa".

In [108]:
aq_pp -f,+1 $file -d $cols -eval 'is:beginWith' 'BegCmp(style, "P")' -c style beginWith

"style","beginWith"
"Cup",0
"Pack",1
"Cup",0
"Pack",1
"Pack",1
"Pack",1
"Cup",0
"Tray",0
"Pack",1


We can also provide multiple `BegStr` to match with string that starts with any of the given `BegStr`. You can observe that this time it returns 1 for string that start with either "P" or "Tr". 

In [109]:
aq_pp -f,+1 $file -d $cols -eval 'is:beginWith' 'BegCmp(style, "P", "Tr")' -c style beginWith

"style","beginWith"
"Cup",0
"Pack",1
"Cup",0
"Pack",1
"Pack",1
"Pack",1
"Cup",0
"Tray",1
"Pack",1


- `EndCmp(Val, EndStr [, EndStr ...])`: Compares one or more ending string EndStr with the tail of Val. All the comparisons are case sensitive.

    * Returns 1 if there is a match, 0 otherwise.
    * Val can be a string column’s name, a string constant, or an expression that evaluates to a string.
    * Each EndStr is a string constant that specifies the ending string to match.

This function is same as the one above (`BegCmp`) except this compares the ending of `Val`. 
Let's see it in action with `style` column, I will provide 2 `EndStr` this time as well.

In [110]:
aq_pp -f,+1 $file -d $cols -eval 'is:beginWith' 'EndCmp(style, "ck", "up")' -c style beginWith

"style","beginWith"
"Cup",1
"Pack",1
"Cup",1
"Pack",1
"Pack",1
"Pack",1
"Cup",1
"Tray",0
"Pack",1


1 is returned for `Cup` and `Pack` that ends with given pattern.

- `SubCmp(Val, SubStr [, SubStr ...])`: Compares one or more substring SubStr with with any part of Val. All the comparisons are case sensitive.

    * Returns 1 if there is a match, 0 otherwise.
    * Val can be a string column’s name, a string constant, or an expression that evaluates to a string.
    * Each SubStr is a string constant that specifies the substring to match.

I will provide "Noodle" as `SubStr` for `variety` column, to detect the ramen name which contains "Noodle"(case sensitive).

In [111]:
aq_pp -f,+1 $file -d $cols -eval 'is:beginWith' 'SubCmp(variety, "Noodle")' -c variety beginWith

"variety","beginWith"
"T's Restaurant Tantanmen ",0
"Noodles Spicy Hot Sesame Spicy Hot Sesame Guan-miao Noodles",1
"Cup Noodles Chicken Vegetable",1
"GGE Ramen Snack Tomato Flavor",0
"Singapore Curry",0
"Kimchi song Song Ramen",0
"Spice Deli Tantan Men With Cilantro",0
"Nabeyaki Kitsune Udon",0
"Hokkaido Soy Sauce Ramen",0


Next, I will demonstrate to provide 2 string, "Noodle" and "Spic"(to match both "Spicy" and "Spice") to extract names which contains **EITHER** of the words.



In [112]:
aq_pp -f,+1 $file -d $cols -eval 'is:beginWith' 'SubCmp(variety, "Noodle", "Spic")' -c variety beginWith

"variety","beginWith"
"T's Restaurant Tantanmen ",0
"Noodles Spicy Hot Sesame Spicy Hot Sesame Guan-miao Noodles",1
"Cup Noodles Chicken Vegetable",1
"GGE Ramen Snack Tomato Flavor",0
"Singapore Curry",0
"Kimchi song Song Ramen",0
"Spice Deli Tantan Men With Cilantro",1
"Nabeyaki Kitsune Udon",0
"Hokkaido Soy Sauce Ramen",0


- `SubCmpAll(Val, SubStr [, SubStr ...])`:Same as `SubCmp()`, except when multiple `SubStr` are provided, it will return 1 only if `Val` contains every single one of `SubStr`.

Similary to the example `SubCmp()` above, we'll provide "Noodle" and "Spic" to be compared with `variety` column. This time though it'll return 1 only if `variety` contains **BOTH** "Noodle" and "Spic".

In [113]:
aq_pp -f,+1 $file -d $cols -eval 'is:beginWith' 'SubCmpAll(variety, "Noodle", "Spic")' -c variety beginWith

"variety","beginWith"
"T's Restaurant Tantanmen ",0
"Noodles Spicy Hot Sesame Spicy Hot Sesame Guan-miao Noodles",1
"Cup Noodles Chicken Vegetable",0
"GGE Ramen Snack Tomato Flavor",0
"Singapore Curry",0
"Kimchi song Song Ramen",0
"Spice Deli Tantan Men With Cilantro",0
"Nabeyaki Kitsune Udon",0
"Hokkaido Soy Sauce Ramen",0


- `MixedCmp(Val, SubStr, Typ [, SubStr, Typ ...])`: You can this of this function as more versatile version of `BegCmp`, `EndCmp`, and `SubCmp`. Given `Val`, you'll provide `SubStr` and `Typ` which is:
* `BEG` - Match with the head of Val.
* `END` - Match with the tail of Val.
* `SUB` - Match with any part of Val.

Note when provided more than 2 `SubStr`, this function will return 1 for matching with **EITHER** of the provided `SubStr`'s pattern. This will be demonstrated at **`SUB`** section later.

**`BEG`**<br>
Let's start with `BEG`. We will use `style` column, and give `C` to get `style` that begin with "C".

In [114]:
aq_pp -f,+1 $file -d $cols -eval 'is:beginWith' 'MixedCmp(style, "C", BEG)' -c style beginWith

"style","beginWith"
"Cup",1
"Pack",0
"Cup",1
"Pack",0
"Pack",0
"Pack",0
"Cup",1
"Tray",0
"Pack",0


**`END`**<br>

Next I will provide "ck" as `SubStr`, and `END` as `Typ` to extract record with style which end with "ck" (Pack type).

In [115]:
aq_pp -f,+1 $file -d $cols -eval 'is:beginWith' 'MixedCmp(style, "ck", END)' -c style beginWith

"style","beginWith"
"Cup",0
"Pack",1
"Cup",0
"Pack",1
"Pack",1
"Pack",1
"Cup",0
"Tray",0
"Pack",1


**`SUB`**<br>
Lastly, I will provide "Noodle" and "Spic" to match with `variety` column as `Val` to extract records that contains EITHER of these strings.
Since we're looking for substring match in any position of `variety`, `SUB` will be the `Typ`.

In [116]:
aq_pp -f,+1 $file -d $cols -eval 'is:beginWith' 'MixedCmp(variety, "Noodle", SUB, "Spic", SUB)' -c variety beginWith

"variety","beginWith"
"T's Restaurant Tantanmen ",0
"Noodles Spicy Hot Sesame Spicy Hot Sesame Guan-miao Noodles",1
"Cup Noodles Chicken Vegetable",1
"GGE Ramen Snack Tomato Flavor",0
"Singapore Curry",0
"Kimchi song Song Ramen",0
"Spice Deli Tantan Men With Cilantro",1
"Nabeyaki Kitsune Udon",0
"Hokkaido Soy Sauce Ramen",0


- `MixedCmpAll(Val, SubStr, Typ [, SubStr, Typ ...])`: You can this of this function as more versatile version of `BegCmp`, `EndCmp`, and `SubCmp`. Given `Val`, you'll provide `SubStr` and `Typ` which is:
* `BEG` - Match with the head of Val.
* `END` - Match with the tail of Val.
* `SUB` - Match with any part of Val.

Same as `MixedCmp()` above, except when provided with more than 2 `SubStr`, it'll return 1 for matching **ALL** of the patterns. Here, I will demonstrate it using **`SUB`**, with `variety` column.
This should only return 1 for the record that contains both "Noodle" and "Spic" in `variety` column.

In [117]:
aq_pp -f,+1 $file -d $cols -eval 'is:beginWith' 'MixedCmpAll(variety, "Noodle", SUB, "Spic", SUB)' -c variety beginWith

"variety","beginWith"
"T's Restaurant Tantanmen ",0
"Noodles Spicy Hot Sesame Spicy Hot Sesame Guan-miao Noodles",1
"Cup Noodles Chicken Vegetable",0
"GGE Ramen Snack Tomato Flavor",0
"Singapore Curry",0
"Kimchi song Song Ramen",0
"Spice Deli Tantan Men With Cilantro",0
"Nabeyaki Kitsune Udon",0
"Hokkaido Soy Sauce Ramen",0


- `Contain(Val, SubStrs)`: Compares the substrings in SubStrs with any part of Val. All the comparisons are case sensitive.

    * Returns 1 if there is a match, 0 otherwise.
    * Val can be a string column’s name, a string constant, or an expression that evaluates to a string.
    * SubStrs is a string constant that specifies what substrings to match. It is a comma-newline separated list of literal substrings of the form “`SubStr1,[\r]\nSubStr2...`”.

Let's test this function by providing "Noodle" and "Spic" as `SubStrs`, and `variety` as `Val`.

In [118]:
aq_pp -f,+1 $file -d $cols -eval 'is:beginWith' 'contain(variety, "Noodle,\nSpic")' -c variety beginWith

"variety","beginWith"
"T's Restaurant Tantanmen ",0
"Noodles Spicy Hot Sesame Spicy Hot Sesame Guan-miao Noodles",1
"Cup Noodles Chicken Vegetable",1
"GGE Ramen Snack Tomato Flavor",0
"Singapore Curry",0
"Kimchi song Song Ramen",0
"Spice Deli Tantan Men With Cilantro",1
"Nabeyaki Kitsune Udon",0
"Hokkaido Soy Sauce Ramen",0


**`ContainAll(Val, SubStrs)`**:Compares the substrings in SubStrs with any part of Val. All the comparisons are case sensitive.

* Returns 1 if all the substrings match, 0 otherwise.
* Val can be a string column’s name, a string constant, or an expression that evaluates to a string.
* SubStrs is a string constant that specifies what substrings to match. It is a comma-newline separated list of literal substrings of the form “`SubStr1,[\r]\nSubStr2...`”.

Same as `Contain()`, except that when provided with multiple `SubStrs`, it'll return 1 only if all the patterns are present in `Val`. 
Using `variety` column with values of "Noodle" and "Spic" again, we'll see that only record with **BOTH** words present in thier `variety` column will have 1 as return value.

In [119]:
aq_pp -f,+1 $file -d $cols -eval 'is:beginWith' 'ContainAll(variety, "Noodle,\nSpic")' -c variety beginWith

"variety","beginWith"
"T's Restaurant Tantanmen ",0
"Noodles Spicy Hot Sesame Spicy Hot Sesame Guan-miao Noodles",1
"Cup Noodles Chicken Vegetable",0
"GGE Ramen Snack Tomato Flavor",0
"Singapore Curry",0
"Kimchi song Song Ramen",0
"Spice Deli Tantan Men With Cilantro",0
"Nabeyaki Kitsune Udon",0
"Hokkaido Soy Sauce Ramen",0


**FOR 3 EXAMPLES BELOW, WAIT FOR KO'S PERMISSION TO USE ANA'S DATA AS EXAMPLE**

For the examples below, we'll be using data from an airline online ticket search, which looks like below.

ticket|
-----|
From=HNDTo=NGODate=20150506Class=Y
From=NGOTo=OKADate=20150425Class=Y
From=OKATo=NGODate=20150425Class=Y
From=OKATo=NGODate=20150425Class=S
From=OKATo=NGODate=20150425Class=S
From=NGOTo=OKADate=20150425Class=Y
From=NGOTo=OKADate=20150419Class=Y
From=OKATo=NGODate=20150426Class=Y
From=OKATo=NGODate=20150517Class=Y


**`PatCmp(Val, Pattern [, AtrLst])`**:Compares a generic wildcard pattern with Val.

* Returns 1 if it matches, 0 otherwise. `Pattern` must match the _entire_ `Val` to be successful.
* `Val` can be a string column’s name, a string constant, or an expression that evaluates to a string.
* `Pattern` is a string constant that specifies the pattern to match. It is a simple wildcard pattern containing just '*' (matches any number of bytes) and ‘?’ (matches any 1 byte) only; literal ‘*’, ‘?’ and ‘\’ in the pattern must be ‘\’ escaped.
* Optional `AtrLst` is a list of `|` separated attributes containing:
    * `ncas` - Perform a case insensitive match (default is case sensitive). For ASCII data only.

On the example below, we'll use wildcard to look for record whose Class is equal to "Y".

In [120]:
airline="data/aq_pp/airline_sample.csv"
aq_pp -f,+1 $airline -d S:ticket -eval 'is:contains' 'PatCmp(ticket, "From=*To=*Date=*Class=Y")' -c ticket contains

"ticket","contains"
"From=NGOTo=OKADate=20150425Class=Y",1
"From=OKATo=NGODate=20150425Class=Y",1
"From=OKATo=NGODate=20150425Class=S",0
"From=OKATo=NGODate=20150425Class=S",0
"From=NGOTo=OKADate=20150425Class=Y",1
"From=NGOTo=OKADate=20150419Class=Y",1
"From=OKATo=NGODate=20150426Class=Y",1
"From=OKATo=NGODate=20150517Class=Y",1
"From=OKATo=NGODate=20150517Class=Y",1
"From=OKATo=NGODate=20150517Class=Y",1
"From=OKATo=NGODate=20150418Class=Y",1
"From=HNDTo=NGODate=20150815Class=Y",1
"From=HNDTo=NGODate=20150815Class=Y",1
"From=NGOTo=HNDDate=20150815Class=Y",1
"From=NGOTo=HNDDate=20150815Class=Y",1
"From=SDJTo=NGODate=20150830Class=Y",1
"From=NGOTo=SDJDate=20150828Class=Y",1
"From=HNDTo=OKADate=20150820Class=Y",1
"From=HNDTo=OKADate=20150820Class=Y",1
"From=SDJTo=NGODate=20150410Class=Y",1
"From=SDJTo=NGODate=20150410Class=Y",1
"From=SDJTo=NGODate=20151206Class=Y",1
"From=NGOTo=SDJDate=20151205Class=Y",1
"From=NGOTo=SDJDate=20151206Class=Y",1
"From=NGOTo=SPKDate=20151205Class=Y",1
"From

"From=NGOTo=SDJDate=20150926Class=Y",1
"From=NGOTo=SDJDate=20150924Class=Y",1
"From=NGOTo=HNDDate=20151008Class=Y",1
"From=NGOTo=HNDDate=20151008Class=Y",1
"From=NGOTo=HNDDate=20151009Class=Y",1
"From=NGOTo=HNDDate=20151008Class=Y",1
"From=NGOTo=HNDDate=20151007Class=Y",1
"From=NGOTo=HNDDate=20151006Class=Y",1
"From=HNDTo=NGODate=20151006Class=Y",1
"From=HNDTo=NGODate=20151007Class=Y",1
"From=HNDTo=OKADate=20160423Class=Y",1
"From=HNDTo=OKADate=20160422Class=Y",1
"From=HNDTo=OKADate=20160423Class=Y",1
"From=OKATo=HNDDate=20160423Class=Y",1
"From=OKATo=HNDDate=20160424Class=Y",1
"From=OKATo=HNDDate=20160423Class=Y",1
"From=HNDTo=OKADate=20160423Class=Y",1
"From=HNDTo=OKADate=20160422Class=Y",1
"From=HNDTo=OKADate=20160423Class=Y",1
"From=HNDTo=NGODate=20151008Class=Y",1
"From=HNDTo=NGODate=20151009Class=Y",1
"From=NGOTo=SPKDate=20160206Class=Y",1
"From=SPKTo=NGODate=20160206Class=Y",1
"From=SPKTo=NGODate=20160207Class=Y",1
"From=SPKTo=NGODate=20160206Class=Y",1
"From=NGOTo=SPKDate=20160

"From=OKATo=HNDDate=20160502Class=Y",1
"From=OKATo=HNDDate=20160503Class=Y",1
"From=OKATo=HNDDate=20160504Class=Y",1
"From=OKATo=HNDDate=20160505Class=Y",1
"From=OKATo=HNDDate=20160506Class=Y",1
"From=OKATo=HNDDate=20160505Class=Y",1
"From=OKATo=HNDDate=20160504Class=Y",1
"From=OKATo=HNDDate=20160503Class=Y",1
"From=OKATo=HNDDate=20160502Class=Y",1
"From=OKATo=HNDDate=20160430Class=Y",1
"From=OKATo=HNDDate=20160429Class=Y",1
"From=HNDTo=OKADate=20160429Class=Y",1
"From=NGOTo=OKADate=20160429Class=Y",1
"From=NGOTo=OKADate=20160429Class=S",0
"From=NGOTo=OKADate=20160428Class=S",0
"From=NGOTo=OKADate=20160428Class=Y",1
"From=NGOTo=OKADate=20160428Class=S",0
"From=NGOTo=OKADate=20160429Class=S",0
"From=NGOTo=OKADate=20160429Class=Y",1
"From=NGOTo=SPKDate=20160429Class=Y",1
"From=SPKTo=NGODate=20160429Class=Y",1
"From=SPKTo=NGODate=20160430Class=Y",1
"From=SPKTo=NGODate=20160501Class=Y",1
"From=SPKTo=NGODate=20160502Class=Y",1
"From=SPKTo=NGODate=20160503Class=Y",1
"From=SPKTo=NGODate=20160

"From=NRTTo=OKADate=20161206Class=Y",1
"From=NRTTo=OKADate=20161205Class=Y",1
"From=NRTTo=SPKDate=20161205Class=Y",1
"From=NRTTo=SDJDate=20161223Class=Y",1
"From=NRTTo=SDJDate=20161222Class=Y",1
"From=NRTTo=SDJDate=20170108Class=Y",1
"From=SDJTo=NRTDate=20170109Class=Y",1
"From=HNDTo=NGODate=20170317Class=Y",1
"From=NGOTo=HNDDate=20170317Class=Y",1
"From=HNDTo=ISGDate=20170202Class=Y",1
"From=HNDTo=ISGDate=20170203Class=Y",1
"From=HNDTo=ISGDate=20170204Class=Y",1
"From=HNDTo=ISGDate=20170205Class=Y",1
"From=HNDTo=ISGDate=20170206Class=Y",1
"From=HNDTo=ISGDate=20170207Class=Y",1
"From=HNDTo=ISGDate=20170208Class=Y",1
"From=HNDTo=ISGDate=20170209Class=Y",1
"From=HNDTo=ISGDate=20170210Class=Y",1
"From=HNDTo=ISGDate=20170211Class=Y",1
"From=HNDTo=ISGDate=20170212Class=Y",1
"From=HNDTo=ISGDate=20170213Class=Y",1
"From=HNDTo=ISGDate=20170212Class=Y",1
"From=HNDTo=SPKDate=20170207Class=Y",1
"From=HNDTo=SPKDate=20170208Class=Y",1
"From=HNDTo=SPKDate=20170209Class=Y",1
"From=HNDTo=SPKDate=20170

"From=FUKTo=HNDDate=20150621Class=Y",1
"From=HNDTo=FUKDate=20150630Class=Y",1
"From=FUKTo=HNDDate=20150507Class=Y",1
"From=FUKTo=HNDDate=20150518Class=Y",1
"From=HNDTo=FUKDate=20150501Class=Y",1
"From=HNDTo=FUKDate=20150501Class=Y",1
"From=HNDTo=FUKDate=20150501Class=Y",1
"From=HNDTo=HSGDate=20150501Class=Y",1
"From=HNDTo=FUKDate=20150501Class=Y",1
"From=HNDTo=FUKDate=20150501Class=Y",1
"From=HNDTo=FUKDate=20150501Class=Y",1
"From=HNDTo=FUKDate=20150501Class=Y",1
"From=HNDTo=KIXDate=20150501Class=Y",1
"From=HNDTo=ITMDate=20150501Class=Y",1
"From=HNDTo=FUKDate=20150501Class=Y",1
"From=HNDTo=HSGDate=20150501Class=Y",1
"From=NGOTo=FUKDate=20150718Class=Y",1
"From=NGOTo=FUKDate=20150718Class=Y",1
"From=FUKTo=HNDDate=20150621Class=Y",1
"From=FUKTo=HNDDate=20150621Class=Y",1
"From=FUKTo=NRTDate=20150506Class=Y",1
"From=FUKTo=ITMDate=20150506Class=Y",1
"From=ITMTo=HNDDate=20150506Class=Y",1
"From=HNDTo=FUKDate=20150404Class=Y",1
"From=HNDTo=FUKDate=20150405Class=Y",1
"From=HNDTo=HSGDate=20150

"From=HNDTo=KOJDate=20171112Class=Y",1
"From=HNDTo=KOJDate=20171112Class=Y",1
"From=HNDTo=KOJDate=20171112Class=Y",1
"From=KOJTo=HNDDate=20171111Class=Y",1
"From=KOJTo=HNDDate=20171111Class=Y",1
"From=HNDTo=KOJDate=20171112Class=Y",1
"From=HNDTo=KOJDate=20171112Class=Y",1
"From=HNDTo=KOJDate=20171112Class=Y",1
"From=HNDTo=KOJDate=20171112Class=Y",1
"From=HNDTo=KOJDate=20171112Class=Y",1
"From=HNDTo=KOJDate=20171112Class=Y",1
"From=HNDTo=KOJDate=20171112Class=Y",1
"From=KOJTo=KMQDate=20150406Class=Y",1
"From=KOJTo=KMQDate=20150406Class=Y",1
"From=HNDTo=KOJDate=20171225Class=Y",1
"From=HNDTo=KOJDate=20171225Class=Y",1
"From=HNDTo=KOJDate=20171225Class=Y",1
"From=HNDTo=KOJDate=20171225Class=Y",1
"From=HNDTo=KOJDate=20171225Class=Y",1
"From=HNDTo=KOJDate=20171225Class=Y",1
"From=HNDTo=KOJDate=20171225Class=Y",1
"From=HNDTo=KOJDate=20171225Class=Y",1
"From=HNDTo=KOJDate=20171225Class=Y",1
"From=HNDTo=KOJDate=20171225Class=Y",1
"From=KOJTo=HNDDate=20180112Class=Y",1
"From=KOJTo=HNDDate=20180

"From=OKATo=ITMDate=20150724Class=S",0
"From=OKATo=KMJDate=20150724Class=S",0
"From=OKATo=KMJDate=20150724Class=Y",1
"From=OKATo=ITMDate=20150724Class=Y",1
"From=OKATo=ITMDate=20150724Class=Y",1
"From=OKATo=ITMDate=20150724Class=Y",1
"From=ITMTo=KOJDate=20150610Class=Y",1
"From=NRTTo=ITMDate=20150907Class=Y",1
"From=OKATo=ITMDate=20150723Class=Y",1
"From=OKATo=UKBDate=20150723Class=Y",1
"From=OKATo=ITMDate=20150723Class=Y",1
"From=ITMTo=OKADate=20150723Class=Y",1
"From=OKATo=ITMDate=20150426Class=Y",1
"From=OKATo=KMIDate=20150426Class=Y",1
"From=OKATo=KMIDate=20150426Class=Y",1
"From=ITMTo=KOJDate=20150610Class=S",0
"From=ITMTo=KOJDate=20150610Class=S",0
"From=KIXTo=FUKDate=20150413Class=Y",1
"From=FUKTo=ITMDate=20150403Class=Y",1
"From=FUKTo=ITMDate=20150731Class=Y",1
"From=FUKTo=ITMDate=20150801Class=Y",1
"From=FUKTo=ITMDate=20150731Class=Y",1
"From=FUKTo=ITMDate=20150801Class=Y",1
"From=FUKTo=ITMDate=20150802Class=Y",1
"From=FUKTo=ITMDate=20150803Class=Y",1
"From=ITMTo=FUKDate=20150

"From=SPKTo=KIXDate=20160101Class=S",0
"From=ITMTo=KMJDate=20160119Class=Y",1
"From=ITMTo=KMJDate=20160119Class=Y",1
"From=ITMTo=NGSDate=20151213Class=Y",1
"From=ITMTo=NGSDate=20151213Class=Y",1
"From=ITMTo=NGSDate=20151213Class=Y",1
"From=FUKTo=ITMDate=20151118Class=Y",1
"From=ITMTo=HNDDate=20151223Class=Y",1
"From=ITMTo=HNDDate=20151223Class=Y",1
"From=KIXTo=SPKDate=20151231Class=S",0
"From=KIXTo=SPKDate=20151231Class=S",0
"From=NGOTo=NGSDate=20151213Class=Y",1
"From=NGOTo=NGSDate=20151213Class=Y",1
"From=NGOTo=NGSDate=20151220Class=Y",1
"From=NGOTo=NGSDate=20151220Class=Y",1
"From=NGOTo=FUKDate=20151220Class=Y",1
"From=FUKTo=ITMDate=20151011Class=Y",1
"From=FUKTo=KIXDate=20151011Class=Y",1
"From=FUKTo=KIXDate=20151012Class=Y",1
"From=KIXTo=FUKDate=20151012Class=Y",1
"From=ITMTo=FUKDate=20151012Class=Y",1
"From=ITMTo=FUKDate=20151013Class=Y",1
"From=OKATo=ITMDate=20151031Class=Y",1
"From=OKATo=KIXDate=20151031Class=Y",1
"From=KIXTo=SPKDate=20151231Class=S",0
"From=KIXTo=SPKDate=20151

"From=FUKTo=HNDDate=20160528Class=Y",1
"From=FUKTo=HNDDate=20160528Class=Y",1
"From=FUKTo=HNDDate=20160528Class=S",0
"From=FUKTo=HNDDate=20160528Class=S",0
"From=FUKTo=HNDDate=20160528Class=Y",1
"From=HNDTo=ITMDate=20160505Class=S",0
"From=FUKTo=HNDDate=20160528Class=Y",1
"From=OKATo=ITMDate=20160618Class=Y",1
"From=OKATo=ITMDate=20160618Class=Y",1
"From=FUKTo=HNDDate=20160528Class=Y",1
"From=ITMTo=FUKDate=20160627Class=Y",1
"From=ITMTo=FUKDate=20160627Class=Y",1
"From=FUKTo=HNDDate=20160528Class=Y",1
"From=ITMTo=FUKDate=20160523Class=Y",1
"From=KIXTo=FUKDate=20160523Class=Y",1
"From=OKATo=ITMDate=20160609Class=Y",1
"From=HNDTo=ITMDate=20160701Class=Y",1
"From=HNDTo=ITMDate=20160701Class=Y",1
"From=OKATo=ITMDate=20160610Class=Y",1
"From=OKATo=ITMDate=20160610Class=S",0
"From=OKATo=KIXDate=20160610Class=S",0
"From=OKATo=KIXDate=20160610Class=Y",1
"From=OKATo=FUKDate=20160712Class=Y",1
"From=ITMTo=SPKDate=20160707Class=Y",1
"From=ITMTo=SPKDate=20160707Class=Y",1
"From=FUKTo=ITMDate=20160

"From=NGSTo=ITMDate=20170314Class=Y",1
"From=ITMTo=KOJDate=20170324Class=Y",1
"From=ITMTo=KOJDate=20170324Class=Y",1
"From=ITMTo=OKADate=20170213Class=S",0
"From=ITMTo=OKADate=20170213Class=S",0
"From=ITMTo=OKADate=20170212Class=S",0
"From=ITMTo=OKADate=20170213Class=S",0
"From=ITMTo=OKADate=20170213Class=S",0
"From=SPKTo=ITMDate=20170311Class=Y",1
"From=FUKTo=HNDDate=20170404Class=S",0
"From=FUKTo=HNDDate=20170404Class=S",0
"From=SPKTo=NGSDate=20170313Class=Y",1
"From=ITMTo=SPKDate=20170308Class=Y",1
"From=HNDTo=NGSDate=20170313Class=Y",1
"From=HNDTo=NGSDate=20170313Class=Y",1
"From=FUKTo=ITMDate=20170429Class=Y",1
"From=NKYTo=OSADate=20170422Class=Y",1
"From=ITMTo=SPKDate=20170308Class=S",0
"From=HNDTo=ITMDate=20170406Class=S",0
"From=HNDTo=ITMDate=20170406Class=S",0
"From=FUKTo=ITMDate=20170318Class=Y",1
"From=KOJTo=ITMDate=20170331Class=Y",1
"From=KOJTo=ITMDate=20170331Class=Y",1
"From=NKYTo=OSADate=20170422Class=Y",1
"From=ITMTo=FUKDate=20170408Class=Y",1
"From=ITMTo=FUKDate=20170

"From=ITMTo=HNDDate=20170617Class=Y",1
"From=OKATo=HNDDate=20170816Class=Y",1
"From=OKATo=HNDDate=20170815Class=Y",1
"From=OKATo=HNDDate=20170815Class=Y",1
"From=OKATo=HNDDate=20170816Class=Y",1
"From=HNDTo=ITMDate=20170629Class=Y",1
"From=OKATo=HNDDate=20170816Class=Y",1
"From=OKATo=HNDDate=20170815Class=Y",1
"From=OKATo=HNDDate=20170815Class=Y",1
"From=HNDTo=OKADate=20170813Class=Y",1
"From=HNDTo=OKADate=20170812Class=Y",1
"From=HNDTo=OKADate=20170812Class=Y",1
"From=OKATo=HNDDate=20170816Class=Y",1
"From=OKATo=HNDDate=20170816Class=S",0
"From=OKATo=HNDDate=20170816Class=Y",1
"From=OKATo=HNDDate=20170816Class=Y",1
"From=OKATo=HNDDate=20170816Class=Y",1
"From=OKATo=HNDDate=20170816Class=Y",1
"From=OKATo=HNDDate=20170816Class=Y",1
"From=OKATo=HNDDate=20170816Class=Y",1
"From=HNDTo=ISGDate=20170813Class=Y",1
"From=HNDTo=ISGDate=20170813Class=Y",1
"From=HNDTo=ISGDate=20170813Class=Y",1
"From=HNDTo=ISGDate=20170813Class=Y",1
"From=OKATo=ITMDate=20170816Class=Y",1
"From=OKATo=HNDDate=20170

"From=HNDTo=FUKDate=20171124Class=Y",1
"From=HNDTo=FUKDate=20171124Class=Y",1
"From=HNDTo=FUKDate=20171124Class=Y",1
"From=HNDTo=FUKDate=20171124Class=Y",1
"From=SPKTo=HNDDate=20171031Class=Y",1
"From=HNDTo=SYODate=20171201Class=Y",1
"From=HNDTo=SYODate=20171201Class=Y",1
"From=SYOTo=HNDDate=20171202Class=Y",1
"From=SYOTo=HNDDate=20171202Class=Y",1
"From=SYOTo=HNDDate=20171007Class=Y",1
"From=SPKTo=HNDDate=20171031Class=Y",1
"From=SPKTo=SYODate=20171031Class=Y",1
"From=SPKTo=SYODate=20171031Class=Y",1
"From=HNDTo=SYODate=20171031Class=Y",1
"From=SPKTo=OKADate=20180119Class=Y",1
"From=SPKTo=OKADate=20180119Class=Y",1
"From=SPKTo=OKADate=20180119Class=Y",1
"From=SPKTo=OKADate=20180112Class=Y",1
"From=SPKTo=OKADate=20180112Class=Y",1
"From=FUKTo=SDJDate=20171127Class=Y",1
"From=SPKTo=HNDDate=20171031Class=Y",1
"From=SPKTo=HNDDate=20171031Class=Y",1
"From=SPKTo=HNDDate=20171031Class=Y",1
"From=FUKTo=SDJDate=20171127Class=Y",1
"From=FUKTo=SDJDate=20171127Class=Y",1
"From=FUKTo=SDJDate=20171

"From=OSATo=HNDDate=20161016Class=Y",1
"From=FUKTo=HNDDate=20161015Class=Y",1
"From=HNDTo=OSADate=20161016Class=Y",1
"From=OSATo=OITDate=20161018Class=Y",1
"From=OSATo=OITDate=20161019Class=Y",1
"From=OSATo=OITDate=20161018Class=Y",1
"From=FUKTo=HNDDate=20161015Class=Y",1
"From=HNDTo=OSADate=20161016Class=Y",1
"From=KMQTo=FUKDate=20161127Class=Y",1
"From=FUKTo=HNDDate=20161015Class=Y",1
"From=FUKTo=HNDDate=20161015Class=S",0
"From=FUKTo=KMQDate=20161126Class=Y",1
"From=FUKTo=KMQDate=20161126Class=Y",1
"From=OSATo=FUKDate=20161120Class=Y",1
"From=FUKTo=OSADate=20161115Class=Y",1
"From=KOJTo=OSADate=20161118Class=Y",1
"From=KOJTo=SPKDate=20161118Class=Y",1
"From=KOJTo=SPKDate=20161118Class=Y",1
"From=OKATo=FUKDate=20161222Class=Y",1
"From=OKATo=FUKDate=20161221Class=Y",1
"From=FUKTo=OSADate=20161217Class=Y",1
"From=OSATo=SPKDate=20170308Class=Y",1
"From=HNDTo=OSADate=20170201Class=Y",1
"From=OSATo=FUKDate=20170126Class=Y",1
"From=OSATo=FUKDate=20170126Class=S",0
"From=FUKTo=OSADate=20170

"From=HNDTo=FUKDate=20180422Class=Y",1
"From=HNDTo=FUKDate=20180422Class=Y",1
"From=HNDTo=FUKDate=20180422Class=Y",1
"From=HNDTo=FUKDate=20180422Class=Y",1
"From=HNDTo=FUKDate=20180422Class=Y",1
"From=FUKTo=HNDDate=20180428Class=S",0
"From=HNDTo=FUKDate=20180422Class=S",0
"From=HNDTo=FUKDate=20180422Class=S",0
"From=FUKTo=HNDDate=20180428Class=Y",1
"From=HNDTo=FUKDate=20180422Class=Y",1
"From=HNDTo=FUKDate=20180422Class=S",0
"From=HNDTo=FUKDate=20180422Class=S",0
"From=HNDTo=FUKDate=20180422Class=Y",1
"From=HNDTo=KOJDate=20180405Class=Y",1
"From=HNDTo=NKYDate=20180422Class=Y",1
"From=HNDTo=FUKDate=20180422Class=Y",1
"From=NKYTo=HNDDate=20180423Class=Y",1
"From=FUKTo=HNDDate=20180423Class=Y",1
"From=FUKTo=HNDDate=20180423Class=S",0
"From=FUKTo=HNDDate=20180423Class=S",0
"From=HNDTo=FUKDate=20180421Class=Y",1
"From=HNDTo=FUKDate=20180422Class=Y",1
"From=FUKTo=HNDDate=20180427Class=Y",1
"From=FUKTo=HNDDate=20180428Class=S",0
"From=HNDTo=KCZDate=20180405Class=Y",1
"From=HNDTo=KCZDate=20180

"From=OSATo=HNDDate=20180124Class=Y",1
"From=SDJTo=OSADate=20180122Class=Y",1
"From=OSATo=SPKDate=20180123Class=Y",1
"From=OSATo=SPKDate=20180123Class=Y",1
"From=AXTTo=HNDDate=20180305Class=Y",1
"From=AXTTo=HNDDate=20180304Class=Y",1
"From=AXTTo=SPKDate=20180304Class=Y",1
"From=HNDTo=OSADate=20180301Class=Y",1
"From=ITMTo=AXTDate=20180304Class=Y",1
"From=AXTTo=HNDDate=20180305Class=Y",1
"From=SPKTo=SDJDate=20180217Class=Y",1
"From=AXTTo=HNDDate=20180305Class=S",0
"From=SDJTo=SPKDate=20180326Class=Y",1
"From=AXTTo=HNDDate=20180305Class=S",0
"From=HNDTo=OKADate=20180305Class=S",0
"From=SPKTo=HNDDate=20180328Class=Y",1
"From=SDJTo=SPKDate=20180326Class=Y",1
"From=SPKTo=SDJDate=20180326Class=Y",1
"From=SDJTo=SPKDate=20180326Class=Y",1
"From=FUKTo=HNDDate=20180407Class=S",0
"From=HNDTo=OSADate=20180413Class=S",0
"From=HNDTo=OSADate=20180413Class=S",0
"From=HNDTo=OSADate=20180413Class=Y",1
"From=HNDTo=OSADate=20180413Class=Y",1
"From=OSATo=HNDDate=20180415Class=Y",1
"From=OSATo=HNDDate=20180

"From=FUKTo=HNDDate=20180930Class=Y",1
"From=FUKTo=HNDDate=20180930Class=Y",1
"From=FUKTo=HNDDate=20180930Class=Y",1
"From=FUKTo=HNDDate=20180930Class=Y",1
"From=HNDTo=FUKDate=20180929Class=Y",1
"From=HNDTo=FUKDate=20180929Class=Y",1
"From=FUKTo=HNDDate=20180930Class=Y",1
"From=FUKTo=HNDDate=20180930Class=Y",1
"From=FUKTo=HNDDate=20180930Class=Y",1
"From=FUKTo=HNDDate=20180930Class=Y",1
"From=FUKTo=HNDDate=20180930Class=Y",1
"From=FUKTo=HNDDate=20180930Class=Y",1
"From=ITMTo=FUKDate=20150416Class=Y",1
"From=ITMTo=FUKDate=20150416Class=Y",1
"From=KMJTo=ITMDate=20150611Class=Y",1
"From=FUKTo=ITMDate=20150515Class=Y",1
"From=HNDTo=FUKDate=20150611Class=Y",1
"From=FUKTo=ITMDate=20150612Class=Y",1
"From=ITMTo=HNDDate=20150526Class=Y",1
"From=ITMTo=FUKDate=20150515Class=Y",1
"From=FUKTo=ITMDate=20150515Class=Y",1
"From=ITMTo=HNDDate=20150526Class=Y",1
"From=FUKTo=ITMDate=20150515Class=Y",1
"From=ITMTo=FUKDate=20150721Class=Y",1
"From=HNDTo=FUKDate=20150526Class=Y",1
"From=FUKTo=ITMDate=20150

"From=KOJTo=ITMDate=20150824Class=Y",1
"From=KOJTo=ITMDate=20150824Class=Y",1
"From=ITMTo=KOJDate=20150824Class=Y",1
"From=ITMTo=HNDDate=20150828Class=Y",1
"From=ITMTo=FUKDate=20150902Class=Y",1
"From=FUKTo=ITMDate=20150908Class=Y",1
"From=ITMTo=FUKDate=20150908Class=Y",1
"From=FUKTo=ITMDate=20150908Class=Y",1
"From=HNDTo=FUKDate=20150803Class=Y",1
"From=HNDTo=FUKDate=20150804Class=Y",1
"From=HNDTo=FUKDate=20150805Class=Y",1
"From=HNDTo=FUKDate=20150806Class=Y",1
"From=ITMTo=HNDDate=20150829Class=Y",1
"From=OSATo=HNDDate=20150829Class=Y",1
"From=ITMTo=HNDDate=20150829Class=Y",1
"From=HNDTo=ITMDate=20150829Class=Y",1
"From=HNDTo=OSADate=20150829Class=Y",1
"From=ITMTo=KMJDate=20150824Class=Y",1
"From=HNDTo=UKBDate=20150829Class=Y",1
"From=ITMTo=FUKDate=20150819Class=Y",1
"From=HNDTo=FUKDate=20150819Class=Y",1
"From=ITMTo=HNDDate=20150829Class=Y",1
"From=ITMTo=FUKDate=20150831Class=Y",1
"From=ITMTo=FUKDate=20150902Class=Y",1
"From=FUKTo=ITMDate=20150902Class=Y",1
"From=HNDTo=UKBDate=20150

"From=FUKTo=ITMDate=20160528Class=Y",1
"From=ITMTo=FUKDate=20160608Class=Y",1
"From=ITMTo=FUKDate=20160609Class=Y",1
"From=FUKTo=ITMDate=20160609Class=Y",1
"From=ITMTo=FUKDate=20160617Class=Y",1
"From=KMJTo=HNDDate=20160523Class=Y",1
"From=HNDTo=KMJDate=20160523Class=Y",1
"From=KMJTo=HNDDate=20160523Class=Y",1
"From=HNDTo=KMJDate=20160522Class=Y",1
"From=ITMTo=FUKDate=20160917Class=Y",1
"From=FUKTo=ITMDate=20160917Class=Y",1
"From=ITMTo=HNDDate=20160912Class=Y",1
"From=FUKTo=ITMDate=20160716Class=Y",1
"From=ITMTo=FUKDate=20160714Class=Y",1
"From=FUKTo=ITMDate=20160714Class=Y",1
"From=FUKTo=ITMDate=20160715Class=Y",1
"From=HNDTo=ITMDate=20161012Class=Y",1
"From=ITMTo=FUKDate=20160805Class=Y",1
"From=FUKTo=ITMDate=20160716Class=Y",1
"From=ITMTo=KMJDate=20160815Class=Y",1
"From=KMJTo=ITMDate=20160815Class=Y",1
"From=KMJTo=ITMDate=20160816Class=Y",1
"From=KMJTo=ITMDate=20160817Class=Y",1
"From=KMJTo=ITMDate=20160816Class=Y",1
"From=KMJTo=ITMDate=20160815Class=Y",1
"From=ITMTo=KMJDate=20160

"From=ISGTo=OKADate=20150405Class=Y",1
"From=ISGTo=OKADate=20150406Class=Y",1
"From=ISGTo=OKADate=20150407Class=Y",1
"From=ISGTo=OKADate=20150408Class=Y",1
"From=ISGTo=OKADate=20150409Class=Y",1
"From=ISGTo=OKADate=20150410Class=Y",1
"From=ISGTo=OKADate=20150411Class=Y",1
"From=ISGTo=OKADate=20150412Class=Y",1
"From=ISGTo=OKADate=20150411Class=Y",1
"From=ISGTo=OKADate=20150411Class=Y",1
"From=ISGTo=OKADate=20150411Class=Y",1
"From=ISGTo=OKADate=20150403Class=Y",1
"From=ISGTo=OKADate=20150404Class=Y",1
"From=ISGTo=OKADate=20150403Class=Y",1
"From=ISGTo=OKADate=20150403Class=Y",1
"From=OKATo=ISGDate=20150405Class=Y",1
"From=OKATo=ISGDate=20150405Class=Y",1
"From=OKATo=ISGDate=20150405Class=Y",1
"From=OKATo=ISGDate=20150406Class=Y",1
"From=OKATo=ISGDate=20150406Class=Y",1
"From=OKATo=ISGDate=20150406Class=Y",1
"From=OKATo=ISGDate=20150406Class=Y",1
"From=OKATo=ISGDate=20150405Class=Y",1
"From=ISGTo=OKADate=20150428Class=Y",1
"From=OKATo=ISGDate=20150428Class=Y",1
"From=OKATo=ISGDate=20150

"From=OKATo=ISGDate=20150406Class=Y",1
"From=OKATo=ISGDate=20150405Class=Y",1
"From=OKATo=ISGDate=20150404Class=Y",1
"From=ISGTo=OKADate=20150404Class=Y",1
"From=ISGTo=OKADate=20150403Class=Y",1
"From=ISGTo=OKADate=20150404Class=Y",1
"From=ISGTo=OKADate=20150406Class=Y",1
"From=ISGTo=OKADate=20150405Class=Y",1
"From=ISGTo=OKADate=20150404Class=Y",1
"From=ISGTo=OKADate=20150825Class=Y",1
"From=ISGTo=OKADate=20150918Class=Y",1
"From=ISGTo=OKADate=20150919Class=Y",1
"From=ISGTo=OKADate=20150920Class=Y",1
"From=ISGTo=OKADate=20150919Class=Y",1
"From=OKATo=ISGDate=20150924Class=Y",1
"From=OKATo=ISGDate=20150820Class=Y",1
"From=OKATo=ISGDate=20150821Class=Y",1
"From=OKATo=ISGDate=20150822Class=Y",1
"From=OKATo=ISGDate=20150823Class=Y",1
"From=ISGTo=OKADate=20150823Class=Y",1
"From=ISGTo=OKADate=20150824Class=Y",1
"From=ISGTo=OKADate=20150825Class=Y",1
"From=ISGTo=OKADate=20150821Class=Y",1
"From=ISGTo=OKADate=20150820Class=Y",1
"From=ISGTo=OKADate=20150821Class=Y",1
"From=ISGTo=OKADate=20150

**`RxCmp(Val, Pattern [, AtrLst])`**:Compares a string with a regular expression.

* Returns 1 if they match, 0 otherwise. `Pattern` only needs to match a subpart of `Val` to be successful.
* `Val` can be a string column’s name, a string constant, or an expression that evaluates to a string.
* `Pattern` is a string constant that specifies the regular expression to match.
* Optional `AtrLst` is a list of `|` separated regular expression attributes.

Similar to `RxPat()`, except this uses regex pattern as pattern. Again getting records with Class=Y, 

In [121]:
aq_pp -f,+1 $airline -d S:ticket -eval 'is:contains' 'PatCmp(ticket, "Class=Y", pcre)' -c ticket contains

"ticket","contains"
"From=NGOTo=OKADate=20150425Class=Y",1
"From=OKATo=NGODate=20150425Class=Y",1
"From=OKATo=NGODate=20150425Class=S",0
"From=OKATo=NGODate=20150425Class=S",0
"From=NGOTo=OKADate=20150425Class=Y",1
"From=NGOTo=OKADate=20150419Class=Y",1
"From=OKATo=NGODate=20150426Class=Y",1
"From=OKATo=NGODate=20150517Class=Y",1
"From=OKATo=NGODate=20150517Class=Y",1
"From=OKATo=NGODate=20150517Class=Y",1
"From=OKATo=NGODate=20150418Class=Y",1
"From=HNDTo=NGODate=20150815Class=Y",1
"From=HNDTo=NGODate=20150815Class=Y",1
"From=NGOTo=HNDDate=20150815Class=Y",1
"From=NGOTo=HNDDate=20150815Class=Y",1
"From=SDJTo=NGODate=20150830Class=Y",1
"From=NGOTo=SDJDate=20150828Class=Y",1
"From=HNDTo=OKADate=20150820Class=Y",1
"From=HNDTo=OKADate=20150820Class=Y",1
"From=SDJTo=NGODate=20150410Class=Y",1
"From=SDJTo=NGODate=20150410Class=Y",1
"From=SDJTo=NGODate=20151206Class=Y",1
"From=NGOTo=SDJDate=20151205Class=Y",1
"From=NGOTo=SDJDate=20151206Class=Y",1
"From=NGOTo=SPKDate=20151205Class=Y",1
"From

"From=NGOTo=SDJDate=20150926Class=Y",1
"From=NGOTo=SDJDate=20150924Class=Y",1
"From=NGOTo=HNDDate=20151008Class=Y",1
"From=NGOTo=HNDDate=20151008Class=Y",1
"From=NGOTo=HNDDate=20151009Class=Y",1
"From=NGOTo=HNDDate=20151008Class=Y",1
"From=NGOTo=HNDDate=20151007Class=Y",1
"From=NGOTo=HNDDate=20151006Class=Y",1
"From=HNDTo=NGODate=20151006Class=Y",1
"From=HNDTo=NGODate=20151007Class=Y",1
"From=HNDTo=OKADate=20160423Class=Y",1
"From=HNDTo=OKADate=20160422Class=Y",1
"From=HNDTo=OKADate=20160423Class=Y",1
"From=OKATo=HNDDate=20160423Class=Y",1
"From=OKATo=HNDDate=20160424Class=Y",1
"From=OKATo=HNDDate=20160423Class=Y",1
"From=HNDTo=OKADate=20160423Class=Y",1
"From=HNDTo=OKADate=20160422Class=Y",1
"From=HNDTo=OKADate=20160423Class=Y",1
"From=HNDTo=NGODate=20151008Class=Y",1
"From=HNDTo=NGODate=20151009Class=Y",1
"From=NGOTo=SPKDate=20160206Class=Y",1
"From=SPKTo=NGODate=20160206Class=Y",1
"From=SPKTo=NGODate=20160207Class=Y",1
"From=SPKTo=NGODate=20160206Class=Y",1
"From=NGOTo=SPKDate=20160

"From=OKATo=HNDDate=20160502Class=Y",1
"From=OKATo=HNDDate=20160503Class=Y",1
"From=OKATo=HNDDate=20160504Class=Y",1
"From=OKATo=HNDDate=20160505Class=Y",1
"From=OKATo=HNDDate=20160506Class=Y",1
"From=OKATo=HNDDate=20160505Class=Y",1
"From=OKATo=HNDDate=20160504Class=Y",1
"From=OKATo=HNDDate=20160503Class=Y",1
"From=OKATo=HNDDate=20160502Class=Y",1
"From=OKATo=HNDDate=20160430Class=Y",1
"From=OKATo=HNDDate=20160429Class=Y",1
"From=HNDTo=OKADate=20160429Class=Y",1
"From=NGOTo=OKADate=20160429Class=Y",1
"From=NGOTo=OKADate=20160429Class=S",0
"From=NGOTo=OKADate=20160428Class=S",0
"From=NGOTo=OKADate=20160428Class=Y",1
"From=NGOTo=OKADate=20160428Class=S",0
"From=NGOTo=OKADate=20160429Class=S",0
"From=NGOTo=OKADate=20160429Class=Y",1
"From=NGOTo=SPKDate=20160429Class=Y",1
"From=SPKTo=NGODate=20160429Class=Y",1
"From=SPKTo=NGODate=20160430Class=Y",1
"From=SPKTo=NGODate=20160501Class=Y",1
"From=SPKTo=NGODate=20160502Class=Y",1
"From=SPKTo=NGODate=20160503Class=Y",1
"From=SPKTo=NGODate=20160

"From=NRTTo=OKADate=20161206Class=Y",1
"From=NRTTo=OKADate=20161205Class=Y",1
"From=NRTTo=SPKDate=20161205Class=Y",1
"From=NRTTo=SDJDate=20161223Class=Y",1
"From=NRTTo=SDJDate=20161222Class=Y",1
"From=NRTTo=SDJDate=20170108Class=Y",1
"From=SDJTo=NRTDate=20170109Class=Y",1
"From=HNDTo=NGODate=20170317Class=Y",1
"From=NGOTo=HNDDate=20170317Class=Y",1
"From=HNDTo=ISGDate=20170202Class=Y",1
"From=HNDTo=ISGDate=20170203Class=Y",1
"From=HNDTo=ISGDate=20170204Class=Y",1
"From=HNDTo=ISGDate=20170205Class=Y",1
"From=HNDTo=ISGDate=20170206Class=Y",1
"From=HNDTo=ISGDate=20170207Class=Y",1
"From=HNDTo=ISGDate=20170208Class=Y",1
"From=HNDTo=ISGDate=20170209Class=Y",1
"From=HNDTo=ISGDate=20170210Class=Y",1
"From=HNDTo=ISGDate=20170211Class=Y",1
"From=HNDTo=ISGDate=20170212Class=Y",1
"From=HNDTo=ISGDate=20170213Class=Y",1
"From=HNDTo=ISGDate=20170212Class=Y",1
"From=HNDTo=SPKDate=20170207Class=Y",1
"From=HNDTo=SPKDate=20170208Class=Y",1
"From=HNDTo=SPKDate=20170209Class=Y",1
"From=HNDTo=SPKDate=20170

"From=FUKTo=HNDDate=20150621Class=Y",1
"From=HNDTo=FUKDate=20150630Class=Y",1
"From=FUKTo=HNDDate=20150507Class=Y",1
"From=FUKTo=HNDDate=20150518Class=Y",1
"From=HNDTo=FUKDate=20150501Class=Y",1
"From=HNDTo=FUKDate=20150501Class=Y",1
"From=HNDTo=FUKDate=20150501Class=Y",1
"From=HNDTo=HSGDate=20150501Class=Y",1
"From=HNDTo=FUKDate=20150501Class=Y",1
"From=HNDTo=FUKDate=20150501Class=Y",1
"From=HNDTo=FUKDate=20150501Class=Y",1
"From=HNDTo=FUKDate=20150501Class=Y",1
"From=HNDTo=KIXDate=20150501Class=Y",1
"From=HNDTo=ITMDate=20150501Class=Y",1
"From=HNDTo=FUKDate=20150501Class=Y",1
"From=HNDTo=HSGDate=20150501Class=Y",1
"From=NGOTo=FUKDate=20150718Class=Y",1
"From=NGOTo=FUKDate=20150718Class=Y",1
"From=FUKTo=HNDDate=20150621Class=Y",1
"From=FUKTo=HNDDate=20150621Class=Y",1
"From=FUKTo=NRTDate=20150506Class=Y",1
"From=FUKTo=ITMDate=20150506Class=Y",1
"From=ITMTo=HNDDate=20150506Class=Y",1
"From=HNDTo=FUKDate=20150404Class=Y",1
"From=HNDTo=FUKDate=20150405Class=Y",1
"From=HNDTo=HSGDate=20150

"From=HNDTo=KOJDate=20171112Class=Y",1
"From=HNDTo=KOJDate=20171112Class=Y",1
"From=HNDTo=KOJDate=20171112Class=Y",1
"From=KOJTo=HNDDate=20171111Class=Y",1
"From=KOJTo=HNDDate=20171111Class=Y",1
"From=HNDTo=KOJDate=20171112Class=Y",1
"From=HNDTo=KOJDate=20171112Class=Y",1
"From=HNDTo=KOJDate=20171112Class=Y",1
"From=HNDTo=KOJDate=20171112Class=Y",1
"From=HNDTo=KOJDate=20171112Class=Y",1
"From=HNDTo=KOJDate=20171112Class=Y",1
"From=HNDTo=KOJDate=20171112Class=Y",1
"From=KOJTo=KMQDate=20150406Class=Y",1
"From=KOJTo=KMQDate=20150406Class=Y",1
"From=HNDTo=KOJDate=20171225Class=Y",1
"From=HNDTo=KOJDate=20171225Class=Y",1
"From=HNDTo=KOJDate=20171225Class=Y",1
"From=HNDTo=KOJDate=20171225Class=Y",1
"From=HNDTo=KOJDate=20171225Class=Y",1
"From=HNDTo=KOJDate=20171225Class=Y",1
"From=HNDTo=KOJDate=20171225Class=Y",1
"From=HNDTo=KOJDate=20171225Class=Y",1
"From=HNDTo=KOJDate=20171225Class=Y",1
"From=HNDTo=KOJDate=20171225Class=Y",1
"From=KOJTo=HNDDate=20180112Class=Y",1
"From=KOJTo=HNDDate=20180

"From=OKATo=ITMDate=20150724Class=S",0
"From=OKATo=KMJDate=20150724Class=S",0
"From=OKATo=KMJDate=20150724Class=Y",1
"From=OKATo=ITMDate=20150724Class=Y",1
"From=OKATo=ITMDate=20150724Class=Y",1
"From=OKATo=ITMDate=20150724Class=Y",1
"From=ITMTo=KOJDate=20150610Class=Y",1
"From=NRTTo=ITMDate=20150907Class=Y",1
"From=OKATo=ITMDate=20150723Class=Y",1
"From=OKATo=UKBDate=20150723Class=Y",1
"From=OKATo=ITMDate=20150723Class=Y",1
"From=ITMTo=OKADate=20150723Class=Y",1
"From=OKATo=ITMDate=20150426Class=Y",1
"From=OKATo=KMIDate=20150426Class=Y",1
"From=OKATo=KMIDate=20150426Class=Y",1
"From=ITMTo=KOJDate=20150610Class=S",0
"From=ITMTo=KOJDate=20150610Class=S",0
"From=KIXTo=FUKDate=20150413Class=Y",1
"From=FUKTo=ITMDate=20150403Class=Y",1
"From=FUKTo=ITMDate=20150731Class=Y",1
"From=FUKTo=ITMDate=20150801Class=Y",1
"From=FUKTo=ITMDate=20150731Class=Y",1
"From=FUKTo=ITMDate=20150801Class=Y",1
"From=FUKTo=ITMDate=20150802Class=Y",1
"From=FUKTo=ITMDate=20150803Class=Y",1
"From=ITMTo=FUKDate=20150

"From=SPKTo=KIXDate=20160101Class=S",0
"From=ITMTo=KMJDate=20160119Class=Y",1
"From=ITMTo=KMJDate=20160119Class=Y",1
"From=ITMTo=NGSDate=20151213Class=Y",1
"From=ITMTo=NGSDate=20151213Class=Y",1
"From=ITMTo=NGSDate=20151213Class=Y",1
"From=FUKTo=ITMDate=20151118Class=Y",1
"From=ITMTo=HNDDate=20151223Class=Y",1
"From=ITMTo=HNDDate=20151223Class=Y",1
"From=KIXTo=SPKDate=20151231Class=S",0
"From=KIXTo=SPKDate=20151231Class=S",0
"From=NGOTo=NGSDate=20151213Class=Y",1
"From=NGOTo=NGSDate=20151213Class=Y",1
"From=NGOTo=NGSDate=20151220Class=Y",1
"From=NGOTo=NGSDate=20151220Class=Y",1
"From=NGOTo=FUKDate=20151220Class=Y",1
"From=FUKTo=ITMDate=20151011Class=Y",1
"From=FUKTo=KIXDate=20151011Class=Y",1
"From=FUKTo=KIXDate=20151012Class=Y",1
"From=KIXTo=FUKDate=20151012Class=Y",1
"From=ITMTo=FUKDate=20151012Class=Y",1
"From=ITMTo=FUKDate=20151013Class=Y",1
"From=OKATo=ITMDate=20151031Class=Y",1
"From=OKATo=KIXDate=20151031Class=Y",1
"From=KIXTo=SPKDate=20151231Class=S",0
"From=KIXTo=SPKDate=20151

"From=FUKTo=HNDDate=20160528Class=Y",1
"From=FUKTo=HNDDate=20160528Class=Y",1
"From=FUKTo=HNDDate=20160528Class=S",0
"From=FUKTo=HNDDate=20160528Class=S",0
"From=FUKTo=HNDDate=20160528Class=Y",1
"From=HNDTo=ITMDate=20160505Class=S",0
"From=FUKTo=HNDDate=20160528Class=Y",1
"From=OKATo=ITMDate=20160618Class=Y",1
"From=OKATo=ITMDate=20160618Class=Y",1
"From=FUKTo=HNDDate=20160528Class=Y",1
"From=ITMTo=FUKDate=20160627Class=Y",1
"From=ITMTo=FUKDate=20160627Class=Y",1
"From=FUKTo=HNDDate=20160528Class=Y",1
"From=ITMTo=FUKDate=20160523Class=Y",1
"From=KIXTo=FUKDate=20160523Class=Y",1
"From=OKATo=ITMDate=20160609Class=Y",1
"From=HNDTo=ITMDate=20160701Class=Y",1
"From=HNDTo=ITMDate=20160701Class=Y",1
"From=OKATo=ITMDate=20160610Class=Y",1
"From=OKATo=ITMDate=20160610Class=S",0
"From=OKATo=KIXDate=20160610Class=S",0
"From=OKATo=KIXDate=20160610Class=Y",1
"From=OKATo=FUKDate=20160712Class=Y",1
"From=ITMTo=SPKDate=20160707Class=Y",1
"From=ITMTo=SPKDate=20160707Class=Y",1
"From=FUKTo=ITMDate=20160

"From=NGSTo=ITMDate=20170314Class=Y",1
"From=ITMTo=KOJDate=20170324Class=Y",1
"From=ITMTo=KOJDate=20170324Class=Y",1
"From=ITMTo=OKADate=20170213Class=S",0
"From=ITMTo=OKADate=20170213Class=S",0
"From=ITMTo=OKADate=20170212Class=S",0
"From=ITMTo=OKADate=20170213Class=S",0
"From=ITMTo=OKADate=20170213Class=S",0
"From=SPKTo=ITMDate=20170311Class=Y",1
"From=FUKTo=HNDDate=20170404Class=S",0
"From=FUKTo=HNDDate=20170404Class=S",0
"From=SPKTo=NGSDate=20170313Class=Y",1
"From=ITMTo=SPKDate=20170308Class=Y",1
"From=HNDTo=NGSDate=20170313Class=Y",1
"From=HNDTo=NGSDate=20170313Class=Y",1
"From=FUKTo=ITMDate=20170429Class=Y",1
"From=NKYTo=OSADate=20170422Class=Y",1
"From=ITMTo=SPKDate=20170308Class=S",0
"From=HNDTo=ITMDate=20170406Class=S",0
"From=HNDTo=ITMDate=20170406Class=S",0
"From=FUKTo=ITMDate=20170318Class=Y",1
"From=KOJTo=ITMDate=20170331Class=Y",1
"From=KOJTo=ITMDate=20170331Class=Y",1
"From=NKYTo=OSADate=20170422Class=Y",1
"From=ITMTo=FUKDate=20170408Class=Y",1
"From=ITMTo=FUKDate=20170

"From=ITMTo=HNDDate=20170617Class=Y",1
"From=OKATo=HNDDate=20170816Class=Y",1
"From=OKATo=HNDDate=20170815Class=Y",1
"From=OKATo=HNDDate=20170815Class=Y",1
"From=OKATo=HNDDate=20170816Class=Y",1
"From=HNDTo=ITMDate=20170629Class=Y",1
"From=OKATo=HNDDate=20170816Class=Y",1
"From=OKATo=HNDDate=20170815Class=Y",1
"From=OKATo=HNDDate=20170815Class=Y",1
"From=HNDTo=OKADate=20170813Class=Y",1
"From=HNDTo=OKADate=20170812Class=Y",1
"From=HNDTo=OKADate=20170812Class=Y",1
"From=OKATo=HNDDate=20170816Class=Y",1
"From=OKATo=HNDDate=20170816Class=S",0
"From=OKATo=HNDDate=20170816Class=Y",1
"From=OKATo=HNDDate=20170816Class=Y",1
"From=OKATo=HNDDate=20170816Class=Y",1
"From=OKATo=HNDDate=20170816Class=Y",1
"From=OKATo=HNDDate=20170816Class=Y",1
"From=OKATo=HNDDate=20170816Class=Y",1
"From=HNDTo=ISGDate=20170813Class=Y",1
"From=HNDTo=ISGDate=20170813Class=Y",1
"From=HNDTo=ISGDate=20170813Class=Y",1
"From=HNDTo=ISGDate=20170813Class=Y",1
"From=OKATo=ITMDate=20170816Class=Y",1
"From=OKATo=HNDDate=20170

"From=HNDTo=FUKDate=20171124Class=Y",1
"From=HNDTo=FUKDate=20171124Class=Y",1
"From=HNDTo=FUKDate=20171124Class=Y",1
"From=HNDTo=FUKDate=20171124Class=Y",1
"From=SPKTo=HNDDate=20171031Class=Y",1
"From=HNDTo=SYODate=20171201Class=Y",1
"From=HNDTo=SYODate=20171201Class=Y",1
"From=SYOTo=HNDDate=20171202Class=Y",1
"From=SYOTo=HNDDate=20171202Class=Y",1
"From=SYOTo=HNDDate=20171007Class=Y",1
"From=SPKTo=HNDDate=20171031Class=Y",1
"From=SPKTo=SYODate=20171031Class=Y",1
"From=SPKTo=SYODate=20171031Class=Y",1
"From=HNDTo=SYODate=20171031Class=Y",1
"From=SPKTo=OKADate=20180119Class=Y",1
"From=SPKTo=OKADate=20180119Class=Y",1
"From=SPKTo=OKADate=20180119Class=Y",1
"From=SPKTo=OKADate=20180112Class=Y",1
"From=SPKTo=OKADate=20180112Class=Y",1
"From=FUKTo=SDJDate=20171127Class=Y",1
"From=SPKTo=HNDDate=20171031Class=Y",1
"From=SPKTo=HNDDate=20171031Class=Y",1
"From=SPKTo=HNDDate=20171031Class=Y",1
"From=FUKTo=SDJDate=20171127Class=Y",1
"From=FUKTo=SDJDate=20171127Class=Y",1
"From=FUKTo=SDJDate=20171

"From=OSATo=HNDDate=20161016Class=Y",1
"From=FUKTo=HNDDate=20161015Class=Y",1
"From=HNDTo=OSADate=20161016Class=Y",1
"From=OSATo=OITDate=20161018Class=Y",1
"From=OSATo=OITDate=20161019Class=Y",1
"From=OSATo=OITDate=20161018Class=Y",1
"From=FUKTo=HNDDate=20161015Class=Y",1
"From=HNDTo=OSADate=20161016Class=Y",1
"From=KMQTo=FUKDate=20161127Class=Y",1
"From=FUKTo=HNDDate=20161015Class=Y",1
"From=FUKTo=HNDDate=20161015Class=S",0
"From=FUKTo=KMQDate=20161126Class=Y",1
"From=FUKTo=KMQDate=20161126Class=Y",1
"From=OSATo=FUKDate=20161120Class=Y",1
"From=FUKTo=OSADate=20161115Class=Y",1
"From=KOJTo=OSADate=20161118Class=Y",1
"From=KOJTo=SPKDate=20161118Class=Y",1
"From=KOJTo=SPKDate=20161118Class=Y",1
"From=OKATo=FUKDate=20161222Class=Y",1
"From=OKATo=FUKDate=20161221Class=Y",1
"From=FUKTo=OSADate=20161217Class=Y",1
"From=OSATo=SPKDate=20170308Class=Y",1
"From=HNDTo=OSADate=20170201Class=Y",1
"From=OSATo=FUKDate=20170126Class=Y",1
"From=OSATo=FUKDate=20170126Class=S",0
"From=FUKTo=OSADate=20170

"From=HNDTo=FUKDate=20180422Class=Y",1
"From=HNDTo=FUKDate=20180422Class=Y",1
"From=HNDTo=FUKDate=20180422Class=Y",1
"From=HNDTo=FUKDate=20180422Class=Y",1
"From=HNDTo=FUKDate=20180422Class=Y",1
"From=FUKTo=HNDDate=20180428Class=S",0
"From=HNDTo=FUKDate=20180422Class=S",0
"From=HNDTo=FUKDate=20180422Class=S",0
"From=FUKTo=HNDDate=20180428Class=Y",1
"From=HNDTo=FUKDate=20180422Class=Y",1
"From=HNDTo=FUKDate=20180422Class=S",0
"From=HNDTo=FUKDate=20180422Class=S",0
"From=HNDTo=FUKDate=20180422Class=Y",1
"From=HNDTo=KOJDate=20180405Class=Y",1
"From=HNDTo=NKYDate=20180422Class=Y",1
"From=HNDTo=FUKDate=20180422Class=Y",1
"From=NKYTo=HNDDate=20180423Class=Y",1
"From=FUKTo=HNDDate=20180423Class=Y",1
"From=FUKTo=HNDDate=20180423Class=S",0
"From=FUKTo=HNDDate=20180423Class=S",0
"From=HNDTo=FUKDate=20180421Class=Y",1
"From=HNDTo=FUKDate=20180422Class=Y",1
"From=FUKTo=HNDDate=20180427Class=Y",1
"From=FUKTo=HNDDate=20180428Class=S",0
"From=HNDTo=KCZDate=20180405Class=Y",1
"From=HNDTo=KCZDate=20180

"From=OSATo=HNDDate=20180124Class=Y",1
"From=SDJTo=OSADate=20180122Class=Y",1
"From=OSATo=SPKDate=20180123Class=Y",1
"From=OSATo=SPKDate=20180123Class=Y",1
"From=AXTTo=HNDDate=20180305Class=Y",1
"From=AXTTo=HNDDate=20180304Class=Y",1
"From=AXTTo=SPKDate=20180304Class=Y",1
"From=HNDTo=OSADate=20180301Class=Y",1
"From=ITMTo=AXTDate=20180304Class=Y",1
"From=AXTTo=HNDDate=20180305Class=Y",1
"From=SPKTo=SDJDate=20180217Class=Y",1
"From=AXTTo=HNDDate=20180305Class=S",0
"From=SDJTo=SPKDate=20180326Class=Y",1
"From=AXTTo=HNDDate=20180305Class=S",0
"From=HNDTo=OKADate=20180305Class=S",0
"From=SPKTo=HNDDate=20180328Class=Y",1
"From=SDJTo=SPKDate=20180326Class=Y",1
"From=SPKTo=SDJDate=20180326Class=Y",1
"From=SDJTo=SPKDate=20180326Class=Y",1
"From=FUKTo=HNDDate=20180407Class=S",0
"From=HNDTo=OSADate=20180413Class=S",0
"From=HNDTo=OSADate=20180413Class=S",0
"From=HNDTo=OSADate=20180413Class=Y",1
"From=HNDTo=OSADate=20180413Class=Y",1
"From=OSATo=HNDDate=20180415Class=Y",1
"From=OSATo=HNDDate=20180

IOPub message rate exceeded.
The notebook server will temporarily stop sending output
to the client in order to avoid crashing it.
To change this limit, set the config variable
`--NotebookApp.iopub_msg_rate_limit`.

Current values:
NotebookApp.iopub_msg_rate_limit=1000.0 (msgs/sec)
NotebookApp.rate_limit_window=3.0 (secs)



**`NumCmp(Val1, Val2, Delta)`**:Tests if Val1 and Val2 are within Delta of each other - i.e., whether `Abs(Val1 - Val2) <= Delta`.

* Returns 1 if true, 0 otherwise.
* `Val1`, `Val2` and `Delta` can be a numeric column’s name, a numeric constant, or an expression that evaluates to a number.
* `Delta` should be greater than or equal to zero.

We'll use following dataset with year, month and day column.

year|month|day
----|----|---|
2015|04|03
2015|08|08
2015|08|23
2015|12|28
2016|03|21
2016|04|11
2016|05|02
2016|05|15
2016|11|02
2016|11|04
2016|12|04
2017|04|26
2017|05|15
2017|10|23
2017|12|18
2018|02|21
2018|08|07
2018|08|07
2018|10|05
2018|12|03

We will calculate the difference between `month` and `date`, then compare that to our `delta` which is equal to 8.

In [122]:
forth_dim="data/aq_pp/year_month_date.csv"
aq_pp -f,+1 $forth_dim -d I:year I:month I:day -eval 'is:delta' 'abs(month - day)' -eval 'is:isWithin' 'NumCmp(month, day, 8)' -c ~year

"month","day","delta","isWithin"
4,3,1,1
8,8,0,1
8,23,15,0
12,28,16,0
3,21,18,0
4,11,7,1
5,2,3,1
5,15,10,0
11,2,9,0
11,4,7,1
12,4,8,1
4,26,22,0
5,15,10,0
10,23,13,0
12,18,6,1
2,21,19,0
8,7,1,1
8,7,1,1
10,5,5,1
12,3,9,0


You can see that the rows whose delta value is greater than 8, isWithin's value is 0, and vise versa.

<a id='extract_code'></a>
### Data extraction and encode / decode Functions



**`SubStr(Val, Start [, Length])`**: Returns a substring of a string.

* `Val` can be a string column’s name, a string constant, or an expression that evaluates to a string.
* `Start` is the starting position (zero-based) of the substring in `Val`. It can be a numeric column’s name, a number, or an expression that evaluates to a number.
     * If `Start` is negative, the length of `Val` will be added to it. If it is still negative, 0 will be used.(Think of it as a pythonic way of indexing the string from backwards)
* Optional `Length` specifies the length of the substring in `Val`. It can be a numeric column’s name, a number, or an expression that evaluates to a number.
    * Max length is length of `Val` minus `Start`.
    * If `Length` is not specified, max length is assumed.
    * If `Length` is negative, max length will be added to it. If it is still negative, 0 will be used.

For this example, to keep things simple we'll use a file containing 2 row, one with numeric string and the other with good old "Hello World", which looks like below.


simple_str|
---|
0123456789
Hello World

starting from zero as `Val`, and will extract substring at index 3 ~ last index. We can do this like following.

In [123]:
subStr="data/aq_pp/substr.csv"
aq_pp -f,+1 $subStr -d s:val_str -eval 's:subStr' 'SubStr(val_str, 3)' -c  val_str SubStr

"val_str","subStr"
"0123456789","3456789"
"Hello World","lo World"


As you can see, string from 3rd index (counting from zero) are extracted as substring. <br>

**`Length`**<br>
Providing this argument will allow users to specify the length of **extracted substring**. Note that this is NOT the ending index of the substring. 

For example, in order to extract substring at index 3 ~ 7 in the original `Val` string, we'd need to provide `3` as `Start` and `5` as `Length`, since the substring extracted will be the length of 5.

In [124]:
aq_pp -f,+1 $subStr -d s:val_str -eval 's:subStr' 'SubStr(val_str, 3, 5)' -c  val_str SubStr

"val_str","subStr"
"0123456789","34567"
"Hello World","lo Wo"


**Negative Index**<br>
Users can specify index from right side of the `Val` string, by using negative indexing (Similar to python's string).

For example, say we'd like to extract the word "World" using the negative index. Letter "W" is the 5th character from the right side of the string, so we'll provide `-5` as `Start`.

aq_pp -f,+1 $file -d s:val_str -eval 's:subStr' 'SubStr(val_str, -5)' -c  val_str SubStr

`Length` argument can also be negative number. Let's provide `-2` as `Length` parameter, and `0` for `Start`.

In [125]:
aq_pp -f,+1 $subStr -d s:val_str -eval 's:subStr' 'SubStr(val_str, 0, -2)' -c  val_str SubStr

"val_str","subStr"
"0123456789","01234567"
"Hello World","Hello Wor"


This works exactly as the reverse ending indexing, such that we've extracted substring that ends before the 2nd character from right side of the original string.

**`ClipStr(Val, ClipSpec)`**:Returns a substring of a string, based on `clipSpec`.

* `Val` can be a string column’s name, a string constant, or an expression that evaluates to a string.

* `ClipSpec` is a string constant that specifies how to clip the substring from the source. It is a sequence of individual clip elements separated by “;”:

Each clip elements specifies either the starting or trailing portion of the source string `Val`. Below are some of the commonly used ones.
- `Num`: number of bytes / Separators (`Sep`) to clip
- `Dir`: direction to clip the string, `>`:left to right, `<`: right to left.
- `Sep`: Single byte separator character. Substring that are up to the `Num` of `Sep` character will be clipped.

Let's take a look at some simple examples.

We'll use a list of web URLs as an example here, to demonstrate how `ClipSpec` would be useful.It is a single column list with web URL strings.

First, let's say you'd like to extract the first 5 characters of the URL. We can specify `ClipSpec` as `5>` where 
- `Num`: `5`
- `Dir`: `>`

val_str column will be provided as `Val`, and the result will be on `subStr`. 

In [126]:
urls="data/aq_pp/clipstr.csv"
aq_pp -f,+1 $urls -d s:val_str -eval 's:subStr' 'ClipStr(val_str, "5>")' -c  val_str SubStr

"val_str","subStr"
"https://duckduckgo.com/?q=is+duckduckgo+safe&t=h_&ia=web","https"
"https://www.google.com/search?client=ubuntu&channel=fs&q=hello+world&ie=utf-8&oe=utf-8","https"
"http://auriq.com/documentation/search.html?q=emod&check_keywords=yes&area=default","http:"


**`Sep`**<br>
Now what if we would like to extract string up until domain name from the URLs? We can do this using `Sep` attributes. 

We will extract everything up until the third `/` character in the URLs, by specifying `/` as `Sep`, and give `Num` three.
Note that the `Sep` is inclusive, therefore the extracted string will include the `Sep` as their last character.

In [127]:
aq_pp -f,+1 $urls -d s:val_str -eval 's:subStr' 'ClipStr(val_str, "3>/")' -c  val_str SubStr

"val_str","subStr"
"https://duckduckgo.com/?q=is+duckduckgo+safe&t=h_&ia=web","https://duckduckgo.com/"
"https://www.google.com/search?client=ubuntu&channel=fs&q=hello+world&ie=utf-8&oe=utf-8","https://www.google.com/"
"http://auriq.com/documentation/search.html?q=emod&check_keywords=yes&area=default","http://auriq.com/"


**Different Direction**<br>

We can also clip the string from right side. In this example we'll clip the very last portion of the URL, by providing `/` as `Sep` and `<` as `Dir`.

In [128]:
aq_pp -f,+1 $urls -d s:val_str -eval 's:subStr' 'ClipStr(val_str, "2</")' -c  val_str SubStr

"val_str","subStr"
"https://duckduckgo.com/?q=is+duckduckgo+safe&t=h_&ia=web","/duckduckgo.com/?q=is+duckduckgo+safe&t=h_&ia=web"
"https://www.google.com/search?client=ubuntu&channel=fs&q=hello+world&ie=utf-8&oe=utf-8","/www.google.com/search?client=ubuntu&channel=fs&q=hello+world&ie=utf-8&oe=utf-8"
"http://auriq.com/documentation/search.html?q=emod&check_keywords=yes&area=default","/documentation/search.html?q=emod&check_keywords=yes&area=default"


**`StrIndex(Val, Str [, AtrLst])`**: Returns the position (zero-based) of the first occurrence of `Str` in `Val` or -1 if it is not found.

* `Val` can be a string column’s name, a string constant, or an expression that evaluates to a string.
* `Str` is the value to find within `Val`. It can be a string column’s name, a string constant, or an expression that evaluates to a string.
* Optional `AtrLst` is a list of `|` separated attributes containing:
    * `ncas` - Perform a case insensitive match (default is case sensitive). For ASCII data only.
    * `back` - Search backwards from the end of `Val`.

Let's check this out with the ramen dataset. We will provide `variety` column as `Val`, and "Noodle" as string to get the index of the string in `Val`.

In [129]:
aq_pp -f,+1 $file -d $cols -eval 'is:isAt' 'StrIndex(variety, "Noodle")' -c variety isAt

"variety","isAt"
"T's Restaurant Tantanmen ",-1
"Noodles Spicy Hot Sesame Spicy Hot Sesame Guan-miao Noodles",0
"Cup Noodles Chicken Vegetable",4
"GGE Ramen Snack Tomato Flavor",-1
"Singapore Curry",-1
"Kimchi song Song Ramen",-1
"Spice Deli Tantan Men With Cilantro",-1
"Nabeyaki Kitsune Udon",-1
"Hokkaido Soy Sauce Ramen",-1


You can see that it is outputting the index number of the given string, counting from left side of the `Val` string.

**Attributes**<br>

**`back`**<br>
Note that 2rd record (or 3th row) contains 2 occurence of "Noodle", and result returns index of 0. This is because `StrIndex()` only returns the first occurence of given string. 

We can reverse the search from backwards by giving `back` attribute, like this.

In [130]:
aq_pp -f,+1 $file -d $cols -eval 'is:isAt' 'StrIndex(variety, "Noodle", back)' -c variety isAt

"variety","isAt"
"T's Restaurant Tantanmen ",-1
"Noodles Spicy Hot Sesame Spicy Hot Sesame Guan-miao Noodles",52
"Cup Noodles Chicken Vegetable",4
"GGE Ramen Snack Tomato Flavor",-1
"Singapore Curry",-1
"Kimchi song Song Ramen",-1
"Spice Deli Tantan Men With Cilantro",-1
"Nabeyaki Kitsune Udon",-1
"Hokkaido Soy Sauce Ramen",-1


The 2nd record's result is now 52, which is the index of the second occurence of "Noodle" string.

**`ncas`**<br>

This attribute perform case insensitive search (default is sensitive). 

We'll provide "NOODLE" as `Str` this time, as well as `ncas` for attribute.

In [131]:
aq_pp -f,+1 $file -d $cols -eval 'is:isAt' 'StrIndex(variety, "NOODLE", ncas)' -c variety isAt

"variety","isAt"
"T's Restaurant Tantanmen ",-1
"Noodles Spicy Hot Sesame Spicy Hot Sesame Guan-miao Noodles",0
"Cup Noodles Chicken Vegetable",4
"GGE Ramen Snack Tomato Flavor",-1
"Singapore Curry",-1
"Kimchi song Song Ramen",-1
"Spice Deli Tantan Men With Cilantro",-1
"Nabeyaki Kitsune Udon",-1
"Hokkaido Soy Sauce Ramen",-1


**`RxMap(Val, MapFrom [, Col, MapTo ...] [, AtrLst])`**: Extracts substrings from a string based on a `MapFrom` expression and place the results in columns based on `MapTo` expressions.

* Returns 1 if successful or 0 otherwise. MapFrom only needs to match a subpart of `Val` to be successful.

* `Val` can be a string column’s name, a string constant, or an expression that evaluates to a string.

* `MapFrom` is a string constant that specifies the regular expression to match. The expression should contain subexpressions for substring extractions.

* The `Col` and `MapTo` pairs define how to save the results. Col is the column to put the result in. It must be of string type. `MapTo` is a string constant that defines how to render the result. It has the form:

Optional AtrLst is a list of | separated regular expression attributes.

Let's start with simple example. 

We'll have a address column, filled with fake addresses. And we'd like to extract ZIP code. Data look like below.

<a id='fake_address_data'></a>
**Fake Address Dataset**


fake_address|
-----|
06060 Cruz Loop Suite 043, Randyberg, WA 82176
11919 Wells Field Suite 087, East Dianaport, AL 96554
92586 Ferguson Inlet, Port Natalieview, HI 90811
836 Myers Road, South Cynthia, TN 70598
Unit 2992 Box 3756, DPO AE 65985
75721 Jo Bypass, Lake Kaitlin, FL 74395
9964 Justin Cliffs Apt. 446, Elizabethstad, MN 58843
499 Anderson Ridge, Pattersonton, TN 09233
USNS Harris, FPO AE 17643
741 Denise Motorway Suite 930, Desireeland, DC 76580

We will extract states and zip code from the address using regex. 
An expression `[A-Z]{2} [0-9]{5}` is provided as `MapFrom` to extract 2 capital alphabet character, followed by whitespace and 5 digit number. 

In [132]:
# define the file path and column spec
fake_addrs="data/aq_pp/fake_addrs.csv"
aq_pp -f,+1 $fake_addrs -d S:address -eval S:State_zip '"96828 HI"' -eval - 'RxMap(address, "[A-Z]{2}\s[0-9]{5}", State_zip, "%%0%%", pcre)' -c State_zip

"State_zip"
"WA 82176"
"AL 96554"
"HI 90811"
"TN 70598"
"AE 65985"
"FL 74395"
"MN 58843"
"TN 09233"
"AE 17643"
"DC 76580"


**`PatMap(Val, MapFrom [, Col, MapTo ...] [, AtrLst])`**:extract substring from `Val` based on `MapFrom` pattern, and map to `Col` based on `MapTo` pattern.

This function uses [RT MapFrom](http://auriq.com/documentation/source/reference/manpages/aq_pp.html?highlight=pcre#rt-mapfrom-syntax) expression instead of regex for `MapFrom` value. 
Let's keep things simple, and extract the zip code only this time.

In [133]:
aq_pp -f,+1 $fake_addrs -d S:address -eval S:Zip '"96828 HI"' -eval - 'PatMap(address, "%*%%ZIP:@n:5-5%%%*", Zip, "%%ZIP%%")' -c Zip

"Zip"
"82176"
"96554"
"90811"
"70598"
"65985"
"74395"
"58843"
"09233"
"17643"
"76580"


MapFrom expressiosn `%*%%ZIP:@n:5-5%%%*` was used here. 

- `%*`: represents any numbers of any characters
- `%%VarName%%`: variable name of your choice, surrounded by double percent signs
- `:`: works as a separator for different attributes to specify the pattern to store in the variable
- `@n`: where character after @ represents class type / character type, and here `n` is for numbers.
- `5-5`: represents the min and max numbers of characters to match, and here we'd like to extract exactly 5 characters.

**`KeyEnc(Col, [, Col ...])`**: Encodes columns of various types into a single string.

* Returns a string key. The key is binary, do not try to interpret or modify it.
* `Col` are the columns to encode into the key.

This feature comes in handy when you'd like to create one composite key column out of multiple columns. Let's create a composite key from `review`, `brand` and `variety`.

Here are the selected columns of the data to be encoded, for your convenience.

Review|Brand|Country|
---|---|---|
2580|New Touch|Japan|
2579|Just Way|Taiwan|
2578|Nissin|USA|
2577|Wei Lih|Taiwan|
2576|Ching's Secret|India|
2575|Samyang Foods|South Korea|
2574|Acecook|Japan|
2573|Ikeda Shoku|Japan|
2572|Ripe'n'Dry|Japan|

In [134]:
aq_pp -f,+1 $file -d $cols -eval 's:key' 'KeyEnc(reviewID, brand, country)' -c key

"key"
"
  	New TouchJapan"
"
 Just WayTaiwan"
"
  NissinUSA"
"
  Wei LihTaiwan"
"
  Ching's SecretIndia"
"
Samyang FoodsSouth Korea"
"
  AcecookJapan"
"
  Ikeda ShokuJapan"
"
  
Ripe'n'DryJapan"


**`KeyDec(Key, Col|"ColType" [, Col|"ColType" ...])`**: Decodes a key previously encoded by KeyEnc() and place the resulting components in the given columns.

* Returns 1 if successful. A failure is considered a processing error. There is no failure return value.
* `Key` is the previously encoded value. It can be a string column’s name, a string constant or an expression that evaluates to a string.
* Each `Col` or `ColType` specifies a components in the key.
    * If a column is given, a component matching the column’s type is expected; the extracted value will be placed in the given column.
    * If a column type string is given, a component matching this type is expected, but the extracted value will not be saved.
* The components must be given in the same order as in the encoding call.

In [135]:
aq_pp -f,+1 $file -d $cols -eval 's:key' 'KeyEnc(reviewID, brand, country)' -c key | \
aq_pp -f,+1 - -d s:key -eval I:dec_ID '100' -eval S:dec_brand '"prada"' -eval S:dec_country '"South America"' \
-eval - 'KeyDec(key, dec_ID, dec_brand, dec_country)' -c dec_ID dec_brand dec_country

"dec_ID","dec_brand","dec_country"
2580,"New Touch","Japan"
2579,"Just Way","Taiwan"
2578,"Nissin","USA"
2577,"Wei Lih","Taiwan"
2576,"Ching's Secret","India"
2575,"Samyang Foods","South Korea"
2574,"Acecook","Japan"
2573,"Ikeda Shoku","Japan"
2572,"Ripe'n'Dry","Japan"


1st line encodes the 3 columns into one string column named `key`, then outputs only that column.
2nd line get the key from the first `aq_pp` command, then set up destination columns to map the decoded keys.
3rd line decodes and map the key on the destination columns, and output the 3 columns.

You can verify that the content of output column does match the original column's contents.

**`QryDec(Val, [, AtrLst], Col, KeyName [, AtrLst] [, Col, KeyName [, AtrLst] ...])`:**<br>
Given URL string as `Val`, extracts the values of selected [query parameters](https://en.wikipedia.org/wiki/Query_string) and place the results in columns, as well as return number of parameters extracted.

As a simple example, we'll demonstrate extraction of parameters from a URL string below, and store it in result column.

**WHY NOT WORKING FOR ALL URLs??**

In [136]:
aq_pp -f,+1 $urls -d S:URL S:result -eval 'I:e_num' 0 -eval e_num 'QryDec(URL, result, "q")'

"URL","result","e_num"
"https://duckduckgo.com/?q=is+duckduckgo+safe&t=h_&ia=web",,0
"https://www.google.com/search?client=ubuntu&channel=fs&q=hello+world&ie=utf-8&oe=utf-8","hello world",1
"http://auriq.com/documentation/search.html?q=emod&check_keywords=yes&area=default",,0


**`UrlEnc(Val)`**: URL-encode a string.

* Returns the encoded result.
* `Val` is the string to encoded. It can be a string column’s name, a string constant or an expression that evaluates to a string.

We'll encode the URLs used in above `QryDec()` example, and store the result on `encoded` column.

In [138]:
aq_pp -f,+1 $urls -d s:val_str -eval 's:encoded' 'UrlEnc(val_str)' -c encoded

"encoded"
"https%3A//duckduckgo.com/%3Fq%3Dis%2Bduckduckgo%2Bsafe%26t%3Dh_%26ia%3Dweb"
"https%3A//www.google.com/search%3Fclient%3Dubuntu%26channel%3Dfs%26q%3Dhello%2Bworld%26ie%3Dutf-8%26oe%3Dutf-8"
"http%3A//auriq.com/documentation/search.html%3Fq%3Demod%26check_keywords%3Dyes%26area%3Ddefault"


**`UrlDec(Val)`**: Decodes an URL-encoded string.

* Returns the decoded result.
* `Val` is an URL-encoded string. It can be a string column’s name, a string constant or an expression that evaluates to a string.

We'll first encode the URLs like done so in previous example using `UrlEnc()`, then pass the result to another `aq_pp` command using pipe, then decode the result using `UrlDec()`. The final result will be stored and outputted in decoded_url column.

In [139]:
aq_pp -f,+1 $urls -d s:val_str -eval 's:encoded' 'UrlEnc(val_str)' -c encoded | \
aq_pp -f,+1 - -d s:encoded -eval 's:decoded_url' 'UrlDec(encoded)' -c decoded_url

"decoded_url"
"https://duckduckgo.com/?q=is+duckduckgo+safe&t=h_&ia=web"
"https://www.google.com/search?client=ubuntu&channel=fs&q=hello+world&ie=utf-8&oe=utf-8"
"http://auriq.com/documentation/search.html?q=emod&check_keywords=yes&area=default"


We can see that the decoded URLs are identical to the original URLs before encoding was applied.

**`Base64Enc(Val)`**: Base64-encode a string.

* Returns the encoded result.
* ``Val`` is the string to encode.
    * It can be a string column's name, a string constant
    * or an expression that evaluates to a string.

Using the ramen data, we'll encode the `country` column.

In [144]:
aq_pp -f,+1 $file -d $cols -eval 'S:encoded64' 'Base64Enc(country)' -c country encoded64

"country","encoded64"
"Japan","SmFwYW4="
"Taiwan","VGFpd2Fu"
"USA","VVNB"
"Taiwan","VGFpd2Fu"
"India","SW5kaWE="
"South Korea","U291dGggS29yZWE="
"Japan","SmFwYW4="
"Japan","SmFwYW4="
"Japan","SmFwYW4="


**`Base64Dec(Val)`**: Decodes a base64-encoded string.

* Returns the decoded result.There is no integrity check. Portions of `Val` that is not base64-encoded are simply skipped. As a result, the function may return a blank string.

* `Val` is a base64-encoded string. It can be a string column's name, a string constant or an expression that evaluates to a string.

Let's try to decode what we've decoded in the example above. 

In [146]:
aq_pp -f,+1 $file -d $cols -eval 'S:encoded64' 'Base64Enc(country)' -c country encoded64 | \
aq_pp -f,+1 - -d s:country s:encoded64 -eval 's:decoded64' 'Base64Dec(encoded64)' -c country decoded64

"country","decoded64"
"Japan","Japan"
"Taiwan","Taiwan"
"USA","USA"
"Taiwan","Taiwan"
"India","India"
"South Korea","South Korea"
"Japan","Japan"
"Japan","Japan"
"Japan","Japan"


You can observe that decoded country matches the original column contents.

<a id='conversion'></a>
### General Data Conversion Functions

There are several data conversion function that convert data into other types, such as following:
- `ToIP(Val)`: IP type
- `ToS(Val)`: String type
- `ToI(Val)`: Integer type
- `ToF(Val)`: Float type

Each of these takes `Val` as argument, and output the data as a corresponding data types.
For concrete examples of the functions, refer to Data Conversion section in `aq_pp -eval` notebook.

Other 4 functions are related to manipulating string values. Let's take a look at each of them.


**`ToUpper(Val), ToLower(Val)`**: Returns the upper or lower case string representation of `Val`.

- For ASCII strings only. May corrupt multibyte character strings.
- `Val` can be a string column’s name, a string constant, or an expression that evaluates to a string.

We'll convert the contents of `style` column in the Ramen dataset into upper case letters. 
`ToLower(Val)` can be used in a same manner. 

In [51]:
aq_pp -f,+1 $file -d $cols -eval 's:upper' 'ToUpper(style)' -c style upper

"style","upper"
"Cup","CUP"
"Pack","PACK"
"Cup","CUP"
"Pack","PACK"
"Pack","PACK"
"Pack","PACK"
"Cup","CUP"
"Tray","TRAY"
"Pack","PACK"



**`MaskStr(Val)`**: Irreversibly masks (or obfuscates) a string value. The result should be nearly as unique as the original (the probability of two different values having the same masked value is extremely small).

* `Val` can be a string column’s name, a string constant, or an expression that evaluates to a string.
* The length of the result may be the same or longer than the original.

Let's apply this function to the `style` column as well. You can observe that same original value are masked into same string value.

In [53]:
aq_pp -f,+1 $file -d $cols -eval 's:masked' 'MaskStr(style)' -c style masked

"style","masked"
"Cup","YXl3"
"Pack","_9iZFR"
"Cup","YXl3"
"Pack","_9iZFR"
"Pack","_9iZFR"
"Pack","_9iZFR"
"Cup","YXl3"
"Tray","E4bYAo"
"Pack","_9iZFR"


**`RxReplace(Val, RepFrom, Col, RepTo [, AtrLst])`**: Replaces the first or all occurrences of a substring in `Val` matching expression `RepFrom` with expression `RepTo` and place the result in `Col`.

- Returns the number of replacements performed or 0 if there is no match.

- `Val` can be a string column’s name, a string constant, or an expression that evaluates to a string.

- `RepFrom` is a string constant that specifies the regular expression to match. Substring(s) matching this expression will be replaced. The expression can contain subexpressions that can be referenced in RepTo.

- `Col` is the column to put the result in. It must be of string type.

Again we will use fake address dataset. 

**Fake Address Dataset**


fake_address|
-----|
06060 Cruz Loop Suite 043, Randyberg, WA 82176
11919 Wells Field Suite 087, East Dianaport, AL 96554
92586 Ferguson Inlet, Port Natalieview, HI 90811
836 Myers Road, South Cynthia, TN 70598
Unit 2992 Box 3756, DPO AE 65985
75721 Jo Bypass, Lake Kaitlin, FL 74395
9964 Justin Cliffs Apt. 446, Elizabethstad, MN 58843
499 Anderson Ridge, Pattersonton, TN 09233
USNS Harris, FPO AE 17643
741 Denise Motorway Suite 930, Desireeland, DC 76580

In the example below, we will do 2 things.
1. extract State and Zip code from `address` column, then assign it to new column called `State_Zip` in a format of `State: WA, Zip: 82176`. 
2. store the numbers of replacements performed in a row in an integer column `num_rep`.

In [120]:
aq_pp -f,+1 $fake_addrs -d S:address \
    -eval 's:replaced' '"SZ"' -eval 'i:num_rep' '0' \
    -eval num_rep 'RxReplace(address, "([A-Z]{2})\s([0-9]{5})", replaced, "STATE: %%1%%, ZIP: %%2%%", pcre)' -c replaced num_rep

"replaced","num_rep"
"6060 Cruz Loop Suite 043 Randyberg STATE: WA, ZIP: 82176",1
"1919 Wells Field Suite 087 East Dianaport STATE: AL, ZIP: 96554",1
"2586 Ferguson Inlet Port Natalieview STATE: HI, ZIP: 90811",1
"836 Myers Road South Cynthia STATE: TN, ZIP: 70598",1
"Unit 2992 Box 3756 DPO STATE: AE, ZIP: 65985",1
"5721 Jo Bypass Lake Kaitlin STATE: FL, ZIP: 74395",1
"9964 Justin Cliffs Apt. 446 Elizabethstad STATE: MN, ZIP: 58843",1
"499 Anderson Ridge Pattersonton STATE: TN, ZIP: 09233",1
"USNS Harris FPO STATE: AE, ZIP: 17643",1
"741 Denise Motorway Suite 930 Desireeland STATE: DC, ZIP: 76580",1


* 1st line deals with input spec
* 2nd line creates columns to store results.
* 3rd line apply `RxReplace()` to `address` column. 

**`RxRep(Val, RepFrom, RepTo [, AtrLst])`**: The same as RxReplace() except that it returns the result string directly (for this reason, it does not have RxReplace()‘s Col argument).

Let's apply this to same column as above, but this time without numbers of replacement performed.

In [123]:
aq_pp -f,+1 $fake_addrs -d S:address -eval 's:replaced' '"SZ"' \
    -eval replaced 'RxRep(address, "([A-Z]{2})\s([0-9]{5})", "STATE: %%1%%, ZIP: %%2%%", pcre)' -c replaced

"replaced"
"6060 Cruz Loop Suite 043 Randyberg STATE: WA, ZIP: 82176"
"1919 Wells Field Suite 087 East Dianaport STATE: AL, ZIP: 96554"
"2586 Ferguson Inlet Port Natalieview STATE: HI, ZIP: 90811"
"836 Myers Road South Cynthia STATE: TN, ZIP: 70598"
"Unit 2992 Box 3756 DPO STATE: AE, ZIP: 65985"
"5721 Jo Bypass Lake Kaitlin STATE: FL, ZIP: 74395"
"9964 Justin Cliffs Apt. 446 Elizabethstad STATE: MN, ZIP: 58843"
"499 Anderson Ridge Pattersonton STATE: TN, ZIP: 09233"
"USNS Harris FPO STATE: AE, ZIP: 17643"
"741 Denise Motorway Suite 930 Desireeland STATE: DC, ZIP: 76580"


<a id='date_time'></a>
### Date/Time conversion Functions

**`DateToTime(DateVal, DateFmt)`**, **`GmDateToTime(DateVal, DateFmt)`**: each of them takes string `DateVal`, and return [UNIX time](https://en.wikipedia.org/wiki/Unix_time) in integral, unless otherwise specified. 

- `DateVal` can be a string column’s name, a string constant, or an expression that evaluates to a string.
- `DateFmt` is a string constant that specifies the format of `DateVal`.

Example below will convert date time column's value into UNIX time, and store it in new column (`Unix_time`).


In [133]:
date_data="data/aq_pp/dates.csv"
aq_pp -f,+1 $date_data -d S:date -eval 'I:Unix_time' 'DateToTime(date, "%Y.%m.%d")'

"date","Unix_time"
"2015-04-03",1428019200
"2015-08-08",1438992000
"2015-08-23",1440288000
"2015-12-28",1451260800
"2016-03-21",1458518400
"2016-04-11",1460332800
"2016-05-02",1462147200
"2016-05-15",1463270400
"2016-11-02",1478044800
"2016-11-04",1478217600
"2016-12-04",1480809600
"2017-04-26",1493164800
"2017-05-15",1494806400
"2017-10-23",1508716800
"2017-12-18",1513555200
"2018-02-21",1519171200
"2018-08-07",1533600000
"2018-08-07",1533600000
"2018-10-05",1538697600
"2018-12-03",1543795200


`DateFmt` used in the example above include followings:

- (a dot) `.` - represent a single unwanted character (e.g., a separator).
- `%Y` - 1-4 digit year.
- `%m` - Month in 1-12.
- `%d` - Day of month in 1-31.

- `%H` or `%I` - hour in 0-23 or 1-12.
- `%M` - Minute in 0-59.
- `%S` - Second in 0-59.

We only covered a simple example, but more attributes are available.
For more details, please refer to the [Date/Time conversion - aq-emod](http://auriq.com/documentation/source/reference/manpages/aq-emod.html#date-time-conversion-functions)

**`TimeToDate(TimeVal, DateFmt)`**, **`TimeToGmDate(TimeVal, DateFmt)`**: Both functions return the date string corresponding to TimeVal. The result string’s maximum length is 127.

- `TimeVal` can be a numeric column’s name, a numeric constant, or an expression that evaluates to a number.
- `DateFmt` is a string constant that specifies the format of the output. 
- Conversion is timezone dependent. It is done using the program’s default timezone. Set the program’s timezone, e.g, via the TZ environment, before execution if necessary.

Example below uses a column of Unix time, to be converted to Date format like "2019-10-21 16:17:56".

In [134]:
unix_data="data/aq_pp/Unix_time.csv"
aq_pp -f,+1 $unix_data -d I:Unix_time -eval 'S:Date' 'TimeToDate(Unix_time, "%Y-%m-%d %H:%M:%S")'

"Unix_time","Date"
119731017,"1973-10-17 18:36:57"
1000000000,"2001-09-09 01:46:40"
1111111111,"2005-03-18 01:58:31"
2000000000,"2033-05-18 03:33:20"
2147483647,"2038-01-19 03:14:07"


You can also set timezone by setting the system's timezone. For example in order to set it to Japan, 

In [135]:
TZ="Japan" aq_pp -f,+1 $unix_data -d I:Unix_time -eval 'S:Date' 'TimeToDate(Unix_time, "%Y-%m-%d %H:%M:%S")'

"Unix_time","Date"
119731017,"1973-10-18 03:36:57"
1000000000,"2001-09-09 10:46:40"
1111111111,"2005-03-18 10:58:31"
2000000000,"2033-05-18 12:33:20"
2147483647,"2038-01-19 12:14:07"






<a id='character_encoding'></a>
### Character set encoding conversion Functions



## multiple -eval options
