[Back to README](README.ipynb)

# aq_pp command -filt

In this notebook, we'll go over common usage examples of the data preprocessing command, `aq_pp`'s option `-filt`.

## Overview

`-filt` option of `aq_pp` command filter / select records (row) based on given conditional statement, which is called `FilterSpec` in `aq_tool`'s eco system. If given `FilterSpec` for a record is true, the record is selected. Otherwise it is discarded. 

You can think of it as python pandas' equivalent of `df.loc["someCondition"]` or SQL's equivalent of `WHERE` clause.

### Prerequisites
Before going over this notebook, make sure you're faimilar with the following concepts.

* Bash commands
* Regular Expression
* aq_input / input-spec 
* string manupilation with `aq_tool` **LINK TO MAP / STRING MANIPULATION WITH AQ_PP** (Only for string manupilation / comparison section)

If not familiar with any of the above, resource can be found on  
- [aq-input](aq_input.ipynb).
- [aq-output notebook](aq_output.ipynb)
- [aq_pp string manupilation](
 

Also have the [aq_pp documentation](http://auriq.com/documentation/source/reference/manpages/aq_pp.html) ready on your side, so you can refer to the details of each options as needed.

**If you are familiar with syntax already, feel free to skip to the [data section](#data)**<br>

### Syntax

```aq_pp, -f,  ... -filt 'FiltSpec' ...```

where `FiltSpec` is the conditional expression. Note that `FilterSpec` needs to be single quoted.

### Filter Spec
Followings are the components of filter spec.

```ColName|Constant compare ColName|Constant```

Values on the both sizes (`ColName|Constant`) can be either
* Column / Variable name, **without quotation**
* String, number or IP address constant. (String needs to be quoted)


`compare`: comparison operator, which are
* `==`, `>`, `<`, `>=`, `<=` - comparison.
* `!`: negation of filter spec, should be placed at the front
* Other operators are also available, please refer to the [aq_pp documentation](http://auriq.com/documentation/source/reference/manpages/aq_pp.html#filt)


**Example of Filter Spec**<br>

Assume that we had an integer column named `age` with people's age, and we'd like to filter out records with age greater than 18 (exclusive). We can use this option as 

`aq_pp ... -filt 'age>18' ... `


<a id='data'></a>
## Data

We'll be using a smaller portion of [amazon review dataset](https://s3.amazonaws.com/amazon-reviews-pds/readme.html). 
Necessary columns are selected and records are sampled from the original dataset. 
Below is the preview of the dataset.

marketplace|product_category|product_title|star_rating|verified_purchase|helpful_votes|
-----|-----|-----|-----|-----|-----|
UK|Toys|Elsa Musical Wand|5.0|1|0.0|
US|Mobile_Apps|The Cursed Ship, Collector’s Edition|3.0|1|0.0|
US|Digital_Music_Purchase|Transit Of Venus|4.0|1|1.0|
US|PC|Griffin 3m USB to Lightning Cable (GC36633)|1.0|1|0.0|
DE|Toys|Lego Duplo Winnie the Pooh 5945 - Winnie Poohs Picknick|5.0|1|0.0|
US|Mobile_Apps|Crossy Road|5.0|1|0.0|
US|Toys|Schleich Tyrannosaurus Rex|5.0|1|0.0|
US|Digital_Music_Purchase|Mylo Xyloto|5.0|0|0.0|
US|Digital_Music_Purchase|Need You Now|5.0|1|0.0|
US|Digital_Music_Purchase|Rhythm & Blues|3.0|1|0.0|

**Column Details of the Dataset**<br>

So what are these columns really mean??

column name|Data Type|Description|Unique Values/Range|
-----|-----|-----|-----|
Marketplace|String|Amazon market place|US, UK, DE, JP, FR|
product_category|String|product category that the product belongs to|9 distinct categories|
product_title|String|product's title|N/A|
star_rating|Float|star rating received|1.0 ~ 5.0|
verified_purchase|Integer|If the purchase was verified or not|1 for verified, 0 for not|
helpful_votes|Float|Numbers of helpful votes per review|0.0 ~ 568.0|


**Column Spec**<br>
Setting up the column spec for the dataset, we have:

```S:MarketPlace S:product_category S:product_title F:star_rating I:verified_purchase I:helpful_vote```



Now we are ready, let's get started on samples!

## Samples

### Loading the Data

In [2]:
# set up the column spec and file path, and dipslay the data on stdout
cols="S:MarketPlace S:product_category S:product_title F:star_rating I:verified_purchase F:helpful_vote"
file="data/aq_pp/amazon_review_binary.csv"

# only displaying the top 10 records of all
aq_pp -f,+1 $file -d $cols

"MarketPlace","product_category","product_title","star_rating","verified_purchase","helpful_vote"
"UK","PC","Amazon Zip Sleeve for 7-Inch Tablets",4,1,0
"UK","Mobile_Apps","Facebook",3,1,0
"US","Digital_Music_Purchase","Hail to the King",5,1,0
"US","Musical Instruments","BEHRINGER TU300",5,1,0
"US","Digital_Music_Purchase","Rhythm & Blues",3,1,0
"US","Mobile_Apps","Candy Crush Saga",5,1,1
"US","Digital_Music_Purchase","x (Deluxe Edition)",5,1,0
"US","Mobile_Apps","Buttons and Scissors (Pro)",4,1,0
"US","Mobile_Apps","Handy Photo",5,1,0
"DE","Toys","STAR WARS Unisex Wecker Analog Kunststoff weiß STAR3",1,1,10
"US","Mobile_Apps","MY LITTLE PONY - Friendship is Magic",1,1,0
"JP","Toys","トランスフォーマーGo! G13 ハンターショックウェーブ",3,1,1
"US","Mobile_Apps","The Room Two (Kindle Tablet Edition)",5,1,0
"US","Toys","LEGO Star Wars Death Star (10188) (Discontinued by manufacturer)",5,1,0
"US","PC","Marware Axis Genuine Leather Rotating, Standing Case for Kindle Fire HD 7"" (only fits Kindle Fire HD 7"")",4,

"UK","Mobile_Apps","Search Engine For Google",5,1,0
"US","Digital_Music_Purchase","Vows (Deluxe Version)",5,1,0
"DE","Toys","Lego Friends  3184 - Abenteuer Wohnmobil",5,1,0
"US","Digital_Music_Purchase","True",5,1,0
"US","PC","Belkin Classic Case for Kindle Fire HD 7"", Blacktop (will only fit Kindle Fire HD 7"")",5,0,0
"US","Mobile_Apps","Crossy Road",5,1,0
"US","Digital_Music_Purchase","The Winery Dogs",5,0,0
"US","Toys","LEGO The DeLorean Time Machine Building Set 21103 (Discontinued by manufacturer)",5,1,0
"UK","PC","Griffin Elevator Desktop Stand for Laptops  & Macbooks",5,1,0
"US","Toys","LEGO® DUPLO®LEGOVille 5682 : Fire truck",5,1,0
"US","PC","AMD A8-5600K APU 3.6Ghz Processor AD560KWOHJBOX",5,0,1
"US","PC","SanDisk Ultra 16GB UHS-I/Class 10 Micro SDHC Memory Card With Adapter- SDSDQUAN-016G-G4A [Old Version]",5,1,0
"US","PC","Amazon 5ft USB to Micro-USB Cable (works with most Micro-USB Tablets)",5,1,0
"US","Toys","Funko POP Movies Despicable Me: Carl Vinyl Figure",5,1,3
"US","

"US","Mobile_Apps","Pixel Gun 3D (Pocket Edition) - multiplayer shooter with skin creator",5,1,0
"US","PC","Griffin 3m USB to Lightning Cable (GC36633)",1,1,0
"US","PC","Moshi FireWire 800 to 400 Adapter (99MO023901)",5,1,0
"US","Mobile_Apps","BADLAND",4,1,4
"US","Mobile_Apps","Splashtop Remote Desktop",5,1,0
"FR","Toys","BRIO World - 33512 - CIRCUIT PLATEFORME VOYAGEURS",2,0,16
"US","Toys","Schleich Hippopotamus Toy Figure",5,1,2
"DE","Mobile_Apps","Wo ist mein Perry?",4,1,0
"US","Toys","The Settlers of Catan",5,0,0
"UK","Mobile_Apps","8 Ball Pool",4,1,0
"US","Mobile_Apps","CSI: Hidden Crimes",3,1,0
"DE","Toys","LEGO Technic 8070 - Super Car",5,1,0
"US","Toys","LEGO Star Wars 75055 Imperial Star Destroyer Building Toy  (Discontinued by manufacturer)",5,1,1
"US","Mobile_Apps","Nyan Cat: Lost In Space (ad free)",4,1,0
"US","Digital_Music_Purchase","The Band Perry",5,1,0
"US","Mobile_Apps","Flappy Wings (not Flappy Bird)",5,1,0
"DE","Digital_Music_Purchase","Chasing Yesterday (Deluxe Edi

"US","Digital_Music_Purchase","Acoustic Classics",5,1,1
"UK","Mobile_Apps","Smart Office 2",2,1,4
"US","Digital_Music_Purchase","Siamese Dream",4,1,0
"JP","Digital_Music_Purchase","Red Dragon Cartel",5,1,2
"DE","Toys","Lamaze LC27045 - Play & Grow Logan, der Löwe",5,1,0
"US","Mobile_Apps","Abyss: The Wraiths of Eden (Full)",5,1,0
"US","Toys","Blokus Game",5,1,0
"US","Mobile_Apps","My Horse",5,1,0
"US","Digital_Music_Purchase","Frozen (Original Motion Picture Soundtrack)",5,1,0
"US","PC","Antec Three Hundred Two Gaming Case, Black",4,1,0
"US","Digital_Music_Purchase","Happy (From ""Despicable Me 2"")",5,1,0
"US","Mobile_Apps","Dictionary",5,1,0
"UK","Digital_Music_Purchase","Wanted On Voyage (Deluxe) [Explicit]",5,1,0
"US","PC","Apple iPad Camera Connection Kit (MC531ZM/A)",5,1,1
"UK","PC","Logitech Z120 Laptop Speakers 3.5mm USB",4,1,0
"US","PC","Logitech G13 Programmable Gameboard with LCD Display",3,1,1
"UK","Mobile_Apps","Flappy Wings (not Flappy Bird)",4,1,0
"US","PC","Logitech Wir

"US","Mobile_Apps","Yahoo Mail – Keeps you organized!",5,1,0
"US","Mobile_Apps","Talking Tom Cat Free",5,1,25
"US","Mobile_Apps","Docs To Go Premium Key",2,1,0
"US","Mobile_Apps","Solitaire",5,1,0
"US","Mobile_Apps","Virtual City Playground®: Building Tycoon",5,0,0
"US","PC","3Dconnexion 3DX-700028 SpaceNavigator 3D Mouse",5,1,0
"US","Toys","LEGO Star Wars Geonosian Cannon 9491",5,1,0
"US","Toys","Diamond Select Back to the Future II DeLorean Time Machine",5,1,1
"DE","Toys","Schleich 14626  - Wild Life, Wolf, heulend",5,1,1
"US","PC","Griffin GC16034 Elevator Stand for Laptops",5,1,0
"UK","Toys","Elsa Musical Wand",5,1,0
"US","PC","Thunderbolt to Gigabit Ethernet Adapter",5,1,0
"JP","Digital_Music_Purchase","Greatest Hits",5,1,5
"US","Musical Instruments","Zoom H4N Handy Portable Digital Recorder - 2009 Version",5,1,0
"DE","Mobile_Apps","Dummy Defense",4,1,1
"US","Digital_Music_Purchase","Ultimate Hits: Rock And Roll Never Forgets",5,1,0
"US","Mobile_Apps","UNO ™ & Friends - The Classi

## Table of Samples

- [Numerical Comparison](#num_cmp)
- [String Comparison](#string_cmp)
    - [Operators](#str_ops)
    - [aq-emod](#emod)
- [RowNum](#rowNum)

<a id='num_cmp'></a>
### Numerical Comparison

Here we'll try to filter out records by certain range or value of a numeric columns.

Let's say we only would like to take a look at reviews with more than 4 star rating (inclusive). <br>
In this case we can provide a column name `star_rating`, as well as comparison operator and a constant to construct `FilterSpec`. We'll output only product category and rating for clearity.

In [3]:
# piping the result into head command to display the top 10 results
aq_pp -f,+1 $file -d $cols -filt 'star_rating > 3' -c product_category star_rating

"product_category","star_rating"
"PC",4
"Digital_Music_Purchase",5
"Musical Instruments",5
"Mobile_Apps",5
"Digital_Music_Purchase",5
"Mobile_Apps",4
"Mobile_Apps",5
"Mobile_Apps",5
"Toys",5
"PC",4
"Digital_Music_Purchase",5
"Toys",5
"Toys",5
"Mobile_Apps",4
"Digital_Music_Purchase",4
"Toys",4
"Digital_Music_Purchase",5
"Toys",5
"Toys",5
"Digital_Music_Purchase",5
"Mobile_Apps",4
"Toys",5
"Office Products",4
"Digital_Music_Purchase",5
"Toys",5
"Toys",5
"PC",4
"Mobile_Apps",5
"Office Products",5
"Toys",4
"Digital_Music_Purchase",5
"Digital_Music_Purchase",5
"PC",5
"Mobile_Apps",5
"Mobile_Apps",4
"PC",5
"Toys",5
"Toys",5
"Toys",5
"Mobile_Apps",4
"PC",5
"Toys",5
"Mobile_Apps",5
"Toys",4
"Toys",4
"Digital_Music_Purchase",5
"Mobile_Apps",5
"Mobile_Apps",5
"Digital_Music_Purchase",5
"Digital_Music_Purchase",5
"Toys",5
"Mobile_Apps",4
"Digital_Music_Purchase",5
"Mobile_Apps",5
"Mobile_Apps",5
"Mobile_Apps",4
"Digital_Music_Purchase",5
"Toys",5
"Mobile_Apps",5
"Digital_Music_Purchase",5
"Offic

We've successfully extracted records with more than or equal to 4 star ratings. 
It's also possible to pass in `star_rating >= 4` to gain the same result.

In [4]:
aq_pp -f,+1 $file -d $cols -filt 'star_rating >= 4' -c product_category star_rating

"product_category","star_rating"
"PC",4
"Digital_Music_Purchase",5
"Musical Instruments",5
"Mobile_Apps",5
"Digital_Music_Purchase",5
"Mobile_Apps",4
"Mobile_Apps",5
"Mobile_Apps",5
"Toys",5
"PC",4
"Digital_Music_Purchase",5
"Toys",5
"Toys",5
"Mobile_Apps",4
"Digital_Music_Purchase",4
"Toys",4
"Digital_Music_Purchase",5
"Toys",5
"Toys",5
"Digital_Music_Purchase",5
"Mobile_Apps",4
"Toys",5
"Office Products",4
"Digital_Music_Purchase",5
"Toys",5
"Toys",5
"PC",4
"Mobile_Apps",5
"Office Products",5
"Toys",4
"Digital_Music_Purchase",5
"Digital_Music_Purchase",5
"PC",5
"Mobile_Apps",5
"Mobile_Apps",4
"PC",5
"Toys",5
"Toys",5
"Toys",5
"Mobile_Apps",4
"PC",5
"Toys",5
"Mobile_Apps",5
"Toys",4
"Toys",4
"Digital_Music_Purchase",5
"Mobile_Apps",5
"Mobile_Apps",5
"Digital_Music_Purchase",5
"Digital_Music_Purchase",5
"Toys",5
"Mobile_Apps",4
"Digital_Music_Purchase",5
"Mobile_Apps",5
"Mobile_Apps",5
"Mobile_Apps",4
"Digital_Music_Purchase",5
"Toys",5
"Mobile_Apps",5
"Digital_Music_Purchase",5
"Offic

Now we'd like to make sure that all the reviews are legitimate. We can make sure that the purchase is verified by filtering out the record with value of `verified_purchase == 1`. Let's take a look.

In [5]:
aq_pp -f,+1 $file -d $cols -filt 'verified_purchase == 1' -c product_category verified_purchase 

"product_category","verified_purchase"
"PC",1
"Mobile_Apps",1
"Digital_Music_Purchase",1
"Musical Instruments",1
"Digital_Music_Purchase",1
"Mobile_Apps",1
"Digital_Music_Purchase",1
"Mobile_Apps",1
"Mobile_Apps",1
"Toys",1
"Mobile_Apps",1
"Toys",1
"Mobile_Apps",1
"Toys",1
"PC",1
"Mobile_Apps",1
"Toys",1
"Toys",1
"Mobile_Apps",1
"Digital_Music_Purchase",1
"Toys",1
"Digital_Music_Purchase",1
"Toys",1
"Digital_Music_Purchase",1
"Mobile_Apps",1
"Toys",1
"Office Products",1
"Digital_Music_Purchase",1
"Toys",1
"Toys",1
"PC",1
"Mobile_Apps",1
"Office Products",1
"Toys",1
"Digital_Music_Purchase",1
"Mobile_Apps",1
"PC",1
"Mobile_Apps",1
"Mobile_Apps",1
"Toys",1
"Toys",1
"Toys",1
"Mobile_Apps",1
"PC",1
"Toys",1
"Mobile_Apps",1
"Toys",1
"Toys",1
"Digital_Music_Purchase",1
"Mobile_Apps",1
"Mobile_Apps",1
"Digital_Music_Purchase",1
"Toys",1
"Mobile_Apps",1
"Digital_Music_Purchase",1
"Mobile_Apps",1
"Mobile_Apps",1
"Mobile_Apps",1
"Mobile_Apps",1
"Digital_Music_Purchase",1
"PC",1
"Toys",1
"Mobile_

**Negation Operator**<br>

You can negate the `FilterSpec` to get records that are the opposite. For instance, by negating the previous filter spec of verified purchase, we are able to extract only NON-verified purchases.

In [6]:
aq_pp -f,+1 $file -d $cols -filt '!(verified_purchase == 1)' -c product_category verified_purchase

"product_category","verified_purchase"
"Digital_Music_Purchase",0
"Digital_Music_Purchase",0
"Toys",0
"Digital_Music_Purchase",0
"Digital_Music_Purchase",0
"PC",0
"Toys",0
"Toys",0
"Digital_Music_Purchase",0
"Toys",0
"Digital_Music_Purchase",0
"Toys",0
"Digital_Music_Purchase",0
"Digital_Music_Purchase",0
"Digital_Music_Purchase",0
"Toys",0
"Mobile_Apps",0
"PC",0
"Digital_Music_Purchase",0
"PC",0
"Digital_Music_Purchase",0
"Digital_Music_Purchase",0
"Digital_Music_Purchase",0
"Toys",0
"Musical Instruments",0
"Digital_Music_Purchase",0
"Toys",0
"PC",0
"Digital_Music_Purchase",0
"Digital_Music_Purchase",0
"Toys",0
"Toys",0
"Digital_Music_Purchase",0
"PC",0
"Toys",0
"Toys",0
"Digital_Music_Purchase",0
"Digital_Music_Purchase",0
"Digital_Music_Purchase",0
"PC",0
"Toys",0
"Toys",0
"Digital_Music_Purchase",0
"Musical Instruments",0
"Digital_Music_Purchase",0
"Toys",0
"PC",0
"PC",0
"Digital_Music_Purchase",0
"Toys",0
"Musical Instruments",0
"Toys",0
"Digital_Music_Purchase",0
"PC",0
"Mobile_A

This comes in very handy when you'd like to negate a group of multiple filter specs that are introduced below.

**Combinations of Numerical Comparisons**<br>

Let's combine the filters from above to refine our search more. We'd like our reviews to be **BOTH**
- more than 4 stars (inclusive)
- verified purchase (equal 1)


To do this, we can combine these 2 conditions using `&&` operator like below.

In [13]:
aq_pp -f,+1 $file -d $cols -filt 'star_rating >= 4 && verified_purchase == 1' -c product_category star_rating verified_purchase | \
head -n 20

"product_category","star_rating","verified_purchase"
"PC",4,1
"Digital_Music_Purchase",5,1
"Musical Instruments",5,1
"Mobile_Apps",5,1
"Digital_Music_Purchase",5,1
"Mobile_Apps",4,1
"Mobile_Apps",5,1
"Mobile_Apps",5,1
"Toys",5,1
"PC",4,1
"Toys",5,1
"Toys",5,1
"Mobile_Apps",4,1
"Digital_Music_Purchase",4,1
"Toys",4,1
"Digital_Music_Purchase",5,1
"Toys",5,1
"Digital_Music_Purchase",5,1
"Mobile_Apps",4,1


You can also negate the whole filter by adding `!` in the very beginning of the `FilterSpec`, like below.

In [14]:
aq_pp -f,+1 $file -d $cols -filt '!(star_rating >= 4 && verified_purchase == 1)' -c product_category star_rating verified_purchase | \
head -n 20

"product_category","star_rating","verified_purchase"
"Mobile_Apps",3,1
"Digital_Music_Purchase",3,1
"Toys",1,1
"Mobile_Apps",1,1
"Toys",3,1
"Digital_Music_Purchase",5,0
"Mobile_Apps",1,1
"Digital_Music_Purchase",2,0
"Toys",5,0
"Digital_Music_Purchase",1,0
"Digital_Music_Purchase",5,0
"Mobile_Apps",2,1
"PC",5,0
"Toys",3,1
"Toys",5,0
"Toys",4,0
"Toys",1,1
"Digital_Music_Purchase",5,0
"Mobile_Apps",1,1


More complicated chain of `FilterSpec`s are supported, with use of `()`s. Generally when combining each `FilterSpec`, 
- `&&`: logical AND
- `||`: logical OR
- `()`: parenthesis to group some expression together.

Let's see them in action. We'd like our reviews to be
- star_rating: 5 OR 1, **AND**
- verified, **AND** have more than 10 helpful_votes

As a whole `FilterSpec`, this would look like this.<br>
```'(star_rating == 1 || star_rating == 5) && (verified_purchase == 1 && helpful_vote >= 10)'```

In [15]:
aq_pp -f,+1 $file -d $cols -filt '(star_rating == 1 || star_rating == 5) && (verified_purchase == 1 && helpful_vote >= 10)' \
-c product_category star_rating verified_purchase helpful_vote | head -n 20

"product_category","star_rating","verified_purchase","helpful_vote"
"Toys",1,1,10
"Mobile_Apps",1,1,11
"Mobile_Apps",1,1,37
"Toys",1,1,15
"Mobile_Apps",5,1,98
"Digital_Music_Purchase",5,1,17
"Musical Instruments",5,1,16
"PC",1,1,64
"Toys",5,1,57
"Mobile_Apps",5,1,25
"PC",5,1,11
"PC",5,1,106


<a id='string_cmp'></a>
### String Comparison

**Note:** This section requires little knowledge of string manupilation with `aq_pp`, and the builtin functions `aq-emod`.

`-filt` option is also capable of filtering records according to string columns' / constant's condition(s). There are 2 main ways of filtering out records with string columns, one is using operators that are exclusively used for string comparision. The other is to use [`aq-emod`](#aq-emod) builtin functions.

<a id='str_ops'></a>
**Operators**<br>

Belows are some of the operators that can be applied to string values.
* `~==`, `~>`, `~<`, `~>=`, `~<=` - LHS(Left Hand Side) and RHS(Right Hand Side) case insensitive comparison; string type only.
* `==`, `>`, `<`, `>=`, `<=` - LHS and RHS comparison, these can be applied to string also.

We'll go over some samples using the same amazon review dataset.

_Filtering by Marketplace_<br>
Note that any string constants within `FilterSpec` needs to be quoted as well. In the case below, double qoutes are used for quoting string, while single quotes are used for `FilterSpec` as a whole.

In [7]:
aq_pp -f,+1 $file -d $cols -filt 'MarketPlace == "JP"'

"MarketPlace","product_category","product_title","star_rating","verified_purchase","helpful_vote"
"JP","Toys","トランスフォーマーGo! G13 ハンターショックウェーブ",3,1,1
"JP","PC","インテル Celeron G1620 (Ivy Bridge 2.70GHz) LGA1155 BX80637G1620",5,0,1
"JP","PC","PhotoFast MS ProDuoデュアルアダプター CR-5400",3,1,0
"JP","PC","EIZO FORIS 23.0インチ TFTモニタ 1920x1080 DVI-D24ピンx1 D-Sub15ピンx1 HDMIx2 ブラック FS2333",5,1,4
"JP","PC","【Amazon.co.jp限定】TDK 録画用ブルーレイディスク BD-R 25GB 1-4倍速 ホワイトワイドプリンタブル 50枚スピンドル BRV25PWB50PK",5,1,0
"JP","PC","Belkin (Kindle Fire HD(2012年モデル)専用) クラシック ケース/カバー ブラック",5,1,1
"JP","Toys","トランスフォーマープライム AM-05 メガトロン",4,1,0
"JP","Toys","キネティックサンド　1ｋｇ",5,1,5
"JP","Mobile_Apps","OfficeSuite Professional",1,1,1
"JP","Toys","レゴ (LEGO) スター・ウォーズ ミレニアム・ファルコン 7965",4,1,0
"JP","Toys","S.H.フィギュアーツ  スーパーサイヤ人孫悟空",5,1,2
"JP","Toys","figma METROID Other M  サムス・アラン(ABS&PVC製塗装済み可動フィギュア)",5,1,0
"JP","Toys","モンスター・ハイ 恐怖の都スカリシリーズ スカリジェンヌ オープンカー (Y4307)",4,1,0
"JP","Toys","レゴ (LEGO) クリエイター・シーサイドハウス 7346",5,1,0
"JP","Digital_Music_Purch

**Alphabetical / Dictionary ordering comparison**<br>
`<` or `>` operator can be used to compare string's alphabetical position.

In [16]:
aq_pp -f,+1 $file -d $cols -filt 'MarketPlace >= "U"' | head -n 20

"MarketPlace","product_category","product_title","star_rating","verified_purchase","helpful_vote"
"UK","PC","Amazon Zip Sleeve for 7-Inch Tablets",4,1,0
"UK","Mobile_Apps","Facebook",3,1,0
"US","Digital_Music_Purchase","Hail to the King",5,1,0
"US","Musical Instruments","BEHRINGER TU300",5,1,0
"US","Digital_Music_Purchase","Rhythm & Blues",3,1,0
"US","Mobile_Apps","Candy Crush Saga",5,1,1
"US","Digital_Music_Purchase","x (Deluxe Edition)",5,1,0
"US","Mobile_Apps","Buttons and Scissors (Pro)",4,1,0
"US","Mobile_Apps","Handy Photo",5,1,0
"US","Mobile_Apps","MY LITTLE PONY - Friendship is Magic",1,1,0
"US","Mobile_Apps","The Room Two (Kindle Tablet Edition)",5,1,0
"US","Toys","LEGO Star Wars Death Star (10188) (Discontinued by manufacturer)",5,1,0
"US","PC","Marware Axis Genuine Leather Rotating, Standing Case for Kindle Fire HD 7"" (only fits Kindle Fire HD 7"")",4,1,0
"US","Digital_Music_Purchase","The Idler Wheel Is Wiser Than the Driver of the Screw and Whipping Cords Will Serve You M

We've filtered out everything except "UK" and "US". 

**Multiple Filtering Conditions**<br>
We can also combine several `FilterSpec`s to apply more complex conditional statememnts, just like numeric columns.
For example, we can extract records of Japanese Marketplace with product category of Toys.

In [19]:
aq_pp -f,+1 $file -d $cols -filt 'MarketPlace == "JP" && product_category == "Toys"' 

"MarketPlace","product_category","product_title","star_rating","verified_purchase","helpful_vote"
"JP","Toys","トランスフォーマーGo! G13 ハンターショックウェーブ",3,1,1
"JP","Toys","トランスフォーマープライム AM-05 メガトロン",4,1,0
"JP","Toys","キネティックサンド　1ｋｇ",5,1,5
"JP","Toys","レゴ (LEGO) スター・ウォーズ ミレニアム・ファルコン 7965",4,1,0
"JP","Toys","S.H.フィギュアーツ  スーパーサイヤ人孫悟空",5,1,2
"JP","Toys","figma METROID Other M  サムス・アラン(ABS&PVC製塗装済み可動フィギュア)",5,1,0
"JP","Toys","モンスター・ハイ 恐怖の都スカリシリーズ スカリジェンヌ オープンカー (Y4307)",4,1,0
"JP","Toys","レゴ (LEGO) クリエイター・シーサイドハウス 7346",5,1,0
"JP","Toys","RG 1/144 ZGMF-X10A フリーダムガンダム (機動戦士ガンダムSEED)",4,0,2
"JP","Toys","ラングスジャパン(RANGS) 室内用お砂遊び キネティックサンド 1kg",2,1,3
"JP","Toys","レゴ (LEGO) ムービー ゲッタウェイ・グライダー 70800",5,0,1


<a id='emod'></a>
**aq-emod**<br>

Here are some simple examples of applying `-filt` with builtin functions. Remember that this is just examples to get you familiar with, and you can do much more complex and advanced string filtering. 

`SLeng(val)`<br>
For the first example, let's say that you'd like to collect only records with relatively short product title. Let's say less than 10 characters long. In order to get the length of string, we can use a builtin function called `SLeng(val)` where `val` is string constant or string column name. This function returns the length of given string in integer, which can be used to filter records with less than 10 characters long. Let's take a look

In [18]:
aq_pp -f,+1 $file -d $cols -filt 'SLeng(product_title) < 10' | head -n 20

"MarketPlace","product_category","product_title","star_rating","verified_purchase","helpful_vote"
"UK","Mobile_Apps","Facebook",3,1,0
"US","Mobile_Apps","Quell",5,1,0
"US","Digital_Music_Purchase","Tapestry",5,1,0
"US","Mobile_Apps","Minecraft",5,1,0
"UK","Digital_Music_Purchase","Powerage",5,1,0
"DE","Mobile_Apps","Solitär",5,1,1
"US","Mobile_Apps","Quell",5,0,3
"US","Digital_Music_Purchase","Oceania",3,1,0
"US","Digital_Music_Purchase","True",5,1,0
"US","Mobile_Apps","Minecraft",5,1,0
"US","Digital_Music_Purchase","Duets II",5,0,0
"US","Mobile_Apps","Bible",1,1,11
"US","Digital_Music_Purchase","Magic",5,1,1
"US","Digital_Music_Purchase","YES!",5,0,0
"US","Mobile_Apps","BADLAND",4,1,4
"UK","Digital_Music_Purchase","Angles",4,0,1
"US","Mobile_Apps","My Horse",5,1,0
"US","Toys","Blokus",5,1,1
"US","Digital_Music_Purchase","Nikki",5,1,17


`RxCmp(Val, Pattern)`<br>
Given `Val`(string constant / column) and `Pattern` (Regex string), this return 1 if there are match in `Val` string, and 0 otherwise. 

We can use this to filter out records that does not have any alphabetic characters in its product_title. Regex to match alphabet characters is `"[A-z]+"`, which will be provided to `RxCmp()`. Finally, the function should return 0 if there are NO match in product_title.  

In [43]:
aq_pp -f,+1 $file -d $cols -filt 'RxCmp(product_title, "[A-z]+", pcre) == 0' -c MarketPlace product_title

"MarketPlace","product_title"
"US","10000000"
"JP","キネティックサンド　1ｋｇ"
"UK","21"
"JP","簡単ボイスレコーダー"


<a id='rowNum'></a>
### RowNum

Using `-filt` option and [builtin variable](http://auriq.com/documentation/source/reference/manpages/aq_pp.html#builtin) `rowNum` we can process only certain numbers of records. For example let's say we'd like to display first 5 records only (Note that `rowNum` take the first row / header into account as well).

In [45]:
aq_pp -f,+1 $file -d $cols -filt '$rowNum < 6' -c MarketPlace product_title

"MarketPlace","product_title"
"UK","Amazon Zip Sleeve for 7-Inch Tablets"
"US","Shine Runner"
"US","Subway Surfers"
"US","Temple Run 2"
"US","101-in-1 Games"


You can also display even numbers' records only. Let us also add row called index to make the row number clear.
Providing arithmetic condition as expression of the filter, 

In [19]:
aq_pp -f,+1 $file -d $cols -eval 'I:Index' '$rowNum' -filt '$rowNum % 2 == 0' -c Index MarketPlace product_title | \
head -n 20

"Index","MarketPlace","product_title"
2,"UK","Facebook"
4,"US","BEHRINGER TU300"
6,"US","Candy Crush Saga"
8,"US","Buttons and Scissors (Pro)"
10,"DE","STAR WARS Unisex Wecker Analog Kunststoff weiß STAR3"
12,"JP","トランスフォーマーGo! G13 ハンターショックウェーブ"
14,"US","LEGO Star Wars Death Star (10188) (Discontinued by manufacturer)"
16,"US","The Idler Wheel Is Wiser Than the Driver of the Screw and Whipping Cords Will Serve You More Than Ropes Will Ever Do"
18,"US","Simply The Best"
20,"DE","Pegasus Spiele 54541G - Camel Up, Spiel des Jahres 2014"
22,"US","Music From Another Dimension!"
24,"US","Bloodstone & Diamonds"
26,"UK","Beyblades #BB118 Japanese Metal Fusion Phantom Orion Starter Set(Discontinued by manufacturer)"
28,"US","Wolfram|Alpha"
30,"DE","CASIO FX-5800P programmierbarer technisch-wissenschaftlicher Rechner, 4-zeilige Anzeige"
32,"UK","Orchard Toys Bus Stop"
34,"UK","Presence [Explicit]"
36,"US","Fruit Ninja Free"
38,"UK","Intex Wetset Summer Colours Swim Centre 73 x 71 Inch Pool"
