# aq_output 

This notebook goes over aq_output's options and it's sample usages. 
Based on AQ Tools version: 2.0.1-1.

Through this option users can specify aq_tools data output destinations, format, and as well as outputting behaviors. 

Output spec are composed of 2 options, `-o` option followed by `-c` option. (Can be used one by one as well).
* `-c`: specify which columns to output, as well as thier output format
* `-o`: specify output's destination and behaviors 


## Prerequisites
Readers are assumed to be equipped with knowledge of
- bash commands
- input, column and output spec for aq_tools

Have [aq-output](http://www.auriq.com/documentation/source/reference/manpages/aq-output.html) and [aq_pp](http://www.auriq.com/documentation/source/reference/manpages/aq_pp.html)'s documentation open on your side for reference, or you can always run `man aq-output` for manpage of `aq-output`'s syntax and options.

Below is basic structure / syntax, as well as functionalities of `aq-output`. If you're already familiar, skip to [data](#data) section.

## Syntax

Using `-o` option tells aq_commands to use output option while `-c` tells specifies which column and how to output them. This option is a part of `output spec` in aq_commands, like below.

```bash
# basic structure of aq_pp command

aq_pp Global_Opt Input_Spec Prep_Spec Process_Spec Output_Spec

```
`-o` option can be followed by `-c` option which specifies columns to be outputted. 

```bash
aq_command .. -o[,ArtLst] File -c ColSpec [ColSpec ...] 
```
### `-o`

**Attributes**<br>
where
- `[,ArtLst]`: list of _attributes_, separated by commas. User can provide 0 ~ multiple attributes in one command.<br>
  **File Format**
    - `csv`, `tsv`: comma or tab separated file
    - `jsn`: jason object format
    
  **Behavior**
    - `sep`: specify your own separator, besides comma and tab
    - `notitle`: this tells aq_commands to skip the first line (header)of the output, and only output the data records
    - `app`: append the output to a file instead of overwriting it.
    
    
- `File`: Specifies _destination_ of data, such as filename, stream or pipe
    - regular file `fileName`
    - stream to stdout `-`
    - stream to named pipe `pipeName`
        - [pipe](https://thoughtbot.com/blog/input-output-redirection-in-the-shell)
These are just several destinations that we'll go over, but more options are available on [official documetation](www.auriq.com/documentation/source/reference/manpages/aq-output.html). 

###  `-c`

For `-c` option, you can provide 1 or more ColSpec, which looks like below.<br>

`ColName[:ColLabel][,n=[-]Width][+NumPrintFormat]`<br>
where
- `:ColLabel` - alternative column label for output
- `n=[-]Width` - integer value that specifies the width of the output, counting from left, and padd with white space if length is shorter than `Width`. When used with `-` like `n=-5`, it'll output 5 characters only counting from right.  
- `+NumPrintFormat` - specifies the output formatting of numerical column. Similar to C language's `printf`'s output formatting with precision. 

For more detailed specifications for `-c` option, run `man aq-output` on any cells.



<a id='data'></a>
## Data

We'll be using [Ramen Ratings Dataset](https://www.kaggle.com/residentmario/ramen-ratings) from kaggle for this notebook, which looks like below.

Review|Brand|Variety|Style|Country|Stars
---|---|---|---|---|---|
2580|New Touch|T's Restaurant Tantanmen|Cup|Japan|3.75
2579|Just Way|Noodles Spicy Hot Sesame Spicy Hot Sesame Guan-miao Noodles|Pack|Taiwan|1
2578|Nissin|Cup Noodles Chicken Vegetable|Cup|USA|2.25
2577|Wei Lih|GGE Ramen Snack Tomato Flavor|Pack|Taiwan|2.75
2576|Ching's Secret|Singapore Curry|Pack|India|3.75
2575|Samyang Foods|Kimchi song Song Ramen|Pack|South Korea|4.75
2574|Acecook|Spice Deli Tantan Men With Cilantro|Cup|Japan|4
2573|Ikeda Shoku|Nabeyaki Kitsune Udon|Tray|Japan|3.75
2572|Ripe'n'Dry|Hokkaido Soy Sauce Ramen|Pack|Japan|0.25
2571|KOKA|The Original Spicy Stir-Fried Noodles|Pack|Singapore|2.5

Columns and corresponing data types for the dataset are follows.
- `int: Review #`: review id number, the more recent the review is, the bigger the number is
- `str: Brand`: brand / manufacture of the product
- `str: Variety`: title of the product
- `str: Style`: categorical styles of the products, cup, pack or tray
- `str: Country`: country of origin
- `float: stars`: star rating of each product

## Table of Samples
### -c Option (Under construction)
- [Selected Columns]
- [Single Attributes]
- [Multiple Attributes]

### -o Option
**Output File Format**<br>
- [Builtin File Format](#builtin_ff)
- [Arbitrary Column Separators](#col_separator)

#### Destination
- [Regular File](#reg_file)
- [Named Pipe](#named_pipe)
- [Pipe](#pipe)

#### Behavior
- [notitle](#notitle)
- [app](#app)

**Note:**<br>
For simplicity, we'll be using [bash parameter substitution](https://www.cyberciti.biz/tips/bash-shell-parameter-substitution-2.html) for filename and column specs, `file` and `cols` accordingly.


Now that's out of the way, let's get started!

By default without output options, data will be outputted to standard output as a stream, separated by comma.

In [5]:
# First store filename and column spec in variable to simplify commands
file="data/aq_pp/ramen-ratings-part.csv"
cols="i:reviewID s:brand s:variety s:style s:country f:stars"

aq_pp -f,+1 $file -d $cols 

"reviewID","brand","variety","style","country","stars"
2580,"New Touch","T's Restaurant Tantanmen ","Cup","Japan",3.75
2579,"Just Way","Noodles Spicy Hot Sesame Spicy Hot Sesame Guan-miao Noodles","Pack","Taiwan",1
2578,"Nissin","Cup Noodles Chicken Vegetable","Cup","USA",2.25
2577,"Wei Lih","GGE Ramen Snack Tomato Flavor","Pack","Taiwan",2.75
2576,"Ching's Secret","Singapore Curry","Pack","India",3.75
2575,"Samyang Foods","Kimchi song Song Ramen","Pack","South Korea",4.75
2574,"Acecook","Spice Deli Tantan Men With Cilantro","Cup","Japan",4
2573,"Ikeda Shoku","Nabeyaki Kitsune Udon","Tray","Japan",3.75
2572,"Ripe'n'Dry","Hokkaido Soy Sauce Ramen","Pack","Japan",0.25


## `-c` (Under construction)

<a id='selected_col'></a>
### Selected Columns
Let's try to selectly output desired columns only. We'll just have to provide column names only, like below.
Note that there's **no need to provide data types for ouput column spec**.

In [7]:
aq_pp -f,+1 $file -d $cols -c brand country stars

"brand","country","stars"
"New Touch","Japan",3.75
"Just Way","Taiwan",1
"Nissin","USA",2.25
"Wei Lih","Taiwan",2.75
"Ching's Secret","India",3.75
"Samyang Foods","South Korea",4.75
"Acecook","Japan",4
"Ikeda Shoku","Japan",3.75
"Ripe'n'Dry","Japan",0.25


<a id='alt_col_lab'></a>
### Alternative Column Labels

We can have the command to output the columns using alternative column labels. You have to specify by `ColName:NewColName`. Let's change few column names such as `variety`, `stars`.

In [8]:
aq_pp -f,+1 $file -d $cols -c brand variety:product_name stars:Review

"brand","product_name","Review"
"New Touch","T's Restaurant Tantanmen ",3.75
"Just Way","Noodles Spicy Hot Sesame Spicy Hot Sesame Guan-miao Noodles",1
"Nissin","Cup Noodles Chicken Vegetable",2.25
"Wei Lih","GGE Ramen Snack Tomato Flavor",2.75
"Ching's Secret","Singapore Curry",3.75
"Samyang Foods","Kimchi song Song Ramen",4.75
"Acecook","Spice Deli Tantan Men With Cilantro",4
"Ikeda Shoku","Nabeyaki Kitsune Udon",3.75
"Ripe'n'Dry","Hokkaido Soy Sauce Ramen",0.25


### Width
This option allows users to specify the numbers of characters (by bytes) to output per column. You'd use it like `colName:n=5`, in which 5 characters from left will be outputted. Let's take a look.

In [30]:
aq_pp -f,+1 $file -d $cols -c variety,n=7

"varie"
"T's R"
"Noodl"
"Cup N"
"GGE R"
"Singa"
"Kimch"
"Spice"
"Nabey"
"Hokka"


**`-` option**<br>
You can specify the width from right side using `-` option.

In [29]:
aq_pp -f,+1 $file -d $cols -c variety,n=-7

"riety"
"nmen "
"odles"
"table"
"lavor"
"Curry"
"Ramen"
"antro"
" Udon"
"Ramen"


### PrintFormat
Using this, we can specify the format of numerical column output. 

## `-o`
## Output File Format

<a id='builtin_ff'></a>
### Builtin File Format
Let's say you would like to output the data in different format. 

**TSV**<br>
Some of the options used on the command below are

- `-o,tsv`: **Attribute:** `tsv` tells it to output as tsv format.
- ` - `: **FileName:** tells command to output to stdout. (We use `-` character for demonstration purpose, but feel free to use any filename to write on your file!)

In [2]:
aq_pp -f,+1 $file -d $cols -o,tsv -

reviewID	brand	variety	style	country	stars
2580	New Touch	T's Restaurant Tantanmen 	Cup	Japan	3.75
2579	Just Way	Noodles Spicy Hot Sesame Spicy Hot Sesame Guan-miao Noodles	Pack	Taiwan	1
2578	Nissin	Cup Noodles Chicken Vegetable	Cup	USA	2.25
2577	Wei Lih	GGE Ramen Snack Tomato Flavor	Pack	Taiwan	2.75
2576	Ching's Secret	Singapore Curry	Pack	India	3.75
2575	Samyang Foods	Kimchi song Song Ramen	Pack	South Korea	4.75
2574	Acecook	Spice Deli Tantan Men With Cilantro	Cup	Japan	4
2573	Ikeda Shoku	Nabeyaki Kitsune Udon	Tray	Japan	3.75
2572	Ripe'n'Dry	Hokkaido Soy Sauce Ramen	Pack	Japan	0.25


You can see that the columns are separated by tab, instead of comma like the original file.

You can also output it as Json format by giving `jsn` attribute.

In [3]:
aq_pp -f,+1 $file -d $cols -o,jsn -

{"reviewID":2580,"brand":"New Touch","variety":"T's Restaurant Tantanmen ","style":"Cup","country":"Japan","stars":3.75}
{"reviewID":2579,"brand":"Just Way","variety":"Noodles Spicy Hot Sesame Spicy Hot Sesame Guan-miao Noodles","style":"Pack","country":"Taiwan","stars":1}
{"reviewID":2578,"brand":"Nissin","variety":"Cup Noodles Chicken Vegetable","style":"Cup","country":"USA","stars":2.25}
{"reviewID":2577,"brand":"Wei Lih","variety":"GGE Ramen Snack Tomato Flavor","style":"Pack","country":"Taiwan","stars":2.75}
{"reviewID":2576,"brand":"Ching's Secret","variety":"Singapore Curry","style":"Pack","country":"India","stars":3.75}
{"reviewID":2575,"brand":"Samyang Foods","variety":"Kimchi song Song Ramen","style":"Pack","country":"South Korea","stars":4.75}
{"reviewID":2574,"brand":"Acecook","variety":"Spice Deli Tantan Men With Cilantro","style":"Cup","country":"Japan","stars":4}
{"reviewID":2573,"brand":"Ikeda Shoku","variety":"Nabeyaki Kitsune Udon","style":"Tray","country":"Japan","st

<a id='col_separator'></a>
### Arbitrary Column Separators

Besides the file type options above, output spec let you specify your own separator. Let's say I'd like to separate the columns by `|` character. This can be achieved by providing the character to `sep` attributes, like below.

In [7]:
aq_pp -f,+1 $file -d $cols -o,sep="|" -

reviewID|brand|variety|style|country|stars
2580|New Touch|T's Restaurant Tantanmen |Cup|Japan|3.75
2579|Just Way|Noodles Spicy Hot Sesame Spicy Hot Sesame Guan-miao Noodles|Pack|Taiwan|1
2578|Nissin|Cup Noodles Chicken Vegetable|Cup|USA|2.25
2577|Wei Lih|GGE Ramen Snack Tomato Flavor|Pack|Taiwan|2.75
2576|Ching's Secret|Singapore Curry|Pack|India|3.75
2575|Samyang Foods|Kimchi song Song Ramen|Pack|South Korea|4.75
2574|Acecook|Spice Deli Tantan Men With Cilantro|Cup|Japan|4
2573|Ikeda Shoku|Nabeyaki Kitsune Udon|Tray|Japan|3.75
2572|Ripe'n'Dry|Hokkaido Soy Sauce Ramen|Pack|Japan|0.25


You can use any characters, **as long as they're one byte character**. As an another example, using `%`, we get

In [8]:
aq_pp -f,+1 $file -d $cols -o,sep="%" -

reviewID%brand%variety%style%country%stars
2580%New Touch%T's Restaurant Tantanmen %Cup%Japan%3.75
2579%Just Way%Noodles Spicy Hot Sesame Spicy Hot Sesame Guan-miao Noodles%Pack%Taiwan%1
2578%Nissin%Cup Noodles Chicken Vegetable%Cup%USA%2.25
2577%Wei Lih%GGE Ramen Snack Tomato Flavor%Pack%Taiwan%2.75
2576%Ching's Secret%Singapore Curry%Pack%India%3.75
2575%Samyang Foods%Kimchi song Song Ramen%Pack%South Korea%4.75
2574%Acecook%Spice Deli Tantan Men With Cilantro%Cup%Japan%4
2573%Ikeda Shoku%Nabeyaki Kitsune Udon%Tray%Japan%3.75
2572%Ripe'n'Dry%Hokkaido Soy Sauce Ramen%Pack%Japan%0.25


Feel free to try with the separator of your choice.

In [None]:
aq_pp -f,+1 $file -d $cols -o,sep="" -

## Destination

Instead of outputting to standard output with `-` character, we can also output data to other destinations. On this notebook we'll take a look at 3 options.
- regular file
- named pipe
- [pipe](https://thoughtbot.com/blog/input-output-redirection-in-the-shell)

<a id='reg_file'></a>
### Regular File

Provide output's filename instead of `-` character. 
Let's start with simply outputting the ramen dataset to a local file called `result.csv`.

All of the output files from this section will be saved under `outputs/` directory.

In [13]:
# look around the outputs/ directory, if result.csv already exist you can delete by rm.
ls outputs/

append.txt  overwrite.txt  result.txt


In [5]:
# run this command to output to the file
aq_pp -f,+1 $file -d $cols -o outputs/result.csv

That should have written the output to the file. Using `cat` command, we can look inside of the file.

In [None]:
cat outputs/result.csv

You can specify any types of attributes when writing on a file as well. For instance we can save it as .json file.

In [12]:
aq_pp -f,+1 $file -d $cols -o,jsn outputs/result.json
cat outputs/result.json

{"reviewID":2580,"brand":"New Touch","variety":"T's Restaurant Tantanmen ","style":"Cup","country":"Japan","stars":3.75}
{"reviewID":2579,"brand":"Just Way","variety":"Noodles Spicy Hot Sesame Spicy Hot Sesame Guan-miao Noodles","style":"Pack","country":"Taiwan","stars":1}
{"reviewID":2578,"brand":"Nissin","variety":"Cup Noodles Chicken Vegetable","style":"Cup","country":"USA","stars":2.25}
{"reviewID":2577,"brand":"Wei Lih","variety":"GGE Ramen Snack Tomato Flavor","style":"Pack","country":"Taiwan","stars":2.75}
{"reviewID":2576,"brand":"Ching's Secret","variety":"Singapore Curry","style":"Pack","country":"India","stars":3.75}
{"reviewID":2575,"brand":"Samyang Foods","variety":"Kimchi song Song Ramen","style":"Pack","country":"South Korea","stars":4.75}
{"reviewID":2574,"brand":"Acecook","variety":"Spice Deli Tantan Men With Cilantro","style":"Cup","country":"Japan","stars":4}
{"reviewID":2573,"brand":"Ikeda Shoku","variety":"Nabeyaki Kitsune Udon","style":"Tray","country":"Japan","st

**Arbitrary Separators for output**<br>
You can also use any character as a column separator. Try with any character of your choice, by changing string value at `sep=` attribute.

In [14]:
aq_pp -f,+1 $file -d $cols -o,sep="^" outputs/result.txt
cat outputs/result.txt

reviewID^brand^variety^style^country^stars
2580^New Touch^T's Restaurant Tantanmen ^Cup^Japan^3.75
2579^Just Way^Noodles Spicy Hot Sesame Spicy Hot Sesame Guan-miao Noodles^Pack^Taiwan^1
2578^Nissin^Cup Noodles Chicken Vegetable^Cup^USA^2.25
2577^Wei Lih^GGE Ramen Snack Tomato Flavor^Pack^Taiwan^2.75
2576^Ching's Secret^Singapore Curry^Pack^India^3.75
2575^Samyang Foods^Kimchi song Song Ramen^Pack^South Korea^4.75
2574^Acecook^Spice Deli Tantan Men With Cilantro^Cup^Japan^4
2573^Ikeda Shoku^Nabeyaki Kitsune Udon^Tray^Japan^3.75
2572^Ripe'n'Dry^Hokkaido Soy Sauce Ramen^Pack^Japan^0.25


<a id=named_pipe></a>
### Named Pipe

Data can be also outputted to named pipe, and can be processed in other bash session. 

In order to output to named pipe, first create a named pipe, then provide `fifo@PipeName` as `File` argument to `-o` option, where `PipeName` is the named pipe's path. 
The program also create the pipe if it does not exist. 

Named pipe does not work on the jupyter notebook, so we'll just demonstrate it in the cells below, as TERMINAL 1 and TERMINAL 2.

```bash
# TERMINAL 1
# make the pipe 

mkfifo cigar
```

```bash
# TERMINAL 2
# outputting to the pipe named cigar
aq_pp -f,+1 $file -d $cols -o fifo@cigar
```

<a id='pipe'></a>
### Pipe

Pipe (unamed pipe) is not a part of aq-output options, yet is extremely useful when providing an output from one command to another. 

For more details on linux pipe and redirection, I found [this post](https://thoughtbot.com/blog/input-output-redirection-in-the-shell) very useful.

In this example, we'll pipe the output from `aq_pp` command into [`head`](https://www.geeksforgeeks.org/head-command-linux-examples/) command which will output top 5 lines, but feel free to pipe the output into other command as well.

In [3]:
aq_pp -f,+1 $file -d $cols | head -n 5

"reviewID","brand","variety","style","country","stars"
2580,"New Touch","T's Restaurant Tantanmen ","Cup","Japan",3.75
2579,"Just Way","Noodles Spicy Hot Sesame Spicy Hot Sesame Guan-miao Noodles","Pack","Taiwan",1
2578,"Nissin","Cup Noodles Chicken Vegetable","Cup","USA",2.25
2577,"Wei Lih","GGE Ramen Snack Tomato Flavor","Pack","Taiwan",2.75


Piping the output to `sort` command. You can see that the output is sorted based on the reviewID column.

In [17]:
aq_pp -f,+1 $file -d $cols -o - | sort -

2572,"Ripe'n'Dry","Hokkaido Soy Sauce Ramen","Pack","Japan",0.25
2573,"Ikeda Shoku","Nabeyaki Kitsune Udon","Tray","Japan",3.75
2574,"Acecook","Spice Deli Tantan Men With Cilantro","Cup","Japan",4
2575,"Samyang Foods","Kimchi song Song Ramen","Pack","South Korea",4.75
2576,"Ching's Secret","Singapore Curry","Pack","India",3.75
2577,"Wei Lih","GGE Ramen Snack Tomato Flavor","Pack","Taiwan",2.75
2578,"Nissin","Cup Noodles Chicken Vegetable","Cup","USA",2.25
2579,"Just Way","Noodles Spicy Hot Sesame Spicy Hot Sesame Guan-miao Noodles","Pack","Taiwan",1
2580,"New Touch","T's Restaurant Tantanmen ","Cup","Japan",3.75
"reviewID","brand","variety","style","country","stars"


When using piping with `-o` option, just make sure to specify stdout (`-`) as `FileName` so that next command can take input from stdout.

## Behavior
Through attributes, we can change and specify the outputting behavior. We'll go over just 2 of them briefly, which are

- `notitle`: skip the header
- `app`: append on file instead of overwriting

<a id='notitle'></a>
### notitle

Let's see `notitle` in action, outputting the ramen data to stdout.

In [24]:
aq_pp -f,+1 $file -d $cols -o,notitle -

2580,"New Touch","T's Restaurant Tantanmen ","Cup","Japan",3.75
2579,"Just Way","Noodles Spicy Hot Sesame Spicy Hot Sesame Guan-miao Noodles","Pack","Taiwan",1
2578,"Nissin","Cup Noodles Chicken Vegetable","Cup","USA",2.25
2577,"Wei Lih","GGE Ramen Snack Tomato Flavor","Pack","Taiwan",2.75
2576,"Ching's Secret","Singapore Curry","Pack","India",3.75
2575,"Samyang Foods","Kimchi song Song Ramen","Pack","South Korea",4.75
2574,"Acecook","Spice Deli Tantan Men With Cilantro","Cup","Japan",4
2573,"Ikeda Shoku","Nabeyaki Kitsune Udon","Tray","Japan",3.75
2572,"Ripe'n'Dry","Hokkaido Soy Sauce Ramen","Pack","Japan",0.25


As you can see, the column names are not displayed (skipped) with this option.

<a id='app'></a>
### app

Next option append output instead of overwriting on a file. To understand this better, we'll demonstrate writing on file with and without `app` option.

**Without `app` option**<br>

`overwrite.txt` contains some text. Here is what it looks like initially.

In [18]:
# content of the file
cat outputs/overwrite.txt

Roses are red, violets are blue

Unexpected ‘{‘ on line 32


It contains a paragraph of text, now let's write on it and see what happens.

Run this following cell to write on a file, and output the result by `cat` command.

In [None]:
# now overwrite the file
aq_pp -f,+1 $file -d $cols -o outputs/overwrite.txt

# check the result
cat outputs/overwrite.txt

You can see that the file is completely overwritten, and the original content is gone.
This is the default behavior of output spec. 

**With `app`**

Now we have other file with same content, named `append.txt`. 

In [15]:
# first the original content of the file
cat outputs/append.txt

Roses are red, violets are blue

Unexpected ‘{‘ on line 32


Now with `app` option, let's see the result

In [None]:
aq_pp -f,+1 $file -d $cols -o,app outputs/append.txt

# check the result
cat outputs/append.txt

The ramen dataset data is appended at the bottom of the file, maintaining the original content at top. This is handy when you're adding new records to existing dataset.