# Aq_input Tips and Samples


Aq-input is a set of options and its attributes that you can specify in order to control data input behaviors. Aq-input applies to all of aq_tools (aq_commands), so it's important to understand and become comfortable with its use.

This notebook goes over some of aq_input's options and it's sample usages briefly. For detailed and exact spec and syntactic explanations, please refer to [aq_input - essentia documentation portal](http://auriq.com/documentation/source/reference/manpages/aq-input.html).
This notebook is based on AQ Tools version: 2.0.1-1, and you can check the version on your system by running `ess --version`.


### Prerequisites
Users are assumed to be equipped with decent knowledge of
- bash commands
- intuitive knowledge of `aq_pp` command. ([link to aq_pp official documentation](http://auriq.com/documentation/source/reference/manpages/aq_pp.html))


## Major Structure
Below is the general structure of the aq-input, which is composed of input spec and column spec

`aq_command ... -f[,attributes] fileName [more filenames...] -d columnSpec`

There are 2 major components in input specification of aq_tool. 

* `-f`: file input: specifies data source's type, and its format as well as input behaviors, and data input's name(file/pipe name).
* `-d`: column specs: specifies numbers, names and data types of columns.

Before getting into examples, let us go through variations of accepted attributes for both options.
**If you've already read the documentation and are familiar with these options, go ahead and skip this section and jump right into [data used for this sample](#data)**.

Otherwise let's take a look at some commonly used options here.


### `-f` Basic file reading options

By this option, we can specify 3 things + some extra, such as input sorce type, input format and error handling. 

#### input source type
- file
- stream from standard output (std)
- named pipe
- stream from connection to listener

#### input format (column separator) selection
- `csv`: the default option.
- `sep`: this option lets users specify their own separater, as long as it is [single byte character](https://www.quora.com/What-is-the-difference-between-single-byte-or-multibyte-unicode)
- `div`: used with `sep` option in column spec, this let's us specify different separators for each columns. 
- `tab`: html table format
- `jsn`: allows us to input json formatted files. 
- `xml`: xml formatted file input.
- `bin`: aq_pp's original input format
- `aq`: from another aq_tool outputtting aq. No column spec needed. 


#### Error Handling
- `eok`: Make error non critical, resulting in skipping the rows with error.

#### Others
- `esc`: input '\' character in the data as escape character.
- `Num`: lines / byhtes to skip
- `bz`: buffer size


### `-d` Column specs

Column spec specifies what data type will the each column in the data be interpreted by, as well as each column's name. 
Therefore, column spec has a format of 
``` bash
-d dtype[,attributes]:colName dtype2[,attributes]:colName2 ... 
```


#### Generic Col Spec

Supported data types are 
- `S`: string
- `F`: Double precision floating point
- `L`: unsigned integer
- `LS`: 64 bit signed integer
- `I`: 32 bit unsigned integer
- `IS`: 32 bit signed integer
- `IP`: v4/v6 address
- `X`: placeholder for unwanted input column.

And some of the attributes that might come in handy are...
- `trm`: trim leading/trailing spaces from field value
- `lo`, `up`: convert a string field value to lower or upper case. 

There's more attributes available, take a look at the documentation for further details.

Now we are equipped with the basic knowledge of input specifications, let's get started on examples.

<a id='data'></a>
## Data 

We will start with the simplest inputting of csv file, consists of numeric and string columns.
This is a data of street price of cannabis across different states.(The data is modified for simplicity)

Data looks like following, 

State|HighQ|HighQN|date
---|---|---|---|
Alabama|339.06|1042|2014-01-01
Alaska|288.75|252|2014-01-01
Arizona|303.31|1941|2014-01-01
Arkansas|361.85000000000002|576|2014-01-01
California|248.78|12096|2014-01-01
Colorado|236.31|2161|2014-01-01
Connecticut|347.89999999999998|1294|2014-01-01
Delaware|373.18000000000001|347|2014-01-01
District of Columbia|352.25999999999999|433|2014-01-01
Florida|306.43000000000001|6506|2014-01-01

**Actual file looks like below**<br>
```
State,HighQ,HighQN,MedQ,MedQN,LowQ,LowQN,date
Alabama,339.06,1042,198.63999999999999,933,149.49000000000001,123,2014-01-01
Alaska,288.75,252,260.60000000000002,297,388.57999999999998,26,2014-01-01
Arizona,303.31,1941,209.34999999999999,1625,189.44999999999999,222,2014-01-01
Arkansas,361.85000000000002,576,185.62,544,125.87,112,2014-01-01
California,248.78,12096,193.56,12812,192.91999999999999,778,2014-01-01
Colorado,236.31,2161,195.28999999999999,1728,213.5,128,2014-01-01
Connecticut,347.89999999999998,1294,273.97000000000003,1316,257.36000000000001,91,2014-01-01
Delaware,373.18000000000001,347,226.25,273,199.88,34,2014-01-01
District of Columbia,352.25999999999999,433,295.67000000000002,349,213.72,39,2014-01-01
```



### Columns 
- 2 **string** columns, `State` and `date`
- 1 **float** column, `HighQ` and
- 1 **int** columns, `HighQN`

<a id='usage_examples'></a>
## Basic Usage Examples
What we need to read in this data with aq-input are followings:
- [input spec](#input_spec)
    * file name 
    * attributes
- [column spec](#column_spec)

<a id='input_spec'></a>
### Input Spec
With the above data, few things needs to be considered for input spec.
1. attributes
2. path to the fileName, which is `data/aq_input/partial_cannibas_price.csv`

`... -f,+1 data/aq_input/partial_cannibas_price.csv ...`

For attributes, we are specifying that we'll skip the first row of the file because it is a header, by providing `+1`. Because csv format is the default for input spec, we don't have to provide it explicitly.

After the attribute, we can simply provide the path to the file name.

<a id="column_spec"></a>
### Column Spec
This needs to be in the same order as the actual columns in the data. Therefore, considering the data types we've looked at earlier, we can specify column specs with data types and column names.

`-d S:state F:highQ I:highQN S:date`

Note that data type's capitailzation does not matter, and no need to put comma between each column specs.

### Topics to be covered in the examples
- [Column_Attributes](#column_attributes): using attributes on column spec
- [Different Separators and File Formats](#sep_file_formats): inputting from files with different formats and separators
- [Different Input Source Types](#input_src_type)
- [Advance Usage Examples](#advanced_examples)
    * [Arbitrary Column Separator](#advanced_examples): reading in files with different separators on each column

### Reading in the Data

Now let's combine these options together and see it in action with `aq_pp` command.

In [14]:
# reading in the file, and displaying it by aq_pp
aq_pp -f,+1 data/aq_input/partial_cannabis_price.csv -d S:state F:highQ I:highQN S:date

"state","highQ","highQN","date"
"Alabama",339.06,1042,"2014-01-01"
"Alaska",288.75,252,"2014-01-01"
"Arizona",303.31,1941,"2014-01-01"
"Arkansas",361.85000000000002,576,"2014-01-01"
"California",248.78,12096,"2014-01-01"
"Colorado",236.31,2161,"2014-01-01"
"Connecticut",347.89999999999998,1294,"2014-01-01"
"Delaware",373.18000000000001,347,"2014-01-01"
"District of Columbia",352.25999999999999,433,"2014-01-01"
"Florida",306.43000000000001,6506,"2014-01-01"
"Georgia",332.20999999999998,3099,"2014-01-01"
"Hawaii",310.95999999999998,328,"2014-01-01"
"Idaho",276.05000000000001,315,"2014-01-01"
"Illinois",359.74000000000001,4008,"2014-01-01"
"Indiana",336.80000000000001,1665,"2014-01-01"
"Iowa",371.69999999999999,697,"2014-01-01"
"Kansas",353.50999999999999,838,"2014-01-01"
"Kentucky",337.32999999999998,1013,"2014-01-01"
"Louisiana",377.70999999999998,1071,"2014-01-01"


<a id='column_attributes'></a>
### Column Attributes

Let's try to apply some of the attributes on the column spec. 
We will capitalize the name of the states, as we input the data. We can do this by adding `up` attribute on the column spec of `state` column, like below.


In [16]:
aq_pp -f,+1 data/aq_input/partial_cannabis_price.csv -d S,up:state F:highQ I:highQN S:date

"state","highQ","highQN","date"
"ALABAMA",339.10000000000002,1042,"2014-01-01"
"ALASKA",288.80000000000001,252,"2014-01-01"
"ARIZONA",303.30000000000001,1941,"2014-01-01"
"ARKANSAS",361.89999999999998,576,"2014-01-01"
"CALIFORNIA",248.80000000000001,12096,"2014-01-01"
"COLORADO",236.30000000000001,2161,"2014-01-01"
"CONNECTICUT",347.89999999999998,1294,"2014-01-01"
"DELAWARE",373.19999999999999,347,"2014-01-01"
"DISTRICT OF COLUMBIA",352.30000000000001,433,"2014-01-01"
"FLORIDA",306.39999999999998,6506,"2014-01-01"
"GEORGIA",332.19999999999999,3099,"2014-01-01"
"HAWAII",311,328,"2014-01-01"
"IDAHO",276.10000000000002,315,"2014-01-01"
"ILLINOIS",359.69999999999999,4008,"2014-01-01"
"INDIANA",336.80000000000001,1665,"2014-01-01"
"IOWA",371.69999999999999,697,"2014-01-01"
"KANSAS",353.5,838,"2014-01-01"
"KENTUCKY",337.30000000000001,1013,"2014-01-01"
"LOUISIANA",377.69999999999999,1071,"2014-01-01"
"MAINE",321.10000000000002,450,"2014-01-01"


You can also set the characters all lower, using `lo`. 
More column attributes are available to use, refer to the [documentation](http://auriq.com/documentation/source/reference/manpages/aq-input.html?highlight=input) for more details.

<a id='sep_file_formats'></a>
### Different Separators and File Formats

Not all the data files comes in clean csv format (we wish...), and as a data scientist / engineer, we need to be able to handle data in all kinds of format.

Let's take a look at how to input data with different format.

#### TSV

This is probably the second most common file format for data.

Let's assume that the data from earlier was separated by tab, like this.

```
State   HighQ   HighQN  MedQ    MedQN   LowQ    LowQN   date
Alabama 339.06  1042    198.63999999999999      933     149.49000000000001      123     2014-01-01
Alaska  288.75  252     260.60000000000002      297     388.57999999999998      26      2014-01-01
Arizona 303.31  1941    209.34999999999999      1625    189.44999999999999      222     2014-01-01
Arkansas        361.85000000000002      576     185.62  544     125.87  112     2014-01-01
California      248.78  12096   193.56  12812   192.91999999999999      778     2014-01-01
Colorado        236.31  2161    195.28999999999999      1728    213.5   128     2014-01-01
Connecticut     347.89999999999998      1294    273.97000000000003      1316    257.36000000000001      91      2014-01-01
Delaware        373.18000000000001      347     226.25  273     199.88  34      2014-01-01
District of Columbia    352.25999999999999      433     295.67000000000002      349     213.72  39      2014-01-01
```
And the file name is `partial_cannabis_price.tsv`. 
All we have to do is add `tsv` attributes on the input spec, like below.

In [17]:
aq_pp -f,+1,tsv data/aq_input/partial_cannabis_price.tsv -d S:state F:highQ I:highQN S:date

"state","highQ","highQN","date"
"Alabama",339.10000000000002,1042,"2014-01-01"
"Alaska",288.80000000000001,252,"2014-01-01"
"Arizona",303.30000000000001,1941,"2014-01-01"
"Arkansas",361.89999999999998,576,"2014-01-01"
"California",248.80000000000001,12096,"2014-01-01"
"Colorado",236.30000000000001,2161,"2014-01-01"
"Connecticut",347.89999999999998,1294,"2014-01-01"
"Delaware",373.19999999999999,347,"2014-01-01"
"District of Columbia",352.30000000000001,433,"2014-01-01"
"Florida",306.39999999999998,6506,"2014-01-01"
"Georgia",332.19999999999999,3099,"2014-01-01"
"Hawaii",311,328,"2014-01-01"
"Idaho",276.10000000000002,315,"2014-01-01"
"Illinois",359.69999999999999,4008,"2014-01-01"
"Indiana",336.80000000000001,1665,"2014-01-01"
"Iowa",371.69999999999999,697,"2014-01-01"
"Kansas",353.5,838,"2014-01-01"
"Kentucky",337.30000000000001,1013,"2014-01-01"
"Louisiana",377.69999999999999,1071,"2014-01-01"
"Maine",321.10000000000002,450,"2014-01-01"


With the simple modification of command, we read in the tsv file!

#### Using `sep`

The case for tsv and csv wasn't too bad. But what if someone decided to use completely random separator character?
No worries, we can use `sep` attributes in input spec, and assgin **any character** as a separator.(Note that the character has to be single byte character)

For this example, we'll assume that the creator of the cannibas data was under influence, and used `+` as a separator instead of comma ;)

The file name is `partial_cannabis_price.plus`. <br>
Taking a look at the file with `head` command, it looks like below.

In [19]:
head data/aq_input/partial_cannabis_price.plus

State+HighQ+HighQN+date
Alabama+339.1+1042+2014-01-01
Alaska+288.8+252+2014-01-01
Arizona+303.3+1941+2014-01-01
Arkansas+361.9+576+2014-01-01
California+248.8+12096+2014-01-01
Colorado+236.3+2161+2014-01-01
Connecticut+347.9+1294+2014-01-01
Delaware+373.2+347+2014-01-01
District of Columbia+352.3+433+2014-01-01


In order to read in the file with aq-input, we'll set `+` as separator, `sep="+"`, like below.<br>
**NOTE:** Be sure to surround the seperator charactor with single or double quotes, when using `sep` option like below.
- `sep=";"`
- `sep=';'`

In [20]:
aq_pp -f,+1,sep='+' data/aq_input/partial_cannabis_price.plus -d S:state F:highQ I:highQN S:date

"state","highQ","highQN","date"
"Alabama",339.10000000000002,1042,"2014-01-01"
"Alaska",288.80000000000001,252,"2014-01-01"
"Arizona",303.30000000000001,1941,"2014-01-01"
"Arkansas",361.89999999999998,576,"2014-01-01"
"California",248.80000000000001,12096,"2014-01-01"
"Colorado",236.30000000000001,2161,"2014-01-01"
"Connecticut",347.89999999999998,1294,"2014-01-01"
"Delaware",373.19999999999999,347,"2014-01-01"
"District of Columbia",352.30000000000001,433,"2014-01-01"
"Florida",306.39999999999998,6506,"2014-01-01"
"Georgia",332.19999999999999,3099,"2014-01-01"
"Hawaii",311,328,"2014-01-01"
"Idaho",276.10000000000002,315,"2014-01-01"
"Illinois",359.69999999999999,4008,"2014-01-01"
"Indiana",336.80000000000001,1665,"2014-01-01"
"Iowa",371.69999999999999,697,"2014-01-01"
"Kansas",353.5,838,"2014-01-01"
"Kentucky",337.30000000000001,1013,"2014-01-01"
"Louisiana",377.69999999999999,1071,"2014-01-01"
"Maine",321.10000000000002,450,"2014-01-01"



<a id='input_src_type'></a>
### Different Input Source Types

As we saw eariler, aq_tools can read in data from various input data sources.
Here we will cover 
- standard input (stdout)
- named pipe

#### Standard Input
This option comes in very handy when you're working with other linux command / aq_commands, and want to feed the output from the command into another aq_command. 
You'd do this with linux piping.(For more details of piping and redirection, take a look [here](https://ryanstutorials.net/linuxtutorial/piping.php))

In the example below we'll demonstrate it with [`cat`](https://www.geeksforgeeks.org/cat-command-in-linux-with-examples/) command to print the file contents to stdout, then pipe it into aq_pp command. 


When getting input from stdout, we provide `-` instead of fileName on input spec.  Let's take a look.

In [21]:
cat data/aq_input/partial_cannabis_price.csv | aq_pp -f,+1 - -d S:state F:highQ I:highQN S:date

"state","highQ","highQN","date"
"Alabama",339.10000000000002,1042,"2014-01-01"
"Alaska",288.80000000000001,252,"2014-01-01"
"Arizona",303.30000000000001,1941,"2014-01-01"
"Arkansas",361.89999999999998,576,"2014-01-01"
"California",248.80000000000001,12096,"2014-01-01"
"Colorado",236.30000000000001,2161,"2014-01-01"
"Connecticut",347.89999999999998,1294,"2014-01-01"
"Delaware",373.19999999999999,347,"2014-01-01"
"District of Columbia",352.30000000000001,433,"2014-01-01"
"Florida",306.39999999999998,6506,"2014-01-01"
"Georgia",332.19999999999999,3099,"2014-01-01"
"Hawaii",311,328,"2014-01-01"
"Idaho",276.10000000000002,315,"2014-01-01"
"Illinois",359.69999999999999,4008,"2014-01-01"
"Indiana",336.80000000000001,1665,"2014-01-01"
"Iowa",371.69999999999999,697,"2014-01-01"
"Kansas",353.5,838,"2014-01-01"
"Kentucky",337.30000000000001,1013,"2014-01-01"
"Louisiana",377.69999999999999,1071,"2014-01-01"
"Maine",321.10000000000002,450,"2014-01-01"


**Compressed Data**<br>

You can use `-c` option on [`gunzip`](https://www.geeksforgeeks.org/gunzip-command-in-linux-with-examples/) command, which will stream the content of the compressed file without making permanent change to the original file itself. And the stream can be inputted into aq_tools, with piping option like above example.

Here we'll use the compressed version of `partial_cannabis_price.csv`, which look like the very first data we used, if uncompressed.

In [23]:
gunzip -c data/aq_input/partial_cannabis_price.csv.gz | aq_pp -f,+1 - -d S:state F:highQ I:highQN S:date

"state","highQ","highQN","date"
"Alabama",339.10000000000002,1042,"2014-01-01"
"Alaska",288.80000000000001,252,"2014-01-01"
"Arizona",303.30000000000001,1941,"2014-01-01"
"Arkansas",361.89999999999998,576,"2014-01-01"
"California",248.80000000000001,12096,"2014-01-01"
"Colorado",236.30000000000001,2161,"2014-01-01"
"Connecticut",347.89999999999998,1294,"2014-01-01"
"Delaware",373.19999999999999,347,"2014-01-01"
"District of Columbia",352.30000000000001,433,"2014-01-01"
"Florida",306.39999999999998,6506,"2014-01-01"
"Georgia",332.19999999999999,3099,"2014-01-01"
"Hawaii",311,328,"2014-01-01"
"Idaho",276.10000000000002,315,"2014-01-01"
"Illinois",359.69999999999999,4008,"2014-01-01"
"Indiana",336.80000000000001,1665,"2014-01-01"
"Iowa",371.69999999999999,697,"2014-01-01"
"Kansas",353.5,838,"2014-01-01"
"Kentucky",337.30000000000001,1013,"2014-01-01"
"Louisiana",377.69999999999999,1071,"2014-01-01"
"Maine",321.10000000000002,450,"2014-01-01"


#### Named Pipe

Named pipe is useful to have multiple commands communicate and exchange data with each other, and aq-input supports this as input as well. For more details of named pipe, refer [this great website](https://www.linuxjournal.com/article/2156).<br>

Given that you have already created the named pipe to use, you can configure aq-input like below.

```... -f,+1 fifo@FileName -d ...```
where `fileName` is path to the pipe, including the pipe's name. 

Let's take a look at the application. 
1. We'll start with creating named pipe called `pipy` with command `mkfifo`, in current directotry.
2. Then we'll use `cat` command to output the content of `partial_cannabis_price.csv` which is in `data/aq-input/` directory, into the named pipe.
3. Finally we'll read the input from the pipe using `aq_pp`

Now named pipe does not work in this jupyter environment, so we'll just show you the command.
In practice, you can open up 2 virtual terminals, TERMINAL 1 and 2, in which

In [9]:
# TERMINAL 1
# Make pipe and feed the output to the pipe

mkfifo pipy

cat data/aq_input/partial_cannabis_price.csv > pipy




In [10]:
# TERMINAL 2
# get the stream input from the pipe
aq_pp -f,+1 fifo@pipy -d S:state F:highQ I:highQN S:date

"state","highQ","highQN","date"



This will comes in handy, when working with multiple file inputs across different terminals, and you'd like to process all of them in one aq_command.

<a id='advanced_examples'></a>
## Advanced Usage Example


### Arbiturary Column Separator

#### Basics 

We can read in data that contains different separator character(s) for each column. Let's try it with the [same dataset](#data), but this time we'll be interpretting the date column as 3 distinct string columns, as year, month and date.

You can achieve this in 2 steps,
1. using `div` attribute in input spec
2. specify the location and character to be used as separator in column spec, using `sep` 

**1st step:**<br>
Using `div` attribute to let the command know that we'll be using distinct separators for each column.

`..-f,+1,div fileName ...`

Remember that this option is not compatible with other attributes, such as csv or tsv.(these option will assume that file uses one kind of separator across all columns in the file.)

**2nd step:**<br>
In column spec, you can simply place `sep:'sep_character'` block in between the columns where the separator character is located in the actual file. Note that when using `div` option, seperators for all of the columns need to be specified.

In our case, one of the rows will look like this.

`Alabama,339.06,1042,2014-01-01`

Therefore, our new column spec will look like this.

`-d S:state sep:',' F:highQ sep:',' I:highQN sep:',' S:year sep:'-' S:month sep:'-' S:day`

As you can see, we are using separator `-` for the new 3 columns, with `sep` option. 
For the rest, we are simply using `,` as our separator.

In [26]:
aq_pp -f,+1,div data/aq_input/partial_cannabis_price.csv -d S:state sep:',' F:highQ sep:',' I:highQN sep:',' S:year sep:'-' S:month sep:'-' S:day

"state","highQ","highQN","year","month","day"
"Alabama",339.10000000000002,1042,"2014","01","01"
"Alaska",288.80000000000001,252,"2014","01","01"
"Arizona",303.30000000000001,1941,"2014","01","01"
"Arkansas",361.89999999999998,576,"2014","01","01"
"California",248.80000000000001,12096,"2014","01","01"
"Colorado",236.30000000000001,2161,"2014","01","01"
"Connecticut",347.89999999999998,1294,"2014","01","01"
"Delaware",373.19999999999999,347,"2014","01","01"
"District of Columbia",352.30000000000001,433,"2014","01","01"
"Florida",306.39999999999998,6506,"2014","01","01"
"Georgia",332.19999999999999,3099,"2014","01","01"
"Hawaii",311,328,"2014","01","01"
"Idaho",276.10000000000002,315,"2014","01","01"
"Illinois",359.69999999999999,4008,"2014","01","01"
"Indiana",336.80000000000001,1665,"2014","01","01"
"Iowa",371.69999999999999,697,"2014","01","01"
"Kansas",353.5,838,"2014","01","01"
"Kentucky",337.30000000000001,1013,"2014","01","01"
"Louisiana",377.69999999999999,1071,"2014","01","01"
"

You can see that we've successfully separated date into 3 distinct columns!

Let's try something more challenging.

#### Advanced `Sep` example

`sep` option is also perfect for inputting large unstractured data, and organizing it into tabular form. 

For this example, we'll take a look at web log file, which is available in our [official github repository](https://github.com/auriq/EssentiaPublic). The log files are located under `/casestudies/apache/accesslog`.

Let's take a peek of the data, only the first few columns, with head command.


In [21]:
head -n 5 data/aq_input/125-access_log-20141109

188.138.118.184 - - [02/Nov/2014:03:34:43 -0800] "GET / HTTP/1.1" 200 30003 "-" "Pingdom.com_bot_version_1.4_(http://www.pingdom.com)"
46.23.67.107 - - [02/Nov/2014:03:34:47 -0800] "GET / HTTP/1.0" 301 - "-" "Mozilla/5.0 (compatible; monitis - premium monitoring service; http://www.monitis.com)"
54.248.98.72 - - [02/Nov/2014:03:34:50 -0800] "GET / HTTP/1.0" 301 - "-" "Mozilla/5.0 (compatible; monitis - premium monitoring service; http://www.monitis.com)"
91.103.66.203 - - [02/Nov/2014:03:34:59 -0800] "GET /wp-content/plugins/contact-form-7/includes/js/scripts.js?ver=3.8.1 HTTP/1.1" 304 - "-" "Mozilla/4.0 (compatible;)"
173.193.219.173 - - [02/Nov/2014:03:35:12 -0800] "GET / HTTP/1.0" 301 - "-" "Mozilla/5.0 (compatible; monitis - premium monitoring service; http://www.monitis.com)"


As you can see, this data is not structured into table, which makes it hard for us to perform analysis, since it is in [common log format](https://en.wikipedia.org/wiki/Common_Log_Format).
Take a look at the link, to get the basic idea of the web log format and its data we're trying to extract. 

Let's extract the following data from this, and display them in tabular format.

* IP address: we can use dedicated datatype IP for this field
* date, time and timezone of the server: string
* Request: string
* Server's response: int
* size of the object returned: int
* browser: string

**Note**: empty field are filled in with `-`, what to do with input spec??



In [22]:
# won't be using substitution to avoid complicated string
aq_pp -f,div,eok data/aq_input/125-access_log-20141109 \
-d IP:ip sep:' - - [' S:datetime sep:'] "' S:request sep:'" ' I:return_code sep:' ' I:obj_size sep:' "-" ' S:browser

"ip","datetime","request","return_code","obj_size","browser"
188.138.118.184,"02/Nov/2014:03:34:43 -0800","GET / HTTP/1.1",200,30003,"""Pingdom.com_bot_version_1.4_(http://www.pingdom.com)"""
46.23.67.107,"02/Nov/2014:03:34:47 -0800","GET / HTTP/1.0",301,0,"""Mozilla/5.0 (compatible; monitis - premium monitoring service; http://www.monitis.com)"""
54.248.98.72,"02/Nov/2014:03:34:50 -0800","GET / HTTP/1.0",301,0,"""Mozilla/5.0 (compatible; monitis - premium monitoring service; http://www.monitis.com)"""
91.103.66.203,"02/Nov/2014:03:34:59 -0800","GET /wp-content/plugins/contact-form-7/includes/js/scripts.js?ver=3.8.1 HTTP/1.1",304,0,"""Mozilla/4.0 (compatible;)"""
173.193.219.173,"02/Nov/2014:03:35:12 -0800","GET / HTTP/1.0",301,0,"""Mozilla/5.0 (compatible; monitis - premium monitoring service; http://www.monitis.com)"""
89.248.170.202,"02/Nov/2014:03:35:20 -0800","POST /xmlrpc.php HTTP/1.0",200,370,"""Mozilla/4.0 (compatible: MSIE 7.0; Windows NT 6.0)"""
89.248.170.202,"02/Nov/2014:03

## To dos for advanced examples

Contents below will be added in near future.
### Extracting Key and Values
- with the beer dataset for json

### Link to weblog analysis using aq_pp command
This section will be added in the future update.
