Skip to content

Commit

Permalink
New "awk" parser option: trim
Browse files Browse the repository at this point in the history
Bumped version to 0.13.0
  • Loading branch information
dbohdan committed Aug 31, 2015
1 parent 0e6886b commit dab1f7e
Show file tree
Hide file tree
Showing 3 changed files with 40 additions and 7 deletions.
20 changes: 19 additions & 1 deletion README.md
Original file line number Diff line number Diff line change
Expand Up @@ -76,7 +76,7 @@ A format option (`format=x`) selects the input parser with which Sqawk will pars

| Format | Additional options | Examples | Comment |
|--------|--------------------|--------- |---------|
| `awk` or `raw` | `FS`, `RS` | `RS=\n`, `FS=:` | The default input parser. Splits input into records then fields using regular expressions. The options `FS` and `RS` work the same as -FS and -RS respectively but only apply to one file. |
| `awk` or `raw` | `FS`, `RS`, `trim` | `RS=\n`, `FS=:`, `trim=left` | The default input parser. Splits input into records then fields using regular expressions. The options `FS` and `RS` work the same as -FS and -RS respectively but only apply to one file. The option `trim` removes whitespace at the beginning of each line of input (`trim=left`), at its end (`trim=right`) or both (`trim=both`). |
| `csv`, `csv2`, `csvalt` | `csvsep`, `csvquote` | `format=csv csvsep=, 'csvquote="'` | Parse the input as CSV. Using `format=csv2` or `format=csvalt` enables [alternate mode](http://core.tcl.tk/tcllib/doc/trunk/embedded/www/tcllib/files/modules/csv/csv.html#section3) for parsing CSV files exported by Microsoft Excel. `csvsep` specifies the field separator; it defaults to `,`. `csvquote` selects what characters fields that themselves contain the separator are quotes with; it defaults to `"`. Note that only some characters can be used as `csvquote`. |

# More examples
Expand All @@ -97,6 +97,24 @@ A format option (`format=x`) selects the input parser with which Sqawk will pars

sqawk -1 'select a1 from a order by random()' < file

## Pretty-print a table

ps | sqawk -output table 'select a1,a2,a3,a4,a5 from a' trim=left

### Sample output

```
┌─────┬─────┬────────┬───────────────┐
│ PID │ TTY │ TIME │ CMD │
├─────┼─────┼────────┼───────────────┤
│11476│pts/3│00:00:00│ ps │
├─────┼─────┼────────┼───────────────┤
│11477│pts/3│00:00:00│tclkit-8.6.3-mk│
├─────┼─────┼────────┼───────────────┤
│20583│pts/3│00:00:02│ zsh │
└─────┴─────┴────────┴───────────────┘
```

## Find duplicate lines

Print them and how many times they are repeated.
Expand Down
25 changes: 20 additions & 5 deletions lib/parsers/awk.tcl
Original file line number Diff line number Diff line change
Expand Up @@ -10,6 +10,7 @@ namespace eval ::sqawk::parsers::awk {
FS {}
RS {}
merge {}
trim {}
}
}

Expand Down Expand Up @@ -127,6 +128,16 @@ proc ::sqawk::parsers::awk::splitmerge {str regexp mergeRanges} {
return $fields
}

# Trim the contents of the variable "record".
proc ::sqawk::parsers::awk::trim-record mode {
upvar 1 record record
set record [switch -exact -- $mode {
both { string trim $record }
left { string trimleft $record }
right { string trimright $record }
default { error "unknown more: \"$mode\"" }
}]
}

# Convert raw text data into a list of database rows using regular
# expressions.
Expand All @@ -135,6 +146,7 @@ proc ::sqawk::parsers::awk::parse {data options} {
set RS [dict get $options RS]
set FS [dict get $options FS]
set mergeRanges [dict get $options merge]
set trim [dict get $options trim]

# Split the raw data into records.
set records [::textutil::splitx $data $RS]
Expand All @@ -144,9 +156,15 @@ proc ::sqawk::parsers::awk::parse {data options} {
set records [lrange $records 0 end-1]
}


# Split records into fields.
set rows {}
if {$mergeRanges ne {}} {
if {$mergeRanges eq {}} {
foreach record $records {
::sqawk::parsers::awk::trim-record $trim
lappend rows [list $record {*}[::textutil::splitx $record $FS]]
}
} else {
# Allow both the {1-2,3-4,5-6} and the {1 2 3 4 5 6} syntax for the
# "merge" option.
set rangeRegexp {[0-9]+-[0-9]+}
Expand All @@ -155,13 +173,10 @@ proc ::sqawk::parsers::awk::parse {data options} {
set mergeRanges [string map {- { } , { }} $mergeRanges]
}
foreach record $records {
::sqawk::parsers::awk::trim-record $trim
lappend rows [list $record {*}[::sqawk::parsers::awk::splitmerge \
$record $FS $mergeRanges]]
}
} else {
foreach record $records {
lappend rows [list $record {*}[::textutil::splitx $record $FS]]
}
}

return $rows
Expand Down
2 changes: 1 addition & 1 deletion sqawk-dev.tcl
Original file line number Diff line number Diff line change
Expand Up @@ -9,7 +9,7 @@ package require sqlite3
package require textutil

namespace eval ::sqawk {
variable version 0.12.0
variable version 0.13.0
}

interp alias {} ::source+ {} ::source
Expand Down

0 comments on commit dab1f7e

Please sign in to comment.