v2.0.0 Release: Named Fields
To download and unpack prebuilt binaries:
$ # Linux
$ curl -L https://github.com/eBay/tsv-utils/releases/download/v2.0.0/tsv-utils-v2.0.0_linux-x86_64_ldc2.tar.gz | tar xz
$ # MacOS
$ curl -L https://github.com/eBay/tsv-utils/releases/download/v2.0.0/tsv-utils-v2.0.0_osx-x86_64_ldc2.tar.gz | tar xz
Installation instructions are in the ReleasePackageReadme.txt
file in the release package.
To be notified of new releases:
GitHub supports notification of new releases. Click the "Watch" button on the repository page and select "Releases Only".
Release 2.0.0 Changes: Named Field Support
Release 2.0.0 adds named field support to all tools in the tsv-utils toolkit. This is a significant usability improvement.
Named fields can be used with any file or data stream that has a header line. Named fields are enabled by the --H|header
option. Field numbers can be used as well, just as in the prior versions of the toolkit. Glob-style wildcards can be used and escapes can be used to specify field names containing special characters.
Details are available in the Field Syntax section of the Tools Reference manual.
Examples - Assume a file with the header fields:
1 test_name
2 run
3 elapsed_time
4 user_time
5 system_time
6 max_memory
Commands like the following can be used:
$ # Select individual fields, like 'cut'
$ tsv-select data.tsv -H -f user_time # Field 4
$ tsv-select data.tsv -H -f test_name,user_time # Fields 1,4
$ tsv-select data.tsv -H -f '*_time' # Fields 3,4,5
$ # Filter lines using numeric comparisons against individual fields
$ tsv-filter data.tsv -H --lt elapsed_time:100
$ tsv-filter data.tsv -H --gt elapsed_time:100 --lt system_time:20
$ # Statistical summaries
$ tsv-summarize data.tsv -H --median elapsed_time
$ tsv-summarize data.tsv -H --median '*_time'
$ tsv-summarize data.tsv -H --group-by test_name --median '*_time'
$ # Uniq'ing on a field
$ tsv-uniq data.tsv -H -f test_name
$ # Joins - Assume another file 'test_info.tsv' with 'test_name' and
$ # 'expected_time' fields. A join can be performed using column names.
$ tsv-join -H -f test_into.tsv data.tsv --key-fields test_name --append-fields expected_time
See the reference docs or online help for details on specific tools. There is also documentation in the Tools Overview section of the main project README file.
Named field support addresses enhancement request #25. It implemented via PRs #284 through #300.
Other Changes
- Prebuilt binaries have been updated to use the latest LDC compiler (ldc-1.22.0).