A unix cut command that supports: multiple delimiters, outputing fields in any specified order, outputing a different delimiter to the ones in the input, treating runs of delimiters as a single delimiter, and honoring quoting within the target document.
Clone or download
Fetching latest commit…
Cannot retrieve the latest commit at this time.
Permalink
Failed to load latest commit information.
tests
LICENCE
Makefile
README.md
ccut.1
ccut.c
check.sh
common.c
common.h

README.md

title: ccut mansection: 1 date: 2016/05/15

ccut - COLUM'S CUT

ccut is a unix 'cut' command with a couple of extra features (and a couple of features missing).

ccut supports:

utf-8 input in -c mode multiple delimiters using a string as the delimiter rather than a character treating runs of a delimiter as one delimiter
quoting using either quotes or backslash within the cut document (for example, honor " quotes or , in a csv) honoring quotes or backslash in a document but stripping them from output outputting fields in any specified order
outputting a different delimiter than those within the cut document
setting variables in the shell to values of cut fields

SYNTAX

Usage: cut OPTION... [FILE]...

Mandatory arguments to long options are mandatory for short options too.

-b, --bytes=[list] : select only these bytes

-c, --characters=[list] : select only these characters

-d, -t, --delimiter=[list] : list of delimiter characters. Default is just the 'tab' character. Multiple instances of -t or -d are allowed

-D, --delimstr=[delim] : use a string as a delimiter rather than a list of single character delimiter. Only one string delimiter can be used and it cannot be used in combination with -d or -t options

-f, --fields=LIST : select only these fields; also print any line without delimiter characters, unless the -s option is specified

--complement : complement the set of selected bytes, characters or fields

-j, --join-delims : combine runs of delimters and treat them as one delimiter

-q, --quote : honor quoting within target document using \ or ' or "

-Q, --quote-strip : honor quoting within target document, but strip quotes off output fields

-s, --only-delimited : do not print lines not containing delimiters

-T, --output-delimiter=[string] : use string as the output delimiter the default is to use the input delimiter

--utf8 : honor UTF-8 unicode characters on input. This causes unicode strings to be treated as single characters in both -c and -b opertations

-V, --vars=[list] : print out bash commands to set variables using the supplied comma-separated list of names

-z, --zero-terminated : read input where lines are null terminated

-?, --help : display this help and exit

-v, --version : output version information and exit

Use one, and only one of -b, -c or -f. Each LIST is made up of one range, or many ranges separated by commas. THIS CUT DOES NOT SUPPORT WIDE CHARACTERS (yet). So '-c' and '-b' are equivalent

Multiple characters can be specified as the input delimiter. The following quoted characters are recognized:

        \e                      escape
        \t                      tab
        \r                      carriage-return
        \n                      newline
        \xnn            where 'nn' is a two-digit hex-code

Selected input is written in the SPECIFIED ORDER (unlike gnu cut), and fields can be output multiple times. However, order has no meaning when cut is run with --complement, so then fields are output in the order they are encountered in the data

Each range is one of:

  N     N'th byte, character or field, counted from 1
  N-    from N'th byte, character or field, to end of line
  N-M   from N'th to M'th (included) byte, character or field
  -M    from first to M'th (included) byte, character or field

The '-V' or '--vars' option allows a comma-separated list of variable names to be supplied. Cut will then match output fields to those variable names and print out commands to set those variables in a borne-style shell. This can then be used with the 'eval' command to set variables in the shell.

EXAMPLES

Cut using either [ or ] as the delimiter

cat file | ccut -d "[]" -f 3

Cut and output fields in a particular order

cat file | ccut -f 3,1,6,5

cut using escape and tab as delimiters

cat file | ccut -d "\e\t" -f 4

cut honoring document quoting (quoting can use \ ' or ")

echo "field1,"field2 with , in it",field3,field4\,comma,field5 | ccut -d , -q -f 4

cut using space, comma and semicolon as delimiters, replace delimiters with '-' on output

echo "field1 field2,field3;field4,field5" | ccut -d " ,;" -f 2,4 -T -

cut using space as a delimiter, and treating runs of multiple spaces as one delimiter

echo "field1 field2 field3 field4 field5" | ccut -d " " -j -f 2,4,3 -T -

set variables in the shell from fields in input

eval `echo apples,oranges,pears,lemons,lime | ccut -d , -f 2,4,5,1,3 -V citrus1,citrus2,citrus3,poma1,poma2`

AUTHOR

Written by Colum Paget colums.projects@gmail.com

COPYRIGHT

Copyright (c) 2016 Colum Paget. License GPLv3+: GNU GPL version 3 or later http://gnu.org/licenses/gpl.html.
This is free software: you are free to change and redistribute it. There is NO WARRANTY, to the extent permitted by law.