__ __ __
/ /\ ____ / /_ ____ / /\
/ / / / __/\/ __/\ / __/\/ \/
/ /_/ /_ /\/ /___/_ /_ /\/ / /\
\__/\/___/ /\__/\/_/\/___/ /_/_/ /
\__/\____/ \__/ \_/\____/\____/
[ STANDALONE LIBRARY est. 2021 ]
LST.sh is a simple portable shell array library. It provides a way of treating a string as an array by splitting it using a Record Separator character.
Arrays can be accessed by index:
. ./lst.sh
rec a= 'First entry' $'Second\nentry' 'Third entry$'
rec a[1] # prints: First entry
rec a[-1] # prints: Third entry$
rec a[2] entry
test "${entry}" = $'Second\nentry' # succeeds
Entries can be appended or prepended:
rec a.push_back 'Final entry'
rec a.push_front 'Preliminary entry'
ORS=\| rec a # prints:
# Preliminary entry|First entry|Second
# entry|Third entry$|Final entry
Entries can be iterated over by index:
i=0
while rec a[i+=1] entry; do
printf '%2d: <%s>\n' $i "${entry}"
done # prints:
# 1: <Preliminary entry>
# 2: <First entry>
# 3: <Second
# entry>
# 4: <Third entry$>
# 5: <Final entry>
Or by regular variable expansion:
rec a.set_ifs
for entry in $a; do
echo "<${entry}>"
done # prints:
# <Preliminary entry>
# <First entry>
# <Second
# entry>
# <Third entry$>
# <Final entry>
The array functionality is accessed by using the function lst()
with the variable RS
set to the value of the separator character.
The first argument to lst()
is always the name of an array followed
by an operator.
There are three ways providing the Record Separator to the lst()
function:
- Setting it globally:
RS='|'
- Providing it constrained to a single invocation:
RS='|' lst ...
- Or defining a wrapper function that calls
lst()
:array() { RS='|' lst "$@"; }
The first way has most merit when a single Record Separator is the
correct choice for an entire shell script. The third method provides
an opportunity to give more semantic meaning to calls of lst()
.
Three lst()
wrappers are provided:
log()
Uses the Line Feed character as an entry separator, this is subject to limitations, detailed in the next section.uni()
Uses the ASCII Unit Separator character (US) as a Record Separator (seeascii(7)
). Other good candidates from this character group are the Record Separator (RS), Group Separator (GS) and the File Separator (FS).rec()
Uses the ASCII Record Separator character (RS) as a Record Separator.grp()
Uses the ASCII Group Separator character (GS) as a Record Separator.fil()
Uses the ASCII File Separator character (FS) as a Record Separator.csv()
Uses the comma character,
to separate array entries. This is akin to using other characters like slashes, colons, semicolons or pipes when they happen to not be used as characters within the array entries.
The rec()
function is a sane default providing full functionality
using a character that is unlikely to occur within non-binary data.
Using White Space characters such as Space, the Line Feed or Horizontal
Tab characters, affects the functionality of lst()
in conjunction with
empty array entries.
This is due to some methods internally using Field Splitting, which
is subject to special rules when the Input Field Separator is set
to White Space character. See the White Space Splitting (Field Splitting)
section of sh(1)
.
The Method/Function table column WS RS in the Operators and Methods section documents which methods/functions are affected by this. The source code documentation of the affected methods/functions provides details on the specific effects.
If the use case does not make use of empty array entries, all functions and methods can be used without limitations.
There are two functions to convert arrays from one Record Separator to another. The difference is in the delivery of the Input and Output Record Separators.
The lst:convert()
function allows providing arbitrary Separators:
# converts `log src` to `rec dst`
IRS=$'\n' ORS=$'\036' lst:convert src dst
# converts `log a` to `rec a`
IRS=$'\n' ORS=$'\036' lst:convert a a
The lst:cast()
function uses lst()
wrapper functions to determine
the Record Separators:
# converts `log src` to `rec dst`
lst:cast log:src rec:dst
# converts `log a` to `rec a`
lst:cast log:a rec:a
The following operators are available (a
is the name of an array,
i
is an index value / arithmetic expression, m
is the name of
a method):
Operator | Action |
---|---|
a[i].m |
Call array subscript method |
a[i]= |
Array subscript assign (equivalent to a[i].set ) |
a[i] |
Array subscript access (equivalent to a[i].get ) |
a.m |
Call array method |
a=cat |
Create by concatenating arrays (equivalent to lst:cat a ) |
a= |
Array create/reset and assign values |
a |
Print array (equivalent to a.print ) |
Supported methods and functions are listed in the table below.
The Complexity column is based on the number of shell operations,
so even operations like field splitting that clearly have a size
dependent cost are considered constant size. The variable n
refers
to the size of the array, the variable #
to the number of arguments
provided to the function/method call.
The WS RS column lists which functions/methods are subject to limitations if the Record Separator is a White Space character.
Method/Function | Description | Complexity | WS RS |
---|---|---|---|
a[i].get |
Read a single indexed entry | O(1) | Limited |
a[i].set |
Overwrite a single indexed entry | O(n) | Limited |
a[i].rm |
Remove an indexed array entry | O(n) | Limited |
a.resize |
Change the size of the array | O(n) | Limited |
a.push_front |
Prepend values | O(#) | |
a.push_back |
Append values | O(#) | |
a.peek_front |
Read first value | O(1) | |
a.peek_back |
Read last value | O(1) | |
a.pop_front |
Read first value and remove it | O(#) | |
a.pop_back |
Read last value and remove it | O(#) | |
a.rm_first |
Remove first match with the given value | O(#) | |
a.rm_last |
Remove last match with the given value | O(#) | |
a.map_front |
Map array entries to variables from front to back | O(#) | Limited |
a.map_back |
Map array entries to variables from back to front | O(#) | Limited |
a.count |
Provide the number of entries | O(1) | Limited |
a.contains |
Return whether the given value is in the array | O(1) | |
a.contains_all |
Return whether all the given values are in the array | O(#) | |
a.contains_any |
Return whether any of the given values are in the array | O(#) | |
a.is_defined |
Check whether the array is defined | O(1) | |
a.is_undefined |
Check whether the array is undefined | O(1) | |
a.is_empty |
Check whether the array is empty | O(1) | |
a.is_not_empty |
Check whether the array is not empty | O(1) | |
a.print |
Print array ORS separated |
O(1) | Limited |
a.printf |
Print array with custom formatting | O(1) | Limited |
a.append |
Append the given array(s) | O(1) | |
a.set_irs |
Set the Input Record Separator (IRS ) to RS |
O(1) | |
a.set_ors |
Set the Output Record Separator (ORS ) to RS |
O(1) | |
a.set_ifs |
Set the shell Input Field Separator (IFS ) to RS |
O(1) | |
lst:cat |
Concatenate arrays | O(1) | |
lst:convert |
Convert array Record Separator IRS to ORS |
O(1) | Limited |
lst:cast |
Convert array Record Separator based on lst() wrappers | O(1) | Limited |