Skip to content

`shellexpand` is a replacement for Golang's standard `os.Expand()`, with added support for UNIX shell string substitution and expansion.

License

Notifications You must be signed in to change notification settings

ganbarodigital/go_shellexpand

Repository files navigation

Welcome to shellexpand

Introduction

ShellExpand is a replacement for Golang's standard os.Expand() that supports UNIX shell string substitution and expansion.

It is released under the 3-clause New BSD license. See LICENSE.md for details.

import (
    "os"
    shellexpand "github.com/ganbarodigital/go_shellexpand"
)

cb := ExpansionCallbacks{
    AssignVar: os.Setenv,
    LookupVar: os.Lookupenv,
}
result, err := shellexpand.Expand(input, cb)

Table of Contents

Why Use ShellExpand?

Who Is ShellExpand For?

We've built ShellExpand for anyone who wants UNIX shell string-expansion support in their Golang projects.

We're using it ourselves in Scriptish, our library that helps you port your UNIX shell scripts into pure Golang.

Why UNIX Shell String Expansion?

UNIX shell string expansion allows you to take a string of text, and replace it with the value of variables from your environment or from a backing store like Envish. You don't have to do a straight replacement either; there are lots of different ways that the value of a variable can be manipulated before it's substituted back into the original string.

Almost every non-trivial UNIX script uses string expansion. By bringing it to Golang, we hope that this will save you time when you're porting your shell scripts to Golang.

How Does It Work?

Getting Started

Import ShellExpand into your Golang code:

import shellexpand "github.com/ganbarodigital/go_shellexpand"

Create a set of expansion callbacks:

cb := ExpansionCallbacks{
    AssignToVar: func(...)
    LookupVar: func(...)
    LookupHomeDir: func(...)
    MatchVarNames: func(...)
}

Call shellexpand.Expand() to expand your string:

output, err := shellexpand.Expand(input, cb)

How Are Errors Handled?

As a general principle, when an expansion fails, the input string is returned unmodified - and no error is returned.

That said, UNIX shells have grown organically over the last 40+ years, and there will be times where a failed expansion is simply removed from the input string instead ... and still no error returned.

Our main unit tests are in expand_test.go, and each supported expansion is run through a real UNIX shell too, to confirm that ShellExpand is as 100% compatible as possible.

Golang errors can come from two places:

We return all errors back to you. When we do, the contents of the string we return is undefined.

Expansion Callbacks

The vast majority of supported string expansions need to look things up:

  • variables
  • home directories for users

You need to tell ShellExpand how to do this.

These callbacks are responsible for searching and updating what we call your variable backing store. This might be your Golang program's environment, or it might be an emulation of that like our own Envish package.

There are four callbacks that you need to define:

ExpansionCallbacks.AssignToVar()

func AssignToVar(key, value string) error

ShellExpand will call AssignToVar() when it needs to set the value of a variable. Your callback must update your variable backing store.

If an error occurs, your callback should return that error back to ShellExpand. We'll then pass that back to the caller of shellexpand.Expand().

ExpansionCallbacks.LookupVar()

func LookupVar(key string) (string, bool)

ShellExpand will call LookupVar() when it needs to get the value of a variable.

  • If the variable exists in your backing store, return its value and true
  • If the variable does not exist in your backing store, return "" (empty string) and false

Please do not return "" and true if the variable does not exist in your backing store. That behaviour may lead to undefined results from ShellExpand. Even if your own testing says that you can get away with it today, we do not guarantee you'll get the results you expect in a future release.

For positional parameters and special parameters, key will always start with a $ sign.

For all other parameters, ShellExpand will strip away the $ sign before calling LookupVar().

(If you're familiar with Golang's os.LookupEnv(), LookupVar() does the same job.)

ExpansionCallbacks.LookupHomeDir()

func LookupHomeDir(user string) (string, bool)

ShellExpand will call LookupHomeDir() when it needs to know the home directory of a user. This is needed for tilde expansion.

  • If the user exists on your computer, return the user's home directory and true
  • Otherwise, return "" (empty string) and false

You can use Golang's os/user package to do this.

ExpansionCallbacks.MatchVarNames()

func MatchVarNames(prefix string) []string

ShellExpand will call MatchVarNames when it needs to know which variable names start with the given prefix. This is needed for parameter expansion.

Your callback must return a list of all variable names that start with the given prefix. If no names match, return an empty list.

Supported Expansions

UNIX shells perform 10 different types of string expansion. This table tracks which ones we currently support, and what we (currently) plan to do about the rest of them.

Expansion Status Planned?
Brace expansion fully supported n/a
Tilde expansion fully supported n/a
Parameter expansion (almost) fully supported n/a
Command substitution not supported no plans to add
Arithmetic expansion not supported yes
Process substitution not supported no plans to add
Word splitting not supported if there is a need
Pathname expansion not supported if there is a need
Quote removal not supported if word splitting is implemented
Escape sequence expansion not supported no plans to

We have put more details about each of them below.

Brace Expansion

What Is Brace Expansion?

Brace expansion is a way to generate longer strings from simple patterns or range sequences.

This example uses a brace-pattern:

input := "ab{c,d,e}fg"
cb := ExpansionCallbacks{}
output := shellexpand.Expand(input, cb)

// output is: abcfg abdfg abefg

This example uses a brace-sequence:

input := "ab{c..e}fg"
cb := ExpansionCallbacks{}
output := shellexpand.Expand(input, cb)

// output is: abcfe abdfg abefg

Why Use Brace Expansion?

It's commonly used in shell scripts to expand a list of filenames without checking that they exist (ie, without relying on pathname expansion).

Rough Grammar

Brace expansion takes the form:

[preamble]((brace-pattern|brace-sequence)+)[postscript]

where:

  • preamble is optional text immediately before the brace-pattern or brace-sequence
  • brace-pattern is {text,text[,text...]}
    • each brace-pattern is surrounded by braces
    • each brace-pattern contains a comma-separated list of text to substitute in
    • each brace-pattern must contain at least two phrases to be a valid brace-pattern
  • brace-sequence is {lo..hi[..incr]} or {hi..lo[..incr]}
    • hi and lo can be characters or number, as long as they're both the same type
    • incr is optional, and must be a number
    • incr's sign is always auto-corrected to match the order you've put hi and lo in
  • postscript is optional text immediately after the brace-pattern or brace-sequence

Other Notes

  • Brace expansions can be nested.
  • Left-to-right order is preserved. The result of a brace expansion is never sorted.
  • You can escape the opening brace (ie do \\{) to prevent a brace triggering brace expansion.

Status

Brace expansion is fully supported in v1.0.0 and later.

If you find any bugs in brace expansion, please let us know.

Tilde Expansion

What Is Tilde Expansion?

Tilde expansion turns ~ into the path to a user's home directory.

input := "~/.storyplayer/storyplayer.json"
cb := ExpansionCallbacks{
    // you need to provide this
    LookupHomeDir: func(...),
}
output, err := shellexpand.Expand(input)

// on my system, output would be: /home/stuart/.storyplayer/storyplayer.json

The ~ can be followed by a username.

input := "~stuart/.storyplayer/storyplayer.json"
cb := ExpansionCallbacks{
    // you need to provide this
    LookupHomeDir: func(...),
}
output, err := shellexpand.Expand(input)

// on my system, output would be: /home/stuart/.storyplayer/storyplayer.json

The ~ can be followed by a + or -. A + is replaced by the value of PWD, and a - is replaced by the value of OLDPWD.

input := "~+/storyplayer.json"
cb := ExpansionCallbacks{
    // you need to provide this
    LookupVar: func(...),
}
output, err := shellexpand.Expand(input)

// on my system, output would be: (current working directory)/storyplayer.json

Why Use Tilde Expansion?

Many programs these days create/maintain/support a config file stored in the user's home directory. Tilde expansion is a very easy way

Rough Grammar

Tilde expansion takes the form:

~(+|-|username|<blank>)[/path/to/folder/or/file]

where:

  • ~ must be the first character of the word
  • ~+ is replaced by the value of PWD via a call to LookupVar
  • ~- is replaced by the value of OLDPWD via a call to LookupVar
  • ~username is replaced by user's home directory, via a call to LookupHomeDir
  • ~ on its own is replaced by the value of HOME via a call to LookupVar

The /path/to/folder/or/file is optional.

Other Notes

  • Tilde expansion does not check that the expanded filepath is valid, or that whatever it points to exists.

Status

Tilde expansion is fully supported in v1.0.0 or later.

If you find any bugs in tilde expansion, please let us know.

Parameter Expansion

What Is Parameter Expansion?

Parameter expansion replaces variable names with their values, and allows you to change those values before they're plugged back into the result string.

Parameter expansion makes up the majority of UNIX shell string expansion (and, arguably, the majority of your average shell script's code too).

Why Use Parameter Expansion?

In a word: convenience. Being able to assign a value to a variable, and then use that value whereever you need to ... that's one of the main reasons that shell scripts are so quick and easy to knock up.

Supported Parameter Expansions

Here's a list of all the different types of UNIX shell parameter expansion that we know about. Most of them are fully supported by ShellExpand.

Syntax Name Status
$PARAM / ${PARAM} expand-to-value supported
${PARAM:-word} expand-with-default-value supported
${PARAM:=word} expand-assign-default-value supported
${PARAM:?word} expand-write-error supported
${PARAM:+word} expand-use-alternate-value supported
${PARAM:offset} expand-to-substring supported
${PARAM:offset:length} expand-to-substring-length supported
${!prefix*} / ${!prefix@} expand-prefix-match-names supported
${!name[*]} / ${!name[@]} list-of-array-keys not supported
${#PARAM} expand-parameter-length supported
${#*} / ${#@} expand-no-positional-params supported
${PARAM#pattern} expand-remove-shortest-prefix supported
${PARAM##pattern} expand-remove-longest-prefix supported
${PARAM%pattern} expand-remove-shortest-suffix supported
${PARAM%%pattern} expand-remove-longest-suffix supported
${PARAM/old/new} expand-search-replace-all-matches supported
${PARAM//old/new} expand-search-replace-first-match supported
${PARAM/#old/new} expand-search-replace-prefix supported
${PARAM/%old/new} expand-search-replace-suffix supported
${PARAM^pattern} expand-uppercase-first-char supported
${PARAM^^pattern} expand-uppercase-all-chars supported
${PARAM,pattern} expand-lowercase-first-char supported
${PARAM,,pattern} expand-lowercase-all-chars supported
${PARAM@operator} expand-parameter-transform not supported

Indirection

Most parameter expansions support something called indirection.

#!/usr/bin/env bash

PARAM1=PARAM2
PARAM2=foo

# this echoes the value of $PARAM2
echo ${!PARAM1}

If the first character after the opening brace is a ! (pling), then the value of the named parameter ($PARAM1 in our example) is used as the name of the parameter to apply the expansion to.

ShellExpand supports all the indirection expansions that we know if. If you find a case where indirection doesn't work in the same way that a UNIX shell does, please let us know.

Positional Parameter Support

In UNIX shell scripts, $1, $2 et al are known as positional parameters. In UNIX shells, they're originally set to the arguments that the shell script was called with, and then to the arguments passed into each function call in the shell script.

(Note that, technically, $0 isn't a positional parameter, even though it looks like it should be! We cover that below.)

Additionally, $# is always the number of positional parameters that are set at any one time, and both $* and $@ are (slightly different) expansions of all the positional parameters.

We support (almost) all parameter expansion involving positional parameter.

  • We keep the $ sign as part of the name of the variable, when we make calls to your expansion callbacks. Normal variables, we strip off the $. During development, we decided that positional parameters and special parameters are much easier to read if we keep the $ sign.
  • It's up to you to create the variables $1, $2 etc and $# in your variable backing store before you call shellexpand.Expand().
  • When we're expanding $* and $#, we always get the value of $# first. We then use $# to work out how many positional parameters currently exist, and then we get each of them in turn.
  • We never retrieve $* and $@ by name via your expansion callbacks.
  • These variables are all treated as read-only by UNIX shells. We don't enforce that explicitly (yet).

$@ Expansion

In UNIX shell scripts, $* and $@ sometimes expand to different results. If you use $@ inside double quotes, that expands to an array of words.

We don't implement support for arrays in ShellExpand. $* and $@ currently both expand to the same results.

If/when we implement word splitting, $* and $@ will start to expand the same way they do in shell scripts. That will probably break backwards compatibility of your code.

Using $* And $@ In Parameter Expansion

Several of the parameter expansion operations behave differently if you apply them to $* or $@. Instead of being applied to the string as a whole, $* or $@ is expanded first, and then the operation is applied to each chunk of that expanded string in turn.

For example:

input := "${*%*.doc}"
cb := ExpansionCallbacks{
    // you need to supply this
    LookupVar: func(...)
}
output, err := shellexpand.Expand(input)

will do remove-shortest-suffix from each word in the expansion of $*.

Special Parameters

These parameters are all known as special parameters in man bash:

  • $?
  • $-
  • $$
  • $!
  • $0
  • $-

ShellExpand will call your LookupVar() expansion callback to get their value. The variable name passed into LookupVar() will always start with a $ sign.

Word Expansion

Some parameter expansion operators (see table above) take a word as their right-hand side.

ShellExpand performs tilde expansion and parameter expansion on each word before it is used. (UNIX shells also perform command substitution and arithmetic expansion during word expansion. ShellExpand doesn't support these today.)

Command Substitution

What Is Command Substitution?

Command substitution calls an external program, and puts the output from that program in the returned string.

#!/usr/bin/env bash

CURRENT_BRACH=$(git branch --no-color | grep '^\* ' | grep -v 'no branch' | sed 's/^* //g')

or

#!/usr/bin/env bash

CURRENT_BRACH=`git branch --no-color | grep '^\* ' | grep -v 'no branch' | sed 's/^* //g'`

Status

Command substitution is not supported.

There are no plans to add support for command substitution at this time.

  • Command substition needs word splitting implementing first.
  • If the input string has come from user input, it should be treated as untrusted to avoid security problems. Calling arbitrary external programs from string expansion is asking for trouble.

If we add support in a future version, we'll make it send the command name and arguments to a callback that you provide.

Arithmetic Expansion

What Is Arithmetic Expansion?

Arithmetic expansion is how UNIX shells support math operations.

#!/usr/bin/env bash

short_names=$(git branches | grep "^feature/" | sed 's ^feature/  g')

local width=0
local branch
for branch in $short_names; do
        local len=${#branch}
        width=$(max $width $len)
done
width=$(($width+3))

Rough Grammar

Arithmetic expansion is of the form:

$((expression))

where expression is a topic for another day :)

Status

Arithmetic expansion is not supported.

The main reason it isn't supported at the minute is that we haven't had time to add it yet. It's something you can (and arguably, should) handle in native Golang code instead.

We plan to add for v2.0.

Process Substitution

What Is Process Substitution?

Process substitution is a way to obtain input from / send input to an external program that is running in the background (ie, an asynchronous operation).

Status

Process substitution is not supported.

All the reasons for not implementing command substitution apply here.

Additionally, it's a feature that's rarely used in the wild. We simply haven't needed it for our code at all.

If we implement command substitution, we'll probably add process substitution at the same time, in a similar manner. It'll use Golang channels to communicate to/from your code.

Word Splitting

What Is Word Splitting?

Word splitting is the algorithm that UNIX shells use to split up a line of text into separate chunks. These chunks are called words.

The basic idea is that a line of text is split up using separator characters:

  • text surrounded by single quotes is treated as a single word (and immune from further expansion)
  • text surrounded by double quotes is also treated as a single word
  • the variable IFS is used to tell the UNIX shell what the separator characters are (they default to whitespace characters)

Status

Word splitting is currently not supported.

We haven't implemented it simply because we haven't needed it yet.

It's needed for command substitution and process substitution: word splitting is required for the list of arguments to be passed into a process.

If/when we add word splitting, we'll either have to change the API for shellexpand.Expand() (it will need to return a []Word instead of a string), or we'll need to export a second function instead.

Pathname Expansion

What Is Pathname Expansion?

Pathname expansion replaces a glob pattern with a list of matching filenames. It is often called globbing.

ls *.log

In this example, it isn't the ls command that lists all the files that end in .log. It's actually the UNIX shell that turns *.log into a list of files to pass on to ls.

Status

Pathname expansion is currently not supported.

At the minute, we don't have any need for it in ShellExpand. It's main use is to generate arguments to external commands, and we haven't added support for that to date.

Basic globbing support is available via Golang's filepath.Match().

If we add command substitution and/or process substitution, then it will definitely make sense to implement pathname expansion.

If you're interested in implementing pathname expansion, it would make sense to implement word splitting first, and completing the implementation of quote removal.

Escape Sequence Expansion

What Is Escape Sequence Expansion?

Escape sequence expansion turns escape sequences (listed in the table below) into other characters. Many of these characters are interpreted by UNIX terminals as commands.

Rough Grammar

An escape sequence is a \ (forward-slash) followed by one or more characters. The supported characters are in this table:

Escape Sequence Shell Expansion
\a alert (bell)
\b backspace
\e and \E ANSI escape sequence
\f form feed
\n new line
\r carriage return
\t horizontal tab
\v vertical tab
\' escaped single quote
\" escaped double quote
\? escaped question mark
\nnn the 8-bit character for the octal number nnn
\xNN the 8-bit character for the hexadecimal XX
\uHHHH the Unicode character for the hexadecimal HHHH
\UHHHHHHHH the Unicode character for the hexadecimal HHHHHHHH
\cX a control-X character

Any other sequence starting with a \ is treated as an escaped character.

Status

Escape sequence expansion is not supported.

There are no plans to add support for escape sequence expansion at the moment.

Why? Many escape sequences exist for working with interactive shells. There's no direct target to translate them to in a Golang library. Many (all?) of the rest are already supported by Golang's fmt package.

Quote Removal

What Is Quote Removal?

Quote removal is the removal of:

  • the \ in front of escaped characters
  • single quotes surrounding words
  • double quotes surrounding words

Why Do Shells Perform Quote Removal?

If a word contains spaces - for example, a long filename on Windows or MacOS - you have to surround it with either single or double quotes. That's how a UNIX shell knows that the filename is a single word.

The quotes aren't actually part of the filename on disk. The UNIX shell has to remove the quotes before passing the filename to whatever command is being called.

UNIX shells remove the \ in front of escaped characters in case the command being called also supports some form of escape sequence expansion.

Status

Quote removal is not supported in ShellExpand v1.0.0.

  • the \ in front of escaped characters is not removed during quote removal
    • at the moment, these are getting removed during earlier expansion phases, and don't reach the expandQuoteRemoval() function
  • single and double quotes surrounding words are not removed
    • we can't safely attempt this until word splitting has been implemented

Common Terms

Escaped Character

An escaped character is any character that has a \ (forward-slash) in front of it.

They are used to tell ShellExpand to treat the escaped character as a normal character (ie, just text).

For example, in brace expansion, a { (brace) on its own denotes the start of a brace pattern:

# ab{c,d,e}fg becomes 'abcfg abdfg abefg'

If you escape the brace, ShellExpand no longer treats it as the start of a brace pattern

# ab\{c,d,e}fg remains `ab{c,d,e}fg`

All of our expansions should correctly support escaped characters. By correctly, we mean that it should:

  • always treat them as normal text
  • the result should be the same that a real UNIX shell would give you

If you find any bugs related to this, please let us know.

Glob Pattern

A glob pattern is a search and/or filter term that supports wildcards and character sets.

In a glob pattern,

  • ? matches any single character
  • * matches zero or more characters (exactly how many depends on whether you're doing a greedy or ungreedy match)
  • [...] matches any one of the characters in the set
  • any other character matches itself

Glob patterns are mostly used as the search patterns for commands like ls *.log in the UNIX terminal. That's called pathname expansion.

They're also used as the pattern in several parameter expansion operations.

ShellExpand uses the Glob package.

Word

When a UNIX shell splits up a text string into chunks, each of those chunks is called a word. UNIX shells normally use the spaces in a text string to split the chunks up, but you can surround text in single and double quotes to tell the shell where a chunk starts and ends.

These chunks are the smallest units that UNIX shells work with:

  • When a shell script calls an external program, each chunk is passed as a separate parameter to that program.
  • When the shell script does string expansion, the shell actually expands each chunk at a time. It doesn't actually work on the string as a whole.

ShellExpand v1.x operates on the whole string. It doesn't implement word splitting. So far, we haven't needed it.

Reporting Problems

See our separate CONTRIBUTING GUIDE.

About

`shellexpand` is a replacement for Golang's standard `os.Expand()`, with added support for UNIX shell string substitution and expansion.

Resources

License

Stars

Watchers

Forks

Packages

No packages published

Languages