Skip to content

proposal: runtime: remove unnecessary copies when converting the command line to os.Args under Windows #73507

@xformerfhs

Description

@xformerfhs

Proposal Details

First, I have to explain the context in which this proposal came up:

A program that encrypts and decrypts data gets its passwords and keys from the command line.
I know there are better and preferable ways of storing secrets, like KMS and secrets managers.
However, in this special setting it is necessary that the passwords are given as command line parameters.

What can be done about this is that the command line parameters are wiped from memory after they have been converted to internal data structures and before the encryption/decryption is carried out.
To achieve this, one has to overwrite both the command line of the operating system and os.Args in the Go runtime.

It is easy to wipe the command line in the Linux environment.
It is more complicated to wipe the command line in Windows, but I successfully managed to do it, finding a bug in *NTUnicodeString.Slice() in package golang.org/x/sys/windows along the way.

Then came the part to clear os.Args.
This can be done easily with this function:

func wipeGoCommandLine() {
   // Clear all command line arguments in os.Args, except the program name.
   for i := 1; i < len(os.Args); i++ {
      clear(unsafe.Slice(unsafe.StringData(os.Args[i]), len(os.Args[i])))
   }

   // Make os.Args only contain the program name.
   os.Args = os.Args[:1]
}

It works perfectly under Linux.
However, under Windows I had a very strange phenomenon:

The above function worked, except when an os.Args string had a length of 1!
In that case the program crashed with an access violation fault.
All other lengths worked fine, only parameters of length 1 caused this.

I did quite a bit of research and pilfered through the Go runtime with source code and Ida Free debugging.
Finally, I found the cause of this strange phenomenon:

The Go runtime handles the command line under Windows quite differently from other OSes.
The command line under Windows is encoded in UTF-16, and so it has to be converted to UTF-8.
This makes a copy of the command line text in UTF-8.
Then this UTF-8 copy of the command line is converted to the os.Args slice in the function commandLineToArgv that is defined in os/exec_windows.go:

// commandLineToArgv splits a command line into individual argument
// strings, following the Windows conventions documented
// at http://daviddeley.com/autohotkey/parameters/parameters.htm#WINARGV
func commandLineToArgv(cmd string) []string {
	var args []string

	for len(cmd) > 0 {
		if cmd[0] == ' ' || cmd[0] == '\t' {
			cmd = cmd[1:]
			continue
		}
		var arg []byte
		arg, cmd = readNextArg(cmd)
		args = append(args, string(arg))
	}
	return args
}

At the end there is the statement args = append(args, string(arg)).
It converts the byte slice returned by the readNextArg into a string and appends it to the args slice.
The statement string(arg) creates a copy of the byte slice by calling the function runtime.slicebytetostring.

This runtime function has a strange quirk:
If the argument has a length of 1, arg is discarded and instead a pointer to a constant in the table runtime_staticuint64s is returned, that has the same value as arg.

It is this constant that causes the access violation fault.
The constant table is in a memory section marked as read-only.
It cannot be written to and when clear gets the address of this constant it tries to overwrite the read-only content and this leads to the access violation fault.

Contrast this with the way the command line is handled by the Go runtime on Linux:

The file runtime/runtime1.go contains the following function:

func goargs() {
   if GOOS == "windows" {
      return
   }
   argslice = make([]string, argc)
   for i := int32(0); i < argc; i++ {
      argslice[i] = gostringnocopy(argv_index(argv, i))		
   }
}

As one can see the arguments are converted to Go strings with the gostringnocopy function.
It does not copy the strings, but rather builds Go string structures that point to the bytes.

So, why are the arguments in the Windows case copied twice?
First, when converting from UTF-16 to UTF-8, which is necessary.
And then again when converting the byte slices to strings.
This second copy by the string conversion is not necessary.

TL;DR:

To make a long story short, the proposal is:
Change the function that converts the command line to os.Args on Windows to use gostringnocopy(arg) or unsafe.String(&arg[0], len(arg)), whichever is appropriate.

This saves unnecessary copy operations and brings the behavior of os.Args under Windows in line with the behavior on other platforms.

Also, I expected the function that processes the command line to be a part of the runtime package, not os.

As a side note I really would like to know the rationale behind the special handling of one byte slices by runtime.slicebytetostring.

Metadata

Metadata

Assignees

No one assigned

    Labels

    NeedsInvestigationSomeone must examine and confirm this is a valid issue and not a duplicate of an existing one.Proposalcompiler/runtimeIssues related to the Go compiler and/or runtime.

    Type

    No type

    Projects

    Status

    Todo

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions