Description
This is similar to #4, but it's more related to child process output content rather than output/buffer sizes.
Child processes that output a lot of data, or run long enough, but don't emit line breaks will eventually cause buffer overflows and therefore broken pipe errors.
This is easily reproducible:
package main
import (
"fmt"
"log"
"math/rand/v2"
"os"
"time"
"github.com/go-cmd/cmd"
)
func main() {
if len(os.Args) > 1 {
server()
} else {
client()
}
}
func server() {
ttl := time.Second * 10
timeout := time.After(ttl)
fmt.Printf("[server] start (running for %s)\n", ttl)
for {
n := time.Duration(rand.Int64N(10))
step := time.After(time.Millisecond * n)
select {
case <-step:
fmt.Printf("\033[2K\r[server] %d", n)
case <-timeout:
fmt.Println("\033[2K\r[server] exit")
return
}
}
}
func client() {
selfExe, err := os.Executable()
if err != nil {
panic(err)
}
svCmd := []string{selfExe, "server"}
log.Printf("[client] launching server: %q", svCmd)
cmd := cmd.NewCmdOptions(cmd.Options{Streaming: true}, selfExe, "server")
go func() {
for output := range cmd.Stdout {
fmt.Fprintln(os.Stdout, output)
}
}()
go func() {
for output := range cmd.Stderr {
fmt.Fprintln(os.Stderr, output)
}
}()
log.Println("[client] wait server")
status := <-cmd.Start()
if status.Error != nil {
log.Fatal(status.Error)
}
log.Printf("[client] server status: %v", status)
log.Println("[client] exit")
}
You can notice that in each server loop iteration, it doesn't emit a line break to the output, it uses a ANSI control sequence (\033[2K
) to clear the line and return the cursor to the start of the line (\r
).
go-cmd
will keep accumulating output, because it doesn't see a \n
and will never send a message to the output channel. This causes two issues.
The overflow issue
Eventually go-cmd
's internal buffer overflows:
> go run streaming.go
2024/05/31 14:17:18 [client] launching server: ["/tmp/go-build653253379/b001/exe/streaming" "server"]
2024/05/31 14:17:18 [client] wait server
[server] start (running for 10s)
2024/05/31 14:17:24 signal: broken pipe
exit status 1
I believe there are no buffer size adjustments that could solve this.
The issue with buffering based on assumptions about the data format
go-cmd
assumes the output will be line oriented and sends channel messages for each line.
I imagine this is done for convenience, but it breaks streaming flows that could be important, like showing progress status (e.g.: rclone
, aria2c
, wget
, curl
, etc).
Even for simpler cases (ANSI control sequences are not used) or smaller cases (all of the output would fit in the buffer), for example when a process prints progress on the same line (like a series of dots or percentages), go-cmd
's implementation breaks those flows (e.g.: some mkfs
implementations do this if I remember correctly).
Another broken use case would be processes that emit very long lines. It could be the case for minifiers/uglyfiers/obfuscators, or language servers (LSPs) that are known to emit long JSON lines to stdio, for example. go-cmd
would have to keep increasing buffer size to accomodate for this.
Proposed solution
I believe the solution to both issues is to just do byte oriented streaming with buffering: don't make assumptions about data format and emit chunks downstream whenever the buffer is filled. The client is responsible to reassemble the data if needed.