[streaming] Line breaks and buffer overflows (broken pipes)

This is similar to #4, but it's more related to child process output content rather than output/buffer sizes.

Child processes that output a lot of data, or run long enough, but don't emit line breaks will eventually cause buffer overflows and therefore broken pipe errors.

This is easily reproducible:

```go
package main

import (
	"fmt"
	"log"
	"math/rand/v2"
	"os"
	"time"

	"github.com/go-cmd/cmd"
)

func main() {
	if len(os.Args) > 1 {
		server()
	} else {
		client()
	}
}

func server() {
	ttl := time.Second * 10
	timeout := time.After(ttl)

	fmt.Printf("[server] start (running for %s)\n", ttl)

	for {
		n := time.Duration(rand.Int64N(10))
		step := time.After(time.Millisecond * n)

		select {
		case <-step:
			fmt.Printf("\033[2K\r[server] %d", n)
		case <-timeout:
			fmt.Println("\033[2K\r[server] exit")
			return
		}
	}

}

func client() {
	selfExe, err := os.Executable()

	if err != nil {
		panic(err)
	}

	svCmd := []string{selfExe, "server"}

	log.Printf("[client] launching server: %q", svCmd)

	cmd := cmd.NewCmdOptions(cmd.Options{Streaming: true}, selfExe, "server")

	go func() {
		for output := range cmd.Stdout {
			fmt.Fprintln(os.Stdout, output)
		}
	}()

	go func() {
		for output := range cmd.Stderr {
			fmt.Fprintln(os.Stderr, output)
		}
	}()

	log.Println("[client] wait server")

	status := <-cmd.Start()

	if status.Error != nil {
		log.Fatal(status.Error)
	}

	log.Printf("[client] server status: %v", status)
	log.Println("[client] exit")
}
```

You can notice that in each server loop iteration, it doesn't emit a line break to the output, it uses a ANSI control sequence (`\033[2K`) to clear the line and return the cursor to the start of the line (`\r`).

`go-cmd` will keep accumulating output, because it doesn't see a `\n` and will never send a message to the output channel. This causes two issues.

## The overflow issue

Eventually `go-cmd`'s internal buffer overflows:

```console
> go run streaming.go
2024/05/31 14:17:18 [client] launching server: ["/tmp/go-build653253379/b001/exe/streaming" "server"]
2024/05/31 14:17:18 [client] wait server
[server] start (running for 10s)
2024/05/31 14:17:24 signal: broken pipe
exit status 1
```

I believe there are no buffer size adjustments that could solve this.

## The issue with buffering based on assumptions about the data format

`go-cmd` assumes the output will be line oriented and sends channel messages for each line.

I imagine this is done for convenience, but it breaks streaming flows that could be important, like showing progress status (e.g.: `rclone`, `aria2c`, `wget`, `curl`, etc).

Even for simpler cases (ANSI control sequences are not used) or smaller cases (all of the output would fit in the buffer), for example when a process prints progress on the same line (like a series of dots or percentages), `go-cmd`'s implementation breaks those flows (e.g.: some `mkfs` implementations do this if I remember correctly).

Another broken use case would be processes that emit very long lines. It could be the case for minifiers/uglyfiers/obfuscators, or language servers (LSPs) that are known to emit long JSON lines to stdio, for example. `go-cmd` would have to keep increasing buffer size to accomodate for this.

## Proposed solution

I believe the solution to both issues is to just do byte oriented streaming with buffering: don't make assumptions about data format and emit chunks downstream whenever the buffer is filled. The client is responsible to reassemble the data if needed.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

[streaming] Line breaks and buffer overflows (broken pipes) #104

The overflow issue

The issue with buffering based on assumptions about the data format

Proposed solution

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

[streaming] Line breaks and buffer overflows (broken pipes) #104

Description

The overflow issue

The issue with buffering based on assumptions about the data format

Proposed solution

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions