Skip to content

strings: optimize WriteTo to use an intermediate buffer for large strings #13848

Open
@dsnet

Description

@dsnet

Using go1.5

NOTE: This issue used to be about using an intermediate buffer in io.WriteString, but will instead perform the optimization in strings.Reader.WriteTo instead. The description in this issue still refers to WriteString, but the performance numbers will probably be the same once applied to strings.Reader.WriteTo.

Currently WriteString does w.Write([]byte(s)) if w is not a stringWriter. This causes a memory allocation proportional to len(s). Instead, we should use an intermediate buffer for large strings.

Using this test code:

var (
    large = strings.Repeat("the quick brown fox jumped over the lazy dog", 1024*1024)
    writerOnly = struct{ io.Writer }{ioutil.Discard}
)

// WriteString2 is modified from io.WriteString to use an
// intermediate buffer for large strings.
func WriteString2(w io.Writer, s string) (n int, err error) {
    const chunkSize = 32*1024
    if sw, ok := w.(interface {WriteString(s string) (n int, err error)}); ok {
        return sw.WriteString(s)
    }
    if len(s) < chunkSize {
        return w.Write([]byte(s))
    }

    buf := make([]byte, chunkSize)
    for len(s) > 0 {
        cnt := copy(buf, s)
        s = s[cnt:]
        cnt, err = w.Write(buf[:cnt])
        n += cnt
        if err != nil {
            break
        }
    }
    return n, err
}

func BenchmarkWriteString(b *testing.B) {
    b.ReportAllocs()
    for i := 0; i < b.N; i++ {
        if _, err := io.WriteString(writerOnly, large); err != nil {
            b.Error(err)
        }
    }
}

func BenchmarkWriteString2(b *testing.B) {
    b.ReportAllocs()
    for i := 0; i < b.N; i++ {
        if _, err := WriteString2(writerOnly, large); err != nil {
            b.Error(err)
        }
    }
}

We can get the number of bytes allocated to be capped at some maximum:

BenchmarkWriteString-4       200       6482427 ns/op    46137405 B/op          2 allocs/op
BenchmarkWriteString2-4      500       2437841 ns/op       32784 B/op          2 allocs/op

I was pleasantly surprised that the runtime also decreased, but this may be because the small buffer fits entirely in the cache.

Metadata

Metadata

Assignees

No one assigned

    Type

    No type

    Projects

    No projects

    Milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions