Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

runtime/map: Memory pre-allocation high for maps with large number of keys #40163

Closed
prashanthpai opened this issue Jul 11, 2020 · 1 comment
Closed

Comments

@prashanthpai
Copy link

@prashanthpai prashanthpai commented Jul 11, 2020

What version of Go are you using (go version)?

$ gotip version
go version devel +3a43226 Fri Jul 10 11:32:36 2020 +0000 darwin/amd64

Does this issue reproduce with the latest release?

Yes

What operating system and processor architecture are you using (go env)?

go env Output
$ gotip env
GO111MODULE=""
GOARCH="amd64"
GOBIN=""
GOCACHE="/Users/prashanthpai/Library/Caches/go-build"
GOENV="/Users/prashanthpai/Library/Application Support/go/env"
GOEXE=""
GOFLAGS=""
GOHOSTARCH="amd64"
GOHOSTOS="darwin"
GOINSECURE=""
GOMODCACHE="/Users/prashanthpai/go/pkg/mod"
GONOPROXY=""
GONOSUMDB=""
GOOS="darwin"
GOPATH="/Users/prashanthpai/go"
GOPRIVATE=""
GOPROXY="https://proxy.golang.org,direct"
GOROOT="/Users/prashanthpai/sdk/gotip"
GOSUMDB="sum.golang.org"
GOTMPDIR=""
GOTOOLDIR="/Users/prashanthpai/sdk/gotip/pkg/tool/darwin_amd64"
GCCGO="gccgo"
AR="ar"
CC="clang"
CXX="clang++"
CGO_ENABLED="1"
GOMOD=""
CGO_CFLAGS="-g -O2"
CGO_CPPFLAGS=""
CGO_CXXFLAGS="-g -O2"
CGO_FFLAGS="-g -O2"
CGO_LDFLAGS="-g -O2"
PKG_CONFIG="pkg-config"
GOGCCFLAGS="-fPIC -m64 -pthread -fno-caret-diagnostics -Qunused-arguments -fmessage-length=0 -fdebug-prefix-map=/var/folders/sn/l1sbjm4d76d9r8g3nmfs6l9h0000gn/T/go-build313555691=/tmp/go-build -gno-record-gcc-switches -fno-common"

What did you do?

Created a Go program that loads a CSV file and creates a go map from values in the CSV file. Simplified loading code looks like this:

// loadCSV is passed an instance of *os.File and an integer which is the
// exact number of records(lines) in the CSV file.
func loadCSV(r io.Reader, numRowsHint uint32) (map[uint64]uint32, error) {
	m := make(map[uint64]uint32, numRowsHint)
	for {
		// read record from csv file and populate map
		m[key] = value
	}
	return m, nil
}

CSV files:

$ ls -lh ./*.csv
-rw-r--r--  1 prashanthpai  staff   189M Jul 11 13:42 ./new.csv
-rw-r--r--  1 prashanthpai  staff   185M Jul 11 13:42 ./old.csv
$ wc -l ./*.csv
 3414775 ./new.csv
 3404693 ./old.csv

The new.csv file has just about 10,000 more rows than old.csv but the loaded go map takes around 50MB more memory! In my specific case, this sudden increase in memory occurs at this point:

sizeHint memAlloc
3407872 72035288
3407873 124913528

General reproducer:

package main

import (
	"fmt"
	"io/ioutil"
	"os"
	"runtime"
)

func memAlloc() uint64 {
	var m runtime.MemStats
	runtime.ReadMemStats(&m)
	return m.Alloc
}

func main() {
	f, err := os.Create("plot.txt")
	if err != nil {
		panic(err)
	}
	defer f.Close()

	for i := 1; i <= 5000000; i += 50000 {
		runtime.GC()
		m := do(uint64(i))
		runtime.GC()

		fmt.Fprintf(f, "%d %d\n", i, memAlloc())
		fmt.Fprint(ioutil.Discard, len(m)) // keep a reference to m around
	}
}

func do(sizeHint uint64) map[uint64]uint32 {
	m := make(map[uint64]uint32, sizeHint)

	var i uint64
	for i < sizeHint {
		m[i] = uint32(0)
		i++
	}

	return m
}

Memory usage plot

Screenshot 2020-07-11 at 5 40 57 PM
plot.txt

What did you expect to see?

Less dramatic increase in memory. I apologize if my expectation is unreasonable.

What did you see instead?

For maps with large number of keys, memory is overallocated by a large margin even when a size hint is given and map is always loaded such that number of keys <= size hint.

@prashanthpai prashanthpai changed the title Memory pre-allocation high for maps with large number of keys runtime/map: Memory pre-allocation high for maps with large number of keys Jul 11, 2020
@randall77
Copy link
Contributor

@randall77 randall77 commented Jul 11, 2020

This is expected. Maps grow by a factor of 2 under the covers, to amortize the cost of growing.
When presized, we have to round to a power of 2.

(BTW, you should notice this behavior with or without presizing.)

I'm going to close this issue, as I don't think there's much we can do without a fundamental redesign of maps.

@randall77 randall77 closed this Jul 11, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Linked pull requests

Successfully merging a pull request may close this issue.

None yet
2 participants
You can’t perform that action at this time.