Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

filepath: filepath.Walk readDirName will cause large memory leak #36197

Open
wcc526 opened this issue Dec 18, 2019 · 9 comments · May be fixed by #36198
Open

filepath: filepath.Walk readDirName will cause large memory leak #36197

wcc526 opened this issue Dec 18, 2019 · 9 comments · May be fixed by #36198
Milestone

Comments

@wcc526
Copy link

@wcc526 wcc526 commented Dec 18, 2019

https://github.com/golang/go/blob/master/src/path/filepath/path.go#L363

  // walk recursively descends path, calling walkFn.
  func walk(path string, info os.FileInfo, walkFn WalkFunc) error {
	if !info.IsDir() {
		return walkFn(path, info, nil)
	}
	names, err := readDirNames(path)
	err1 := walkFn(path, info, err)
	// If err != nil, walk can't walk into this directory.
	// err1 != nil means walkFn want walk to skip this directory or stop walking.
	// Therefore, if one of err and err1 isn't nil, walk will return.
	if err != nil || err1 != nil {
		// The caller's behavior is controlled by the return value, which is decided
		// by walkFn. walkFn may ignore err and return nil.
		// If walkFn returns SkipDir, it will be handled by the caller.
		// So walk should return whatever walkFn returns.
		return err1
	}

names, err := readDirNames(path)
the readDirNames will lead large memory if directory has large number of files

which conflict skipDir purpose.

@wcc526
Copy link
Author

@wcc526 wcc526 commented Dec 19, 2019

package main

import (
	"context"
	"path/filepath"
	"time"

	"os"
	"strings"

	log "github.com/sirupsen/logrus"

	"sort"
)

var lstat = os.Lstat // for testing
func walk(path string, info os.FileInfo, walkFn filepath.WalkFunc) error {
	if !info.IsDir() {
		return walkFn(path, info, nil)
	}
	err1 := walkFn(path, info, nil)
	if err1 != nil {
		return err1
	}
	names, err := readDirNames(path)
	log.Info("names",names)
	if err != nil {
		return err
	}
	for _, name := range names {
		filename := filepath.Join(path, name)
		fileInfo, err := lstat(filename)
		if err != nil {
			if err := walkFn(filename, fileInfo, err); err != nil && err != filepath.SkipDir {
				return err
			}
		} else {
			err = walk(filename, fileInfo, walkFn)
			if err != nil {
				if !fileInfo.IsDir() || err != filepath.SkipDir {
					return err
				}
			}
		}
	}
	return nil
}
func CustomWalk(root string, walkFn filepath.WalkFunc) error {
	info, err := os.Lstat(root)
	if err != nil {
		err = walkFn(root, nil, err)
	} else {
		err = walk(root, info, walkFn)
	}
	if err == filepath.SkipDir {
		return nil
	}
	return err
}
func readDirNames(dirname string) ([]string, error) {
	f, err := os.Open(dirname)
	if err != nil {
		return nil, err
	}
	names, err := f.Readdirnames(-1)
	f.Close()
	if err != nil {
		return nil, err
	}
	sort.Strings(names)
	return names, nil
}

func walkLarge(rootpath string) {
	ctx, cancel := context.WithTimeout(context.Background(), 7200*time.Second)
	defer cancel()
	err := CustomWalk(rootpath, func(path string, info os.FileInfo, err error) error {
		time.Sleep(1 * time.Second)
		log.Info(path)
		if err != nil {
			return nil
		}
		select {
		case <-ctx.Done():
			log.Warn("Timeout", rootpath)
			return ctx.Err()
		default:
		}

		rel, err := filepath.Rel(rootpath, path)
		if err != nil {
			return err
		}
		if uint(len(strings.Split(rel, "/"))) > 4 {
			return filepath.SkipDir
		}

		if info.IsDir() {
			if info.Name() == ".git" {
				return filepath.SkipDir
			}
			f, err := os.Open(path)
			defer f.Close()
			if err != nil {
				return filepath.SkipDir
			}
			files, err := f.Readdir(1000)
			if err != nil {
				return filepath.SkipDir
			}
			if len(files) > 999 {
				log.Warn("Skip dir ", path)
				return filepath.SkipDir
			}

		} else {
			if !info.Mode().IsRegular() {
				return nil
			}
		}
		return nil
	})
	if err != nil {
		log.Errorf("walk error [%v]\n", err)
	}
}

func main() {
	walkLarge("/")
}

@wcc526
Copy link
Author

@wcc526 wcc526 commented Dec 19, 2019

        err1 := walkFn(path, info, nil)
	if err1 != nil {
		return err1
	}
	names, err := readDirNames(path)
	log.Info("names",names)
	if err != nil {
		return err
	}

walkFn should before readDirNames

@dmitshur
Copy link
Member

@dmitshur dmitshur commented Dec 20, 2019

Can you please include information about what version of Go you tested with (go version), whether it's an issue with the latest Go 1.13.5 version, and your environment (go env)?

When you say "large memory leak", what kind of numbers were you seeing?

That will help with reproducing this and being able to make progress. Thanks.

@dmitshur dmitshur added this to the Backlog milestone Dec 20, 2019
@wcc526
Copy link
Author

@wcc526 wcc526 commented Dec 23, 2019

go version is 1.13.5

[root@el6 ~]# go env
GO111MODULE="on"
GOARCH="amd64"
GOBIN=""
GOCACHE="/root/.cache/go-build"
GOENV="/root/.config/go/env"
GOEXE=""
GOFLAGS=""
GOHOSTARCH="amd64"
GOHOSTOS="linux"
GONOPROXY=""
GONOSUMDB=""
GOOS="linux"
GOPATH="/usr/local/:/builds/lucci/lucci-agent/:/builds/lucci/"
GOPRIVATE=""
GOPROXY="https://proxy.golang.org,direct"
GOROOT="/usr/local/go"
GOSUMDB="sum.golang.org"
GOTMPDIR=""
GOTOOLDIR="/usr/local/go/pkg/tool/linux_amd64"
GCCGO="gccgo"
AR="ar"
CC="gcc"
CXX="g++"
CGO_ENABLED="1"
GOMOD="/dev/null"
CGO_CFLAGS="-g -O2"
CGO_CPPFLAGS=""
CGO_CXXFLAGS="-g -O2"
CGO_FFLAGS="-g -O2"
CGO_LDFLAGS="-g -O2"
PKG_CONFIG="pkg-config"
GOGCCFLAGS="-fPIC -m64 -pthread -fmessage-length=0 -fdebug-prefix-map=/tmp/go-build288278525=/tmp/go-build"

I have pull request #36198 Please review and merge it!

When the directory has 8151134 files, it will cause 820MB meomory leak.

the root cause is "names, err := readDirNames(path)" which read the large files of directory.

Just swap the orders, User can SkipDir for this large directorys

thank you for your attention to this matter @dmitshur

@agnivade
Copy link
Contributor

@agnivade agnivade commented Dec 23, 2019

There is already a PR for this: https://go-review.googlesource.com/c/go/+/211802.

@wcc526 - I was not aware of this issue. Could you please link this issue in your PR ? In the PR message, just add a new line "Fixes #3697"

@gopherbot
Copy link

@gopherbot gopherbot commented Dec 25, 2019

Change https://golang.org/cl/211802 mentions this issue: path/filepath: fix readDirNames to not unconditionally read a directory

@anmathew
Copy link

@anmathew anmathew commented Dec 26, 2019

does not seem these changes help much. walking a folder with 21k files still hogs about 82M RES size - after the walk.

@wcc526
Copy link
Author

@wcc526 wcc526 commented Dec 26, 2019

does not seem these changes help much. walking a folder with 21k files still hogs about 82M RES size - after the walk.

                         files, err := f.Readdir(1000)
			if err != nil {
				return filepath.SkipDir
			}
			if len(files) > 999 {
				log.Warn("Skip dir ", path)
				return filepath.SkipDir
			}

You should add filepath.SkipDir to skip this large files.

these changes help to user to right control the SkipDir.

Before change even if user skip large files, walk still readDirNames(path) whick hogs large RES size

@dajohi
Copy link

@dajohi dajohi commented Apr 9, 2020

Is this just waiting on a regress test?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Linked pull requests

Successfully merging a pull request may close this issue.

6 participants
You can’t perform that action at this time.