Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Poor performance on IO-bound task: 10x slower than runc #451

jirfag opened this issue Jun 23, 2019 · 13 comments

Poor performance on IO-bound task: 10x slower than runc #451

jirfag opened this issue Jun 23, 2019 · 13 comments


Copy link

@jirfag jirfag commented Jun 23, 2019


I've run everything on a new GCE n1-standard-8 machine.

System info
$ gvisor -v
flag provided but not defined: -v
Usage: gvisor <flags> <subcommand> <subcommand args>

	checkpoint       checkpoint current state of container (experimental)
	create           create a secure container
	delete           delete resources held by a container
	do               Simplistic way to execute a command inside the sandbox. It's to be used for testing only.
	events           display container events such as OOM notifications, cpu, memory, and IO usage statistics
	exec             execute new process inside the container
	flags            describe all known top-level flags
	gofer            launch a gofer process that serves files over 9P protocol (internal use only)
	help             Print help documentation.
	kill             sends a signal to the container
	list             list containers started by runsc with the given root
	pause            pause suspends all processes in a container
	ps               ps displays the processes running inside a container
	restore          restore a saved state of container (experimental)
	resume           Resume unpauses a paused container
	run              create and run a secure container
	spec             create a new OCI bundle specification file
	start            start a secure container
	state            get the state of a container
	wait             wait on a process inside a container

Subcommands for internal use only:
	boot             launch a sandbox process (internal use only)
	debug            shows a variety of debug information
	gofer            launch a gofer process that serves files over 9P protocol (internal use only)

Use "gvisor flags" for a list of top-level flags

I've downloaded it by wget, also I've tried a few earlier builds, a result was the same.

$ docker version
 Version:           18.09.6
 API version:       1.39
 Go version:        go1.10.8
 Git commit:        481bc77
 Built:             Sat May  4 02:36:00 2019
 OS/Arch:           linux/amd64
 Experimental:      false
$ docker info
Containers: 6
 Running: 0
 Paused: 0
 Stopped: 6
Images: 2
Server Version: 18.09.6
Storage Driver: overlay2
 Backing Filesystem: extfs
 Supports d_type: true
 Native Overlay Diff: true
Logging Driver: json-file
Cgroup Driver: cgroupfs
 Volume: local
 Network: bridge host macvlan null overlay
 Log: awslogs fluentd gcplogs gelf journald json-file local logentries splunk syslog
Swarm: inactive
Runtimes: gvisor_kvm gvisor_prof runc gvisor_debug
Default Runtime: runc
Init Binary: docker-init
containerd version: bb71b10fd8f58240ca47fbb579b9d1028eea7c84
runc version: 2b18fe1d885ee5083ef9f0838fee39b62d653e30
init version: fec3683
Security Options:
  Profile: default
Kernel Version: 4.9.0-9-amd64
Operating System: Debian GNU/Linux 9 (stretch)
OSType: linux
Architecture: x86_64
CPUs: 8
Total Memory: 29.46GiB
Name: example-nested-vm
Docker Root Dir: /var/lib/docker
Debug Mode (client): false
Debug Mode (server): false
Experimental: false
Insecure Registries:
Live Restore Enabled: false
Product License: Community Engine

WARNING: No swap limit support
$ uname -a
Linux example-nested-vm 4.9.0-9-amd64 #1 SMP Debian 4.9.168-1+deb9u3 (2019-06-16) x86_64 GNU/Linux
$ lsb_release -a
No LSB modules are available.
Distributor ID:	Debian
Description:	Debian GNU/Linux 9.9 (stretch)
Release:	9.9
Codename:	stretch

The issue

First of all, thank you for this awesome technology!
I've tried to use it in SaaS service: users typically run builds of their Go programs in containers. But a slow-down after using gVisor is 4x-10x.

How to reproduce 10x slow-down

To reproduce 10x slow-down you can execute the following:

git clone
cd golangci-api
GO111MODULE=on go mod vendor

It executes 20sec on GCE n1-standard-8 machine without gVisor and 240sec with gVisor.

How to reproduce 4x slow-down

I've tried to reproduce 10x slow-down by a simple example but could reproduce only 4x slow-down. Maybe, it will help.

$ cat ~/go.mod
module m

require v2.2.1
$ sudo docker run --runtime=gvisor -w /goapp -v ~/go.mod:/goapp/go.mod -it -e GO111MODULE=on golang:1.12.6 bash -c "time go mod download"
go: finding v2.2.1
go: finding v0.0.0-20161208181325-20d25e280405

real	0m9.637s
user	0m4.190s
sys	0m2.470s
$ sudo docker run -w /goapp -v ~/go.mod:/goapp/go.mod -it -e GO111MODULE=on golang:1.12.6 bash -c "time go mod download"
go: finding v2.2.1
go: finding v0.0.0-20161208181325-20d25e280405

real	0m2.579s
user	0m0.564s
sys	0m0.204s

Debug logs

I've attached from one of runs. Also, I've tried to extract statistic from them, maybe it will help.

Top system call times total
futex: 2m15.733738002s
wait4: 12.47037273s
epoll_pwait: 11.112928873s
read: 10.575159935s
waitid: 9.158005727s
nanosleep: 1.978170689s
select: 1.555938321s
openat: 315.219887ms
write: 263.92663ms
renameat: 244.598318ms
open: 229.862991ms
getdents: 164.218294ms
newfstatat: 92.353902ms
rename: 84.018281ms
getdents64: 76.73747ms
fsync: 65.010792ms
unlink: 37.377295ms
execve: 35.727299ms
clone: 34.550085ms
brk: 13.27073ms
connect: 7.148152ms
mkdirat: 4.435736ms
access: 3.060792ms
sched_yield: 1.615556ms
stat: 1.320963ms
lstat: 1.243392ms
sysinfo: 1.213435ms
close: 1.068737ms
mmap: 1.014153ms
mkdir: 1.007307ms
fcntl: 68.749µs
getegid: 4.882µs
getgid: 4.754µs
geteuid: 4.635µs
getuid: 3.037µs
ioctl: 974ns
rt_sigprocmask: 952ns
arch_prctl: 926ns
exit: 743ns
Top syscall-by-file times total
/heads/master: 2.12630434s
/heads/v1: 1.658275023s
./objects/pack/pack-00a19d4f357b1612edca644fc1dfc9a60cd06f73.keep: 1.408336718s
/goapp: 656.607843ms
/go/pkg/mod/cache/vcs/7e5fa1eab4705eb80c9746632736cea906708d060702d529df6241d1c8c2c9f9/objects/pack: 57.530264ms
/go/pkg/mod/cache/download/ 38.358534ms
/etc/ssl/certs: 35.311047ms
/go/pkg/mod/cache/download/ 33.356223ms
/u: 33.002017ms
/usr/lib/x86_64-linux-gnu/ 32.702934ms
/lib/x86_64-linux-gnu/ 25.771194ms
/lib/x86_64-linux-gnu/ 25.327247ms
/go/pkg/mod/cache/vcs/7e5fa1eab4705eb80c9746632736cea906708d060702d529df6241d1c8c2c9f9/config.lock: 22.004032ms
/go/pkg/mod/cache/vcs/4f939cf8921e8662ef2d8cf78b0db3bcacccda0d79c9e1b4cef06c640b4a8cef/config.lock: 21.309737ms
/go/pkg/mod/cache/vcs/4f939cf8921e8662ef2d8cf78b0db3bcacccda0d79c9e1b4cef06c640b4a8cef/objects/pack: 18.532718ms
/q: 17.319877ms
/usr/bin/git: 14.218023ms
/go/pkg/mod/cache/vcs/7e5fa1eab4705eb80c9746632736cea906708d060702d529df6241d1c8c2c9f9/refs: 12.839608ms
/go/pkg/mod/cache/vcs/4f939cf8921e8662ef2d8cf78b0db3bcacccda0d79c9e1b4cef06c640b4a8cef/HEAD.lock: 12.671706ms
/go/pkg/mod/ 11.406743ms
/go/pkg/mod/cache/vcs/7e5fa1eab4705eb80c9746632736cea906708d060702d529df6241d1c8c2c9f9/objects/pack/tmp_pack_v7xS1g: 10.409262ms
/go/pkg/mod/cache/vcs/7e5fa1eab4705eb80c9746632736cea906708d060702d529df6241d1c8c2c9f9/HEAD.lock: 10.073025ms
/go/pkg/mod/ 9.792402ms
/usr/lib/git-core/git: 9.499358ms
/go/pkg/mod/cache/vcs/4f939cf8921e8662ef2d8cf78b0db3bcacccda0d79c9e1b4cef06c640b4a8cef/refs: 7.906987ms
/usr/lib/git-core/git-remote-https: 7.841109ms
/go/pkg/mod/cache/vcs/7e5fa1eab4705eb80c9746632736cea906708d060702d529df6241d1c8c2c9f9/refs/heads: 7.749685ms
/go/pkg/mod/cache/vcs/7e5fa1eab4705eb80c9746632736cea906708d060702d529df6241d1c8c2c9f9/refs/remotes/origin/master.lock: 7.385629ms
/var/run/nscd/socket: 7.148152ms
/go/pkg/mod/cache/vcs/4f939cf8921e8662ef2d8cf78b0db3bcacccda0d79c9e1b4cef06c640b4a8cef/shallow.lock: 6.990799ms
/go/pkg/mod/cache/vcs/7e5fa1eab4705eb80c9746632736cea906708d060702d529df6241d1c8c2c9f9/refs/heads/v1.lock: 6.818456ms
/go/pkg/mod/cache/vcs/7e5fa1eab4705eb80c9746632736cea906708d060702d529df6241d1c8c2c9f9/refs/remotes/origin/v1.lock: 6.732988ms
./objects/pack: 6.627003ms
/go/pkg/mod/cache/vcs/7e5fa1eab4705eb80c9746632736cea906708d060702d529df6241d1c8c2c9f9/refs/heads/master.lock: 6.495823ms
/go/pkg/mod/ 6.236151ms
/go/pkg/mod/cache/vcs/4f939cf8921e8662ef2d8cf78b0db3bcacccda0d79c9e1b4cef06c640b4a8cef/refs/tags: 5.945923ms
/usr/share/git-core/templates: 5.563759ms
/go/pkg/mod/cache/vcs/7e5fa1eab4705eb80c9746632736cea906708d060702d529df6241d1c8c2c9f9/refs/remotes: 5.508308ms
/go/pkg/mod/cache/vcs/4f939cf8921e8662ef2d8cf78b0db3bcacccda0d79c9e1b4cef06c640b4a8cef/refs/tags/v2.2.1.lock: 5.284988ms
/go/pkg/mod/cache/vcs/7e5fa1eab4705eb80c9746632736cea906708d060702d529df6241d1c8c2c9f9/refs/remotes/origin: 4.854879ms
/goapp/go.sum065097093.tmp: 4.703922ms
/go/pkg/mod/cache/vcs/7e5fa1eab4705eb80c9746632736cea906708d060702d529df6241d1c8c2c9f9/objects/pack/tmp_idx_ifdK6D: 4.695062ms
/usr/share/git-core/templates/hooks: 4.675798ms
/go/pkg/mod/cache/vcs/4f939cf8921e8662ef2d8cf78b0db3bcacccda0d79c9e1b4cef06c640b4a8cef/objects/54: 4.524487ms
./objects/4c/45e660a8f2e1020907cd4f05a46cfa3c421e97: 4.279272ms
/go/pkg/mod/cache/vcs/7e5fa1eab4705eb80c9746632736cea906708d060702d529df6241d1c8c2c9f9/tpJJmG7: 3.930056ms
/go/pkg/mod/cache/vcs/4f939cf8921e8662ef2d8cf78b0db3bcacccda0d79c9e1b4cef06c640b4a8cef/objects/17: 3.924113ms
/usr/share/git-core/templates/branches: 3.900993ms
/usr/share/git-core/templates/info: 3.779083ms
/etc/ssl/certs/DigiCert_Assured_ID_Root_G2.pem: 3.703102ms
/etc/localtime: 3.603089ms
/go/pkg/mod/cache/vcs/7e5fa1eab4705eb80c9746632736cea906708d060702d529df6241d1c8c2c9f9/refs/tags: 3.478505ms
./objects/pack/tmp_pack_v7xS1g: 3.246566ms
/go/pkg/mod/cache/vcs/4f939cf8921e8662ef2d8cf78b0db3bcacccda0d79c9e1b4cef06c640b4a8cef/refs/heads: 3.061064ms
./objects/54/tmp_obj_IxSdxD: 2.983034ms
./objects/7c/tmp_obj_uysOXN: 2.882551ms
/go/pkg/mod/cache/vcs/4f939cf8921e8662ef2d8cf78b0db3bcacccda0d79c9e1b4cef06c640b4a8cef/tYeSycV: 2.87449ms
./objects/92/tmp_obj_2HMuZI: 2.862281ms
./objects/8d/tmp_obj_yQgM3C: 2.837885ms
./objects/17/tmp_obj_sNMg7K: 2.698299ms
./objects/pack/tmp_idx_ifdK6D: 2.693626ms
./objects/b5/tmp_obj_GeDwWE: 2.670285ms
./objects/a1/tmp_obj_EJV34G: 2.577941ms
/bin/date: 2.569766ms
./objects/81/tmp_obj_QrjW5J: 2.567559ms
./objects/6c/tmp_obj_aGzTuN: 2.563423ms
./objects/f0/tmp_obj_2FpZuK: 2.539998ms
./objects/07/tmp_obj_k3GsNL: 2.452942ms
./objects/1f/tmp_obj_ox5DwF: 2.424578ms
./objects/a1/tmp_obj_61fVzH: 2.408735ms
./config: 2.396375ms
/etc/ssl/certs/ccc52f49.0: 2.362462ms
./objects/a2/tmp_obj_OehqkQ: 2.347485ms
./objects/e4/tmp_obj_YLpLsI: 2.319997ms
./objects/e2/tmp_obj_yWJBlO: 2.314943ms
./objects/8d/tmp_obj_aaVa0F: 2.296146ms
./objects/fa/tmp_obj_qUG9bE: 2.274839ms
./objects/c5/tmp_obj_0stuqP: 2.249093ms
/usr/lib/x86_64-linux-gnu/ 2.235308ms
/lib/x86_64-linux-gnu/ 2.130039ms
/etc/ssl/certs/c2c1704e.0: 2.043096ms
/usr/lib/x86_64-linux-gnu/ 1.997861ms
./objects/c3/53b3e80edf714c187e6c23e8681f0c3107faa7: 1.942185ms
./objects/e4/e56e28e0e807e84ff6828a669c7b55a3c8d06d: 1.686849ms
/usr/lib/x86_64-linux-gnu/ 1.659932ms
/usr/lib/x86_64-linux-gnu/ 1.629524ms
/usr/local/go/bin/go: 1.599043ms
/lib/x86_64-linux-gnu/ 1.546362ms
./objects/9f/tmp_obj_8YhHtC: 1.545493ms
/lib/x86_64-linux-gnu/ 1.504098ms
/usr/lib/x86_64-linux-gnu/ 1.502959ms
/lib/x86_64-linux-gnu/ 1.481554ms
/usr/lib/x86_64-linux-gnu/ 1.475695ms
/usr/lib/x86_64-linux-gnu/ 1.472054ms
/lib/x86_64-linux-gnu/ 1.461186ms
/usr/lib/x86_64-linux-gnu/ 1.411349ms
/lib/x86_64-linux-gnu/ 1.408774ms
/usr/lib/x86_64-linux-gnu/ 1.403558ms
/lib/x86_64-linux-gnu/ 1.373322ms
/usr/lib/x86_64-linux-gnu/ 1.369503ms
/lib/x86_64-linux-gnu/ 1.369477ms
/lib/x86_64-linux-gnu/ 1.356245ms
/usr/lib/x86_64-linux-gnu/ 1.34411ms
./objects/8d/a58fbf6f84a9280f8351ed3c8bfb47c9d787c2: 1.337339ms
/usr/lib/x86_64-linux-gnu/ 1.33639ms
/usr/lib/x86_64-linux-gnu/ 1.327105ms
./objects/92/69f12b8b0b1c2004ff8f702e183d3901af2f30: 1.324231ms
/heads: 1.320963ms
/lib/x86_64-linux-gnu/ 1.29998ms
/lib/x86_64-linux-gnu/ 1.290125ms
./objects/86/tmp_obj_4StGFG: 1.289778ms
/z: 1.288516ms
/usr/lib/x86_64-linux-gnu/ 1.284589ms
/usr/lib/x86_64-linux-gnu/ 1.28276ms
/lib/x86_64-linux-gnu/ 1.271286ms
/etc/ssl/certs/DigiCert_Assured_ID_Root_G3.pem: 1.255138ms
/usr/lib/x86_64-linux-gnu/ 1.253455ms
/lib/x86_64-linux-gnu/ 1.25088ms
./objects/19/tmp_obj_YlE7qJ: 1.247552ms
/usr/lib/x86_64-linux-gnu/ 1.237169ms
/usr/lib/x86_64-linux-gnu/ 1.21427ms
./objects/4c/tmp_obj_YdyinM: 1.211094ms
/go/pkg/mod/cache/vcs/4f939cf8921e8662ef2d8cf78b0db3bcacccda0d79c9e1b4cef06c640b4a8cef/hooks/pre-push.sample: 1.205618ms
/lib/x86_64-linux-gnu/ 1.182773ms
/k: 1.182638ms
/lib/x86_64-linux-gnu/ 1.162966ms
./objects/81/tmp_obj_Q7Ek4O: 1.149958ms
/usr/lib/x86_64-linux-gnu/ 1.14672ms
/lib/x86_64-linux-gnu/ 1.130787ms
./objects/e2/5cee563be827c201da827c9c3aacc4bfef7830: 1.130478ms
./objects/pack/pack-00a19d4f357b1612edca644fc1dfc9a60cd06f73.idx: 1.12719ms
./objects/81/e26d17b0d32beabd4df1df23dc556cdf4ea05b: 1.118607ms
./objects/de/tmp_obj_uecpQP: 1.117269ms
./objects/07/7fd1dd2d446ddb9c000b6e4a15ac4518452334: 1.069661ms
./objects/a2/dde608cb7a39850340eba008e796e10a730b26: 1.048925ms
/qr: 1.018391ms
./objects/1f/7e87e67275af6c2767393d53b9aab2c0b51962: 1.006736ms
/dev/null: 3.939µs
/go/pkg/mod/cache/vcs/7e5fa1eab4705eb80c9746632736cea906708d060702d529df6241d1c8c2c9f9.lock: 2.874µs
/usr/share/ca-certificates/mozilla/Go_Daddy_Class_2_CA.crt: 1.934µs
/usr/share/ca-certificates/mozilla/TrustCor_RootCert_CA-2.crt: 1.928µs
/usr/share/ca-certificates/mozilla/Taiwan_GRCA.crt: 1.898µs
/usr/share/ca-certificates/mozilla/ISRG_Root_X1.crt: 1.873µs
/usr/share/ca-certificates/mozilla/Buypass_Class_3_Root_CA.crt: 997ns
/usr/share/ca-certificates/mozilla/ComSign_CA.crt: 996ns
/usr/share/ca-certificates/mozilla/Verisign_Class_1_Public_Primary_Certification_Authority_-_G3.crt: 996ns
/usr/share/ca-certificates/mozilla/Hellenic_Academic_and_Research_Institutions_RootCA_2015.crt: 994ns
/usr/share/ca-certificates/mozilla/Amazon_Root_CA_4.crt: 993ns
/usr/share/ca-certificates/mozilla/DigiCert_Assured_ID_Root_G2.crt: 991ns
/usr/share/ca-certificates/mozilla/USERTrust_ECC_Certification_Authority.crt: 991ns
/usr/share/ca-certificates/mozilla/Starfield_Root_Certificate_Authority_-_G2.crt: 991ns
/usr/share/ca-certificates/mozilla/QuoVadis_Root_CA_3_G3.crt: 991ns
/usr/share/ca-certificates/mozilla/Comodo_AAA_Services_root.crt: 991ns
/usr/share/ca-certificates/mozilla/DigiCert_Global_Root_G3.crt: 989ns
/usr/share/ca-certificates/mozilla/ePKI_Root_Certification_Authority.crt: 986ns
/usr/share/ca-certificates/mozilla/Hongkong_Post_Root_CA_1.crt: 984ns
/usr/share/ca-certificates/mozilla/Starfield_Class_2_CA.crt: 981ns
/usr/share/ca-certificates/mozilla/SSL.com_Root_Certification_Authority_ECC.crt: 979ns
/usr/share/ca-certificates/mozilla/GlobalSign_Root_CA_-_R3.crt: 977ns
/usr/share/ca-certificates/mozilla/VeriSign_Universal_Root_Certification_Authority.crt: 977ns
/go/pkg/mod/cache/lock: 974ns
/usr/share/ca-certificates/mozilla/AffirmTrust_Networking.crt: 969ns
/usr/share/ca-certificates/mozilla/TrustCor_ECA-1.crt: 968ns
/usr/share/ca-certificates/mozilla/OISTE_WISeKey_Global_Root_GA_CA.crt: 958ns
/usr/share/ca-certificates/mozilla/Microsec_e-Szigno_Root_CA_2009.crt: 957ns
/usr/share/ca-certificates/mozilla/DST_Root_CA_X3.crt: 956ns
/usr/share/ca-certificates/mozilla/SecureSign_RootCA11.crt: 953ns
/usr/share/ca-certificates/mozilla/QuoVadis_Root_CA.crt: 953ns
/usr/share/ca-certificates/mozilla/OpenTrust_Root_CA_G2.crt: 953ns
/usr/share/ca-certificates/mozilla/Amazon_Root_CA_1.crt: 952ns
/usr/share/ca-certificates/mozilla/Certplus_Root_CA_G1.crt: 947ns
/usr/share/ca-certificates/mozilla/EE_Certification_Centre_Root_CA.crt: 945ns
/usr/share/ca-certificates/mozilla/COMODO_ECC_Certification_Authority.crt: 945ns
/usr/share/ca-certificates/mozilla/thawte_Primary_Root_CA.crt: 943ns
/usr/share/ca-certificates/mozilla/AddTrust_External_Root.crt: 941ns
/usr/share/ca-certificates/mozilla/Certum_Trusted_Network_CA.crt: 941ns
/usr/share/ca-certificates/mozilla/DigiCert_High_Assurance_EV_Root_CA.crt: 913ns
/usr/share/ca-certificates/mozilla/DigiCert_Global_Root_G2.crt: 910ns
/usr/share/ca-certificates/mozilla/Buypass_Class_2_Root_CA.crt: 883ns


top 50 -cum
(pprof) top 50 -cum
Showing nodes accounting for 2.67s, 65.28% of 4.09s total
Dropped 349 nodes (cum <= 0.02s)
Showing top 50 nodes out of 197
      flat  flat%   sum%        cum   cum%
         0     0%     0%      3.64s 89.00%*Task).run
     0.05s  1.22%  1.22%      3.59s 87.78%*runApp).execute
     0.03s  0.73%  1.96%      1.78s 43.52%*context).Switch
         0     0%  1.96%      1.75s 42.79%*subprocess).switchToApp
         0     0%  1.96%      1.31s 32.03%*Task).doSyscall
     0.01s  0.24%  2.20%      1.30s 31.78%*Task).doSyscallEnter
         0     0%  2.20%      1.29s 31.54%*Task).doSyscallInvoke
     0.01s  0.24%  2.44%      1.28s 31.30%*Task).executeSyscall
        1s 24.45% 26.89%         1s 24.45%  syscall.RawSyscall
     0.03s  0.73% 27.63%      0.69s 16.87%*thread).wait
     0.68s 16.63% 44.25%      0.69s 16.87%  syscall.Syscall6
         0     0% 44.25%      0.66s 16.14%  syscall.Wait4
         0     0% 44.25%      0.66s 16.14%  syscall.wait4
         0     0% 44.25%      0.62s 15.16%
         0     0% 44.25%      0.54s 13.20%*Client).sendRecv
     0.48s 11.74% 55.99%      0.48s 11.74%  syscall.RawSyscall6
         0     0% 55.99%      0.44s 10.76%
         0     0% 55.99%      0.41s 10.02%*MountNamespace).FindInode
         0     0% 55.99%      0.41s 10.02%*MemoryManager).HandleUserFault
         0     0% 55.99%      0.36s  8.80%
         0     0% 55.99%      0.34s  8.31%*MountNamespace).resolve
         0     0% 55.99%      0.34s  8.31%*subprocess).syscall
         0     0% 55.99%      0.34s  8.31%*thread).syscall
         0     0% 55.99%      0.34s  8.31%*thread).syscallIgnoreInterrupt
     0.32s  7.82% 63.81%      0.32s  7.82%  runtime.futex
         0     0% 63.81%      0.28s  6.85%*clientFile).Readlink
         0     0% 63.81%      0.28s  6.85%*Inode).Readlink
         0     0% 63.81%      0.28s  6.85%
         0     0% 63.81%      0.28s  6.85%*contextFile).readlink
         0     0% 63.81%      0.28s  6.85%*inodeOperations).Readlink
         0     0% 63.81%      0.27s  6.60%*Client).waitAndRecv
         0     0% 63.81%      0.26s  6.36%*Client).handleOne
     0.02s  0.49% 64.30%      0.25s  6.11%
     0.01s  0.24% 64.55%      0.25s  6.11%*thread).bind
         0     0% 64.55%      0.24s  5.87%*MemoryManager).mapASLocked
         0     0% 64.55%      0.24s  5.87%*subprocess).MapFile
         0     0% 64.55%      0.24s  5.87%  runtime.futexwakeup
         0     0% 64.55%      0.24s  5.87%  runtime.notewakeup
         0     0% 64.55%      0.22s  5.38%
         0     0% 64.55%      0.21s  5.13%*SocketWriter).WriteVec
         0     0% 64.55%      0.21s  5.13%  runtime.schedule
         0     0% 64.55%      0.20s  4.89%*MemoryManager).getPMAsInternalLocked
         0     0% 64.55%      0.20s  4.89%*MemoryManager).getPMAsLocked
         0     0% 64.55%      0.20s  4.89%  runtime.mcall
     0.01s  0.24% 64.79%      0.20s  4.89%  runtime.startm
     0.01s  0.24% 65.04%      0.19s  4.65%*MountNamespace).FindLink
     0.01s  0.24% 65.28%      0.19s  4.65%  runtime.mallocgc
         0     0% 65.28%      0.19s  4.65%  runtime.systemstack
         0     0% 65.28%      0.17s  4.16%*SocketReader).ReadVec
         0     0% 65.28%      0.16s  3.91%  runtime.wakep
top 50
(pprof) top 50
Showing nodes accounting for 3.38s, 82.64% of 4.09s total
Dropped 349 nodes (cum <= 0.02s)
Showing top 50 nodes out of 197
      flat  flat%   sum%        cum   cum%
        1s 24.45% 24.45%         1s 24.45%  syscall.RawSyscall
     0.68s 16.63% 41.08%      0.69s 16.87%  syscall.Syscall6
     0.48s 11.74% 52.81%      0.48s 11.74%  syscall.RawSyscall6
     0.32s  7.82% 60.64%      0.32s  7.82%  runtime.futex
     0.10s  2.44% 63.08%      0.10s  2.44%
     0.09s  2.20% 65.28%      0.09s  2.20%  runtime.memmove
     0.06s  1.47% 66.75%      0.06s  1.47%  runtime.memclrNoHeapPointers
     0.05s  1.22% 67.97%      3.59s 87.78%*runApp).execute
     0.05s  1.22% 69.19%      0.08s  1.96%  runtime.heapBitsSetType
     0.05s  1.22% 70.42%      0.05s  1.22%  runtime.usleep
     0.05s  1.22% 71.64%      0.06s  1.47%  syscall.Syscall
     0.04s  0.98% 72.62%      0.04s  0.98%  runtime.mapaccess1_fast64
     0.04s  0.98% 73.59%      0.04s  0.98%  sync.(*Pool).Get
     0.03s  0.73% 74.33%      1.78s 43.52%*context).Switch
     0.03s  0.73% 75.06%      0.69s 16.87%*thread).wait
     0.03s  0.73% 75.79%      0.03s  0.73%  runtime.lock
     0.03s  0.73% 76.53%      0.03s  0.73%  runtime.mapaccess2_fast32
     0.02s  0.49% 77.02%      0.25s  6.11%
     0.02s  0.49% 77.51%      0.03s  0.73%*MemoryManager).incPrivateRef
     0.01s  0.24% 77.75%      0.14s  3.42%*FeatureSet).ExtendedStateSize
     0.01s  0.24% 78.00%      0.06s  1.47%*File).Readv
     0.01s  0.24% 78.24%      0.19s  4.65%*MountNamespace).FindLink
     0.01s  0.24% 78.48%      0.03s  0.73%
     0.01s  0.24% 78.73%      0.03s  0.73%
     0.01s  0.24% 78.97%      1.30s 31.78%*Task).doSyscallEnter
     0.01s  0.24% 79.22%      1.28s 31.30%*Task).executeSyscall
     0.01s  0.24% 79.46%      0.05s  1.22%*Timer).init
     0.01s  0.24% 79.71%      0.06s  1.47%*MemoryManager).withVecInternalMappings
     0.01s  0.24% 79.95%      0.25s  6.11%*thread).bind
     0.01s  0.24% 80.20%      0.10s  2.44%
     0.01s  0.24% 80.44%      0.10s  2.44%
     0.01s  0.24% 80.68%      0.04s  0.98%
     0.01s  0.24% 80.93%      0.12s  2.93%  runtime.chansend
     0.01s  0.24% 81.17%      0.14s  3.42%  runtime.findrunnable
     0.01s  0.24% 81.42%      0.09s  2.20%  runtime.futexsleep
     0.01s  0.24% 81.66%      0.03s  0.73%  runtime.gentraceback
     0.01s  0.24% 81.91%      0.19s  4.65%  runtime.mallocgc
     0.01s  0.24% 82.15%      0.03s  0.73%  runtime.newdefer
     0.01s  0.24% 82.40%      0.14s  3.42%  runtime.newobject
     0.01s  0.24% 82.64%      0.20s  4.89%  runtime.startm
         0     0% 82.64%      0.03s  0.73%*FeatureSet).HasFeature
         0     0% 82.64%      0.03s  0.73%*FeatureSet).UseXsave
         0     0% 82.64%      0.26s  6.36%*Client).handleOne
         0     0% 82.64%      0.54s 13.20%*Client).sendRecv
         0     0% 82.64%      0.27s  6.60%*Client).waitAndRecv
         0     0% 82.64%      0.05s  1.22%*clientFile).Close
         0     0% 82.64%      0.05s  1.22%*clientFile).Open
         0     0% 82.64%      0.28s  6.85%*clientFile).Readlink
         0     0% 82.64%      0.07s  1.71%*clientFile).Walk
         0     0% 82.64%      0.06s  1.47%*clientFile).WalkGetAttr


I've tried it on a KVM platform: it worked even slower.

Copy link

@amscanne amscanne commented Jun 24, 2019

Given that this is a build, I don't think it's pure I/O, I think it's filesystem (VFS) operations. (Sucking down dependencies in a build system ends up looking a lot like a VFS microbenchmark.)

Our internal VFS implementation is not efficient. We have a large-scale rewrite of the internals (#447) in the works that should address a lot of these costs, but it will take a little while longer for it to land. We'll see if we can pull these cases into the workload benchmarks and get a sense for whether the rewrite fixes the VFS performance issues.

Copy link

@yoshiat yoshiat commented Jun 24, 2019

Thank you for the feedback @jirfag! As @amscanne mentioned, gVisor hasn't been geared toward FS intensive use cases like build however we've heard from a lot of external users that they would want to use gVisor for build use cases. Thus we have been prioritizing the VFS rewrite, which should unblock a lot of users for this use case. Stay tuned!

Copy link

@jirfag jirfag commented Jun 27, 2019

thank you!

Copy link

@jirfag jirfag commented Oct 16, 2019

I've checked on the latest release (where VFS was rewritten) - it's still 4x slower:

$ sudo docker run --rm --runtime=runsc -w /goapp -it golang:1.13 bash -c "time go get"
real    0m8.148s
user    0m4.220s
sys     0m2.020s
$ sudo docker run --rm -w /goapp -it golang:1.13 bash -c "time go get"
real    0m1.928s
user    0m1.655s
sys     0m0.234s

Copy link

@prattmic prattmic commented Oct 16, 2019

I've checked on the latest release (where VFS was rewritten) - it's still 4x slower:

Just to be clear, the new VFS is not finished yet, so it is not enabled in the latest release.

Copy link

@jirfag jirfag commented Oct 17, 2019

ok, got it, thank you

Copy link

@fvoznika fvoznika commented Oct 18, 2019

I think there is some confusion because the PR (#447) pointed to above was checked in. However, that was only the initial skeleton for the VFS2 code. There is a lot more work in progress before we can make the switch over to it. You can follow along #1035 to see updates about it.

Copy link

@cuonglm cuonglm commented Sep 16, 2021

I faced the same performance issue with go command related action:

$ docker run --rm -it -v /home/cuong/go:/go --runtime=runsc golang:latest /bin/bash
root@e17a2362ab06:/go# cd src/
root@e17a2362ab06:/go/src/ time go mod tidy

real	0m18.609s
user	0m16.870s
sys	0m20.990s
$ docker run --rm -it -v /home/cuong/go:/go --runtime=runc golang:latest /bin/bash
root@77ede87f87d9:/go# cd src/
root@77ede87f87d9:/go/src/ time go mod tidy

real	0m0.419s
user	0m1.140s
sys	0m0.588s

For broader context, we use gvisor to run go benchmarks, and using gvisor failed to compile the package with details error here:

Copy link

@prattmic prattmic commented Sep 16, 2021

@cuonglm Just to clarify, you are describing two issues, correct?

  1. Go toolchain tools are slow inside gVisor (likely directly related to this issue of IO performance).
  2. Building/testing cosmos-sdk inside gVisor causes segfaults in the test and Go toolchain (this should be tracked in another issue).

Copy link

@cuonglm cuonglm commented Sep 16, 2021

@prattmic Yes, I want to investigate further before opening new issue. The (2) happens in our bencher system first, then I investigated and see (1) happens.

I'm going to make a reproducible for (2).

Copy link

@prattmic prattmic commented Sep 16, 2021

Great, thanks!

Copy link

@cuonglm cuonglm commented Sep 16, 2021

Great, thanks!

I filed #6600

Copy link

@fvoznika fvoznika commented Sep 28, 2021

@cuonglm enabling VFS2 improves your scenario quite a bit. -v mounts require extra validation when accessed and this has been greatly improved with VFS2. From the numbers you posted it seems that go mod tidy didn't need to download modules, so I posted the results for both for comparison:

Workload VFS1 VFS2 Native
go mod tidy (download) 116s 21s 7s
go mod tidy (2nd time) 12.1s 4.5s 0.4s

@jirfag, for the 10x case, most of the overhead is coming from the file system and syscall trapping cost of ptrace. If you switch to VFS2, things improve but the overhead is still dominated by the cost to trap syscalls. If you change the platform from ptrace to KVM (+VFS2), then the results get a lot better. For the 4x case, there has been several improvements in networking performance since you first looked at this. That in combination with VFS2 brings gVisor performance practically on par with native:

Workload VFS1 VFS2 KVM+VFS2 Native
go mod vendor 316s 118s 54s 35s
go mod download 10s 8.8s 8.6s 7s

With that, I'm closing out this issue. If you have other workloads that don't perform well with VFS2 enabled, please file another issue.

Note: to enable VFS2, add the --vfs2 flag to your runtime configuration.

@fvoznika fvoznika closed this Sep 28, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
None yet
Linked pull requests

Successfully merging a pull request may close this issue.

None yet
7 participants