Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

x/build: scaleway builders write_snapshot_tar consistently taking > 1h #21839

Closed
adams-sarah opened this issue Sep 11, 2017 · 6 comments

Comments

Projects
None yet
4 participants

@adams-sarah adams-sarah self-assigned this Sep 11, 2017

@adams-sarah adams-sarah changed the title x/build: scaleway builders: write_snapshot_tar consistently taking > 1h x/build: scaleway builders write_snapshot_tar consistently taking > 1h Sep 11, 2017

@gopherbot gopherbot added this to the Unreleased milestone Sep 11, 2017

@gopherbot gopherbot added the Builders label Sep 11, 2017

@adams-sarah

This comment has been minimized.

Copy link
Contributor Author

commented Sep 11, 2017

gopherbot does not seem to like a second colon in the title.

@dsnet

This comment has been minimized.

Copy link
Member

commented Sep 11, 2017

Unfortunately all of the links you posted are now stale and returning 404s :(

@adams-sarah

This comment has been minimized.

Copy link
Contributor Author

commented Sep 11, 2017

arg shoot.

still had windows open. uploading as text files.
some example culprit lines:

4:
2017-09-11T18:23:16Z write_snapshot_tar
2017-09-11T19:44:11Z finish_write_snapshot_tar after 1h20m54.5s

5:
2017-09-11T18:49:54Z write_snapshot_tar
2017-09-11T20:26:43Z finish_write_snapshot_tar after 1h36m48.4s

6:
2017-09-11T18:49:52Z write_snapshot_tar
2017-09-11T20:09:12Z finish_write_snapshot_tar after 1h19m20s

7:
2017-09-11T19:04:35Z write_snapshot_tar
2017-09-11T20:38:02Z finish_write_snapshot_tar after 1h33m26.9s

temporarylogs.txt
temporarylogs2.txt
temporarylogs3.txt
temporarylogs4.txt
temporarylogs5.txt
temporarylogs6.txt
temporarylogs7.txt

@bradfitz

This comment has been minimized.

Copy link
Member

commented Nov 13, 2017

Looks like a configuration problem means the --workdir being passed to the buildlet isn't the tmpfs mount, so it's doing lots of slow I/O over the network (Scaleway's disk is an NBD device, IIRC).

I saw during a build:

:: Running /tmp/workdir/go/src/make.bash with args ["/tmp/workdir/go/src/make.bash"] and env ["PATH=/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin" "HOSTNAME=scaleway-prod-07" "ARCH=armv7l" "UBUNTU_SUITE=xenial" "DOCKER_REPO=multiarch/ubuntu-debootstrap" "DEBIAN_FRONTEND=noninteractive" "SCW_BASE_IMAGE=scaleway/ubuntu:xenial" "GO_BOOTSTRAP=/usr/local/go" "GO_BUILD_KEY_PATH=/buildkey/gobuildkey" "GO_BUILD_KEY_DELETE_AFTER_READ=true" "IN_KUBERNETES=1" "GO_BUILDER_ENV=host-linux-arm-scaleway" "META_BUILDLET_BINARY_URL=https://storage.googleapis.com/go-builder-data/buildlet.linux-arm" "HOME=/root" "USER=root" "WORKDIR=/tmp/workdir" "GOROOT_BOOTSTRAP=/usr/local/go" "GO_BUILDER_NAME=linux-arm" "GO_BUILDER_FLAKY_NET=1" "GOBIN="] in dir /tmp/workdir/go/src

(With /tmp/workdir).

But in rundockerbuildlet I see:

                out, err := exec.Command("docker", "run",
                        "-d",
                        "--memory="+*memory,
                        "--name="+name,
                        "-v", filepath.Dir(keyFile)+":/buildkey/",
                        "-e", "HOSTNAME="+name,
                        "--tmpfs=/workdir:rw,exec",
                        *image).CombinedOutput()

... putting a tmpfs inside the container at /workdir.

@bradfitz

This comment has been minimized.

Copy link
Member

commented Nov 13, 2017

Yup, confirmed with a gomote ssh session to a linux-arm builder.

$ gomote ssh user-bradfitz-linux-arm-0
$ ssh -p 2222 user-bradfitz-linux-arm-0@farmer.golang.org # auth using https://github.com/bradfitz.keys
# Welcome to the gomote ssh proxy, bradfitz.
# Connecting to/starting remote ssh...
#
# `gomote push` and the builders use:
# - workdir: /tmp/workdir
# - GOROOT: /tmp/workdir/go
# - GOPATH: /tmp/workdir/gopath
# - env: GO_BUILDER_NAME=linux-arm GO_BUILDER_FLAKY_NET=1 GOROOT_BOOTSTRAP=/usr/local/go
# Happy debugging.
....
root@d41b38a67ad1:~# df
Filesystem     1K-blocks    Used Available Use% Mounted on
none            47929956 6711792  38760376  15% /
tmpfs            1033900       0   1033900   0% /dev
tmpfs            1033900       0   1033900   0% /sys/fs/cgroup
/dev/nbd0       47929956 6711792  38760376  15% /buildkey
tmpfs            1033900       0   1033900   0% /workdir
shm                65536       0     65536   0% /dev/shm
root@d41b38a67ad1:~# df /tmp/workdir
Filesystem     1K-blocks    Used Available Use% Mounted on
none            47929956 6711792  38760376  15% /
root@d41b38a67ad1:~# cat /proc/mounts 
none / aufs rw,relatime,si=79190e56,dio,dirperm1 0 0
proc /proc proc rw,nosuid,nodev,noexec,relatime 0 0
tmpfs /dev tmpfs rw,nosuid,mode=755 0 0
devpts /dev/pts devpts rw,nosuid,noexec,relatime,gid=5,mode=620,ptmxmode=666 0 0
sysfs /sys sysfs ro,nosuid,nodev,noexec,relatime 0 0
tmpfs /sys/fs/cgroup tmpfs ro,nosuid,nodev,noexec,relatime,mode=755 0 0
cgroup /sys/fs/cgroup/systemd cgroup ro,nosuid,nodev,noexec,relatime,xattr,release_agent=/lib/systemd/systemd-cgroups-agent,name=systemd 0 0
cgroup /sys/fs/cgroup/net_cls,net_prio cgroup ro,nosuid,nodev,noexec,relatime,net_cls,net_prio 0 0
cgroup /sys/fs/cgroup/cpu,cpuacct cgroup ro,nosuid,nodev,noexec,relatime,cpu,cpuacct 0 0
cgroup /sys/fs/cgroup/freezer cgroup ro,nosuid,nodev,noexec,relatime,freezer 0 0
cgroup /sys/fs/cgroup/cpuset cgroup ro,nosuid,nodev,noexec,relatime,cpuset 0 0
cgroup /sys/fs/cgroup/devices cgroup ro,nosuid,nodev,noexec,relatime,devices 0 0
cgroup /sys/fs/cgroup/perf_event cgroup ro,nosuid,nodev,noexec,relatime,perf_event 0 0
cgroup /sys/fs/cgroup/memory cgroup ro,nosuid,nodev,noexec,relatime,memory 0 0
cgroup /sys/fs/cgroup/blkio cgroup ro,nosuid,nodev,noexec,relatime,blkio 0 0
mqueue /dev/mqueue mqueue rw,nosuid,nodev,noexec,relatime 0 0
/dev/nbd0 /buildkey ext4 rw,relatime,data=ordered 0 0
tmpfs /workdir tmpfs rw,nosuid,nodev,relatime 0 0
/dev/nbd0 /etc/resolv.conf ext4 rw,relatime,data=ordered 0 0
/dev/nbd0 /etc/hostname ext4 rw,relatime,data=ordered 0 0
/dev/nbd0 /etc/hosts ext4 rw,relatime,data=ordered 0 0
shm /dev/shm tmpfs rw,nosuid,nodev,noexec,relatime,size=65536k 0 0
proc /proc/bus proc ro,nosuid,nodev,noexec,relatime 0 0
proc /proc/fs proc ro,nosuid,nodev,noexec,relatime 0 0
proc /proc/irq proc ro,nosuid,nodev,noexec,relatime 0 0
proc /proc/sys proc ro,nosuid,nodev,noexec,relatime 0 0
proc /proc/sysrq-trigger proc ro,nosuid,nodev,noexec,relatime 0 0
tmpfs /proc/timer_list tmpfs rw,nosuid,mode=755 0 0
tmpfs /proc/timer_stats tmpfs rw,nosuid,mode=755 0 0
@gopherbot

This comment has been minimized.

Copy link

commented Nov 13, 2017

Change https://golang.org/cl/77370 mentions this issue: cmd/buildlet: use tmpfs workdir if flag value is unspecified

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
You can’t perform that action at this time.