Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

severe performance issue with buildah bud when copying recursive folders #2047

Closed
gireeshpunathil opened this issue Dec 25, 2019 · 22 comments
Closed

Comments

@gireeshpunathil
Copy link

Description

I am composing an image from a large (~200MB) project structure using buildah budthat never completes. Here is the full bug report for reference: kabanero-io/kabanero-pipelines#121

I have brought it down to a simple recreate for easy diagnosis

Steps to reproduce the issue:

$ ls -lrt

total 8
-rw-r--r-- 1 root root  29 Dec 25 05:34 Dockerfile
-rwxr-xr-x 1 root root 272 Dec 25 05:37 foo.sh

$ cat Dockerfile

FROM scratch
COPY . /project

$ cat foo.sh

mkdir foo
cd foo
x=0
while [ $x -lt 10 ]
do
	mkdir $x
	cd $x
	y=0
	while [ $y -lt 10 ]
	do
		mkdir $y
		cd $y
		z=0
		while [ $z -lt 10 ]
		do
			mkdir $z
			echo "hello" > $z/foo.txt
			z=`expr $z + 1`
		done
		cd ..
		y=`expr $y + 1`
	done
	cd ..
	x=`expr $x + 1`
done
$ ./foo.sh
$ find ./foo | wc -l
2111
$ du -ms .
4	.
$ time buildah bud --tls-verify=false --format=docker .
STEP 1: FROM scratch
STEP 2: COPY . /project
STEP 3: COMMIT
Getting image source signatures
Copying blob 13b57060f1d4 done
Copying config 3b5993213c done
Writing manifest to image destination
Storing signatures
3b5993213cca36d60981efd72cdd74d03eaa721843e897520afb8c76a171be3e

real	1m9.819s
user	0m47.776s
sys	0m24.280s
$ rm -rf foo
$ fallocate -l 5M dummy.txt
$ du -ms .
6	.
$ time buildah bud --tls-verify=false --format=docker .
STEP 1: FROM scratch
STEP 2: COPY . /project
STEP 3: COMMIT
Getting image source signatures
Copying blob b35b1eb9a11e done
Copying config 45ceda274a done
Writing manifest to image destination
Storing signatures
45ceda274a743389aa270c8d36b9d436f17f14d656b00edc2c2576482fb8a5c4

real	0m1.733s
user	0m0.603s
sys	0m0.276s

As you can see, a folder structure with 1K+ directories inside an 4MB of data takes 1 minute + where as a a monolith file of 5MB just takes over a second.

Describe the results you received:

unexpected delay in COPY directive. When I tried to trace using strace I believe things started moving faster, but because of the change in timings that is involved with strace, results are never conclusive

Describe the results you expected:

Few seconds for copying - parity with docker build command

Output of rpm -q buildah or apt list buildah:

$ rpm -q buildah
buildah-1.9.0-2.el7.x86_64

Output of buildah version:

$ buildah version
Version:         1.9.0
Go Version:      go1.10.8
Image Spec:      1.0.0
Runtime Spec:    1.0.0
CNI Spec:        0.4.0
libcni Version:  
Git Commit:      
Built:           Wed Dec 31 16:00:00 1969
OS/Arch:         linux/amd64

Output of podman version if reporting a podman build issue:

(paste your output here)

Output of cat /etc/*release:

$ cat /etc/redhat-release 
Red Hat Enterprise Linux Server release 7.6 (Maipo)

Output of uname -a:

$ uname -a
Linux <redacted> 3.10.0-1062.4.1.el7.x86_64 #1 SMP Wed Sep 25 09:42:57 EDT 2019 x86_64 x86_64 x86_64 GNU/Linux

Output of cat /etc/containers/storage.conf:

$ cat /etc/containers/storage.conf
# storage.conf is the configuration file for all tools
# that share the containers/storage libraries
# See man 5 containers-storage.conf for more information
# The "container storage" table contains all of the server options.
[storage]

# Default Storage Driver
driver = "overlay"

# Temporary storage location
runroot = "/var/run/containers/storage"

# Primary Read/Write location of container storage
graphroot = "/var/lib/containers/storage"

[storage.options]
# Storage options to be passed to underlying storage drivers

# AdditionalImageStores is used to pass paths to additional Read/Only image stores
# Must be comma separated list.
additionalimagestores = [
]

# Size is used to set a maximum size of the container image.  Only supported by
# certain container storage drivers.
size = ""

# OverrideKernelCheck tells the driver to ignore kernel checks based on kernel version
override_kernel_check = "true"

# Remap-UIDs/GIDs is the mapping from UIDs/GIDs as they should appear inside of
# a container, to UIDs/GIDs as they should appear outside of the container, and
# the length of the range of UIDs/GIDs.  Additional mapped sets can be listed
# and will be heeded by libraries, but there are limits to the number of
# mappings which the kernel will allow when you later attempt to run a
# container.
#
# remap-uids = 0:1668442479:65536
# remap-gids = 0:1668442479:65536

# Remap-User/Group is a name which can be used to look up one or more UID/GID
# ranges in the /etc/subuid or /etc/subgid file.  Mappings are set up starting
# with an in-container ID of 0 and the a host-level ID taken from the lowest
# range that matches the specified name, and using the length of that range.
# Additional ranges are then assigned, using the ranges which specify the
# lowest host-level IDs first, to the lowest not-yet-mapped container-level ID,
# until all of the entries have been used for maps.
#
# remap-user = "storage"
# remap-group = "storage"

[storage.options.thinpool]
# Storage Options for thinpool

# autoextend_percent determines the amount by which pool needs to be
# grown. This is specified in terms of % of pool size. So a value of 20 means
# that when threshold is hit, pool will be grown by 20% of existing
# pool size.
# autoextend_percent = "20"

# autoextend_threshold determines the pool extension threshold in terms
# of percentage of pool size. For example, if threshold is 60, that means when
# pool is 60% full, threshold has been hit.
# autoextend_threshold = "80"

# basesize specifies the size to use when creating the base device, which
# limits the size of images and containers.
# basesize = "10G"

# blocksize specifies a custom blocksize to use for the thin pool.
# blocksize="64k"

# directlvm_device specifies a custom block storage device to use for the
# thin pool. Required if you setup devicemapper
# directlvm_device = ""

# directlvm_device_force wipes device even if device already has a filesystem
# directlvm_device_force = "True"

# fs specifies the filesystem type to use for the base device.
# fs="xfs"

# log_level sets the log level of devicemapper.
# 0: LogLevelSuppress 0 (Default)
# 2: LogLevelFatal
# 3: LogLevelErr
# 4: LogLevelWarn
# 5: LogLevelNotice
# 6: LogLevelInfo
# 7: LogLevelDebug
# log_level = "7"

# min_free_space specifies the min free space percent in a thin pool require for
# new device creation to succeed. Valid values are from 0% - 99%.
# Value 0% disables
# min_free_space = "10%"

# mkfsarg specifies extra mkfs arguments to be used when creating the base
# device.
# mkfsarg = ""

# mountopt specifies extra mount options used when mounting the thin devices.
# mountopt = ""

# use_deferred_removal Marking device for deferred removal
# use_deferred_removal = "True"

# use_deferred_deletion Marking device for deferred deletion
# use_deferred_deletion = "True"

# xfs_nospace_max_retries specifies the maximum number of retries XFS should
# attempt to complete IO when ENOSPC (no space) error is returned by
# underlying storage device.
# xfs_nospace_max_retries = "0"
$ 

@TomSweeneyRedHat
Copy link
Member

@gireeshpunathil Thanks for the issue. Most of the team is enjoy PTO until the new year. If we don't get to it before, we'll definitely dive in at the start of the year.

@TomSweeneyRedHat TomSweeneyRedHat self-assigned this Dec 26, 2019
@TomSweeneyRedHat
Copy link
Member

@mheon FYI

@gireeshpunathil
Copy link
Author

👍 thanks @TomSweeneyRedHat, you have been greatly helpful as always! sure, let us tackle this in the new year!

@rhatdan
Copy link
Member

rhatdan commented Jan 2, 2020

Are you using a .dockerignore file?

@gireeshpunathil
Copy link
Author

I am not; but can try that, if required. Any steps / suggestions?

@rhatdan
Copy link
Member

rhatdan commented Jan 2, 2020

No we have seen performance issues when using .dockerignore

Could you try a newer version of buildah to see if it has an issue, say on a Fedora box?

@gireeshpunathil
Copy link
Author

@rhatdan - thnx. I don't have a fedora box, but created a container from fedora:latest and tried, but got this error

[root@a099739f4c99 foo]# buildah version
Version:         1.12.0
Go Version:      go1.13.5
Image Spec:      1.0.1-dev
Runtime Spec:    1.0.1-dev
CNI Spec:        0.4.0
libcni Version:  
image Version:   5.0.0
Git Commit:      
Built:           Thu Jan  1 00:00:00 1970
OS/Arch:         linux/amd64
[root@a099739f4c99 foo]# uname -a
Linux a099739f4c99 4.9.125-linuxkit #1 SMP Fri Sep 7 08:20:28 UTC 2018 x86_64 x86_64 x86_64 GNU/Linux
[root@a099739f4c99 foo]# cat /etc/fedora-release 
Fedora release 31 (Thirty One)
[root@a099739f4c99 foo]# buildah bud .
ERRO 'overlay' is not supported over overlayfs    
'overlay' is not supported over overlayfs: backing file system is unsupported for this graph driver
[root@a099739f4c99 foo]# 

any suggestions?

@rhatdan
Copy link
Member

rhatdan commented Jan 4, 2020

This looks like you have storage on an overlayfs file system? Are you running buildah inside of a container?

@gireeshpunathil
Copy link
Author

yes, in this case - as I don't have a normal fedora system.

@rhatdan
Copy link
Member

rhatdan commented Jan 4, 2020

Then volume mount a directory on /var/lib/containers and overlay will work.

@gireeshpunathil
Copy link
Author

@rhatdan - I am unable to upgrade buildah, looks like this is the best version that is available for fedora. Am I using the right fedora in the first place?

# cat /etc/os-release 
NAME="Red Hat Enterprise Linux Server"
VERSION="7.6 (Maipo)"
ID="rhel"
ID_LIKE="fedora"
VARIANT="Server"
VARIANT_ID="server"
VERSION_ID="7.6"
PRETTY_NAME="Red Hat Enterprise Linux Server 7.6 (Maipo)"
ANSI_COLOR="0;31"
CPE_NAME="cpe:/o:redhat:enterprise_linux:7.6:GA:server"
HOME_URL="https://www.redhat.com/"
BUG_REPORT_URL="https://bugzilla.redhat.com/"

REDHAT_BUGZILLA_PRODUCT="Red Hat Enterprise Linux 7"
REDHAT_BUGZILLA_PRODUCT_VERSION=7.6
REDHAT_SUPPORT_PRODUCT="Red Hat Enterprise Linux"
REDHAT_SUPPORT_PRODUCT_VERSION="7.6"

# buildah version
Version:         1.9.0
Go Version:      go1.10.8
Image Spec:      1.0.0
Runtime Spec:    1.0.0
CNI Spec:        0.4.0
libcni Version:  
Git Commit:      
Built:           Wed Dec 31 16:00:00 1969
OS/Arch:         linux/amd64
# yum upgrade buildah
Loaded plugins: product-id, search-disabled-repos
No packages marked for update
# 

@rhatdan
Copy link
Member

rhatdan commented Jan 7, 2020

You are using RHEL.

Did you try to mount a file system at /var/lib/containers?

@gireeshpunathil
Copy link
Author

@rhatdan - thnx

You are using RHEL.

Yes, unfortunately I don't have system with fedora installed, I am using a container (fedora:latest) in a rhel host

Did you try to mount a file system at /var/lib/containers?

Yes, it did not make any visible difference w.r.t performance

@rhatdan
Copy link
Member

rhatdan commented Jan 8, 2020

Perhaps this will fix the issue.
#2070

@gireeshpunathil
Copy link
Author

thanks @rhatdan for fixing this. do you know when a release is going to be cut out of it? I tried building buildah from source by following the doc, but got into issues:

# github.com/containers/buildah/vendor/github.com/vbatts/tar-split/archive/tar
vendor/github.com/vbatts/tar-split/archive/tar/writer.go:178:11: undefined: strings.Builder
# github.com/containers/buildah/vendor/github.com/vbauerster/mpb/v4/internal
vendor/github.com/vbauerster/mpb/v4/internal/percentage.go:14:9: undefined: math.Round
# github.com/containers/buildah/vendor/github.com/docker/docker/pkg/archive
vendor/github.com/docker/docker/pkg/archive/archive.go:365:5: hdr.Format undefined (type *tar.Header has no field or method Format)
vendor/github.com/docker/docker/pkg/archive/archive.go:365:15: undefined: tar.FormatPAX
vendor/github.com/docker/docker/pkg/archive/archive.go:1160:7: hdr.Format undefined (type *tar.Header has no field or method Format)
vendor/github.com/docker/docker/pkg/archive/archive.go:1160:17: undefined: tar.FormatPAX
vendor/github.com/docker/docker/pkg/archive/copy.go:346:7: hdr.Format undefined (type *tar.Header has no field or method Format)
vendor/github.com/docker/docker/pkg/archive/copy.go:346:17: undefined: tar.FormatPAX
[root@g1 buildah]# 

@TomSweeneyRedHat
Copy link
Member

@gireeshpunathil Thanks for the heads up on the install instructions, I'm not quite sure what's going on there. I'll most likely spin up a new release by early next week for Buildah.

@gireeshpunathil
Copy link
Author

gireeshpunathil commented Jan 11, 2020

@rhatdan @TomSweeneyRedHat - some updates.

when I upgraded the go version to 1.13.6 (by default, it was installing 1.9.4 for whatever reason), the build issues went away. And with that, I am able to custom build and test the patch for #2070

[root]# du -ms .
41	.
[root]

old case

[root]# buildah version
Version:         1.9.0
Go Version:      go1.10.8
Image Spec:      1.0.0
Runtime Spec:    1.0.0
CNI Spec:        0.4.0
libcni Version:  
Git Commit:      
Built:           Wed Dec 31 16:00:00 1969
OS/Arch:         linux/amd64
[root]# 
[root]# time buildah bud .
STEP 1: FROM scratch
STEP 2: COPY . /project
STEP 3: COMMIT
Getting image source signatures
Copying blob fd191a377db7 done
Copying config 4511d8faeb done
Writing manifest to image destination
Storing signatures
4511d8faeb8d608bbdcfeecb6fcb4570b0b33e57ad9b031fb1e323515cbeaf49

real	1m27.032s
user	0m56.247s
sys	0m34.159s

good case

[root]# ./buildah version
Version:         1.14.0-dev
Go Version:      go1.13.6
Image Spec:      1.0.1-dev
Runtime Spec:    1.0.1-dev
CNI Spec:        0.4.0
libcni Version:  v0.7.1
image Version:   5.1.0
Git Commit:      4e23b7a
Built:           Sat Jan 11 03:30:08 2020
OS/Arch:         linux/amd64
[root]# time ./buildah bud .
STEP 1: FROM scratch
STEP 2: COPY . /project
STEP 3: COMMIT
Getting image source signatures
Copying blob 83a7eb8c927e done  
Copying config f549711ba9 done  
Writing manifest to image destination
Storing signatures
f549711ba95d5fe7b911d36bcfef16940f4f49fbaf62e6cc3f7da1d411832985
f549711ba95d5fe7b911d36bcfef16940f4f49fbaf62e6cc3f7da1d411832985

real	0m5.305s
user	0m3.652s
sys	0m2.383s

as we can see, the 1.14.0-dev version just takes 5 seconds, as opposed to the old one (87 seconds).

thanks once again, will wait upon quay.io for a release!

@TomSweeneyRedHat
Copy link
Member

Glad it's working and thanks for the Go version update too. That's not something that I've run into. I can't take much credit on the fix though, @nalind did all the heavy lifting on this one.

@gireeshpunathil
Copy link
Author

thanks @nalind for quickly turning around this!

@marikaj123
Copy link

@gireeshpunathil - Could you please accommodate Kyle G. Christianson request:
Please make sure to test this the node pipelines. This was another reason we stayed on an old version of buildah: appsody/appsody-buildah#10. Thank you.

@gireeshpunathil
Copy link
Author

@marikaj123 - looks like you probably meant to put this comment in kabanero-io/kabanero-pipelines#121 instead of here? either way, answer is yes, will do.

gireeshpunathil added a commit to gireeshpunathil/appsody-buildah that referenced this issue Jan 18, 2020
buildah was fixed at v1.9.0 due to bind mount issues
when interacted with a buggy fuse-overlay module.
But this proved to be poorly performing for certain
stacks when exercised in the pipeline.

Upgrade buildah to latest that has the fix for the performance
issue, while documenting the requirement for fuse-overlay

Fixes: kabanero-io/kabanero-pipelines#121
Refs: containers/buildah#2047
@TomSweeneyRedHat
Copy link
Member

As this appears to be addressed in Buildah based on @gireeshpunathil 's comments, I'm closing this. @gireeshpunathil Please feel free to open another issue if things aren't going as expected with Buildah.

@github-actions github-actions bot locked as resolved and limited conversation to collaborators Sep 13, 2023
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Projects
None yet
Development

No branches or pull requests

4 participants