Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add pwalkdir, speed up chcon #150

Merged
merged 2 commits into from Aug 9, 2021
Merged

Conversation

kolyshkin
Copy link
Collaborator

@kolyshkin kolyshkin commented Aug 6, 2021

TL;DR: speed up chcon for up to 1.5x to 2x with Go 1.16+.

  1. pkg/pwalkdir: add

    This is a copy-paste implementation of pwalk but utilizing
    filepath.WalkDir instead (added in Go 1.16). As WalkDir do not call
    stat(2) on every entry, it is much (2x on my machine) faster than Walk
    (see below).

    The idea is to have pwalk and pwalkdir coexist for some time, then
    gradually retire pwalk once Go < 1.16 is unsupported.

  2. Chcon: speedup for Go 1.16+

    Using the newly added pwalkdir, the benchmark added by and described
    in commit ba956ca shows a considerable speed improvement
    (0.8s with pwalk, 0.5s with pwalkdir, 0.9s with chcon from GNU coreutils).

@kolyshkin
Copy link
Collaborator Author

A few non-scientific benchmarks (testdata is Linux kernel source tree, find is for reference only)

Op Time
find -print >/dev/null 0.19s
chcon -R 0.92s
Before this PR (pwalk) 0.83s
This PR (pwalkdir) + go 1.16 0.51s
This PR (pwalkdir) + go 1.17rc2 0.45s

@thaJeztah
Copy link
Member

Nice! Will try to do a review soon, but ping me otherwise 😅

rhatdan
rhatdan previously approved these changes Aug 7, 2021
Copy link
Collaborator

@rhatdan rhatdan left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@rhatdan
Copy link
Collaborator

rhatdan commented Aug 7, 2021

Awesome work @kolyshkin

Copy link
Member

@thaJeztah thaJeztah left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks! Spotted a typo, and one question/suggestion (rename or drop the WalkFunc type alias); let me know what you think

pkg/pwalkdir/pwalkdir.go Outdated Show resolved Hide resolved
pkg/pwalkdir/pwalkdir.go Outdated Show resolved Hide resolved
@thaJeztah
Copy link
Member

Looks like you forgot to remove some references that used the alias;

  level=error msg="Running error: buildir: analysis skipped: errors in package: [/home/runner/work/selinux/selinux/pkg/pwalkdir/pwalkdir_test.go:124:42: undeclared name: WalkFunc /home/runner/work/selinux/selinux/pkg/pwalkdir/pwalkdir_test.go:127:34: undeclared name: WalkFunc /home/runner/work/selinux/selinux/pkg/pwalkdir/pwalkdir_test.go:141:8: undeclared name: WalkFunc /home/runner/work/selinux/selinux/pkg/pwalkdir/pwalkdir_test.go:153:24: cannot use filepath.WalkDir (value of type func(root string, fn fs.WalkDirFunc) error) as walkerFunc value in struct literal /home/runner/work/selinux/selinux/pkg/pwalkdir/pwalkdir_test.go:154:21: cannot use Walk (value of type func(root string, walkFn fs.WalkDirFunc) error) as walkerFunc value in struct literal]"

This is a copy-paste implementation of pwalk but utilizing
filepath.WalkDir instead (added in Go 1.16). As WalkDir do not call
stat(2) on every entry, it is much (2x on my machine) faster than Walk
(see below).

The idea is to have pwalk and pwalkdir coexist for some time, then
gradually retire pwalk once Go < 1.16 is unsupported.

pwalk:

BenchmarkWalk/Empty/filepath.Walk-4         	      50	  24297251 ns/op
BenchmarkWalk/Empty/pwalk.Walk-4            	      37	  27564272 ns/op
BenchmarkWalk/ReadFile/filepath.Walk-4      	      22	  50904602 ns/op
BenchmarkWalk/ReadFile/pwalk.Walk-4         	      37	  30498257 ns/op
BenchmarkWalk/ChownChmod/filepath.Walk-4    	      22	  49689793 ns/op
BenchmarkWalk/ChownChmod/pwalk.Walk-4       	      36	  32384297 ns/op

pwalkdir:

BenchmarkWalk/Empty/filepath.WalkDir-4         	      96	  12537753 ns/op
BenchmarkWalk/Empty/pwalkdir.Walk-4            	      80	  14289699 ns/op
BenchmarkWalk/ReadFile/filepath.WalkDir-4      	      27	  41524600 ns/op
BenchmarkWalk/ReadFile/pwalkdir.Walk-4         	      50	  24767977 ns/op
BenchmarkWalk/ChownChmod/filepath.WalkDir-4    	      27	  40676721 ns/op
BenchmarkWalk/ChownChmod/pwalkdir.Walk-4       	      42	  24256671 ns/op

Signed-off-by: Kir Kolyshkin <kolyshkin@gmail.com>
Using the newly added pwalkdir, the benchmark added by and described
in commit ba956ca shows a considerable speed improvement
(0.8s with pwalk, 0.5s with pwalkdir, 0.9s with chcon from GNU coreutils).

Signed-off-by: Kir Kolyshkin <kolyshkin@gmail.com>
@kolyshkin
Copy link
Collaborator Author

Looks like you forgot to remove some references that used the alias

🤦🏻 never trust 3am edits! Fixed.

@thaJeztah
Copy link
Member

🤦🏻 never trust 3am edits! Fixed

More than familiar with those 😂

Copy link
Member

@thaJeztah thaJeztah left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM!

@thaJeztah
Copy link
Member

thaJeztah commented Aug 8, 2021

@rhatdan ptal (CI wants 2 LGTMs, and the update invalidated your previous one 😅)

@rhatdan rhatdan merged commit f23ae77 into opencontainers:main Aug 9, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants