Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

(PUP-2169) Invalidate SELinux matchpathcon cache #6278

Closed
wants to merge 1 commit into from

Conversation

alexjfisher
Copy link
Contributor

The file resource has 4 selinux parameters (selrange, selrole,
seltype, seluser). When not set, these default to "the value
returned by matchpathcon for the file". When puppet is run as a
service, the first call to matchpathcon correctly returns the default
SELinux security context for the specified file. But on subsequent
runs, matchpathcon returns a cached value. This is a problem if the
default context has changed since the puppet service was first started.
This leads to puppet doing things like reverting file contexts that were
set when an RPM was installed (there are several RPMs which bundle
SELinux modules and run restorecon on file paths when installed).

This bad behaviour is easy to replicate. eg create a very simple puppet
module 'selinux-test' and apply that to a test node through PE or
Foreman etc.

 # selinux-test/init.pp
class selinux-test {
  file { '/selinux-test':
    ensure => file,
  }
}

Set a very short runinterval...

puppet config set --section agent runinterval 30

Wait for puppet to create /selinux-test with whatever the default
context type is for this path (etc_runtime_t).

[root@testhost ~]# ls -Z /selinux-test
-rw-r--r--. root root system_u:object_r:etc_runtime_t:s0 /selinux-test

Change the default context type for this file using fcontext

semanage fcontext -a -t tmp_t /selinux-test

Wait another 30 seconds, tailing the puppet agent log and notice how
puppet does not update the file.

Use restorecon to relabel the file manually.

[root@testhost ~]# restorecon /selinux-test
[root@testhost ~]# ls -Z /selinux-test
-rw-r--r--. root root system_u:object_r:tmp_t:s0 /selinux-test

Wait for puppet to unhelpfully revert the seltype back to
etc_runtime_t.

Restart the puppet service and note how puppet now changes the seltype
back to tmp_t.

The fix is really simple. Just call matchpathcon_fini after the call
to matchpathcon. The code never explicitly called
matchpathcon_init, but this was done automatically by the call to
matchpathcon.

From the man page...

If matchpathcon_init() has not already been called, then this function
will call it upon its first invocation with a NULL path, defaulting to
the active file contexts configuration.

And on matchpathcon_fini()...

matchpathcon_fini() frees the memory allocated by a prior call to
matchpathcon_init.() This function can be used to free and reset the
internal state between multiple matchpathcon_init() calls

I was initially concerned that calling matchpathcon_fini after each
call to matchpathcon might be awful for performance. I did some
testing, and even though it is much slower, it's still so fast as to not
be an issue (over 500 calls to matchpathcon with matchpathcon_fini
in less than a second).

[root@testhost ~]# cat test.rb
require 'selinux'
require 'benchmark'
list_of_files = Dir['/etc/**/*.*']
puts "Testing with all #{list_of_files.size} files under /etc"
time = Benchmark.realtime do
        list_of_files.each do |file|
                Selinux.matchpathcon('file',0)
                Selinux.matchpathcon_fini
        end
end
puts "Time elapsed #{time*1000} milliseconds"
time = Benchmark.realtime do
        list_of_files.each do |file|
                Selinux.matchpathcon('file',0)
        end
end
puts "Time elapsed #{time*1000} milliseconds"
[root@testhost ~]# /opt/puppetlabs/puppet/bin/ruby test.rb
Testing with all 507 files under /etc
Time elapsed 881.8768259999999 milliseconds
Time elapsed 12.083007 milliseconds

The `file` resource has 4 selinux parameters (`selrange`, `selrole`,
`seltype`, `seluser`).  When not set, these default to "the value
returned by matchpathcon for the file".  When puppet is run as a
service, the first call to matchpathcon correctly returns the default
SELinux security context for the specified file.  But on subsequent
runs, matchpathcon returns a cached value.  This is a problem if the
default context has changed since the puppet service was first started.
This leads to puppet doing things like reverting file contexts that were
set when an RPM was installed (there are several RPMs which bundle
SELinux modules and run `restorecon` on file paths when installed).

This bad behaviour is easy to replicate.  eg create a very simple puppet
module 'selinux-test' and apply that to a test node through PE or
Foreman etc.
```puppet
 # selinux-test/init.pp
class selinux-test {
  file { '/selinux-test':
    ensure => file,
  }
}
```

Set a very short runinterval...
```
puppet config set --section agent runinterval 30
```

Wait for puppet to create `/selinux-test` with whatever the default
context type is for this path (`etc_runtime_t`).

```
[root@testhost ~]# ls -Z /selinux-test
-rw-r--r--. root root system_u:object_r:etc_runtime_t:s0 /selinux-test
```

Change the default context type for this file using `fcontext`
```
semanage fcontext -a -t tmp_t /selinux-test
```

Wait another 30 seconds, tailing the puppet agent log and notice how
puppet does *not* update the file.

Use `restorecon` to relabel the file manually.
```
[root@testhost ~]# restorecon /selinux-test
[root@testhost ~]# ls -Z /selinux-test
-rw-r--r--. root root system_u:object_r:tmp_t:s0 /selinux-test
```

Wait for puppet to unhelpfully revert the seltype back to
`etc_runtime_t`.

Restart the puppet service and note how puppet now changes the seltype
back to `tmp_t`.

The fix is really simple.  Just call `matchpathcon_fini` after the call
to `matchpathcon`.  The code never explicitly called
`matchpathcon_init`, but this was done automatically by the call to
`matchpathcon`.

From the man page...
> If matchpathcon_init() has not already been called, then this function
> will call it upon its first invocation with a NULL path, defaulting to
> the active file contexts configuration.

And on `matchpathcon_fini()`...
> matchpathcon_fini() frees the memory allocated by a prior call to
> matchpathcon_init.() This function can be used to free and reset the
> internal state between multiple matchpathcon_init() calls

I was initially concerned that calling `matchpathcon_fini` after *each*
call to `matchpathcon` might be awful for performance.  I did some
testing, and even though it is much slower, it's still so fast as to not
be an issue (over 500 calls to `matchpathcon` with `matchpathcon_fini`
in less than a second).
```
[root@testhost ~]# cat test.rb
require 'selinux'
require 'benchmark'
list_of_files = Dir['/etc/**/*.*']
puts "Testing with all #{list_of_files.size} files under /etc"
time = Benchmark.realtime do
        list_of_files.each do |file|
                Selinux.matchpathcon('file',0)
                Selinux.matchpathcon_fini
        end
end
puts "Time elapsed #{time*1000} milliseconds"
time = Benchmark.realtime do
        list_of_files.each do |file|
                Selinux.matchpathcon('file',0)
        end
end
puts "Time elapsed #{time*1000} milliseconds"
[root@testhost ~]# /opt/puppetlabs/puppet/bin/ruby test.rb
Testing with all 507 files under /etc
Time elapsed 881.8768259999999 milliseconds
Time elapsed 12.083007 milliseconds
```
@puppetcla
Copy link

CLA signed by all contributors.

@MosesMendoza MosesMendoza self-requested a review October 12, 2017 20:49
@MosesMendoza
Copy link
Contributor

I'm kind of nervous about the performance hit. That increase is non-trivial, as you noted. Ie for something like /opt, this is a big deal (not that you would manage everything in opt, but)... changing your script to do so returns

☂ : /opt/puppetlabs/puppet/bin/ruby test_selinux
Testing with all 20883 files under /opt
Time elapsed 81188.125529 milliseconds
Time elapsed 341.796001 milliseconds

81 seconds vs 341 milliseconds. Do you know how often this would get invoked and in what scenarios? I'm not super familiar with our selinux support yet.

@alexjfisher
Copy link
Contributor Author

The worse case scenario I could think of is when you're using a file resource on a directory with recurse -> true. This is already pretty slow though on huge directories. With the patch, it takes very roughly twice as long. In reality, catalogs normally contain more than just file resources and I suspect they are usually not the slowest thing to apply. (If they're slow because you're enforcing large content, any slow down in enforcing selinux context is probably no longer relevant).

The 8746 files in my /opt take very approximately 1 minute on my test system or 2 minutes with the change.
Is < 10ms extra per file worth worrying about? If it is, you can always set selinux_ignore_defaults => true and things are then much faster (< 20 seconds). ie file resources on selinux enabled systems are already 3 times slower compared to non selinux systems. (In the output below, I apply the patch after 6 runs.)

[root@testhost ~]# rm -rf /opt.test; cp -a /opt /opt.test ; chown -R root /opt.test ; restorecon -RF /opt.test
[root@testhost ~]# time puppet apply -e 'file { "/opt.test": ensure  => directory, recurse => true, owner => root,}'
Notice: Compiled catalog for testhost. in environment production in 0.06 seconds
Notice: Applied catalog in 59.55 seconds

real    1m24.024s
user    1m4.462s
sys     0m19.275s
[root@testhost ~]# time puppet apply -e 'file { "/opt.test": ensure  => directory, recurse => true, owner => root,}'
Notice: Compiled catalog for testhost. in environment production in 0.07 seconds
Notice: Applied catalog in 49.80 seconds

real    1m12.288s
user    1m3.803s
sys     0m8.439s
[root@testhost ~]# time puppet apply -e 'file { "/opt.test": ensure  => directory, recurse => true, owner => root,}'
Notice: Compiled catalog for testhost. in environment production in 0.07 seconds
Notice: Applied catalog in 52.04 seconds

real    1m16.299s
user    1m4.921s
sys     0m11.051s
[root@testhost ~]# time puppet apply -e 'file { "/opt.test": ensure  => directory, recurse => true, owner => root,}'
Notice: Compiled catalog for testhost. in environment production in 0.08 seconds
Notice: Applied catalog in 62.22 seconds

real    1m35.604s
user    1m1.958s
sys     0m16.417s
[root@testhost ~]# time puppet apply -e 'file { "/opt.test": ensure  => directory, recurse => true, owner => root,}'
Notice: Compiled catalog for testhost. in environment production in 0.08 seconds
Notice: Applied catalog in 52.73 seconds

real    1m24.736s
user    1m17.163s
sys     0m7.033s
[root@testhost ~]# time puppet apply -e 'file { "/opt.test": ensure  => directory, recurse => true, owner => root,}'
Notice: Compiled catalog for testhost. in environment production in 0.06 seconds
Notice: Applied catalog in 53.61 seconds

real    1m15.956s
user    1m8.448s
sys     0m5.985s
[root@testhost ~]# cp -a /opt /opt.test ; chown -R root /opt.test ; restorecon -R /opt.test
[root@testhost ~]# vim /opt/puppetlabs/puppet/lib/ruby/vendor_ruby/puppet/util/selinux.rb
[root@testhost ~]# time puppet apply -e 'file { "/opt.test": ensure  => directory, recurse => true, owner => root,}'
Notice: Compiled catalog for testhost. in environment production in 0.06 seconds
Notice: Applied catalog in 132.48 seconds

real    2m31.482s
user    2m16.648s
sys     0m14.643s
[root@testhost ~]# time puppet apply -e 'file { "/opt.test": ensure  => directory, recurse => true, owner => root,}'
Notice: Compiled catalog for testhost. in environment production in 0.07 seconds
Notice: Applied catalog in 133.13 seconds

real    2m33.695s
user    2m19.337s
sys     0m14.293s
[root@testhost ~]# time puppet apply -e 'file { "/opt.test": ensure  => directory, recurse => true, owner => root,}'
Notice: Compiled catalog for testhost. in environment production in 0.07 seconds
Notice: Applied catalog in 130.36 seconds

real    2m29.765s
user    2m15.296s
sys     0m14.396s

[root@testhost ~]# find /opt.test | wc -l
8746

[root@testhost ~]# time puppet apply -e 'file { "/opt.test": ensure  => directory, recurse => true, owner => root,  selinux_ignore_defaults => true}'
Notice: Compiled catalog for testhost. in environment production in 0.07 seconds
Notice: Applied catalog in 17.27 seconds

real    0m36.105s
user    0m34.980s
sys     0m1.090s

I've been trying to come up with ways you could conditionally invalidate the cache. I can't think of anything simple, sane or without corner-cases waiting to bite. (eg perhaps if you knew that the next resource you applying was also a file resource you'd skip the invalidation somehow?)

I think living with the performance drop and documenting selinux_ignore_defaults => true as a possible workaround in the release notes could be preferable to introducing complicated and/or fragile logic.

Copy link
Contributor

@joshcooper joshcooper left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Calling matchpathcon_fini definitely seems like the correct behavior. Otherwise we're relying on the ruby layer to free whatever memory was allocated implicitly during the matchpathcon call, and relying on finalizers for memory deallocation is asking for trouble.

@MosesMendoza
Copy link
Contributor

Thanks for the contribution @alexjfisher. That is good enough for me.

@alexjfisher
Copy link
Contributor Author

@MosesMendoza Is this suitable for a backport to 4.x? (or even 4.10.x)?

@MosesMendoza
Copy link
Contributor

Yeah I think this is probably safe for the 4.10.x line. @alexjfisher can you rebase / retarget?

@MosesMendoza
Copy link
Contributor

(I'm also happy to do the rebase / separate PR if you don't have time, @alexjfisher)

@alexjfisher
Copy link
Contributor Author

@MosesMendoza I've opened a new PR targeting 4.10.x. I'm not sure how you manage your branches, but I guess you'll merge the new PR, close this one, and all of 4.10.x gets periodically merged into master (or maybe into 5.3.x and then into master)?

@MosesMendoza
Copy link
Contributor

@alexjfisher yes that's correct

@MosesMendoza
Copy link
Contributor

superseded by #6289

@alexjfisher
Copy link
Contributor Author

BTW, Sorry about the daft bug in my test.rb script above. 'file' should obviously not be quoted! As it happens, this doesn't really change the results.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

4 participants