Skip to content

fcontext is painfully slow #54

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
kkeane opened this issue Nov 3, 2016 · 14 comments
Closed

fcontext is painfully slow #54

kkeane opened this issue Nov 3, 2016 · 14 comments

Comments

@kkeane
Copy link

kkeane commented Nov 3, 2016

Each selinux_policy_fcontext resource takes a very long time to run - five to ten seconds on my system. This is of course due to the underlying semanage utility also being slow.

When adding many fcontexts, that slow down my cookbook to unacceptable levels (from 2 to about 20 minutes) - not just the first time, but on every run because testing the guards also takes a long time.

The semanage utility supports importing many fcontexts (and booleans and ports) at once (semanage -i), which is dramatically faster.

Also, testing if a context already exists could be optimized by caching the output from semanage fcontext -ln

Similar optimizations should also be possible for booleans and ports, although it is, at least for me, less critical.

@BackSlasher
Copy link
Contributor

RE many contexts at once - this not good CM methodology, but if we make it elegant we can consider it. Can you suggest a syntax?
RE caching, consider this:

selinux_policy_fcontext '/var/www' do
  action :delete
end
# Caching that fcontext does not exist
package 'apache'
# ^ Fcontext is created because of package installation
selinux_policy_fcontext '/var/www' do
  action :delete
end
# ^ Will not do anything because cached as does not exist 

When I was looking at speeding these things up, I looked into using custom Python scripts instead of semanage, thinking the problem was that the code there was bloated. Sadly, this proved false.
AFAIK semanage is slow because manipulating the SELinux store requires you to load and parse it, modify whatever you want in-memory and then save it back. The load/save take forever.

In your case, since you want batching, I suggest you create a python script that modifies the store or an SELinux module, depending on your needs / time you can spend on this.
That way, you gain run speed but give up on separation and maintainability.

WDYT?

@kkeane
Copy link
Author

kkeane commented Nov 3, 2016

Yes, I see the point. It's a common situation in software development: you have a very nice design, but then in the end you find that it isn't all that performant, and you have to tweak it. Another example where the same problem surfaced was multipackage support in the package resource, as well as the multipackage cookbook. Also a situation where the gospel CM way ended up extremely slow. Same thing with video drivers. You could have the software draw every pixel individually, but the more smarts you put in the driver (and the hardware) the better performance you get.

Your analysis on why it is so slow is close to correct. It does seem to be the "load/save" that takes forever. But it's not quite that. selinux works based on transactions. The loading and saving is actually fast, but starting and finishing a transactions is extremely expensive. That is why semanage -i even exists: everything you submit through that call is processed in a single transaction. You can submit 100 fcontexts all at once with semanage -i, and it will take about the same amount of time as a single fcontext with semanage fcontext -a.

On the design:

The fcontext resource code could do basically the same thing as the package resource: allow passing an array of file/directory names, and an array of secontexts.

That said: I actually already had a solution based on the selinux cookbook. That cookbook used arrays of attributes to manage booleans, and I extended that concept to also work for fcontexts and ports. Sadly, my PR for that improvement was rejected. When I wanted to switch to the selinux_policy cookbook, I ran into the performance issue.

Another way would be to model this batch after the multipackage cookbook.

Use one resource just to collect the requests.

Use a different resource (or multipackage actually uses the same one) to check all of the requested fcontexts against what's already there, and then - if necessary - install all the missing ones at once with semanage -i.

You are right about caching. My design suggestion would solve that. You'd retrieve the fcontexts only immediately before generating the input for semanage -i.

@BackSlasher
Copy link
Contributor

Well, if you have a concrete suggestion for batching in some way, even
syntax description, I'll be happy to read it.
I don't think we can achieve something like that without doing some Ruby
dark magic, or making one ugly resource, but let's see.

Nitzan Raz
http://backslasher.net

On Thu, Nov 3, 2016 at 10:46 AM, Kevin Keane notifications@github.com
wrote:

Yes, I see the point. It's a common situation in software development: you
have a very nice design, but then in the end you find that it isn't all
that performant, and you have to tweak it. Another example where the same
problem surfaced was multipackage support in the package resource, as well
as the multipackage cookbook. Also a situation where the gospel CM way
ended up extremely slow. Same thing with video drivers. You could have the
software draw every pixel individually, but the more smarts you put in the
driver (and the hardware) the better performance you get.

Your analysis on why it is so slow is close to correct. It does seem to be
the "load/save" that takes forever. But it's not quite that. selinux works
based on transactions. The loading and saving is actually fast, but
starting and finishing a transactions is extremely expensive. That is why
semanage -i even exists: everything you submit through that call is
processed in a single transaction. You can submit 100 fcontexts all at once
with semanage -i, and it will take about the same amount of time as a
single fcontext with semanage fcontext -a.

On the design:

The fcontext resource code could do basically the same thing as the
package resource: allow passing an array of file/directory names, and an
array of secontexts.

That said: I actually already had a solution based on the selinux
cookbook. That cookbook used arrays of attributes to manage booleans, and I
extended that concept to also work for fcontexts and ports. Sadly, my PR
for that improvement was rejected. When I wanted to switch to the
selinux_policy cookbook, I ran into the performance issue.

Another way would be to model this batch after the multipackage cookbook.

Use one resource just to collect the requests.

Use a different resource (or multipackage actually uses the same one) to
check all of the requested fcontexts against what's already there, and then

  • if necessary - install all the missing ones at once with semanage -i.

You are right about caching. My design suggestion would solve that. You'd
retrieve the fcontexts only immediately before generating the input for
semanage -i.


You are receiving this because you commented.
Reply to this email directly, view it on GitHub
#54 (comment),
or mute the thread
https://github.com/notifications/unsubscribe-auth/AG3Hexrgz0ZEw3jIu7NK7HMmBZgN_Itxks5q6Z9IgaJpZM4KoEi4
.

@kkeane
Copy link
Author

kkeane commented Nov 3, 2016

This idea is taken from https://supermarket.chef.io/cookbooks/multipackage . I think there are also some useful implementation ideas in that cookbook; it uses an accumulator pattern.

selinux_policy_multi "some contexts" do
  file_specs [ "/firstpath", "/secondpath" ]
  secontexts [ "context1_t", "context2_t" ]
  file_types [ "a", "a" ]
  action :addormodify
end

selinux_policy_multi "some ports" do
  ports [ "80", "443" ]
  protocols [ 'tcp', 'tcp' ]
  secontexts [ "context1_t", "context2_t" ]
  action :addormodify
end

And similar for booleans

# not providing any args will install all the collected fcontexts, ports and booleans
selinux_policy_multi

@BackSlasher
Copy link
Contributor

I like Lamont's work but this multipackage thing seems hella ugly.
A better alternative IMO is define some resources with action :nothing, and have a transaction resource that collects them and runs them, something like:

s1 = selinux_policy_port '5678' do
    action :nothing
    protocol 'tcp'
    secontext 'http_port_t'
end

s2 = selinux_policy_fcontext '/var/www/moodle(/.*)?' do
  action :nothing
  secontext 'httpd_sys_rw_content_t'
end

selinux_policy_transaction "everything" do
  actions ({
    create: [ s1, s2 ]
  })
end

Not sure how complicated it'll be to implement though.

@kkeane
Copy link
Author

kkeane commented Nov 5, 2016

I think there should be several considerations here. For one, optimizations often tend to be ugly. It's the nature of the beast. And to some extent, it is also a matter of taste. I find multipackage quite neat, although I could imagine two improvements (both of which may end up not having much value in real life).

I would argue for a multipackage-style solution for several reasons:

  • consistency. Multipackage is widely used, and the paradigm is familiar. Coming up with yet another paradigm may, by itself, be cleaner, but in the end the proliferation of paradigms is ugly in its own right. Reminds me of this: https://xkcd.com/927/
  • Multipackage was implemented this way in part because, if I understand it correctly, it allows placing the transaction anywhere, including at the beginning of the chef run. It seems to me that your approach would be sequence-dependent? Being able to install all packages - and also to update all fcontexts etc. - at the beginning of a chef run can be highly beneficial.

If I am right about your approach being sequence dependent, it could only be used within one cookbook. For that, the much simpler suggestion of expanding the resource to accept multiple path/fcontext pairs (much as package now accepts multiple package names) would be much more straightforward.

  • I haven't tried it, but I believe multipackage would allow not just installing, but also uninstalling in the same transaction.

If you do go for your proposal, how about changing the action from ":nothing" to ":collect"? A second suggestion could be to add a chef handler that completes the transaction at the end of the run (that would deal with sequence-related issues).

@BackSlasher
Copy link
Contributor

Widely used? Iunno, doesn't seem so popular according to my quick checks. I'll admit I'm not working with Chef so much, so I may be out of sync.
My approach isn't sequence driven (I think), since the actual action only happens in convergence phase. You can go crazy and do something like

selinux_policy_transaction "all" do
  actions ({
    create: ["selinux_policy_fcontext[/var/www/moodle(/.*)?]"]
  })
end

# other stuff

selinux_policy_fcontext '/var/www/moodle(/.*)?' do
  action :nothing
  secontext 'httpd_sys_rw_content_t'
end

Just like with notifications. Looks weird to me that you'd want to do that, but whatevs
RE single cookbook - don't think this is a technical constraint, as you can always use resource() to locate other resources or refer to them using their names. however, I do think this should be constrained to a single cookbook on the human level, to allow readability. These should be topical and not cluster into "all of the SELinux actions in the Chef run", unless you explicitly want to do it, which then you can do with some fancy resource() tricks and it sounds bad IMO (e.g. if you determine what to do with some fcontext, and then a package is installed and overwrites it, you're going to have a bad time).

To sum up, I don't like the idea of collecting these resources implicitly (action :collect or Chef handler). Trying to work around the need to explicitly specify the timing for your SELinux settings and risking unintended edge cases (cookbook a applies resources from b by accident) sounds lazy and dangerous to me.
If this transaction thing will roll, it should be explicit AF IMO.
I think I might be coming off a bit angry, so have no fear. At least we have an interesting idea here, that might be worth expanding to a Chef resource pattern

@kkeane
Copy link
Author

kkeane commented Nov 6, 2016

Yeah, I completely hear you about the drawbacks - there seem to be advantages and drawbacks on both sides, and reasonable minds can disagree on what's more important. I just like the consistency in the pattern, but your points have merit, too. And in the end, it's your call.

What I care about most is solving the performance problem. Exactly how to do that is secondary - it seems like we have at least three different designs floating around, which is a lot better than having no ideas!

@BackSlasher
Copy link
Contributor

So, to sum up - I'd gladly read bits of code if you (or anyone) wanna slide some my way.
I'm not a big fan of anything implicit though - we'll need to have all of the "collection" explicit as much as possible to avoid any potential side effects.

@BackSlasher
Copy link
Contributor

Should be solved by #60

@kkeane
Copy link
Author

kkeane commented Nov 24, 2016

I didn't have a chance to test it yet, but I don't see how #60 would address the issue. The slowness does not occur during restorecon, but rather is due to semanage fcontext being called frequently, once for each selinux_policy_fcontext resource.

semanage fcontext basically does three things: open an selinux transaction, update the fcontext database, and close the selinux transaction. Updating the fcontext database is actually very fast, but the transactions operations are very slow.

@BackSlasher
Copy link
Contributor

Yeah I got confused again, sorry. Will not be solved by #60

@xorima
Copy link
Contributor

xorima commented May 11, 2019

Closing due to inactivity.

If this is still an issue please reopen or open another issue. Alternatively drop by the #sous-chefs channel on the Chef Community Slack and we'll be happy to help!

Thanks,
Sous-Chefs

@xorima xorima closed this as completed May 11, 2019
@lock
Copy link

lock bot commented Aug 2, 2019

This thread has been automatically locked since there has not been any recent activity after it was closed. Please open a new issue for related bugs.

@lock lock bot locked as resolved and limited conversation to collaborators Aug 2, 2019
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Development

No branches or pull requests

5 participants